AO3 has been scraped, once again.
As of the time of this post, AO3 has been scraped by yet another shady individual looking to make a quick buck off the backs of hardworking hobby writers. This Reddit post here has all the details and the most current information. In short, if your fic URL ends in a number between 1 and 63,200,000 (inclusive), AND is not archive locked, your fic has been scraped and added to this database.
I have been trying to hold off on archive locking my fics for as long as possible, and I’ve managed to get by unscathed up to now. Unfortunately, my luck has run out and I am archive locking all of my current and future stories. I’m sorry to my lovelies who read and comment without an account; I love you all. But I have to do what is best for me and my work. Thank you for your understanding.
An excerpt from the linked post:
“The dataset of AO3 on HuggingFace is currently disabled, meaning: you can’t download it but you can still see the relevant information of the dataset and it could be available again if the copyright infringement/DMCA takedowns requests are countered. As far as of April 23 (today), the AO3 dataset has only 4 copyright infringement notices. I encourage eveyone to do one, since (quoting): "the scraper has not agreed to take down the entire repo. At this time, the scraper has agreed with taking down art from the person who owns the copyright. That means each of you will need to request a takedown”.
EDIT: I apologize for not including this in the OG post, but yes, as others in the comments have said, the database “was created by processing works with IDs from 1 to 63,200,000 that are publicly accessible.” Work ID means the number in the URL of the works, so if your work has a matching ID between 1 to 63,200,000, then your work is in the dataset and you can fill a DMCA or a copyright infringement notice. The CSV thing on PaperDemon is just a list that you privately (via email) send to the user who did the dataset so they identify your work in the dataset and delete it. So you can do it just, copy and paste your works’ ID to an excel file and send that.“
Please go to the post for more info.