The Hacker Who Archived Parler Explains How She Did It (and What Comes Next)
The hacker, donk_enby, explained that she only scraped what was publicly available: "I hope that it can be used to hold people accountable and to prevent more death."
By Leland Nally
January 12, 2021, 12:53pm
Those plans hit a snag when Amazon Web Services, Google, and Apple deplatformed Parler, effectively erasing it off the internet, at least temporarily. Parler was an organizing and rallying point for the far-right, including many of the Capitol Hill insurrectionists, and so its erasure from the internet threatened to destroy months of posts that could be used to better understand the attack on the Capitol.
But the quick thinking of a self-described hacker by the name of donk_enby and a host of amateur data hoarders preserved more than 56.7 terabytes of data from Parler that donk_enby and open source investigators believe could be useful in piecing together what happened last Wednesday and in the weeks and months leading up to it. donk_enby was able to scrape and capture and archive nearly the entire content of the website after it became clear that hundreds of Trump supporters had uploaded potentially incriminating photos and videos of themselves to the platform, many filming from inside the Capitol itself.
When news of donk_enby's archival efforts broke, several viral tweets, Reddit posts, and Facebook posts claimed that she had captured private information, scans of drivers licenses and IDs, and other highly sensitive information. She said those posts are “not at all” accurate.
“Everything we grabbed was publicly available on the web, we just made a permanent public snapshot of it,” donk_enby told me.
Nevertheless, with the FBI, state and local law enforcement, and open-source investigators looking for media from Wednesday's attack, the archive could be highly useful to a whole host of people.
“I hope that it can be used to hold people accountable and to prevent more death,” she said. “I think people should be allowed to have their own opinion as long as they can act civilized, on Wednesday we saw what can happen if they don’t.”
On Saturday, Amazon Web Services announced that it would no longer host Parler, cutting the company off from one of the largest web hosts in the world. The move was set to be effective Sunday at midnight. The clock was ticking.
When rumors of Parler’s imminent deletion began to circulate, donk_enby, who has been researching Parler for months, understood that a litany of important information about America’s most prominent far-right extremist groups was at risk of being permanently hidden from the public eye. In a monumental effort, donk_enby and a few other fellow hackers and researchers managed to capture and archive nearly every post, photo and video on Parler before it was shut down.
“Last night was all gas no brakes,” she told me Monday.
When word of donk_enby’s project broke online, competing theories circled about what information had actually been pulled. What donk_enby actually did was an old school scrape of already publicly available information. Using a jailbroken iPad and Ghidra, a piece of reverse-engineering software designed and publicly released by the National Security Agency, donk_enby managed to exploit weaknesses in the website’s design to pull the URL’s of every single public post on Parler in sequential order, from the very first to the very last, allowing her to then capture and archive the contents.
The task of downloading that data, what she called the “big pull”, was a race against the clock—Amazon was set to revoke Parler’s hosting services within hours, and over 50 terrabytes of data had to be pulled from the site in order to be effectively archived. After donk_enby tweeted about the content she was scraping from Parler, the Archive Team, a volunteer collection of hackers and data researchers who have saved a host of other dying sites, took notice and joined in her effort. “The Archive Team deserves a lot of credit for orchestrating the big pull,” donky_enby told me, saying that he group paid the steep server costs and constructed a tool that allowed anonymous Twitter users to volunteer their own bandwidth to help speed the transfer, which at one point peaked at 50 GB per second. The extra speed proved critical—the group-effort managed to capture 96% of Parler’s content by midnight.
In December, donk_enby published details about Parler's iOS app on her GitHub, which Archive Team used to help them scrape the site. At the time, she posted on her GitHub that the API could be used "to solve fun mysteries such as:
- Is my dad on Parler?
- Who was on Parler before it first started gaining popularity when Candice Owens tweeted about in December 2018?
- Is Parler really the world's most secure social network? (no)"
donk_enby had originally intended to grab data only from the day of the Capitol takeover, but found that the poor construction and security of Parler allowed her to capture, essentially, the entire website. That ended up being 56.7 terabytes of data, which included every public post on Parler, 412 million files in all—including 150 million photos and more than 1 million videos. Each of these had embedded metadata like date, time and GPS coordinates—unlike most social media sites, Parler does not strip metadata from media its users upload, which, crucially, could be useful for law enforcement and open source investigators.
The data is currently being processed and should be available to browse in a couple days, according to donk_enby. Early archives of it are already cropping up as torrent files and are being shared on IRC channels and different git sites. One of the hosters posted this message on their website: "the files were shared from this site, and made into a torrent file so the distribution is mostly out of my hands now," they said. "the data has also been shared with researches and archival organizations." Metadata archives have been uploaded and new scripts have been written to help parse and plot the data.
Users of Parler have responded with threats.
“All the hate and threats I’m getting make it all the more satisfying. I don’t know the full extent of what’s in there but people are afraid,” she said.
A screenshot she posted from a group named the North Central Florida Patriots called out her Twitter handle and named her “the rat running the operation”:
Parler has since registered their domain with Epik, a service that hosts other similar platforms used by far-right groups like Gab and 8chan, and are now suing Amazon).“Bad news. Left extremists have captured and archived over 70TB of data from parler severs. This includes posts, personal information, locations, videos, images etc.
The intent is a mass dox and a list to hold patriots “accountable”. It is too late to scrub your data, and its already archived. There is nothing you can do to prevent whats already happened. All you can do is prepare for the fallout.”
It’s worth noting that the FBI could have gotten the server information on their own, but what this kind of public dump does is empower other hackers, researchers, activists, and antifascist members of the public to identify suspects on their own and make their names and faces public. It also preserves posts organizing the insurrection and other violent threats, rhetoric, and planning done by the far right groups involved in the takeover, an important piece of information when trying to answer the Capitol Police’s and federal government's inexplicable lack of preparation for the January 6th violence. donk_enby told me that the data is “already being processed to extract metadata, pull still frames, and maybe run some computer vision analysis.” Social media has proved a powerful force in identifying people at riots and protests—as I write this, the FBI is posting screengrabs on Twitter asking for help identifying suspects photographed inside the Capitol.
While donk_enby’s information will surely prove valuable to antifascist groups and others who have a vested interest in naming and shaming right-wing extremists, the level playing field of the internet makes it just as likely to aid the state in seeking prosecution. While donk_enby didn't archive this for the explicit purpose of helping law enforcement (she considers herself an anarcho-socialist and said the data would have utility for those leading crowd-sourced identification efforts), she acknowledges they may find it useful. “Once people start sifting through our archive, it should point them to where they can find the actual legally admissible evidence,” she said.
"I saw what online disinformation can do first hand ...I hope to inspire people like me to use their skillset for political purposes - hacking IS political,” donk_enby told me, “we’ve been slipping down a slippery slope that got us to what happened on Wednesday for a very long time.”
article viewed here