The US military has enlisted academics to fight a new enemy: Twitter bots.
The Defense Advanced Research Projects Agency (DARPA) held a special contest last year to identify so-called "influence bots" — "realistic, automated identities that illicitly shape discussion on sites like Twitter and Facebook."
The fascinating 4-week competition, called the DARPA Twitter Bot Challenge, was detailed in a paper published this week.
The paper minces no words about how dangerous it is that human-like bots on social media can accelerate recruitment to organizations like ISIS, or grant governments the ability to spread misinformation to their people. Proven uses of influence bots in the wild are rare, the paper notes, but the threat is real.
And so, the surprisingly simple test. DARPA placed "39 pro-vaccination influence bots" onto a fake, Twitter-like social network. Importantly, competing teams didn't know how many influence bots there were in total.
Teams from the University of Southern California, Indiana University, Georgia Tech, Sentimetrix, IBM, and Boston Fusion worked over the four weeks to find them all.
With 8.5% of all Twitter users being bots, per the company's own metrics, it's important to weed out those bots who go beyond just trying to sell you weight-loss plans and work-at-home methods, and cross the line into politics.
But actually making that distinction can be a challenge, as the paper notes.
Sentimetrix technically won the challenge, reporting 39 correct guesses and one false positive, a full six days before the end of the four-week contest period. But USC was the most accurate, going 39 for 39.
How to detect a robot
DARPA combined all the teams' various approaches into a complicated 3-step process, all of which will need improved software support to get better and faster going forward:
- Initial bot detection — You can detect who's a bot and who's not by using language analysis to see who's using statistically unnatural and bot-generated words and phrases. Using multiple hashtags in a post can also be a flag. Also, if you post to Twitter a lot, and consistently over the span of a 24-hour day, the chances you're a bot go up.
- Clustering, outliers, and network analysis: That first step may only identify a few bots. But bots tend to follow bots, so you can use your initial findings to network out and get a good statistical sense of robot social circles.
- Classification/Outlier analysis: The more positives you find with the first two steps, the easier it is to extrapolate out and find the rest in a bunch.
A key finding from the DARPA paper, and very important to note, is that all of this required human interaction — computers just can't tell a real human from an influence bot, at least not yet.
The good news, say the authors in their paper, is that these methods can also be used to find human-run propaganda and misinformation campaigns.
The bad news is that you can expect a lot more evil propaganda bots on Twitter in the years to come.
"Bot developers are becoming increasingly sophisticated. Over the next few years, we can expect a proliferation of social media influence bots as advertisers, criminals, politicians, nation states, terrorists, and others try to influence populations," says the paper.