Boston University Twitter Collection and Analysis Toolkit (BU-TCAT)

The BU-TCAT allows Boston University students and faculty–and members of the Communication Technology Division of AEJMC–to examine Tweets off the STREAM API (the so-called “gardenhose” access to Twitter) and then process the data for network analysis and visualization in Gephi or similar. With this open-source software, social data in the millions of units is quickly and easily sorted by algorithms to find people or items of importance on Twitter, among other analytic opportunities.  JoCTEC Founding Editor Jacob Groshek oversees BU-TCAT and interested users should contact him for more information, gain access the system, or add search terms.

As a resource to Boston University and the broader research community, the BU-TCAT opens up a host of analytic options that require no programming knowledge. Detailed analytic options include:

  • Timeline of Twitter activity, with minute-by-minute timestamping
  • Tweet statistics like hashtag and retweet frequencies, geocoding and unique users
  • Specific user stats: number of friends, followers, favorites, and verification
  • Activity metrics such as user visibility by mention frequency
  • Hashtag frequency, hashtag-user activity, word and identicial Tweet frequency
  • Lists of individual retweets and geocoded Tweets
  • Network graphs by mentions, co-hashtagging, and hashtag-user graphs
  • Cascades, alluvial diagrams, and associational profiles
  • Unlike many other types of Twitter collection systems or software, BU-TCAT searches do not run out or expire until they are turned off or deleted.

To date, the BU-TCAT system has archived over 500 million Tweets (and counting), on a wide range of topics, such as Ferguson, antibiotics, #election2016, Netflix, politics, and more.