What the Tweetgeister?

This is a visualization of the tweets associated with Ignite Austin #2 held on October 20, 2010. A "live" visualization exists, but one will need modern-and-hip browsers (Chrome, FireFox, Safari) to see it. The technical term of this layout is a "circle packing layout", which provides a reasonably space-efficient presentation of hierarchical data. Wasted space exists (the empty part of the circles), but the viz very quickly reveals the clusters with the largest number of tweets. These populous clusters represent memes that formed during the span of time that people were tweeting about something - the Twitter zeitgeist: The Tweetgeist.
Each red dot is an individual tweet hashtagged with "#ia2" (link may not return anything in a week or two - Twitter doesn't search the past very far). Each gray circle is a clustering of tweets. I used semantic techniques to perform the clustering, so that similar content will be grouped together. For example:

shows a cluster related to the words "Fonts are the clothes that words wear". That is the text of the tweet that is most representative of this cluster and the nine red dots are tweets that fit into this cluster. The yellow box is a tool tip that shows the actual cluster "concept".
When the cursor is placed over a tweet, the actual text of the tweet shows up. In this case, that text is "Typography: think of the clothes words wear". Similar to the cluster concept, but not identical. The clusterer sub-clustered the cluster (did that make sense? I think it did). There are two main sub-clusters: one with one tweet and one with eight tweets. The eight-tweet cluster is further decomposed.
Now the hover-text shows the content of a tweet inside the inner-most cluster. Using Twitter-speak, this tweet is a re-tweet ("RT") of something "schnee" tweeted. Note that the text "Fonts are the clothes..." is the text of the outer most cluster. The clusterer associated "Fonts are the clothes that words wear", "Typography: think of the clothes words wear" and various re-tweets. That this works is pretty nifty.
Also note that "schnee" is me.
Using the live version of the tweetgeist, one can click on a tweet and open the Twitter page of the tweet, which is a little sugar.
"Tweetgeister" is the service that creates tweetgeists. A very private service - doing a tweetgeist is horrendously expensive in terms of time and compute resources. While I could fix both of those, the fix would require money expenses and hey, this is a hobby.
Posted at 11:30AM Oct 22, 2010 by schnee in General |
American Politics, Redux, Redux
I have a dataset and I'm not afraid to use it. Over and over. Ad nauseam.
Two more quick riffs on the Legislative timelines (part one, part two). I decided to ignore the third parties and independents (because I'm an American, dammit - that's what we do!) and take a closer look at how the two dominant parties (Democratic and Republican) relate. I flipped the Democrats below the timeline so as to have a field of red and a field of blue. I also added decade markers to the timeline.
Senate
House
Both visualizations show initial Republican activity followed by a period of less activity (more pronounced in the Senate). Then, around 1860, the Republican party comes back to stay.
- 1860 - "the Party of Abraham Lincoln" indeed
- I'm guessing that the Republican party of the early 1800s has little in common with the latter incarnation. But I'm no Political Science / History major.
(there's a story here, leading all the way to Liz Claiborne - yes, THAT Liz Claiborne)
The two parties firmly establish themselves as the status quo in the 1880s and America has had them ever since. The 1930s (the Great Depression) shows a significant number of Democrats (in the Senate) launching their arcs but then in the 1940s (post WWII), the Republicans seem to answer.
The visualizations do not show the number of Legislators that "share an arc" - it is not possible to look at an arc and determine if it represents more than one Legislator. Further riffs on the viz may do this.
Posted at 09:15AM Feb 24, 2010 by schnee in General |
American Politics, Redux
Visualizing the length of time that US Senators spend in office exposed some not-altogether unexpected details: Senators are serving longer. But what about the men and women who serve in the US House of Representatives?
The viz shows pretty much the same obvious conclusion: more Representatives hold office for longer as the United States gets older. Some notes about this data:
- Approximately 2700 Legislators are represented, which I don't believe is the complete list, but the complete list may not exist. That's too bad.
- The data I found has a minimum resolution to the year, e.g. '(1865-1867)'. Inter-session appointments are thereby masked, as are the specific dates of service. To simplify the data a bit, I assumed that Representatives served from Jan 1 on the first year to Dec 31 on the last year and trusted the Law of Large Numbers to smooth over any lies that may tell.

The simplification in the 2nd note reveals a couple of nice things:
- The striations are more easily seen. 1-, 2-, 3-, 4-term Reps are identifiable
- Party Revolutions stand out
Posted at 03:11PM Feb 22, 2010 by schnee in General | Comments[1]
The Decline of American Politics

The Founding Fathers created the United States with the concept of citizen servants in mind. Men (originally, and land-owners at that) would represent constituents for a time and then return to their lives.
After being elected to two terms of the Presidency, George Washington decided to not run again, not wanting to create a democratic monarchy. Soon, the government codified Presidential Term Limits in the US Constitution. Those term limits never trickled down to the Senate or the House.

This figure (pops out to a much larger version) shows the durations of all the men and women who, as of Feb 20, 2010, have ever served as a Senator of the United States. The x-axis begins on (anyone? anyone?) March 4, 1789 with 21 Senators launching their arcs. The x-axis scales appropriately so the lengths of the arcs are proportional to the actual duration of the Senators' length in office. The height is related to the length of the arc.
The right end of the image shows several arcs cut off in the middle. As I cannot predict the future, I decided that the durations of sitting Senators should be twice the time-served. Unrealistic, since that predicts that Senator Robert Byrd (D-WV) will serve for over 100 years. And, since he is over ninety, serving an additional 50+ years puts him at over 140 years old when he retires. While I wish him a long and healthy life, this seems unreasonable. The flip-side is likely true as well - I suspect Scott Brown will serve a bit longer than 34 days.
This image assumes that currently sitting Senators will not serve past the current Legislative Session (ends on Jan 3, 2011). Unrealistic as well, but at least not leading to Methuselaian Senators.
One can see that over time, the heights of the arc have increased as Senators serve more and more terms. The next image shows a crop of the larger image which reveals vertical plateaus, corresponding to one-, two-, three-, and four-term Senators.
Finally, I wanted to see if Party Affiliation shows anything interesting.
Red is Republican. Blue is Democrat. Black is Other (while historically relevant, coloring the "American (Know-Nothing)" and the "Farmer Alliance" parties doesn't seem useful) . We've polarized into a two-party system (no duh) over time, and the visualization shows a Democratic bias to serving longer term at the moment, but I haven't performed any analysis to draw any definite conclusions. Similar breakdowns are possible: state or region (North, South, South West, etc) come to mind, but weren't created.
The electoral cycle becomes evident on the bottom of the time line - the arcs tend to begin and end on concentrated points which are probably 2 years apart (1/3 of the Senate is elected every two years). Random data sets may not show this periodicity: employment duration at a company, call-center call times, and engineering compile times would all likely reveal different patterns. However, some trends may emerge if, for example, companies instituted retention policies or made a concerted effort to reduce compile times.
I extracted the base data from the PDF file located on the US Senate Art & History site. The author of this data did not make it easy to extract, but fortunately, I know a thing or two about programming. A few hours of Java hacking later and I created a little application that parses the textual data and outputs drawing commands used by ImageMagick to create the actual image.
Information is Beautiful inspired the visualization. I find this site well-worth the time to read.
The images are hosted on Flickr and are licensed under the Creative Commons "Attribution" license. Various sizes of the images are available at Flickr.
Posted at 05:24PM Feb 20, 2010 by schnee in General | Comments[1]







