I was a primary member of the small but awesome Timesense engineering team. We built systems to rapidly detect real-world events, like earthquakes and celebrity deaths, from large, noisy streams of user-generated content.
As a full-time engineer (2012 - 2013), I helped build a new system that improved our earlier event-detection latency by 6x. I worked with streaming/low-latency technologies like Storm, Kafka, and HBase.
I also led the organization of a science and math talk series. I gave a talk myself, on random forests.
More memorably, I worked on numerous blue-sky prototypes during Yahoo!’s quarterly Hack Days. I still miss the adrenaline, and the night I dozed off in a conference room.
As in intern (Fall 2011), I extended the trend detection system to be multithreaded and centrally configurable via a simple XML file. This cut down the deployment time of new locales from impossible to a few minutes.
I also implemented a research prototype to detect geographically niche search trends. It was basically an implementation of this KDD ‘12 paper by our science-team counterparts.
I was one of the 3 interns (out of 14 from my university) to be offered a full-time position.
Some resources describing our early work back in 2010, using language models to classify trending queries:
We also expose the (currently very limited) Timesense API via YQL.