Yesterday I started my internship with the Real Time Compute team at Twitter!
I will be working to apply the findings of my performance modelling
research Apache Storm
topologies to Twitter’s Heron
I arrived last week and spent the time mainly getting over my jet lag and
finding the grocery store!
I really enjoyed the onboarding sessions, learning all about Twitter’s history
(the company started out as a podcasting app!) and meeting my fellow
Twitter Heron and Dhalion
Two recent papers have given more detail on Twitter’s distributed stream
processing system Heron (the successor to
The first paper, Twitter Heron: Towards Extensible Streaming
, details how Heron has evolved into a modular architecture that makes it
extremely flexible. Adrian Colyer gives a good summary of the paper on his
blog The Morning
The main takeaway for me is the ease with which Heron can be augmented with new
features. As I read it I saw how I could adapt my modelling work
into a module that could aid in auto-scaling Heron topologies. Of course, not
long after this the second paper:
Dhalion: Self-Regulating Stream Processing in Heron came out. Someone had already done it, but this is by no
means a bad thing.
As before, Adrian Colyer gives a
of the paper. Dhalion is a framework for regulating and maintaining a running
Heron cluster. The authors have implemented monitoring, diagnostic and remedial
systems to identify and fix performance issues and have show impressive
However, I see one issue with Dhalion’s current approach to resolving
performance problems. The system currently has no way to know if a proposed
resolution is likely to succeed before it is deployed. Dhalion will implement a
resolution and observe its results in order to assess if it is successful.
Updating a Heron topology (the equivalent of rebalancing in Storm) has a
latency cost and it takes some time for the topology’s performance to stabilise
after an update. Only after the topology has stabilised is it clear if a
resolution has been successful or not. If not, then Dhalion has to repeat the
process, potentially incurring further latency costs.
If Dhalion had a way to model the effects of a proposed resolution then it
could iterate to an effective solution much faster. The authors of Dhalion have
already done the lions share of the work, providing monitoring code and
deployment options. I am looking forward to them open-sourcing Dhalion so I can
investigate how easy it would be to integrate a modelling system, like the one
I am developing for Storm, into the resolution process.
Python North East - The Next Generation
This month I officially take the reins of the Python North East user group.
Myself and Scott Walton are taking over
from Rowan Hargreaves and Kieran
Darcy’s fine stewardship.
I’m looking forward to the opportunity to meet many Python people and encourage
new members to try the best programming language there is!
We are in the process of refreshing the website and have set up a new chat
on Gitter. If you want to get in touch the best way is either on Gitter or via
our Twitter account @pythonnortheast.