Thanks for asking, I'm definitely overdue to provide an update. While some of you might want to know the details of what's going on, I'm sure most of you don't, so I'll provide two versions of the update:
TL;DR VERSION: 1.0 will be released after generic buffering for all output plugins has landed and had some time to burn in for a while, so we feel comfortable that it's stable. Unfortunately performance issues related to the current filter and output plugin APIs have caused the implementation of generic buffering to be a *lot* more work than expected. That work is happening alongside lots of other stuff that needs to get done, so it has taken (and will continue to take) a while to be completed. EXCRUCIATING DETAIL VERSION: There is no current schedule for a 1.0 release. Originally we were targeting trying to have one out by now, but other work (including the work of actually *using* Heka inside of Mozilla) has gotten in the way. The buffering-for-all-outputs is the last major change that is on deck before 1.0, but that's a big enough change that I'm planning to put out a 0.10 release so it gets some break-in time before bumping the revision to 1.0. A great deal of work has been done on the buffering front, but unfortunately it's ballooned into a bigger job than expected, so there's still a lot more to do. We've put it together so that the buffering happens directly off of the router. Whenever a message_matcher match is found, Heka will check to see whether or not the plugin is using buffering. If no, the message will be placed on the plugin's input channel, as before. If yes, the message will instead be placed in the plugin's disk queue. For each buffered plugin, a separate goroutine will run to pull messages off of the queue and feed them to the plugin. So far things are pretty straightforward, but here's where they take a turn. In order to support retries, Heka needs to know whether or not a message was successfully delivered. Not only that, it needs to know synchronously, before it advances to the next record in the disk queue. Unfortunately, Heka's current design makes this difficult; outputs run in their own goroutine, and expect to get all of their messages from the input channel. This means that telling Heka whether or not the message delivery was successful involves bi-directional synchronization between the goroutine that is pulling from the disk queue and the output plugin's main goroutine. I've come up with a way to get this all working while making minimal changes to existing plugin code, but the cross-goroutine synchronization slows everything down. With this scheme, I'm seeing TcpOutput throughput of about 50% what we get with the existing, entirely-inside-the-plugin disk buffering. That's a non-starter. We can get rid of the problem by getting rid of the separate goroutines. But that involves a complete overhaul of the output plugin API, so that the Go API looks more like the existing sandbox API. That means plugin authors would implement `ProcessMessage` and `TimerEvent` methods instead of receiving messages and ticker notifications over channels like they do now. Needless to say, rewriting every output (and filter, since with this approach they'll support buffering too) to use a fundamentally different API is a lot of work. I'm not excited about that, but I'm not sure how else to approach it. My current thinking is that we'll let both of the APIs exist side by side for a while. This would mean existing plugins could support buffering, albeit with a performance hit, with only trivial code changes. Much improved performance would be possible, but it would require reimplementing the plugin using the newer API. (Note that the performance penalty only applies when buffering is used... the unbuffered performance would match what we have now regardless of which API was used.) So my next steps are to hammer out the details of the new API, implement support for it in the Heka core, and update the TcpOutput to support it. I'll keep plugging away at it, but that work is happening alongside a number of other important initiatives being worked on, so it will likely take a while longer. Hope this answers your questions. Ideas, comments, feedback welcome as always. -r On 04/23/2015 10:52 AM, Tom Davis wrote:
The recent 0.9.2 release prompted me to wonder again about what might be planned for 1.0. The milestone on GitHub seems quite outdated, aside from the generic buffering for output plugins (which sounds great). Rob, Mike, do you guys have any big goals for 1.0? Any timeline? (I don't personally have a burning desire for a particular number, it's just commonly seen as the time when most of the big, breaking stuff has been finished and I'm wondering what's next for Heka) Thanks! _______________________________________________ Heka mailing list Heka@mozilla.org https://mail.mozilla.org/listinfo/heka
_______________________________________________ Heka mailing list Heka@mozilla.org https://mail.mozilla.org/listinfo/heka