On 01/11/2015 12:34 AM, Tiru Srikantha wrote:
Yeah, I'm looking at this as well. I don't like losing the bulk because that means it gets slow when your load gets high due to HTTP overhead. If I didn't care about bulk it'd just be a quick rewrite. The other problem I ran into around bulk operations on a buffered queue is that there are 4 points when you want to flush to the output: 1. Queued messages count equals count to send. 2. Queued message size with the new message added will exceed the max bulk message size, even if the count is less than max, due to large messages. 3. A timer expires to force a flush to the output target even if the count or max size hasn't been hit yet, to get the data into the output target in a timely manner. 4. Plugin is shutting down. I'm actually re-writing a lot of the plugins/buffered_output.go file in my local fork because of this and teasing out the "read the next record from the file pile" operation from the "send the next record" operation, so what I'm aiming for is something like a BulkBufferedOutput interface with a SendRecords(records [][]byte) method that must be implemented in lieu of BufferedOutput's SendRecord(record []byte) and some code that's shared between them to buffer new messages and read buffered messages. SendRecords would not advance the cursor until the bulk operation succeeded, as you might expect.
I recommend reading my other message in this thread (http://is.gd/kzuhxs) for an alternate approach. I think you can achieve what you want with less effort, and better separation of concerns, by doing the batching *before* you pass data in to the BufferedOutput. Ultimately each buffer "record" is just a slice of bytes, the buffer doesn't need to know or care if that slice contains a single serialized message or an accumulated batch of them.
I'll submit a PR once I finish.
I always look forward to PRs, but I'll warn you that one that the approach described above will likely be rejected. I'm not very keen on introducing a separate buffering interface specifically for batches, when a simpler change can solve the same problems.
-r _______________________________________________ Heka mailing list [email protected] https://mail.mozilla.org/listinfo/heka

