Re: [heka] Retry with ElasticSearchoutput

Rob Miller Sun, 11 Jan 2015 18:38:46 -0800


On 01/10/2015 06:57 PM, Denis Shashkov wrote:

For this one case, though, an easier fix would be to have the ES output retry, 
and/or if it fails to have the data that *would* have been sent to ES to 
instead be written out to disk.


I agree with Rob, that it's easier to patch ES output and add a retry.

I've patched heka to add buffered output (like in TCPOutput plugin). It 
perfectly works and we have simple monitoring of buffering and sending 
performance. But... we completely lost ability to send bulk POSTs to ES because 
BufferedOutput works only with single messages. Yes, we might use flush_* 
parameters, but it's meaningless to keep messages in previous buffer and to 
lost the whole bulk of them in ES output because of network errors.

Interesting. I'm assuming you mean you've added buffering support justto the ES output, and not that you've made it generally available to*all* outputs. Am I correct?

BTW, Rob did you consider in your feature plans how to handle this situation 
(single processing in BufferedOutput vs bulk sending outputs)?

We haven't worked out the details of the new API yet, no. But addingsimple batch support to the current BufferedOutput implementation wouldbe pretty straightforward. Currently there's a`QueueRecord(*PipelinePack) error` method that serializes the pack to abyte slice (using the specified encoder), adds framing (if necessary),and writes the output bytes to the buffer. If you add alongside this a`QueueBytes([]byte) error` method that expects to receive an alreadyserialized byte slice, then you can pass in an entire batch of encodedrecords and the buffer would treat is as a single record. ThenSendRecord will be passed a batch at a time, and if it returns an errorthe entire batch will stay in the buffer for retry.

Note that, with this approach, QueueRecord would just end up encodingthe pack and then calling QueueBytes. Also, we'd have to take care withnested framing. If a batch contains multiple records that are framedusing the same Heka framing that the buffer uses to demarcate recordboundaries, we might have issues.

That's how we'd get there with the current code. What we end up with inthe long run is TBD, but it will probably be along similar lines, wherean entire batch would be buffered as a single record.


Hope this helps,

-r
_______________________________________________
Heka mailing list
[email protected]
https://mail.mozilla.org/listinfo/heka

Re: [heka] Retry with ElasticSearchoutput

Reply via email to