On 01/10/2015 06:57 PM, Denis Shashkov wrote:
For this one case, though, an easier fix would be to have the ES output retry,
and/or if it fails to have the data that *would* have been sent to ES to
instead be written out to disk.
I agree with Rob, that it's easier to patch ES output and add a retry.
I've patched heka to add buffered output (like in TCPOutput plugin). It
perfectly works and we have simple monitoring of buffering and sending
performance. But... we completely lost ability to send bulk POSTs to ES because
BufferedOutput works only with single messages. Yes, we might use flush_*
parameters, but it's meaningless to keep messages in previous buffer and to
lost the whole bulk of them in ES output because of network errors.
Interesting. I'm assuming you mean you've added buffering support just
to the ES output, and not that you've made it generally available to
*all* outputs. Am I correct?
BTW, Rob did you consider in your feature plans how to handle this situation
(single processing in BufferedOutput vs bulk sending outputs)?
We haven't worked out the details of the new API yet, no. But adding
simple batch support to the current BufferedOutput implementation would
be pretty straightforward. Currently there's a
`QueueRecord(*PipelinePack) error` method that serializes the pack to a
byte slice (using the specified encoder), adds framing (if necessary),
and writes the output bytes to the buffer. If you add alongside this a
`QueueBytes([]byte) error` method that expects to receive an already
serialized byte slice, then you can pass in an entire batch of encoded
records and the buffer would treat is as a single record. Then
SendRecord will be passed a batch at a time, and if it returns an error
the entire batch will stay in the buffer for retry.
Note that, with this approach, QueueRecord would just end up encoding
the pack and then calling QueueBytes. Also, we'd have to take care with
nested framing. If a batch contains multiple records that are framed
using the same Heka framing that the buffer uses to demarcate record
boundaries, we might have issues.
That's how we'd get there with the current code. What we end up with in
the long run is TBD, but it will probably be along similar lines, where
an entire batch would be buffered as a single record.
Hope this helps,
-r
_______________________________________________
Heka mailing list
[email protected]
https://mail.mozilla.org/listinfo/heka