On 05/18/2015 02:10 AM, Kai Storbeck wrote:
Hello Heka,
I'm currently streaming a logfile containing large XML messages. They
are separated by a long line with dashes, so I'm making use of
RegexSplitter containing those dashes.
This works, the messages are getting thrown over to elasticsearch for
indexing.
Restarting heka will now give me an error:
> 2015/05/18 10:29:56 Decoder 'b2bsoap-b2bdecoder-1' error: No match: ..
> .....
> .....
> ...
> </closing xml tag>
I percieve that his is a problem in the bookkeeping of the seek
position, as that points to the middle of a multiline record.
Yes, I think that's correct.
Can I assist in curing this? Is it curable? Is it a good starting point
to help improving heka? Or are there smaller outstanding issues to
assist with...
Sure, your help resolving this would be welcome. I took a quick peek and I
think that the issue is related to the following code:
https://github.com/mozilla-services/heka/blob/dev/plugins/logstreamer/logstreamer_input.go#L359
That's the LogstreamInput (a pool of which are managed by each LogstreamerInput)
telling the underlying stream to update the ring buffer with the latest read
position. You'll notice that it's happening there whenever n > 0, i.e. whenever
any data is successfully read from the input stream. What you're asking for is to
instead only update the read position if len(record) > 0, which implies that a
full record was retrieved.
You'll want to test this out, though, rather than take my word for it. There's
a lot of code in there, it might be that even if you change that code the
location will still get flushed to disk at shutdown. Hopefully this is a good
starting point.
If you do tackle this, I think it would be nice to retain backwards
compatibility by turning the new behavior on with a config flag, say if the
user sets `update_cursor_on_record_boundary` to true, or something.
-r
Regards,
Kai
_______________________________________________
Heka mailing list
[email protected]
https://mail.mozilla.org/listinfo/heka
_______________________________________________
Heka mailing list
[email protected]
https://mail.mozilla.org/listinfo/heka