On 11/16/2015 12:19 PM, Timur Batyrshin wrote:
Hi Rob,
Parsing heka.all-report messages looks fine as well as having
DashboardOutput.
I was just hoping that there is already something ready for that :-)
No, alas. Should be pretty easy w/ cjson in a SandboxFilter, though. I'd
happily add such a filter to the Heka repo to make it available for everyone.
One more question related to my original issue but a bit away from topic
on performance:
As I’ve mentioned earlier I run heka agents on multiple boxes.
They send collect local data and send it via plain TCP to heka
aggregator. This central heka server then sends traffic further to
storage via HTTP.
In my setup I see caches/logfiles on disk growing only on collectors but
not on aggregator.
This means that if network connectivity issues appear on aggregator the
caches will fill up disks on end-boxes but not on aggregator, right?
Yes.
Also looks like they'll even start dropping the messages.
Yup.
So the question is:
Is there a way to make HttpOutput cache outgoing traffic in the same way
TcpOutput does?
Heka v0.9 supports buffering only for the TcpOutput and ElasticSearchOutput.
Heka v0.10 adds support for disk buffering for *all* filter and output plugins.
Unfortunately, this has had some stability issues, and I've been too busy doing
things other than working on the Heka core to yet resolve these issues. One of
the known problems (https://github.com/mozilla-services/heka/issues/1738) seems
to have been resolved and will hopefully be merged to the versions/0.10 branch
this week, but there's another issue where Heka is generating idle pack
diagnostic messages that I'm pretty sure is related to the new disk buffering
(https://github.com/mozilla-services/heka/issues/1699) that I haven't yet had a
chance to debug. That's the last blocker I know about for a 0.10.0 final
release. I wish I could give you a time-table for resolving it, but I can't
beyond saying that getting the 0.10.0 final release out the door is on my list
of 2015 Q4 goals.
Or is there any other way to add persistence to it (without running
external services)?
The upcoming buffering is probably your best choice. There are hackish, fairly
painful manual options, such as also writing your data out to a FileOutput, or
maybe even using a SandboxFilter with `preserve_data = true` to hold on to a
sliding window of the latest set of records, but there's no support for knowing
what was missed and automatically retrying it, you'd have to basically
cross-reference what arrived with what didn't and then manually extract the
record from your backup storage and add it to the end data store by hand.
Pretty painful. Sorry. :P
-r
_______________________________________________
Heka mailing list
[email protected]
https://mail.mozilla.org/listinfo/heka