Re: [heka] measure Heka load

Rob Miller Mon, 16 Nov 2015 12:41:07 -0800

On 11/16/2015 12:19 PM, Timur Batyrshin wrote:

Hi Rob,


Parsing heka.all-report messages looks fine as well as having
DashboardOutput.
I was just hoping that there is already something ready for that :-)

No, alas. Should be pretty easy w/ cjson in a SandboxFilter, though. I'd 
happily add such a filter to the Heka repo to make it available for everyone.

One more question related to my original issue but a bit away from topic
on performance:
As I’ve mentioned earlier I run heka agents on multiple boxes.
They send collect local data and send it via plain TCP to heka
aggregator. This central heka server then sends traffic further to
storage via HTTP.
In my setup I see caches/logfiles on disk growing only on collectors but
not on aggregator.

This means that if network connectivity issues appear on aggregator the
caches will fill up disks on end-boxes but not on aggregator, right?

Yes.

Also looks like they'll even start dropping the messages.

Yup.

So the question is:
Is there a way to make HttpOutput cache outgoing traffic in the same way
TcpOutput does?

Heka v0.9 supports buffering only for the TcpOutput and ElasticSearchOutput. 
Heka v0.10 adds support for disk buffering for *all* filter and output plugins. 
Unfortunately, this has had some stability issues, and I've been too busy doing 
things other than working on the Heka core to yet resolve these issues. One of 
the known problems (https://github.com/mozilla-services/heka/issues/1738) seems 
to have been resolved and will hopefully be merged to the versions/0.10 branch 
this week, but there's another issue where Heka is generating idle pack 
diagnostic messages that I'm pretty sure is related to the new disk buffering 
(https://github.com/mozilla-services/heka/issues/1699) that I haven't yet had a 
chance to debug. That's the last blocker I know about for a 0.10.0 final 
release. I wish I could give you a time-table for resolving it, but I can't 
beyond saying that getting the 0.10.0 final release out the door is on my list 
of 2015 Q4 goals.

Or is there any other way to add persistence to it (without running
external services)?

The upcoming buffering is probably your best choice. There are hackish, fairly 
painful manual options, such as also writing your data out to a FileOutput, or 
maybe even using a SandboxFilter with `preserve_data = true` to hold on to a 
sliding window of the latest set of records, but there's no support for knowing 
what was missed and automatically retrying it, you'd have to basically 
cross-reference what arrived with what didn't and then manually extract the 
record from your backup storage and add it to the end data store by hand. 
Pretty painful. Sorry. :P

-r
_______________________________________________
Heka mailing list
[email protected]
https://mail.mozilla.org/listinfo/heka

Re: [heka] measure Heka load

Reply via email to