On 02/05/2015 04:39 AM, Victor Castell wrote:
Yeah I know it's different because my input is LogStreamer and the
example in docs was for receiving protobuf by tcp.
I want to understand.
When my LogStreamer reads a message it passes to the decoder a Protocol
Buffer message with the with the log line in the message payload, that's
right?
Nope, this is the misunderstanding. When Logstreamer reads a text file, it
passes to the decoder an instantiated Message struct with the log line in the
message payload. Protocol buffers aren't involved at all. The only time it
makes sense to use a ProtobufDecoder with a LogstreamerInput is if the file(s)
you're loading contain binary, protobuf encoded Heka messages, such as those
generated by Heka itself using a FileOutput with a ProtobufEncoder. This is a
valid use case; in fact at Mozilla we do this often. Heka even ships with a
command line utility called `heka-cat`
(http://hekad.readthedocs.org/en/dev/developing/testing.html#heka-cat) which
lets you browse and query the contents of such files.
If the files you're loading are plain text log files, however, a
ProtobufDecoder will have no idea what to do with them. It will fail on every
message. And it will slow things down considerably.
Using the following config as my input decoder (this is what I actually
tried):
[syslog-decoder]
subs = ['nginx-access-decoder', 'ProtobufDecoder']
cascade_strategy = "first-wins"
log_sub_errors = true
[ProtobufDecoder]
This should capture my nginx log lines and remove it from the decoding
"cascade" and pass all the rest to ProtobufDecoder that in turn doesn't
do nothing.
Is this correct?
The first part is correct, any successfully parsed nginx log files won't make
it through to the ProtobufDecoder. But any messages that fail the nginx parser
will be given to the ProtobufDecoder, which will have no idea what to do with
them.
And if it is, why is this so slow?
See above. :)
-r
On Wed, Feb 4, 2015 at 8:08 PM, Rob Miller <[email protected]
<mailto:[email protected]>> wrote:
The config that you cargo-culted from the docs is meant for an
entirely different use case. That's meant to handle cases where
you're receiving protocol buffer encoded Heka messages, each of
which contains an Nginx access log line as the message payload. This
would be useful in a case where one Heka is loading the log files
but instead of parsing them it's sending them along in protobuf
format to another Heka that's doing the parsing. The config below
would be used on the listener.
If you want to see the decoding errors all you need to do is change
your `log_sub_errors` setting from false to true.
-r
On 02/04/2015 04:19 AM, Victor Castell wrote:
I managed to get a working config but I want to understand
what's going on:
[syslog-decoder]
type = "MultiDecoder"
subs = ['nginx-access-decoder', 'rsyslog-decoder']
cascade_strategy = "first-wins"
log_sub_errors = false
In the nginx-access I'm deconding the corresponding access.log
entries
from my rsyslog and in the rsyslog decoder I'm capturing any other
rsyslog entries and discarding them.
That works well but in my first attempt I tried with the config
extracted from the documentation:
[shipped-nginx-decoder]
type = "MultiDecoder"
subs = ['ProtobufDecoder', 'nginx-access-decoder']
cascade_strategy = "all"
log_sub_errors = true
[ProtobufDecoder]
I would rather this config than the previous one, because it can
log me
the errors of my nginx decoding.
The problem is that when using the ProtobufDecoder the speed of
decoding
is really slow, and my nginx logs doesn't keep up with the current
events in time, and it's always behind the current time.
This doesn't happen with the rsyslog-decoder config, it parses
the logs
really fast.
I thought it will be much faster using the internal
ProtobufDecoder than
a lua one but it's not the case.
What's the reason for this?
On Fri, Jan 30, 2015 at 11:31 AM, Victor Castell
<[email protected] <mailto:[email protected]>
<mailto:victor@victorcastell.__com
<mailto:[email protected]>>> wrote:
Didn't know of that! Life saver
Thanks!
El 30/1/2015 11:17, "Krzysztof Krzyżaniak"
<[email protected] <mailto:[email protected]>
<mailto:[email protected]
<mailto:[email protected]>__>> escribió:
W dni pią 30 sty, 2015 o 10∶34 użytkownik Victor Castell
<[email protected]
<mailto:[email protected]>
<mailto:victor@victorcastell.__com
<mailto:[email protected]>>>
napisał:
> Hi!
>
> I have a centralized rsyslog formatted logfile and I'm
> extracting nginx logs from there using heka and the nginx
> access log decoder.
>
> The problem is that the parser also logs every other log
> message out to heka.log.
>
> The volume of non nginx logs mixed in my rsyslog log is really
> huge so heka.log file is growing like crazy (I have
> logrotating before you ask)
>
> Is there a way to conditionally/intentionally suppress the
> parsing errors of a given decoder?
You probably want to use MultiDecoder which split nginx
logs
from the rest and use log_sub_errors = false in
MultiDecoder
section.
eloy
--
V
_________________________________________________
Heka mailing list
[email protected] <mailto:[email protected]>
https://mail.mozilla.org/__listinfo/heka
<https://mail.mozilla.org/listinfo/heka>
--
V
_______________________________________________
Heka mailing list
[email protected]
https://mail.mozilla.org/listinfo/heka