Re: [heka] Supress parsing logging

Rob Miller Thu, 05 Feb 2015 09:07:20 -0800

On 02/05/2015 04:39 AM, Victor Castell wrote:

Yeah I know it's different because my input is LogStreamer and the
example in docs was for receiving protobuf by tcp.


I want to understand.

When my LogStreamer reads a message it passes to the decoder a Protocol
Buffer message with the with the log line in the message payload, that's
right?

Nope, this is the misunderstanding. When Logstreamer reads a text file, it 
passes to the decoder an instantiated Message struct with the log line in the 
message payload. Protocol buffers aren't involved at all. The only time it 
makes sense to use a ProtobufDecoder with a LogstreamerInput is if the file(s) 
you're loading contain binary, protobuf encoded Heka messages, such as those 
generated by Heka itself using a FileOutput with a ProtobufEncoder. This is a 
valid use case; in fact at Mozilla we do this often. Heka even ships with a 
command line utility called `heka-cat` 
(http://hekad.readthedocs.org/en/dev/developing/testing.html#heka-cat) which 
lets you browse and query the contents of such files.

If the files you're loading are plain text log files, however, a 
ProtobufDecoder will have no idea what to do with them. It will fail on every 
message. And it will slow things down considerably.

Using the following config as my input decoder (this is what I actually
tried):

[syslog-decoder]
subs = ['nginx-access-decoder', 'ProtobufDecoder']
cascade_strategy = "first-wins"
log_sub_errors = true

[ProtobufDecoder]

This should capture my nginx log lines and remove it from the decoding
"cascade" and pass all the rest to ProtobufDecoder that in turn doesn't
do nothing.

Is this correct?

The first part is correct, any successfully parsed nginx log files won't make 
it through to the ProtobufDecoder. But any messages that fail the nginx parser 
will be given to the ProtobufDecoder, which will have no idea what to do with 
them.

And if it is, why is this so slow?

See above. :)

-r


On Wed, Feb 4, 2015 at 8:08 PM, Rob Miller <[email protected]
<mailto:[email protected]>> wrote:

    The config that you cargo-culted from the docs is meant for an
    entirely different use case. That's meant to handle cases where
    you're receiving protocol buffer encoded Heka messages, each of
    which contains an Nginx access log line as the message payload. This
    would be useful in a case where one Heka is loading the log files
    but instead of parsing them it's sending them along in protobuf
    format to another Heka that's doing the parsing. The config below
    would be used on the listener.

    If you want to see the decoding errors all you need to do is change
    your `log_sub_errors` setting from false to true.

    -r



    On 02/04/2015 04:19 AM, Victor Castell wrote:

        I managed to get a working config but I want to understand
        what's going on:

        [syslog-decoder]
        type = "MultiDecoder"
        subs = ['nginx-access-decoder', 'rsyslog-decoder']
        cascade_strategy = "first-wins"
        log_sub_errors = false

        In the nginx-access I'm deconding the corresponding access.log
        entries
        from my rsyslog and in the rsyslog decoder I'm capturing any other
        rsyslog entries and discarding them.

        That works well but in my first attempt I tried with the config
        extracted from the documentation:

        [shipped-nginx-decoder]
        type = "MultiDecoder"
        subs = ['ProtobufDecoder', 'nginx-access-decoder']
        cascade_strategy = "all"
        log_sub_errors = true

        [ProtobufDecoder]

        I would rather this config than the previous one, because it can
        log me
        the errors of my nginx decoding.

        The problem is that when using the ProtobufDecoder the speed of
        decoding
        is really slow, and my nginx logs doesn't keep up with the current
        events in time, and it's always behind the current time.

        This doesn't happen with the rsyslog-decoder config, it parses
        the logs
        really fast.

        I thought it will be much faster using the internal
        ProtobufDecoder than
        a lua one but it's not the case.

        What's the reason for this?


        On Fri, Jan 30, 2015 at 11:31 AM, Victor Castell
        <[email protected] <mailto:[email protected]>
        <mailto:victor@victorcastell.__com
        <mailto:[email protected]>>> wrote:

             Didn't know of that! Life saver

             Thanks!

             El 30/1/2015 11:17, "Krzysztof Krzyżaniak"
        <[email protected] <mailto:[email protected]>
             <mailto:[email protected]
        <mailto:[email protected]>__>> escribió:

                 W dni pią 30 sty, 2015 o 10∶34 użytkownik Victor Castell
                 <[email protected]
        <mailto:[email protected]>
        <mailto:victor@victorcastell.__com
        <mailto:[email protected]>>>
                 napisał:
        >         Hi!
        >
        >         I have a centralized rsyslog formatted logfile and I'm
        >         extracting nginx logs from there using heka and the nginx
        >         access log decoder.
        >
        >         The problem is that the parser also logs every other log
        >         message out to heka.log.
        >
        >         The volume of non nginx logs mixed in my rsyslog log is really
        >         huge so heka.log file is growing like crazy (I have
        >         logrotating before you ask)
        >
        >         Is there a way to conditionally/intentionally suppress the
        >         parsing errors of a given decoder?

                 You probably want to use MultiDecoder which split nginx
        logs
                 from the rest and use log_sub_errors = false in
        MultiDecoder
                 section.

                    eloy




        --
        V


        _________________________________________________
        Heka mailing list
        [email protected] <mailto:[email protected]>
        https://mail.mozilla.org/__listinfo/heka
        <https://mail.mozilla.org/listinfo/heka>





--
V


_______________________________________________
Heka mailing list
[email protected]
https://mail.mozilla.org/listinfo/heka

Re: [heka] Supress parsing logging

Reply via email to