On 10/12/2015 01:22 PM, Timur Batyrshin wrote:
But how should I then handle modifying fields in subsequent decoders in
multi-decoder?
You can use `read_message('Fields[fieldname]')` to get the value of a field and 
and `write_message('Fields[fieldname]', value)` to mutate the value of a field.
And actually how multi-decoder should be used then? Only with non-lua
decoders?
MultiDecoder can be used with any decoders, although if you're stringing 
multiple SandboxDecoders together it's more efficient to just have a single 
SandboxDecoder that does all of the work.
I think in all other places I’ve used the same technique with Lua
decoder being the first in the line
but not PayloadRegexDecoder as in this case. Could that be the reason
for the issue?
In some cases Heka may store some data in the MsgBytes value, so you might sometimes get what seems 
like accurate data from `read_message("raw")` in a decoder, but you can't count on it 
being there, and if it is there you can't count on it being accurate. We should maybe just have 
SandboxDecoders enforce this and not allow `read_message("raw")` at all from within 
decoder code.
One more point to add here: I’ve seen some Heka panics with this setup.
These happened when
there were some more lua plugins added to the setup.
They are also stable in my other setups but again I don’t use
PayloadRegexDecoder often.
I could not reproduce that reliably so no stacktrace from me, sorry.
Could that also be related to this issue?
Possibly? Hard to say w/o seeing a stack trace.

-r


re: modifying lua_sandbox.go
Many thanks, I’ll definitely try this for debugging setup!


Thanks,
Timur

On 12 Oct 2015 at 23:03:51, Rob Miller ([email protected]
<mailto:[email protected]>) wrote:

> Simon was right in that the problem here is your use of
> `read_message("raw")`, which should almost never be used in a decoder.
> `read_message("raw")` returns the pack.MsgBytes value. Once a message
> hits the router, this is guaranteed to have an accurate protobuf
> encoding of the message, but before the router this is not the case.
>
> If you want to use `print` in your Lua scripts you can do so by removing
> `print` from the sandbox template's `remove_entries` value in the
> lua_sandbox.go.in file:
>
> 
https://github.com/mozilla-services/heka/blob/dev/sandbox/lua/lua_sandbox.go.in#L60
>
>
> -r
>
>
> On 10/12/2015 07:17 AM, Timur Batyrshin wrote:
> > Hi,
> >
> > I’ve been writing a decoder for myself and have hit the following issue
> > which I can’t understand.
> >
> > When I start Heka it produces the following error message in logs:
> >
> >     2015/10/12 13:54:47 SubDecoder
> >     ‘zerogw-zerogw_decoder-stdout-zerogw_rotate_fields’ error: FATAL:
> >     process_message() /usr/share/heka/lua_decoders/rotate_fields.lua:30:
> >     bad argument #0 to ‘decode_message’ (must have one string argument)
> >
> > At the same time the code for decoder is the following:
> >
> > |-- the only lines above are comments which are skipped metric_field =
> > read_config("metric_field") or "metric" value_field =
> > read_config("value_field") or "value" function process_message() local
> > fields = {} raw = read_message("raw”) # line 29 msg =
> > decode_message(raw) # line 30 -- other part of code is probably
> > irrelevant as crash is seen in the above line |
> >
> > (I’ve tried writing that as |decode_message(read_message("raw”))| with
> > the same effect)
> >
> > What’s really weird is the exactly the same decoder works fine on other
> > hosts.
> >
> > I’m using the following Heka config:
> >
> > |[zerogw] type = "ProcessInput" ticker_interval = 0 splitter =
> > "on_newline" decoder = "zerogw_decoder" stdout = true stderr = false
> > [zerogw.command.0] bin = "/usr/local/bin/zerogw_collector.py" args =
> > ["-s", "tcp://127.0.0.1:5111 <http://127.0.0.1:5111>"] [on_newline] type
> > = "TokenSplitter" delimiter = "\n" [estp_decoder] type =
> > "PayloadRegexDecoder" match_regex = '^(?P<Name>[^\s]+)
> > (?P<Timestamp>\d+) (?P<Value>\d+)' timestamp_layout = "Epoch"
> > [estp_decoder.message_fields] Service = "Zerogw" Metric = "%Name%" Value
> > = "%Value%" [zerogw_decoder] type = "MultiDecoder" subs =
> > ["estp_decoder", "zerogw_rotate_fields"] cascade_strategy = "all"
> > [zerogw_rotate_fields] type = "SandboxDecoder" filename =
> > "lua_decoders/rotate_fields.lua" [zerogw_rotate_fields.config]
> > metric_field = "Metric" value_field = "Value" |
> >
> > zerogw_collector.py produces about a dozen of lines to stdout every 5
> > seconds in the format as seen in message payload (see below).
> >
> > As MultiDecoder has |cascade_strategy = "all"| Heka dumps messages
> > processed by the first decoder in the chain to stdout which are the
> > following:
> >
> > |2015/10/12 14:04:22 :Timestamp: 2015-10-12 14:04:22 +0000 UTC :Type:
> > ProcessInput :Hostname: t-eu-zgw :Pid: 5212 :Uuid:
> > 2c7deb23-7961-49dc-8f57-da716d851439 :Logger: zerogw :Payload:
> > zerogw.connections.total 1444658662 4 :EnvVersion: :Severity: 7 :Fields:
> > | name:"ProcessInputName" type:string value:"zerogw.stdout" |
> > name:"ExitStatus" type:integer value:0 | name:"Value" type:string
> > value:"4" | name:"Service" type:string value:"Zerogw" | name:"Metric"
> > type:string value:"zerogw.connections.total" |
> >
> > In plain Lua I’d dump the result of |read_message("raw")| to stdout, add
> > some prints everywhere and see what happens inside but don’t know how to do.
> >
> > Any clues on how I should debug such cases?
> >
> > Thanks,
> > Timur
> >
> > ​
> >
> >
> > _______________________________________________
> > Heka mailing list
> > [email protected]
> > https://mail.mozilla.org/listinfo/heka
> >

_______________________________________________
Heka mailing list
[email protected]
https://mail.mozilla.org/listinfo/heka

Reply via email to