Actually I was using the function on k (just skipped it here for clarity) so iterating is necessary.
This could be otherwise solved by allocating empty table inside process_message() to later assign it to .Fields but it will trigger extra garbage collection as well. Do you know how to measure how GC is affected in different cases? Thanks, Timur On 2 Nov 2015 at 21:06:06, Rob Miller ([email protected]) wrote: Forgot to mention, in the sample code you included, your problem will go away if instead of iterating through the fields returned from the grammar you just set the msg.Fields value every time: function process_message() local data = read_message("Payload") msg.Fields = grammar:match(data) inject_message(msg) return 0 end -r On 11/02/2015 10:03 AM, Rob Miller wrote: > If you're not careful to zero out the values, or to explicitly set each > value every time, then yes, you'll end up leaking data from one > process_message call to the next. > > Even so, however, it's often a good idea to define the msg table outside > of the process_message call because then the same block of memory will > be reused each time. If you define the table inside of process_message, > then a new chunk of memory will be allocated with every call, which will > go out of scope when the call exits. This will cause a great deal of > garbage collection churn, likely impacting performance greatly. > > So, yes, you should be careful to not let values leak through, but it's > generally worth taking the extra care. > > -r > > > On 11/01/2015 06:10 AM, Timur Batyrshin wrote: > > Hi, > > > > In many stock decoders I see the construct like this: > > > > local msg = { > > Timestamp = nil, > > EnvVersion = nil, > > Hostname = nil, > > Type = msg_type, > > Payload = nil, > > Fields = nil, > > Severity = nil > > } > > > > function process_message() > > > > > > Here the local variable is defined outside of main functions. > > > > From the docs here > > (http://hekad.readthedocs.org/en/v0.10.0b1/sandbox/index.html#how-to-create-a-simple-sandbox-filter) > > > > > > I understand that this variable is initialized once at Heka start and > > after that it is reused. > > This would mean that previous decodes could affect the subsequent > > decodes. > > > > Does this sound like a bug or I'm missing something? > > > > I'm asking about that because I'm using the similar approach in my code > > and I've seen leaking the old data into new messages (some non-relevant > > parts were skipped): > > > > local msg = { > > Type = msg_type, > > Payload = nil, > > Hostname = read_config("Hostname"), > > Fields = {}, > > } > > > > function process_message() > > local data = read_message("Payload") > > fields = grammar:match(data) > > for k,v in pairs(fields) do > > msg.Fields[k] = v > > end > > > > inject_message(msg) > > return 0 > > end > > > > In this case the fields set in the first message appeared in the > > successive message. > > After I've moved the local msg {} into inside of process_message() all > > seemed to start working fine. > > > > Why I'm writing here about that is this behaviour could be subtly > > affecting many other decoders in Heka. > > > > > > Thanks and regards, > > Timur > > > > > > _______________________________________________ > > Heka mailing list > > [email protected] > > https://mail.mozilla.org/listinfo/heka > > >
_______________________________________________ Heka mailing list [email protected] https://mail.mozilla.org/listinfo/heka

