Hi,
I have logging statements in my Beam python code (using Flink). To improve
debugging, I add tags to the log statement, which are then indexable on my
log tool (kibana). However, the log statements that come out are not
properly formatted.

For example, statements like:

logger.exception('this is the desired msg field', kv={'some_tag': x,
'some_other_tag': y})

Comes out with a log message like:

json={"some_tag": x, "some_other_tag": y} msg=this is the desired msg field
I want the message to be simply:

this is the desired msg field

Where the two tags are separate, indexable attributes used by the log tool
(kibana).


More concretely, the below screenshot shows output from logging outside the
beam pipeline- i.e. the desired behavior. Note that the "
pricing.conversion_method" and "pricing.pricing_region" fields are tags
that I added in code.
[image: image.png]
 However, logging inside the Beam pipeline creates entries like this:
[image: image.png]
Where I want "pricingrealtime_region" and "model_version" to be separate.


In general, it seems like the kv dictionary, which I'm passing to the
logging client, is not being properly parsed, and the fields to be indexed
are not being removed. I'm ~guessing~ that this
https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto#L987
needs to be a list of key-value pairs (rather than just a string). Does
anyone have any idea or plans to fix this?

Thanks,
- Akshay

Reply via email to