The generally expected behavior is that if you don't do anything,
logging goes to stderr. Logging to non-root loggers breaks this.
(Arguably it's a bug in the Python logging libraries to have this
inconsistency, but so be it...)

On the other hand, if you do set something up, that is respected.

I think setting up default logging (emulating what is done by
logging.info(), etc. which is calling basicConfig to get a stderr
logger iff no root loggers are present) make sense. We could do this
in Pipeline.__init__, but should also probably do it when parsing
options. These two entry points should be sufficient.

We may need to update non-Pipeline-creating unit-tests as well in some
cases, but I think generally the testing framework sets up its
logging.

On Tue, Dec 17, 2019 at 3:09 PM Pablo Estrada <pabl...@google.com> wrote:
>
> The Python basicConfig[1] sets up a StreamHandler[2], which by default 
> publishes to stderr. This configuration is applied by default whenever 
> someone logs anything on the root logger (e.g. logging.info('abc')). With the 
> module-based logging changes, the basic config is never called on the 
> pipeline construction side, so logging depends 100% on the user, instead of 
> having a 'reasonable default'.
>
> The default configuration gets setup if the user ever logs on the root logger 
> (logging.info, ...). If they don't then, nothing is set up.
>
> I'd say that it's best to be consistent with our previous behavior, which was 
> to have a default configuration by virtue of logging on the root logger. By 
> calling basicConfig in Pipeline.__init__, we allow the user to set up a 
> configuration before creating the Pipeline object, or we set up a default at 
> that point.
>
> Thoughts?
>
> [1] https://docs.python.org/3/library/logging.html#logging.basicConfig
> [2] 
> https://docs.python.org/3/library/logging.handlers.html#logging.StreamHandler
>
> On Tue, Dec 17, 2019 at 2:42 PM Luke Cwik <lc...@google.com> wrote:
>>
>> In Beam Java, the expectation has always been that pipeline authors are 
>> responsible for setting up logging correctly during pipeline construction 
>> time and that the Beam SDK is responsible for setting up logging at pipeline 
>> execution time.
>>
>> Is this something we can solve by documenting and telling users to configure 
>> logging at pipeline construction time or is the expected python user 
>> experience that logging always works out of the box?
>>
>> On Tue, Dec 17, 2019 at 2:15 PM Pablo Estrada <pabl...@google.com> wrote:
>>>
>>> It should not affect debuggability at pipeline runtime, as the sdk_worker 
>>> already does the appropriate setup of handlers, but it may affect debugging 
>>> of e.g. ptransform expansion, and pipeline construction issues.
>>> Best
>>> -P.
>>>
>>> On Tue, Dec 17, 2019 at 1:59 PM Udi Meiri <eh...@google.com> wrote:
>>>>
>>>> Pablo, does the issue affect debuggability of pipelines?
>>>>
>>>> On Mon, Dec 16, 2019 at 6:23 PM Chad Dombrova <chad...@gmail.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Dec 16, 2019 at 5:59 PM Pablo Estrada <pabl...@google.com> wrote:
>>>>>>
>>>>>> +chad...@gmail.com is this consistent with behavior that you observed?
>>>>>
>>>>>
>>>>> I honestly can't recall, sorry.  I just remember that while I was testing 
>>>>> I updated sdk version and some logging stopped.  I *think* I was missing 
>>>>> the state/message stream, which would be on the client side after 
>>>>> pipeline construction.
>>>>>
>>>>> -chad
>>>>>

Reply via email to