The generally expected behavior is that if you don't do anything, logging goes to stderr. Logging to non-root loggers breaks this. (Arguably it's a bug in the Python logging libraries to have this inconsistency, but so be it...)
On the other hand, if you do set something up, that is respected. I think setting up default logging (emulating what is done by logging.info(), etc. which is calling basicConfig to get a stderr logger iff no root loggers are present) make sense. We could do this in Pipeline.__init__, but should also probably do it when parsing options. These two entry points should be sufficient. We may need to update non-Pipeline-creating unit-tests as well in some cases, but I think generally the testing framework sets up its logging. On Tue, Dec 17, 2019 at 3:09 PM Pablo Estrada <pabl...@google.com> wrote: > > The Python basicConfig[1] sets up a StreamHandler[2], which by default > publishes to stderr. This configuration is applied by default whenever > someone logs anything on the root logger (e.g. logging.info('abc')). With the > module-based logging changes, the basic config is never called on the > pipeline construction side, so logging depends 100% on the user, instead of > having a 'reasonable default'. > > The default configuration gets setup if the user ever logs on the root logger > (logging.info, ...). If they don't then, nothing is set up. > > I'd say that it's best to be consistent with our previous behavior, which was > to have a default configuration by virtue of logging on the root logger. By > calling basicConfig in Pipeline.__init__, we allow the user to set up a > configuration before creating the Pipeline object, or we set up a default at > that point. > > Thoughts? > > [1] https://docs.python.org/3/library/logging.html#logging.basicConfig > [2] > https://docs.python.org/3/library/logging.handlers.html#logging.StreamHandler > > On Tue, Dec 17, 2019 at 2:42 PM Luke Cwik <lc...@google.com> wrote: >> >> In Beam Java, the expectation has always been that pipeline authors are >> responsible for setting up logging correctly during pipeline construction >> time and that the Beam SDK is responsible for setting up logging at pipeline >> execution time. >> >> Is this something we can solve by documenting and telling users to configure >> logging at pipeline construction time or is the expected python user >> experience that logging always works out of the box? >> >> On Tue, Dec 17, 2019 at 2:15 PM Pablo Estrada <pabl...@google.com> wrote: >>> >>> It should not affect debuggability at pipeline runtime, as the sdk_worker >>> already does the appropriate setup of handlers, but it may affect debugging >>> of e.g. ptransform expansion, and pipeline construction issues. >>> Best >>> -P. >>> >>> On Tue, Dec 17, 2019 at 1:59 PM Udi Meiri <eh...@google.com> wrote: >>>> >>>> Pablo, does the issue affect debuggability of pipelines? >>>> >>>> On Mon, Dec 16, 2019 at 6:23 PM Chad Dombrova <chad...@gmail.com> wrote: >>>>> >>>>> >>>>> >>>>> On Mon, Dec 16, 2019 at 5:59 PM Pablo Estrada <pabl...@google.com> wrote: >>>>>> >>>>>> +chad...@gmail.com is this consistent with behavior that you observed? >>>>> >>>>> >>>>> I honestly can't recall, sorry. I just remember that while I was testing >>>>> I updated sdk version and some logging stopped. I *think* I was missing >>>>> the state/message stream, which would be on the client side after >>>>> pipeline construction. >>>>> >>>>> -chad >>>>>