A fix, calling basicconfig in pipeline and pipelineoptions: https://github.com/apache/beam/pull/10396
On Tue, Dec 17, 2019 at 3:17 PM Robert Bradshaw <rober...@google.com> wrote: > The generally expected behavior is that if you don't do anything, > logging goes to stderr. Logging to non-root loggers breaks this. > (Arguably it's a bug in the Python logging libraries to have this > inconsistency, but so be it...) > > On the other hand, if you do set something up, that is respected. > > I think setting up default logging (emulating what is done by > logging.info(), etc. which is calling basicConfig to get a stderr > logger iff no root loggers are present) make sense. We could do this > in Pipeline.__init__, but should also probably do it when parsing > options. These two entry points should be sufficient. > > We may need to update non-Pipeline-creating unit-tests as well in some > cases, but I think generally the testing framework sets up its > logging. > > On Tue, Dec 17, 2019 at 3:09 PM Pablo Estrada <pabl...@google.com> wrote: > > > > The Python basicConfig[1] sets up a StreamHandler[2], which by default > publishes to stderr. This configuration is applied by default whenever > someone logs anything on the root logger (e.g. logging.info('abc')). With > the module-based logging changes, the basic config is never called on the > pipeline construction side, so logging depends 100% on the user, instead of > having a 'reasonable default'. > > > > The default configuration gets setup if the user ever logs on the root > logger (logging.info, ...). If they don't then, nothing is set up. > > > > I'd say that it's best to be consistent with our previous behavior, > which was to have a default configuration by virtue of logging on the root > logger. By calling basicConfig in Pipeline.__init__, we allow the user to > set up a configuration before creating the Pipeline object, or we set up a > default at that point. > > > > Thoughts? > > > > [1] https://docs.python.org/3/library/logging.html#logging.basicConfig > > [2] > https://docs.python.org/3/library/logging.handlers.html#logging.StreamHandler > > > > On Tue, Dec 17, 2019 at 2:42 PM Luke Cwik <lc...@google.com> wrote: > >> > >> In Beam Java, the expectation has always been that pipeline authors are > responsible for setting up logging correctly during pipeline construction > time and that the Beam SDK is responsible for setting up logging at > pipeline execution time. > >> > >> Is this something we can solve by documenting and telling users to > configure logging at pipeline construction time or is the expected python > user experience that logging always works out of the box? > >> > >> On Tue, Dec 17, 2019 at 2:15 PM Pablo Estrada <pabl...@google.com> > wrote: > >>> > >>> It should not affect debuggability at pipeline runtime, as the > sdk_worker already does the appropriate setup of handlers, but it may > affect debugging of e.g. ptransform expansion, and pipeline construction > issues. > >>> Best > >>> -P. > >>> > >>> On Tue, Dec 17, 2019 at 1:59 PM Udi Meiri <eh...@google.com> wrote: > >>>> > >>>> Pablo, does the issue affect debuggability of pipelines? > >>>> > >>>> On Mon, Dec 16, 2019 at 6:23 PM Chad Dombrova <chad...@gmail.com> > wrote: > >>>>> > >>>>> > >>>>> > >>>>> On Mon, Dec 16, 2019 at 5:59 PM Pablo Estrada <pabl...@google.com> > wrote: > >>>>>> > >>>>>> +chad...@gmail.com is this consistent with behavior that you > observed? > >>>>> > >>>>> > >>>>> I honestly can't recall, sorry. I just remember that while I was > testing I updated sdk version and some logging stopped. I *think* I was > missing the state/message stream, which would be on the client side after > pipeline construction. > >>>>> > >>>>> -chad > >>>>> >