Re: 0.9.0 forces log4j usage

Patrick Wendell Fri, 07 Feb 2014 10:10:13 -0800

Hey Guys,

Thanks for explainning. Ya this is a problem - we didn't really know
that people are using other slf4j backends, slf4j is in there for
historical reasons but I think we may assume in a few places that
log4j is being used and we should minimize those.


We should patch this and get a fix into 0.9.1. So some solutions I see are:

(a) Add SparkConf option to disable this. I'm fine with this one.

(b) Ask slf4j which backend is active and only try to enforce this
default if we know slf4j is using log4j. Do either of you know if this
is possible? Not sure if slf4j exposes this.

(c) Just remove this default stuff. We'd rather not do this. The goal
of this thing is to provide good usability for people who have linked
against Spark and haven't done anything to configure logging. For
beginners we try to minimize the assumptions about what else they know
about, and I've found log4j configuration is a huge mental barrier for
people who are getting started.

Paul if you submit a patch doing (a) we can merge it in. If you have
any idea if (b) is possible I prefer that one, but it may not be
possible or might be brittle.

- Patrick

On Fri, Feb 7, 2014 at 6:36 AM, Koert Kuipers <ko...@tresata.com> wrote:
> Totally agree with Paul: a library should not pick the slf4j backend. It
> defeats the purpose of slf4j. That big ugly warning is there to alert
> people that its their responsibility to pick the back end...
> On Feb 7, 2014 3:55 AM, "Paul Brown" <p...@mult.ifario.us> wrote:
>
>> Hi, Patrick --
>>
>> From slf4j, you can either backend it into log4j (which is the way that
>> Spark is shipped) or you can route log4j through slf4j and then on to a
>> different backend (e.g., logback).  We're doing the latter and manipulating
>> the dependencies in the build because that's the way the enclosing
>> application is set up.
>>
>> The issue with the current situation is that there's no way for an end user
>> to choose to *not* use the log4j backend.  (My short-term solution was to
>> use the Maven shade plugin to swap in a version of the Logging trait with
>> the body of that method commented out.)  In addition to the situation with
>> log4j-over-slf4j and the empty enumeration of ROOT appenders, you might
>> also run afoul of someone who intentionally configured log4j with an empty
>> set of appenders at the time that Spark is initializing.
>>
>> I'd be happy with any implementation that lets me choose my logging
>> backend: override default behavior via system property, plug-in
>> architecture, etc.  I do think it's reasonable to expect someone digesting
>> a substantial JDK-based system like Spark to understand how to initialize
>> logging -- surely they're using logging of some kind elsewhere in their
>> application -- but if you want the default behavior there as a courtesy, it
>> might be worth putting an INFO (versus a the glaring log4j WARN) message on
>> the output that says something like "Initialized default logging via Log4J;
>> pass -Dspark.logging.loadDefaultLogger=false to disable this behavior." so
>> that it's both convenient and explicit.
>>
>> Cheers.
>> -- Paul
>>
>>
>>
>>
>>
>>
>> --
>> p...@mult.ifario.us | Multifarious, Inc. | http://mult.ifario.us/
>>
>>
>> On Fri, Feb 7, 2014 at 12:05 AM, Patrick Wendell <pwend...@gmail.com>
>> wrote:
>>
>> > A config option e.g. could just be to add:
>> >
>> > spark.logging.loadDefaultLogger (default true)
>> > If set to true, Spark will try to initialize a log4j logger if none is
>> > detected. Otherwise Spark will not modify logging behavior.
>> >
>> > Then users could just set this to false if they have a logging set-up
>> > that conflicts with this.
>> >
>> > Maybe there is a nicer fix...
>> >
>> > On Fri, Feb 7, 2014 at 12:03 AM, Patrick Wendell <pwend...@gmail.com>
>> > wrote:
>> > > Hey Paul,
>> > >
>> > > Thanks for digging this up. I worked on this feature and the intent
>> > > was to give users good default behavior if they didn't include any
>> > > logging configuration on the classpath.
>> > >
>> > > The problem with assuming that CL tooling is going to fix the job is
>> > > that many people link against spark as a library and run their
>> > > application using their own scripts. In this case the first thing
>> > > people see when they run an application that links against Spark was a
>> > > big ugly logging warning.
>> > >
>> > > I'm not super familiar with log4j-over-slf4j, but this behavior of
>> > > returning null for the appenders seems a little weird. What is the use
>> > > case for using this and not just directly use slf4j-log4j12 like Spark
>> > > itself does?
>> > >
>> > > Did you have a more general fix for this in mind? Or was your plan to
>> > > just revert the existing behavior... We might be able to add a
>> > > configuration option to disable this logging default stuff. Or we
>> > > could just rip it out - but I'd like to avoid that if possible.
>> > >
>> > > - Patrick
>> > >
>> > > On Thu, Feb 6, 2014 at 11:41 PM, Paul Brown <p...@mult.ifario.us>
>> wrote:
>> > >> We have a few applications that embed Spark, and in 0.8.0 and 0.8.1,
>> we
>> > >> were able to use slf4j, but 0.9.0 broke that and unintentionally
>> forces
>> > >> direct use of log4j as the logging backend.
>> > >>
>> > >> The issue is here in the org.apache.spark.Logging trait:
>> > >>
>> > >>
>> >
>> https://github.com/apache/incubator-spark/blame/master/core/src/main/scala/org/apache/spark/Logging.scala#L107
>> > >>
>> > >> log4j-over-slf4j *always* returns an empty enumeration for appenders
>> to
>> > the
>> > >> ROOT logger:
>> > >>
>> > >>
>> >
>> https://github.com/qos-ch/slf4j/blob/master/log4j-over-slf4j/src/main/java/org/apache/log4j/Category.java?source=c#L81
>> > >>
>> > >> And this causes an infinite loop and an eventual stack overflow.
>> > >>
>> > >> I'm happy to submit a Jira and a patch, but it would be significant
>> > enough
>> > >> reversal of recent changes that it's probably worth discussing before
>> I
>> > >> sink a half hour into it.  My suggestion would be that initialization
>> > (or
>> > >> not) should be left to the user with reasonable default behavior
>> > supplied
>> > >> by the spark commandline tooling and not forced on applications that
>> > >> incorporate Spark.
>> > >>
>> > >> Thoughts/opinions?
>> > >>
>> > >> -- Paul
>> > >> --
>> > >> p...@mult.ifario.us | Multifarious, Inc. | http://mult.ifario.us/
>> >
>>

Re: 0.9.0 forces log4j usage

Reply via email to