Sean Owen created SPARK-2646:
--------------------------------

             Summary: log4j initialization not quite compatible with log4j 2.x
                 Key: SPARK-2646
                 URL: https://issues.apache.org/jira/browse/SPARK-2646
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 1.0.0, 1.0.1
            Reporter: Sean Owen
            Priority: Minor


The logging code that handles log4j initialization leads to an stack overflow 
error when used with log4j 2.x, which has just been released. This occurs even 
a downstream project has correctly adjusted SLF4J bindings, and that is the 
right thing to do for log4j 2.x, since it is effectively a separate project 
from 1.x.

Here is the relevant bit of Logging.scala:

{code}
  private def initializeLogging() {
    // If Log4j is being used, but is not initialized, load a default 
properties file
    val binder = StaticLoggerBinder.getSingleton
    val usingLog4j = 
binder.getLoggerFactoryClassStr.endsWith("Log4jLoggerFactory")
    val log4jInitialized = 
LogManager.getRootLogger.getAllAppenders.hasMoreElements
    if (!log4jInitialized && usingLog4j) {
      val defaultLogProps = "org/apache/spark/log4j-defaults.properties"
      Option(Utils.getSparkClassLoader.getResource(defaultLogProps)) match {
        case Some(url) =>
          PropertyConfigurator.configure(url)
          log.info(s"Using Spark's default log4j profile: $defaultLogProps")
        case None =>
          System.err.println(s"Spark was unable to load $defaultLogProps")
      }
    }
    Logging.initialized = true

    // Force a call into slf4j to initialize it. Avoids this happening from 
mutliple threads
    // and triggering this: 
http://mailman.qos.ch/pipermail/slf4j-dev/2010-April/002956.html
    log
  }
{code}

The first minor issue is that there is a call to a logger inside this method, 
which is initializing logging. In this situation, it ends up causing the 
initialization to be called recursively until the stack overflow. It would be 
slightly tidier to log this only after Logging.initialized = true. Or not at 
all. But it's not the root problem, or else, it would not work at all now. 

The calls to log4j classes here always reference log4j 1.2 no matter what. For 
example, there is not getAllAppenders in log4j 2.x. That's fine. Really, 
"usingLog4j" means "using log4j 1.2" and "log4jInitialized" means "log4j 1.2 is 
initialized".

usingLog4j should be false for log4j 2.x, because the initialization only 
matters for log4j 1.2. But, it's true, and that's the real issue. And 
log4jInitialized is always false, since calls to the log4j 1.2 API are stubs 
and no-ops in this setup, where the caller has swapped in log4j 2.x. Hence the 
loop.

This is fixed, I believe, if "usingLog4j" can be false for log4j 2.x. The SLF4J 
static binding class has the same name for both versions, unfortunately, which 
causes the issue. However they're in different packages. For example, if the 
test included "... and begins with org.slf4j", it should work, as the SLF4J 
binding for log4j 2.x is provided by log4j 2.x at the moment, and is in package 
org.apache.logging.slf4j.

Of course, I assume that SLF4J will eventually offer its own binding. I hope to 
goodness they at least name the binding class differently, or else this will 
again not work. But then some other check can probably be made.

(Credit to Agust Egilsson for finding this; at his request I'm opening a JIRA 
for him. I'll propose a PR too.)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to