Thomas Humphries created LOG4J2-2903:
----------------------------------------

             Summary: Scala: Cannot log to HDFS with a custom appender?
                 Key: LOG4J2-2903
                 URL: https://issues.apache.org/jira/browse/LOG4J2-2903
             Project: Log4j 2
          Issue Type: Bug
          Components: Appenders, Scala API
    Affects Versions: 2.12.1
         Environment: Spark 2.4.6, Scala 2.1.2, Hadoop 3.2.1

Unit tests: Windows 10, maven Eclipse 4.7 (Scala IDE) Junit 4

Runtime tests:   Ubuntu 16.04,18.04, (test/staging), Spark cluster, HDFS cluster
            Reporter: Thomas Humphries
             Fix For: Scala 12.0
         Attachments: HDFSAppender.scala, LoggingTest.scala, 
MavenDependencies.txt, log4j.properties

*Overview*
[(x-post from public StackOverflow 
question)|https://stackoverflow.com/questions/63276838/scala-how-to-create-logs-on-hdfs-with-a-log4j-custom-appender]

We want to log our Spark job activity using log4j to write log files to HDFS.
 - Spark 2.4.6, Scala 2.1.2, Hadoop 3.2.1

We were Unable to find a native apache log4j appenders to write to HDFS (Apache 
Flume in not an option), so set out to write our own.

*Java works*

Constructed a simple custom log4j appender in Java which works fine for both 
unit test and at run-time (Spark-submit with our java.jar writes to HDFS).

*Scala does not work*

Scala logic matched Java, however, logging in Scala fails consistently with a 
Stack overflow error.

We cannot find anything obviously wrong?

*The implementation:*

Here is a simple hello-world project to reproduce the issue:
 - Scala project (Maven build), containing a simple main class, and:
 - HDFSAppender class (extends log4j AppenderSkeleton)
 - LoggingTest unit test class
 - log4j configuration file for testing

**The error**

Any logging event causes the program to crash with a stack overflow error:
{code:bash}
   java.util.concurrent.ExecutionException: java.lang.StackOverflowError
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:206)
        at 
org.apache.hadoop.util.ShutdownHookManager.executeShutdown(ShutdownHookManager.java:124)
        at 
org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:95)
Caused by: java.lang.StackOverflowError
        at org.apache.log4j.PatternLayout.format(PatternLayout.java:506)
        at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:310)
        at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
{code}
Then the following messages appear, implying some kind of loop/race condition:
  
{code:bash}
at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
        at 
org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
        at org.apache.log4j.Category.callAppenders(Category.java:206)
        at org.apache.log4j.Category.forcedLog(Category.java:391)
        at org.apache.log4j.Category.log(Category.java:856)
        at org.slf4j.impl.Log4jLoggerAdapter.warn(Log4jLoggerAdapter.java:401)
        at org.apache.spark.internal.Logging.logWarning(Logging.scala:66)
        at org.apache.spark.internal.Logging.logWarning$(Logging.scala:65)
        at org.apache.spark.SparkContext$.logWarning(SparkContext.scala:2442)
        at 
org.apache.spark.SparkContext$.$anonfun$assertNoOtherContextIsRunning$5(SparkContext.scala:2500)
        at 
org.apache.spark.SparkContext$.$anonfun$assertNoOtherContextIsRunning$5$adapted(SparkContext.scala:2491)
        at scala.Option.foreach(Option.scala:274)
        at 
org.apache.spark.SparkContext$.assertNoOtherContextIsRunning(SparkContext.scala:2491)
        at 
org.apache.spark.SparkContext$.markPartiallyConstructed(SparkContext.scala:2568)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:85)
        at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2520)
        at 
org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$5(SparkSession.scala:935)
        at scala.Option.getOrElse(Option.scala:138)
        at 
org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:926)
        at 
com.obfuscated.test.spark.log4j.HDFSLogWriter.write(HDFSLogWriter.scala:23)
        at 
com.obfuscated.test.spark.log4j.HDFSAppender.append(HDFSAppender.scala:63)
(repeats 48 times...)
{code}





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to