[ https://issues.apache.org/jira/browse/SPARK-40489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17606872#comment-17606872 ]
Jungtaek Lim commented on SPARK-40489: -------------------------------------- https://www.slf4j.org/news.html 2022-08-20 - Release of SLF4J 2.0.0 2022-09-14 - Release of SLF4J 2.0.1 It sounds like the new major version upgrade is done in a month ago and we don't quite know about stability. It doesn't seem like it is quite urgent to set the priority to critical. (I'm going to lower down the priority.) Also, we cannot easily move on before we make clear there is NO breakage/behavioral change on upgrading Spark version which migrates SLF4J1 to SLF4J2. We wouldn't be happy with breaking/behavioral changes the dependency has brought, hence we concern about major version upgrade on dependency. The comment about log4j1 is moot as recent version of Spark uses log4j2. > Spark 3.3.0 breaks with SFL4J 2. > -------------------------------- > > Key: SPARK-40489 > URL: https://issues.apache.org/jira/browse/SPARK-40489 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 3.3.0 > Reporter: Garret Wilson > Priority: Critical > > Spark breaks fundamentally with SLF4J 2.x because it uses > {{StaticLoggerBinder}}. > SLF4J is the logging facade that is meant to shield the application from the > implementation, whether it be Log4J or Logback or whatever. Historically > SLF4J 1.x used a bad approach to configuration: it used a > {{StaticLoggerBinder}} (a global static singleton instance) rather than the > Java {{ServiceLoader}} mechanism. > SLF4J 2.x, which has been in development for years, has been released. It > finally switches to use the {{ServiceLoader}} mechanism. As [described in the > FAQ|https://www.slf4j.org/faq.html#changesInVersion200], the API should be > compatible; an application just needs to use the latest Log4J/Logback > implementation which has the service loader. > *Above all the application must _not_ use the low-level > {{StaticLoggerBinder}} method, because it has been removed!* > Unfortunately > [{{org.apache.spark.internal.Logging}}|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/internal/Logging.scala] > uses {{StaticLoggerBinder}} and completely breaks any environment using > SLF4J 2.x. For example, in my application, I have pulled in the SLF4J 2.x API > and pulled in the Logback 1.4.x libraries (I'm not even using Log4J). Spark > breaks completely just trying to get a Spark session: > {noformat} > Caused by: java.lang.NoClassDefFoundError: org/slf4j/impl/StaticLoggerBinder > at > org.apache.spark.internal.Logging$.org$apache$spark$internal$Logging$$isLog4j2(Logging.scala:232) > at > org.apache.spark.internal.Logging.initializeLogging(Logging.scala:129) > at > org.apache.spark.internal.Logging.initializeLogIfNecessary(Logging.scala:115) > at > org.apache.spark.internal.Logging.initializeLogIfNecessary$(Logging.scala:109) > at > org.apache.spark.SparkContext.initializeLogIfNecessary(SparkContext.scala:84) > at > org.apache.spark.internal.Logging.initializeLogIfNecessary(Logging.scala:106) > at > org.apache.spark.internal.Logging.initializeLogIfNecessary$(Logging.scala:105) > at > org.apache.spark.SparkContext.initializeLogIfNecessary(SparkContext.scala:84) > at org.apache.spark.internal.Logging.log(Logging.scala:53) > at org.apache.spark.internal.Logging.log$(Logging.scala:51) > at org.apache.spark.SparkContext.log(SparkContext.scala:84) > at org.apache.spark.internal.Logging.logInfo(Logging.scala:61) > at org.apache.spark.internal.Logging.logInfo$(Logging.scala:60) > at org.apache.spark.SparkContext.logInfo(SparkContext.scala:84) > at org.apache.spark.SparkContext.<init>(SparkContext.scala:195) > at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2704) > at > org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:953) > at scala.Option.getOrElse(Option.scala:201) > at > org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:947) > {noformat} > This is because Spark is playing low-level tricks to find out if the logging > platform is Log4J, and relying on {{StaticLoggerBinder}} to do it. > {code} > private def isLog4j2(): Boolean = { > // This distinguishes the log4j 1.2 binding, currently > // org.slf4j.impl.Log4jLoggerFactory, from the log4j 2.0 binding, > currently > // org.apache.logging.slf4j.Log4jLoggerFactory > val binderClass = StaticLoggerBinder.getSingleton.getLoggerFactoryClassStr > "org.apache.logging.slf4j.Log4jLoggerFactory".equals(binderClass) > } > {code} > Whatever the wisdom of Spark's relying on Log4J-specific functionality, Spark > should not be using {{StaticLoggerBinder}} to do that detection. There are > many other approaches. (The code itself suggest one approach: > {{LogManager.getRootLogger.asInstanceOf[Log4jLogger]}}. You could check to > see if the root logger actually is a {{Log4jLogger}}. There may be even > better approaches.) > The other big problem is relying on the Log4J classes themselves. By relying > on those classes, you force me to bring in Log4J as a dependency, which in > the latest versions will register themselves with the service loader > mechanism, causing conflicting SLF4J implementations. > It is paramount that you: > * Remove all reliance ton {{StaticLoggerBinder}}. If you must must must use > it, please check for it using reflection! > * Remove all static references to the Log4J classes. (In an ideal world you > wouldn't even be doing Log4J-specific things anyway.) If you must must must > do Log4J-specific things, access the classes via reflection; don't statically > link them in the code. > The current situation absolutely (and unnecessarily) 100% breaks the use of > SLF4J 2.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org