Antonin Fischer created ZEPPELIN-6014: -----------------------------------------
Summary: The Docker image apache/zeppelin:0.11.1 contains macOS metadata for jar files. This leads to a java.util.zip.ZipException: zip END header not found error. Key: ZEPPELIN-6014 URL: https://issues.apache.org/jira/browse/ZEPPELIN-6014 Project: Zeppelin Issue Type: Bug Components: zeppelin-integration Affects Versions: 0.11.1 Reporter: Antonin Fischer The Docker image {{apache/zeppelin:0.11.1}} contains macOS metadata for jar files. This leads to a {{java.util.zip.ZipException: zip END header not found}} error in the Spark interpreter due to the inclusion of macOS {{._}} metadata files. Exception: {code:java} ERROR [2024-04-12 22:50:08,356] ( {FIFOScheduler-interpreter_1585837962-Worker-1} SparkInterpreter.java[open]:139) - Fail to open SparkInterpreter scala.reflect.internal.FatalError: Error accessing /mnt/disk2/yarn/local/usercache/ondrej.cerny/appcache/application_1711465088664_2677/container_e30_1711465088664_2677_01_000001/._spark-scala-2.12-0.11.1.jar at scala.tools.nsc.classpath.AggregateClassPath.$anonfun$list$3(AggregateClassPath.scala:113) at scala.collection.Iterator.foreach(Iterator.scala:943) at scala.collection.Iterator.foreach$(Iterator.scala:943) at scala.collection.AbstractIterator.foreach(Iterator.scala:1431) at scala.collection.IterableLike.foreach(IterableLike.scala:74) at scala.collection.IterableLike.foreach$(IterableLike.scala:73) at scala.collection.AbstractIterable.foreach(Iterable.scala:56) at scala.tools.nsc.classpath.AggregateClassPath.list(AggregateClassPath.scala:101) at scala.tools.nsc.util.ClassPath.list(ClassPath.scala:36) at scala.tools.nsc.util.ClassPath.list$(ClassPath.scala:36) at scala.tools.nsc.classpath.AggregateClassPath.list(AggregateClassPath.scala:30) at scala.tools.nsc.symtab.SymbolLoaders$PackageLoader.doComplete(SymbolLoaders.scala:298) at scala.tools.nsc.symtab.SymbolLoaders$SymbolLoader.complete(SymbolLoaders.scala:250) at scala.reflect.internal.Symbols$Symbol.completeInfo(Symbols.scala:1542) at scala.reflect.internal.Symbols$Symbol.info(Symbols.scala:1514) at scala.reflect.internal.Mirrors$RootsBase.init(Mirrors.scala:258) at scala.tools.nsc.Global.rootMirror$lzycompute(Global.scala:74) at scala.tools.nsc.Global.rootMirror(Global.scala:72) at scala.tools.nsc.Global.rootMirror(Global.scala:44) at scala.reflect.internal.Definitions$DefinitionsClass.ObjectClass$lzycompute(Definitions.scala:294) at scala.reflect.internal.Definitions$DefinitionsClass.ObjectClass(Definitions.scala:294) at scala.reflect.internal.Definitions$DefinitionsClass.init(Definitions.scala:1504) at scala.tools.nsc.Global$Run.<init>(Global.scala:1213) at scala.tools.nsc.interpreter.IMain._initialize(IMain.scala:124) at scala.tools.nsc.interpreter.IMain.initializeSynchronous(IMain.scala:146) at org.apache.zeppelin.spark.SparkScala212Interpreter.createSparkILoop(SparkScala212Interpreter.scala:195) at org.apache.zeppelin.spark.AbstractSparkScalaInterpreter.open(AbstractSparkScalaInterpreter.java:116) at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:124) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:861) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:769) at org.apache.zeppelin.scheduler.Job.run(Job.java:186) at org.apache.zeppelin.scheduler.AbstractScheduler.runJob(AbstractScheduler.java:135) at org.apache.zeppelin.scheduler.FIFOScheduler.lambda$runJobInScheduler$0(FIFOScheduler.java:42) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) Caused by: java.io.IOException: Error accessing /mnt/disk2/yarn/local/usercache/ondrej.cerny/appcache/application_1711465088664_2677/container_e30_1711465088664_2677_01_000001/._spark-scala-2.12-0.11.1.jar at scala.reflect.io.FileZipArchive.scala$reflect$io$FileZipArchive$$openZipFile(ZipArchive.scala:182) at scala.reflect.io.FileZipArchive.root$lzycompute(ZipArchive.scala:230) at scala.reflect.io.FileZipArchive.root(ZipArchive.scala:227) at scala.reflect.io.FileZipArchive.allDirs$lzycompute(ZipArchive.scala:264) at scala.reflect.io.FileZipArchive.allDirs(ZipArchive.scala:264) at scala.tools.nsc.classpath.ZipArchiveFileLookup.findDirEntry(ZipArchiveFileLookup.scala:76) at scala.tools.nsc.classpath.ZipArchiveFileLookup.list(ZipArchiveFileLookup.scala:63) at scala.tools.nsc.classpath.ZipArchiveFileLookup.list$(ZipArchiveFileLookup.scala:62) at scala.tools.nsc.classpath.ZipAndJarClassPathFactory$ZipArchiveClassPath.list(ZipAndJarFileLookupFactory.scala:58) at scala.tools.nsc.classpath.AggregateClassPath.$anonfun$list$3(AggregateClassPath.scala:105) ... 36 more Caused by: java.util.zip.ZipException: zip END header not found at java.base/java.util.zip.ZipFile$Source.zerror(ZipFile.java:1769) at java.base/java.util.zip.ZipFile$Source.findEND(ZipFile.java:1652) at java.base/java.util.zip.ZipFile$Source.initCEN(ZipFile.java:1659) at java.base/java.util.zip.ZipFile$Source.<init>(ZipFile.java:1463) at java.base/java.util.zip.ZipFile$Source.get(ZipFile.java:1426) at java.base/java.util.zip.ZipFile$CleanableResource.<init>(ZipFile.java:742) at java.base/java.util.zip.ZipFile$CleanableResource.get(ZipFile.java:859) at java.base/java.util.zip.ZipFile.<init>(ZipFile.java:257) at java.base/java.util.zip.ZipFile.<init>(ZipFile.java:186) at java.base/java.util.zip.ZipFile.<init>(ZipFile.java:200) at scala.reflect.io.FileZipArchive.scala$reflect$io$FileZipArchive$$openZipFile(ZipArchive.scala:179) ... 45 more {code} {{ls in docker image:}} {code:java} zeppelin@e2a754418536:~/interpreter/spark$ ls -la total 31620 drwxr-xr-x 1 root root 408 Nov 14 2022 . drwxr-xr-x 1 root root 864 Apr 3 00:56 .. -rw-r--r-- 1 root root 163 Nov 14 2022 ._interpreter-setting.json -rw-r--r-- 1 root root 12195 Nov 14 2022 interpreter-setting.json -rwxr-xr-x 1 root root 163 Nov 14 2022 ._META-INF drwxr-xr-x 1 root root 112 Apr 3 00:56 META-INF -rwxr-xr-x 1 root root 163 Nov 14 2022 ._pyspark drwxr-xr-x 1 root root 136 Apr 3 00:56 pyspark -rwxr-xr-x 1 root root 163 Nov 14 2022 ._python drwxr-xr-x 1 root root 164 Apr 3 00:56 python -rwxr-xr-x 1 root root 163 Nov 14 2022 ._R drwxr-xr-x 1 root root 16 Nov 14 2022 R -rwxr-xr-x 1 root root 163 Nov 14 2022 ._scala-2.12 drwxr-xr-x 1 root root 112 Apr 3 00:56 [scala-2](https://youtrack.seznam.net/issue/scala-2).12 -rwxr-xr-x 1 root root 163 Nov 14 2022 ._scala-2.13 drwxr-xr-x 1 root root 112 Apr 3 00:56 [scala-2](https://youtrack.seznam.net/issue/scala-2).13 -rw-r--r-- 1 root root 163 Nov 14 2022 ._spark-interpreter-0.11.1.jar -rw-r--r-- 1 root root 32332642 Nov 14 2022 spark-interpreter-0.11.1.jar zeppelin@e2a754418536:~/interpreter/spark$ cat ._spark-interpreter-0.11.1.jar Mac OS X 2qATTR{code} {{{}{}}}-rw-r--r-- 1 root root 163 Nov 14 2022 ._spark-interpreter-0.11.1.jar it is wrapped to the classpath and spark driver fails, because this is on java classpath... -- This message was sent by Atlassian Jira (v8.20.10#820010)