Hi David It is good to hear that it works now!
Makes sense, with mvn package, Crail was not installed and the build of crail-spark-io probably pulled an older version of Crail from Maven central. Regards Adrian On 6/24/19 21:42, David Crespi wrote: > I found the issue. My Dockerfile was using mvn package instead of mvn > install for the crail repo, > so crail-spark-io was never finding the correct jars. All better now! > > Regards, > > David > > From: David Crespi<mailto:[email protected]> > Sent: Monday, June 24, 2019 8:03 AM > To: Jonas Pfefferle<mailto:[email protected]>; > [email protected]<mailto:[email protected]>; Adrian Schüpbach > Gribex<mailto:[email protected]> > Subject: RE: Crail-Spark Shuffle Manager config error > > I started a new build for a container image this morning and I am still > getting this error when building > > the crail-spark code. > > > > This should be all new pulls from all the repos, so there shouldn’t be any > random jars laying around the > > container. Is this missing a file or something? > > > > [INFO] /crail-spark-io/src/main/scala:-1: info: compiling > > [INFO] Compiling 12 source files to /crail-spark-io/target/classes at > 1561388304205 > > [ERROR] > /crail-spark-io/src/main/scala/org/apache/spark/storage/CrailDispatcher.scala:119: > error: value createConfigurationFromFile is not a member of object > org.apache.crail.conf.CrailConfiguration > > [ERROR] val crailConf = CrailConfiguration.createConfigurationFromFile(); > > [ERROR] ^ > > [ERROR] one error found > > [INFO] > ------------------------------------------------------------------------ > > [INFO] BUILD FAILURE > > [INFO] > ------------------------------------------------------------------------ > > [INFO] Total time: 15.262 s > > [INFO] Finished at: 2019-06-24T07:58:27-07:00 > > [INFO] > ------------------------------------------------------------------------ > > > > Regards, > > > > David > > > > > > ________________________________ > From: Jonas Pfefferle <[email protected]> > Sent: Friday, June 21, 2019 1:20:14 AM > To: [email protected]; David Crespi; Adrian Schüpbach Gribex > Subject: Re: Crail-Spark Shuffle Manager config error > > The hash in the version looks correct. Make sure there is no old jars > floating around in your jars directory. > I suggest deleting all of them and copy the new ones in from assembly/... > > > Regards, > Jonas > > On Thu, 20 Jun 2019 16:40:39 +0000 > David Crespi <[email protected]> wrote: >> Did a fresh pull in the morning. >> I am running with spark 2.4.2, so had to make changes to the pom.xml >> file. >> >> This is what version of crail is: v1.1-7-ga6e622f >> Does this look correct? >> >> Regards, >> >> David >> >> ________________________________ >> From: Adrian Schüpbach Gribex <[email protected]> >> Sent: Wednesday, June 19, 2019 7:27:52 AM >> To: [email protected]; David Crespi; [email protected] >> Subject: RE: Crail-Spark Shuffle Manager config error >> >> Hi David >> >> Do you use the latest Apache Crail from master? >> >> It works only with this version. >> >> Regards >> Adrian >> >> >> Am 19. Juni 2019 16:19:05 MESZ schrieb David Crespi >> <[email protected]>: >> >> Adrian, >> >> Did you change the code in the crail-spark-io? >> >> I’m getting a build error now. >> >> >> [INFO] /crail-spark-io/src/main/scala:-1: info: compiling >> >> [INFO] Compiling 12 source files to /crail-spark-io/target/classes >> at 1560953910105 >> >> [ERROR] >> /crail-spark-io/src/main/scala/org/apache/spark/storage/CrailDispatcher.scala:119: >> error: value createConfigurationFromFile is not a member of object >> org.apache.crail.conf.CrailConfiguration >> >> [ERROR] val crailConf = >> CrailConfiguration.createConfigurationFromFile(); >> >> [ERROR] ^ >> >> [ERROR] one error found >> >> [INFO] >> ------------------------------------------------------------------------ >> >> [INFO] BUILD FAILURE >> >> [INFO] >> ------------------------------------------------------------------------ >> >> [INFO] Total time: 15.007 s >> >> [INFO] Finished at: 2019-06-19T07:18:33-07:00 >> >> >> Regards, >> >> >> David >> ________________________________ >> From: Adrian Schuepbach <[email protected]> >> Sent: Wednesday, June 19, 2019 5:28:30 AM >> To: [email protected] >> Subject: Re: Crail-Spark Shuffle Manager config error >> >> Hi David >> >> I changed the code to use the new API to create the Crail >> configuration. >> Please pull, build and install the newest version. >> >> Please also remove the old jars from the directory where the >> classpath >> is pointing to, since if you have multiple jars of different >> versions >> in the classpath, it is unclear, which one will be taken. >> >> Best regards >> Adrian >> >> On 6/19/19 13:43, Adrian Schuepbach wrote: >> Hi David >> >> This is caused by the API change to create a Crail configuration >> object. >> The new API has three different static methods to create the Crail >> configuration instead of the empty constructor. >> >> I am adapting the dependent repositories to the new API. >> >> What is a bit unclear to me is why you hit this. The >> crail-dispatcher's >> dependency is to crail-client 1.0, however the new API is only >> available >> on the current master (version 1.2-incubating-SNAPSHOT). >> >> If you built Apache Crail from source, you get >> 1.2-incubating-SNAPSHOT, >> but not the 1.0 version. I would have expected that you cannot even >> build >> crail-spark-io. >> >> In any case, the fix is shortly ready. >> >> Regards >> Adrian >> >> On 6/19/19 09:21, Jonas Pfefferle wrote: >> Hi David, >> >> >> I assume you are running with latest Crail master. We just pushed a >> change to the CrailConfiguration initialization which we have not >> adapted yet in the shuffle plugin (Should be a one line fix). >> @Adrian >> Can you take a look. >> >> Regards, >> Jonas >> >> On Tue, 18 Jun 2019 23:24:48 +0000 >> David Crespi <[email protected]> wrote: >> Hi, >> I’m getting what looks to be a configuration error when trying to >> use >> the CrailShuffleManager. >> (spark.shuffle.manager >> org.apache.spark.shuffle.crail.CrailShuffleManager) >> >> It seems like a basic error, but other things are running okay until >> I add in the line above in to my spark-defaults.conf >> File. >> I have my environment variable for crail home set, as well as for >> the >> disni libs using: >> LD_LIBRARY_PATH=/usr/local/lib >> $ ls -l /usr/local/lib/ >> total 156 >> -rwxr-xr-x 1 root root 947 Jun 18 08:11 libdisni.la >> lrwxrwxrwx 1 root root 17 Jun 18 08:11 libdisni.so -> >> libdisni.so.0.0.0 >> lrwxrwxrwx 1 root root 17 Jun 18 08:11 libdisni.so.0 -> >> libdisni.so.0.0.0 >> -rwxr-xr-x 1 root root 149784 Jun 18 08:11 libdisni.so.0.0.0 >> >> I also have a environment variable for classpath set: >> CLASSPATH=/disni/target/*:/jNVMf/target/*:/crail/jars/* >> >> Could the classpath veriable be the issue? >> >> 19/06/18 15:59:47 DEBUG Client: getting client out of cache: >> org.apache.hadoop.ipc.Client@7bebcd65 >> 19/06/18 15:59:47 DEBUG PerformanceAdvisory: Both short-circuit >> local >> reads and UNIX domain socket are disabled. >> 19/06/18 15:59:47 DEBUG DataTransferSaslUtil: DataTransferProtocol >> not using SaslPropertiesResolver, no QOP found in configuration for >> dfs.data.transfer.protection >> 19/06/18 15:59:48 INFO MemoryStore: Block broadcast_0 stored as >> values in memory (estimated size 288.9 KB, free 366.0 MB) >> 19/06/18 15:59:48 DEBUG BlockManager: Put block broadcast_0 locally >> took 123 ms >> 19/06/18 15:59:48 DEBUG BlockManager: Putting block broadcast_0 >> without replication took 125 ms >> 19/06/18 15:59:48 INFO MemoryStore: Block broadcast_0_piece0 stored >> as bytes in memory (estimated size 23.8 KB, free 366.0 MB) >> 19/06/18 15:59:48 INFO BlockManagerInfo: Added broadcast_0_piece0 in >> memory on master:34103 (size: 23.8 KB, free: 366.3 MB) >> 19/06/18 15:59:48 DEBUG BlockManagerMaster: Updated info of block >> broadcast_0_piece0 >> 19/06/18 15:59:48 DEBUG BlockManager: Told master about block >> broadcast_0_piece0 >> 19/06/18 15:59:48 DEBUG BlockManager: Put block broadcast_0_piece0 >> locally took 7 ms >> 19/06/18 15:59:48 DEBUG BlockManager: Putting block >> broadcast_0_piece0 without replication took 8 ms >> 19/06/18 15:59:48 INFO SparkContext: Created broadcast 0 from >> newAPIHadoopFile at TeraSort.scala:60 >> 19/06/18 15:59:48 DEBUG Client: The ping interval is 60000 ms. >> 19/06/18 15:59:48 DEBUG Client: Connecting to >> NameNode-1/192.168.3.7:54310 >> 19/06/18 15:59:48 DEBUG Client: IPC Client (199041063) connection to >> NameNode-1/192.168.3.7:54310 from hduser: starting, having >> connections 1 >> 19/06/18 15:59:48 DEBUG Client: IPC Client (199041063) connection to >> NameNode-1/192.168.3.7:54310 from hduser sending #0 >> 19/06/18 15:59:48 DEBUG Client: IPC Client (199041063) connection to >> NameNode-1/192.168.3.7:54310 from hduser got value #0 >> 19/06/18 15:59:48 DEBUG ProtobufRpcEngine: Call: getFileInfo took >> 56ms >> 19/06/18 15:59:48 DEBUG Client: IPC Client (199041063) connection to >> NameNode-1/192.168.3.7:54310 from hduser sending #1 >> 19/06/18 15:59:48 DEBUG Client: IPC Client (199041063) connection to >> NameNode-1/192.168.3.7:54310 from hduser got value #1 >> 19/06/18 15:59:48 DEBUG ProtobufRpcEngine: Call: getListing took 3ms >> 19/06/18 15:59:48 DEBUG FileInputFormat: Time taken to get >> FileStatuses: 142 >> 19/06/18 15:59:48 INFO FileInputFormat: Total input paths to process >> : 2 >> 19/06/18 15:59:48 DEBUG FileInputFormat: Total # of splits generated >> by getSplits: 2, TimeTaken: 145 >> 19/06/18 15:59:48 DEBUG FileCommitProtocol: Creating committer >> org.apache.spark.internal.io.HadoopMapReduceCommitProtocol; job 1; >> output=hdfs://NameNode-1:54310/tmp/data_sort; dynamic=false >> 19/06/18 15:59:48 DEBUG FileCommitProtocol: Using (String, String, >> Boolean) constructor >> 19/06/18 15:59:48 INFO FileOutputCommitter: File Output Committer >> Algorithm version is 1 >> 19/06/18 15:59:48 DEBUG DFSClient: /tmp/data_sort/_temporary/0: >> masked=rwxr-xr-x >> 19/06/18 15:59:48 DEBUG Client: IPC Client (199041063) connection to >> NameNode-1/192.168.3.7:54310 from hduser sending #2 >> 19/06/18 15:59:48 DEBUG Client: IPC Client (199041063) connection to >> NameNode-1/192.168.3.7:54310 from hduser got value #2 >> 19/06/18 15:59:48 DEBUG ProtobufRpcEngine: Call: mkdirs took 3ms >> 19/06/18 15:59:48 DEBUG ClosureCleaner: Cleaning lambda: >> $anonfun$write$1 >> 19/06/18 15:59:48 DEBUG ClosureCleaner: +++ Lambda closure >> ($anonfun$write$1) is now cleaned +++ >> 19/06/18 15:59:48 INFO SparkContext: Starting job: runJob at >> SparkHadoopWriter.scala:78 >> 19/06/18 15:59:48 INFO CrailDispatcher: CrailStore starting version >> 400 >> 19/06/18 15:59:48 INFO CrailDispatcher: spark.crail.deleteonclose >> false >> 19/06/18 15:59:48 INFO CrailDispatcher: spark.crail.deleteOnStart >> true >> 19/06/18 15:59:48 INFO CrailDispatcher: spark.crail.preallocate 0 >> 19/06/18 15:59:48 INFO CrailDispatcher: spark.crail.writeAhead 0 >> 19/06/18 15:59:48 INFO CrailDispatcher: spark.crail.debug false >> 19/06/18 15:59:48 INFO CrailDispatcher: spark.crail.serializer >> org.apache.spark.serializer.CrailSparkSerializer >> 19/06/18 15:59:48 INFO CrailDispatcher: spark.crail.shuffle.affinity >> true >> 19/06/18 15:59:48 INFO CrailDispatcher: >> spark.crail.shuffle.outstanding 1 >> 19/06/18 15:59:48 INFO CrailDispatcher: >> spark.crail.shuffle.storageclass 0 >> 19/06/18 15:59:48 INFO CrailDispatcher: >> spark.crail.broadcast.storageclass 0 >> Exception in thread "dag-scheduler-event-loop" >> java.lang.IllegalAccessError: tried to access method >> org.apache.crail.conf.CrailConfiguration.<init>()V from class >> org.apache.spark.storage.CrailDispatcher >> at >> org.apache.spark.storage.CrailDispatcher.org$apache$spark$storage$CrailDispatcher$$init(CrailDispatcher.scala:119) >> at >> org.apache.spark.storage.CrailDispatcher$.get(CrailDispatcher.scala:662) >> at >> org.apache.spark.shuffle.crail.CrailShuffleManager.registerShuffle(CrailShuffleManager.scala:52) >> at >> org.apache.spark.ShuffleDependency.<init>(Dependency.scala:94) >> at >> org.apache.spark.rdd.ShuffledRDD.getDependencies(ShuffledRDD.scala:87) >> at >> org.apache.spark.rdd.RDD.$anonfun$dependencies$2(RDD.scala:240) >> at scala.Option.getOrElse(Option.scala:138) >> at org.apache.spark.rdd.RDD.dependencies(RDD.scala:238) >> at >> org.apache.spark.scheduler.DAGScheduler.getShuffleDependencies(DAGScheduler.scala:512) >> at >> org.apache.spark.scheduler.DAGScheduler.getOrCreateParentStages(DAGScheduler.scala:461) >> at >> org.apache.spark.scheduler.DAGScheduler.createResultStage(DAGScheduler.scala:448) >> at >> org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:962) >> at >> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2067) >> at >> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2059) >> at >> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2048) >> at >> org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49) >> >> Regards, >> >> David >> >> >> -- >> Adrian Schüpbach, Dr. sc. ETH Zürich >> > > -- Adrian Schüpbach, Dr. sc. ETH Zürich
