Re: Spark 2.0.0-snapshot: IllegalArgumentException: requirement failed: chunks must be non-empty

Ted Yu Fri, 13 May 2016 09:18:01 -0700

Is it possible to come up with code snippet which reproduces the following ?


Thanks

On Fri, May 13, 2016 at 8:13 AM, Raghava Mutharaju <
m.vijayaragh...@gmail.com> wrote:

> I am able to run my application after I compiled Spark source in the
> following way
>
> ./dev/change-scala-version.sh 2.11
>
> ./dev/make-distribution.sh --name spark-2.0.0-snapshot-bin-hadoop2.6 --tgz
> -Phadoop-2.6 -DskipTests
>
> But while the application is running I get the following exception, which
> I was not getting with Spark 1.6.1. Any idea why this might be happening?
>
> java.lang.IllegalArgumentException: requirement failed: chunks must be
> non-empty
>
> at scala.Predef$.require(Predef.scala:224)
>
> at
> org.apache.spark.util.io.ChunkedByteBuffer.<init>(ChunkedByteBuffer.scala:41)
>
> at
> org.apache.spark.util.io.ChunkedByteBuffer.<init>(ChunkedByteBuffer.scala:52)
>
> at
> org.apache.spark.storage.BlockManager.getRemoteBytes(BlockManager.scala:580)
>
> at
> org.apache.spark.storage.BlockManager.getRemoteValues(BlockManager.scala:514)
>
> at org.apache.spark.storage.BlockManager.get(BlockManager.scala:601)
>
> at
> org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:653)
>
> at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:329)
>
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:280)
>
> at
> org.apache.spark.rdd.PartitionerAwareUnionRDD$$anonfun$compute$1.apply(PartitionerAwareUnionRDD.scala:100)
>
> at
> org.apache.spark.rdd.PartitionerAwareUnionRDD$$anonfun$compute$1.apply(PartitionerAwareUnionRDD.scala:99)
>
> at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
>
> at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
>
> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
>
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
>
> at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
>
> at scala.collection.mutable.SetBuilder.$plus$plus$eq(SetBuilder.scala:20)
>
> at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
>
> at scala.collection.AbstractIterator.to(Iterator.scala:1336)
>
> at scala.collection.TraversableOnce$class.toSet(TraversableOnce.scala:304)
>
> at scala.collection.AbstractIterator.toSet(Iterator.scala:1336)
>
> at
> org.daselab.sparkel.SparkELHDFSTestCopy$$anonfun$45.apply(SparkELHDFSTestCopy.scala:392)
>
> at
> org.daselab.sparkel.SparkELHDFSTestCopy$$anonfun$45.apply(SparkELHDFSTestCopy.scala:391)
>
> at
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$22.apply(RDD.scala:756)
>
> at
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$22.apply(RDD.scala:756)
>
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:318)
>
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:282)
>
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:318)
>
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:282)
>
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:318)
>
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:282)
>
> at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
>
> at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
>
> at org.apache.spark.scheduler.Task.run(Task.scala:85)
>
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>
> at java.lang.Thread.run(Thread.java:745)
>
> On Fri, May 13, 2016 at 6:33 AM, Raghava Mutharaju <
> m.vijayaragh...@gmail.com> wrote:
>
>> Thank you for the response.
>>
>> I used the following command to build from source
>>
>> build/mvn -Dhadoop.version=2.6.4 -Phadoop-2.6 -DskipTests clean package
>>
>> Would this put in the required jars in .ivy2 during the build process? If
>> so, how can I make the spark distribution runnable, so that I can use it on
>> other machines as well (make-distribution.sh no longer exists in Spark root
>> folder)?
>>
>> For compiling my application, I put in the following lines in the
>> build.sbt
>>
>> packAutoSettings
>> val spark = "org.apache.spark" %% "spark-core" % "2.0.0-SNAPSHOT"
>> val sparksql = "org.apache.spark" % "spark-sql_2.11" % "2.0.0-SNAPSHOT"
>>
>> lazy val root = (project in file(".")).
>>   settings(
>>     name := "sparkel",
>>     version := "0.1.0",
>>     scalaVersion := "2.11.8",
>>     libraryDependencies += spark,
>>     libraryDependencies += sparksql
>>   )
>>
>>
>> Regards,
>> Raghava.
>>
>>
>> On Fri, May 13, 2016 at 12:23 AM, Luciano Resende <luckbr1...@gmail.com>
>> wrote:
>>
>>> Spark has moved to build using Scala 2.11 by default in master/trunk.
>>>
>>> As for the 2.0.0-SNAPSHOT, it is actually the version of master/trunk
>>> and you might be missing some modules/profiles for your build. What command
>>> did you use to build ?
>>>
>>> On Thu, May 12, 2016 at 9:01 PM, Raghava Mutharaju <
>>> m.vijayaragh...@gmail.com> wrote:
>>>
>>>> Hello All,
>>>>
>>>> I built Spark from the source code available at
>>>> https://github.com/apache/spark/. Although I haven't specified the
>>>> "-Dscala-2.11" option (to build with Scala 2.11), from the build messages I
>>>> see that it ended up using Scala 2.11. Now, for my application sbt, what
>>>> should be the spark version? I tried the following
>>>>
>>>> val spark = "org.apache.spark" %% "spark-core" % "2.0.0-SNAPSHOT"
>>>> val sparksql = "org.apache.spark" % "spark-sql_2.11" % "2.0.0-SNAPSHOT"
>>>>
>>>> and scalaVersion := "2.11.8"
>>>>
>>>> But this setting of spark version gives sbt error
>>>>
>>>> unresolved dependency: org.apache.spark#spark-core_2.11;2.0.0-SNAPSHOT
>>>>
>>>> I guess this is because the repository doesn't contain 2.0.0-SNAPSHOT.
>>>> Does this mean, the only option is to put all the required jars in the lib
>>>> folder (unmanaged dependencies)?
>>>>
>>>> Regards,
>>>> Raghava.
>>>>
>>>
>>>
>>>
>>> --
>>> Luciano Resende
>>> http://twitter.com/lresende1975
>>> http://lresende.blogspot.com/
>>>
>>
>>
>>
>> --
>> Regards,
>> Raghava
>> http://raghavam.github.io
>>
>
>
>
> --
> Regards,
> Raghava
> http://raghavam.github.io
>

Re: Spark 2.0.0-snapshot: IllegalArgumentException: requirement failed: chunks must be non-empty

Reply via email to