Re: key not found: sportingpulse.com in Spark SQL 1.5.0
This is a bug in DataFrame caching. You can avoid caching or turn off compression. It is fixed in Spark 1.5.1 On Sat, Oct 31, 2015 at 2:31 AM, Silvio Fiorito < silvio.fior...@granturing.com> wrote: > I don’t believe I have it on 1.5.1. Are you able to test the data locally > to confirm, or is it too large? > > From: "Zhang, Jingyu" > Date: Friday, October 30, 2015 at 7:31 PM > To: Silvio Fiorito > Cc: Ted Yu , user > > Subject: Re: key not found: sportingpulse.com in Spark SQL 1.5.0 > > Thanks Silvio and Ted, > > Can you please let me know how to fix this intermittent issues? Should I > wait EMR upgrading to support the Spark 1.5.1 or change my code from > DataFrame to normal Spark map-reduce? > > Regards, > > Jingyu > > On 31 October 2015 at 09:40, Silvio Fiorito > wrote: > >> It's something due to the columnar compression. I've seen similar >> intermittent issues when caching DataFrames. "sportingpulse.com" is a >> value in one of the columns of the DF. >> -------------- >> From: Ted Yu >> Sent: 10/30/2015 6:33 PM >> To: Zhang, Jingyu >> Cc: user >> Subject: Re: key not found: sportingpulse.com in Spark SQL 1.5.0 >> >> I searched for sportingpulse in *.scala and *.java files under 1.5 >> branch. >> There was no hit. >> >> mvn dependency doesn't show sportingpulse either. >> >> Is it possible this is specific to EMR ? >> >> Cheers >> >> On Fri, Oct 30, 2015 at 2:57 PM, Zhang, Jingyu >> wrote: >> >>> There is not a problem in Spark SQL 1.5.1 but the error of "key not >>> found: sportingpulse.com" shown up when I use 1.5.0. >>> >>> I have to use the version of 1.5.0 because that the one AWS EMR >>> support. Can anyone tell me why Spark uses "sportingpulse.com" and how >>> to fix it? >>> >>> Thanks. >>> >>> Caused by: java.util.NoSuchElementException: key not found: >>> sportingpulse.com >>> >>> at scala.collection.MapLike$class.default(MapLike.scala:228) >>> >>> at scala.collection.AbstractMap.default(Map.scala:58) >>> >>> at scala.collection.mutable.HashMap.apply(HashMap.scala:64) >>> >>> at >>> org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.compress( >>> compressionSchemes.scala:258) >>> >>> at >>> org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.build( >>> CompressibleColumnBuilder.scala:110) >>> >>> at org.apache.spark.sql.columnar.NativeColumnBuilder.build( >>> ColumnBuilder.scala:87) >>> >>> at >>> org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply( >>> InMemoryColumnarTableScan.scala:152) >>> >>> at >>> org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply( >>> InMemoryColumnarTableScan.scala:152) >>> >>> at scala.collection.TraversableLike$$anonfun$map$1.apply( >>> TraversableLike.scala:244) >>> >>> at scala.collection.TraversableLike$$anonfun$map$1.apply( >>> TraversableLike.scala:244) >>> >>> at scala.collection.IndexedSeqOptimized$class.foreach( >>> IndexedSeqOptimized.scala:33) >>> >>> at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) >>> >>> at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) >>> >>> at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108) >>> >>> at >>> org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next( >>> InMemoryColumnarTableScan.scala:152) >>> >>> at >>> org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next( >>> InMemoryColumnarTableScan.scala:120) >>> >>> at org.apache.spark.storage.MemoryStore.unrollSafely( >>> MemoryStore.scala:278) >>> >>> at org.apache.spark.CacheManager.putInBlockManager( >>> CacheManager.scala:171) >>> >>> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78) >>> >>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:262) >>> >>> at org.apache.spark.rdd.MapPartitionsRDD.compute( >>> MapPartitionsRDD.scala:38) >>> >>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) >>> >>> at org.apache.spark.rdd.RDD.iterator
Re: key not found: sportingpulse.com in Spark SQL 1.5.0
I don’t believe I have it on 1.5.1. Are you able to test the data locally to confirm, or is it too large? From: "Zhang, Jingyu" mailto:jingyu.zh...@news.com.au>> Date: Friday, October 30, 2015 at 7:31 PM To: Silvio Fiorito mailto:silvio.fior...@granturing.com>> Cc: Ted Yu mailto:yuzhih...@gmail.com>>, user mailto:user@spark.apache.org>> Subject: Re: key not found: sportingpulse.com in Spark SQL 1.5.0 Thanks Silvio and Ted, Can you please let me know how to fix this intermittent issues? Should I wait EMR upgrading to support the Spark 1.5.1 or change my code from DataFrame to normal Spark map-reduce? Regards, Jingyu On 31 October 2015 at 09:40, Silvio Fiorito mailto:silvio.fior...@granturing.com>> wrote: It's something due to the columnar compression. I've seen similar intermittent issues when caching DataFrames. "sportingpulse.com<http://sportingpulse.com>" is a value in one of the columns of the DF. From: Ted Yu<mailto:yuzhih...@gmail.com> Sent: 10/30/2015 6:33 PM To: Zhang, Jingyu<mailto:jingyu.zh...@news.com.au> Cc: user<mailto:user@spark.apache.org> Subject: Re: key not found: sportingpulse.com<http://sportingpulse.com> in Spark SQL 1.5.0 I searched for sportingpulse in *.scala and *.java files under 1.5 branch. There was no hit. mvn dependency doesn't show sportingpulse either. Is it possible this is specific to EMR ? Cheers On Fri, Oct 30, 2015 at 2:57 PM, Zhang, Jingyu mailto:jingyu.zh...@news.com.au>> wrote: There is not a problem in Spark SQL 1.5.1 but the error of "key not found: sportingpulse.com<http://sportingpulse.com/>" shown up when I use 1.5.0. I have to use the version of 1.5.0 because that the one AWS EMR support. Can anyone tell me why Spark uses "sportingpulse.com<http://sportingpulse.com/>" and how to fix it? Thanks. Caused by: java.util.NoSuchElementException: key not found: sportingpulse.com<http://sportingpulse.com> at scala.collection.MapLike$class.default(MapLike.scala:228) at scala.collection.AbstractMap.default(Map.scala:58) at scala.collection.mutable.HashMap.apply(HashMap.scala:64) at org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.compress(compressionSchemes.scala:258) at org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.build(CompressibleColumnBuilder.scala:110) at org.apache.spark.sql.columnar.NativeColumnBuilder.build(ColumnBuilder.scala:87) at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152) at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108) at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:152) at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:120) at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278) at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171) at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78) at org.apache.spark.rdd.RDD.iterator(RDD.scala:262) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) at org.apache.spark.rdd.MapPartitionsWithPreparationRDD.compute(MapPartitionsWithPreparationRDD.scala:63) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.sca
Re: key not found: sportingpulse.com in Spark SQL 1.5.0
Thanks Silvio and Ted, Can you please let me know how to fix this intermittent issues? Should I wait EMR upgrading to support the Spark 1.5.1 or change my code from DataFrame to normal Spark map-reduce? Regards, Jingyu On 31 October 2015 at 09:40, Silvio Fiorito wrote: > It's something due to the columnar compression. I've seen similar > intermittent issues when caching DataFrames. "sportingpulse.com" is a > value in one of the columns of the DF. > -- > From: Ted Yu > Sent: 10/30/2015 6:33 PM > To: Zhang, Jingyu > Cc: user > Subject: Re: key not found: sportingpulse.com in Spark SQL 1.5.0 > > I searched for sportingpulse in *.scala and *.java files under 1.5 branch. > There was no hit. > > mvn dependency doesn't show sportingpulse either. > > Is it possible this is specific to EMR ? > > Cheers > > On Fri, Oct 30, 2015 at 2:57 PM, Zhang, Jingyu > wrote: > >> There is not a problem in Spark SQL 1.5.1 but the error of "key not >> found: sportingpulse.com" shown up when I use 1.5.0. >> >> I have to use the version of 1.5.0 because that the one AWS EMR support. >> Can anyone tell me why Spark uses "sportingpulse.com" and how to fix it? >> >> Thanks. >> >> Caused by: java.util.NoSuchElementException: key not found: >> sportingpulse.com >> >> at scala.collection.MapLike$class.default(MapLike.scala:228) >> >> at scala.collection.AbstractMap.default(Map.scala:58) >> >> at scala.collection.mutable.HashMap.apply(HashMap.scala:64) >> >> at >> org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.compress( >> compressionSchemes.scala:258) >> >> at >> org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.build( >> CompressibleColumnBuilder.scala:110) >> >> at org.apache.spark.sql.columnar.NativeColumnBuilder.build( >> ColumnBuilder.scala:87) >> >> at >> org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply( >> InMemoryColumnarTableScan.scala:152) >> >> at >> org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply( >> InMemoryColumnarTableScan.scala:152) >> >> at scala.collection.TraversableLike$$anonfun$map$1.apply( >> TraversableLike.scala:244) >> >> at scala.collection.TraversableLike$$anonfun$map$1.apply( >> TraversableLike.scala:244) >> >> at scala.collection.IndexedSeqOptimized$class.foreach( >> IndexedSeqOptimized.scala:33) >> >> at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) >> >> at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) >> >> at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108) >> >> at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next( >> InMemoryColumnarTableScan.scala:152) >> >> at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next( >> InMemoryColumnarTableScan.scala:120) >> >> at org.apache.spark.storage.MemoryStore.unrollSafely( >> MemoryStore.scala:278) >> >> at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171 >> ) >> >> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78) >> >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:262) >> >> at org.apache.spark.rdd.MapPartitionsRDD.compute( >> MapPartitionsRDD.scala:38) >> >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) >> >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) >> >> at org.apache.spark.rdd.MapPartitionsRDD.compute( >> MapPartitionsRDD.scala:38) >> >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) >> >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) >> >> at org.apache.spark.rdd.MapPartitionsRDD.compute( >> MapPartitionsRDD.scala:38) >> >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) >> >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) >> >> at org.apache.spark.rdd.MapPartitionsWithPreparationRDD.compute( >> MapPartitionsWithPreparationRDD.scala:63) >> >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) >> >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) >> >> at org.apache.spark.rdd.MapPartitionsRDD.compute( >> MapPartitionsRDD.scala:38) >> >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) >> >> at org.apache.spark.rdd.RDD.iterator(RDD.scal
RE: key not found: sportingpulse.com in Spark SQL 1.5.0
It's something due to the columnar compression. I've seen similar intermittent issues when caching DataFrames. "sportingpulse.com" is a value in one of the columns of the DF. From: Ted Yu<mailto:yuzhih...@gmail.com> Sent: 10/30/2015 6:33 PM To: Zhang, Jingyu<mailto:jingyu.zh...@news.com.au> Cc: user<mailto:user@spark.apache.org> Subject: Re: key not found: sportingpulse.com in Spark SQL 1.5.0 I searched for sportingpulse in *.scala and *.java files under 1.5 branch. There was no hit. mvn dependency doesn't show sportingpulse either. Is it possible this is specific to EMR ? Cheers On Fri, Oct 30, 2015 at 2:57 PM, Zhang, Jingyu mailto:jingyu.zh...@news.com.au>> wrote: There is not a problem in Spark SQL 1.5.1 but the error of "key not found: sportingpulse.com<http://sportingpulse.com/>" shown up when I use 1.5.0. I have to use the version of 1.5.0 because that the one AWS EMR support. Can anyone tell me why Spark uses "sportingpulse.com<http://sportingpulse.com/>" and how to fix it? Thanks. Caused by: java.util.NoSuchElementException: key not found: sportingpulse.com<http://sportingpulse.com> at scala.collection.MapLike$class.default(MapLike.scala:228) at scala.collection.AbstractMap.default(Map.scala:58) at scala.collection.mutable.HashMap.apply(HashMap.scala:64) at org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.compress(compressionSchemes.scala:258) at org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.build(CompressibleColumnBuilder.scala:110) at org.apache.spark.sql.columnar.NativeColumnBuilder.build(ColumnBuilder.scala:87) at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152) at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108) at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:152) at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:120) at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278) at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171) at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78) at org.apache.spark.rdd.RDD.iterator(RDD.scala:262) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) at org.apache.spark.rdd.MapPartitionsWithPreparationRDD.compute(MapPartitionsWithPreparationRDD.scala:63) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:88) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) This message and its attachments may contain legally privileged or confidential information. It is intended solely for the named addressee. If you are not the addressee indicated in this message or responsible for delivery of the message to the addressee, you may not copy or deliver this message or its attachments to anyone. Rather, you should permanently delete this message and its attachments and kindly notify the sender by reply e-mail. Any content of this message and its attachments which does not relate to the official b
Re: key not found: sportingpulse.com in Spark SQL 1.5.0
I searched for sportingpulse in *.scala and *.java files under 1.5 branch. There was no hit. mvn dependency doesn't show sportingpulse either. Is it possible this is specific to EMR ? Cheers On Fri, Oct 30, 2015 at 2:57 PM, Zhang, Jingyu wrote: > There is not a problem in Spark SQL 1.5.1 but the error of "key not found: > sportingpulse.com" shown up when I use 1.5.0. > > I have to use the version of 1.5.0 because that the one AWS EMR support. > Can anyone tell me why Spark uses "sportingpulse.com" and how to fix it? > > Thanks. > > Caused by: java.util.NoSuchElementException: key not found: > sportingpulse.com > > at scala.collection.MapLike$class.default(MapLike.scala:228) > > at scala.collection.AbstractMap.default(Map.scala:58) > > at scala.collection.mutable.HashMap.apply(HashMap.scala:64) > > at > org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.compress( > compressionSchemes.scala:258) > > at > org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.build( > CompressibleColumnBuilder.scala:110) > > at org.apache.spark.sql.columnar.NativeColumnBuilder.build( > ColumnBuilder.scala:87) > > at > org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply( > InMemoryColumnarTableScan.scala:152) > > at > org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply( > InMemoryColumnarTableScan.scala:152) > > at scala.collection.TraversableLike$$anonfun$map$1.apply( > TraversableLike.scala:244) > > at scala.collection.TraversableLike$$anonfun$map$1.apply( > TraversableLike.scala:244) > > at scala.collection.IndexedSeqOptimized$class.foreach( > IndexedSeqOptimized.scala:33) > > at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) > > at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) > > at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108) > > at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next( > InMemoryColumnarTableScan.scala:152) > > at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next( > InMemoryColumnarTableScan.scala:120) > > at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278 > ) > > at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171) > > at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78) > > at org.apache.spark.rdd.RDD.iterator(RDD.scala:262) > > at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38 > ) > > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) > > at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) > > at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38 > ) > > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) > > at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) > > at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38 > ) > > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) > > at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) > > at org.apache.spark.rdd.MapPartitionsWithPreparationRDD.compute( > MapPartitionsWithPreparationRDD.scala:63) > > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) > > at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) > > at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38 > ) > > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) > > at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) > > at org.apache.spark.scheduler.ShuffleMapTask.runTask( > ShuffleMapTask.scala:73) > > at org.apache.spark.scheduler.ShuffleMapTask.runTask( > ShuffleMapTask.scala:41) > > at org.apache.spark.scheduler.Task.run(Task.scala:88) > > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > > at java.util.concurrent.ThreadPoolExecutor.runWorker( > ThreadPoolExecutor.java:1142) > > at java.util.concurrent.ThreadPoolExecutor$Worker.run( > ThreadPoolExecutor.java:617) > > This message and its attachments may contain legally privileged or > confidential information. It is intended solely for the named addressee. If > you are not the addressee indicated in this message or responsible for > delivery of the message to the addressee, you may not copy or deliver this > message or its attachments to anyone. Rather, you should permanently delete > this message and its attachments and kindly notify the sender by reply > e-mail. Any content of this message and its attachments which does not > relate to the official business of the sending company must be taken not to > have been sent or endorsed by that company or any of its related entities. > No warranty is made that the e-mail or attachments are free from computer > virus or other defect.