What is the relationship between reduceByKey and spark.driver.maxResultSize?
I have a job that is running into intermittent errors with [SparkDriver] java.lang.OutOfMemoryError: Java heap space. Before I was getting this error I was getting errors saying the result size exceed the spark.driver.maxResultSize. This does not make any sense to me, as there are no actions in my job that send data to the driver - just a pull of data from S3, a map and reduceByKey and then conversion to dataframe and saveAsTable action that puts the results back on S3. I've found a few references to reduceByKey and spark.driver.maxResultSize having some importance, but cannot fathom how this setting could be related. Would greatly appreciated any advice. Thanks in advance, Tom
java.lang.NoSuchMethodError and yarn-client mode
Hi, I have a problem trying to get a fairly simple app working which makes use of native avro libraries. The app runs fine on my local machine and in yarn-cluster mode, but when I try to run it on EMR yarn-client mode I get the error below. I'm aware this is a version problem, as EMR runs an earlier version of avro, and I am trying to use avro-1.7.7. What's confusing me a great deal is the fact that this runs fine in yarn-cluster mode. What is it about yarn-cluster mode that means the application has access to the correct version of the avro library? I need to run in yarn-client mode as I will be caching data to the driver machine in between batches. I think in yarn-cluster mode the driver can run on any machine in the cluster so this would not work. Grateful for any advice as I'm really stuck on this. AWS support are trying but they don't seem to know why this is happening either! Just to note, I'm aware of Databricks spark-avro project and have used it. This is an investigation to see if I can use RDDs instead of dataframes. java.lang.NoSuchMethodError: org.apache.avro.Schema$Parser.parse(Ljava/lang/String;[Ljava/lang/String;)Lorg/apache/avro/Schema; at ophan.thrift.event.Event.(Event.java:10) at SimpleApp$.main(SimpleApp.scala:25) at SimpleApp.main(SimpleApp.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:665) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:170) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:193) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Thanks, Tom
Re: java.lang.NoSuchMethodError and yarn-client mode
Thanks for your reply Aniket. Ok I've done this and I'm still confused. Output from running locally shows: file:/home/tom/spark-avro/target/scala-2.10/simpleapp.jar file:/home/tom/spark-1.4.0-bin-hadoop2.4/conf/ file:/home/tom/spark-1.4.0-bin-hadoop2.4/lib/spark-assembly-1.4.0-hadoop2.4.0.jar file:/home/tom/spark-1.4.0-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar file:/home/tom/spark-1.4.0-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar file:/home/tom/spark-1.4.0-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/sunjce_provider.jar file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/zipfs.jar file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/localedata.jar file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/dnsns.jar file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/sunec.jar file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/sunpkcs11.jar saving text file... done! In yarn-client mode: file:/home/hadoop/simpleapp.jar file:/usr/lib/hadoop/hadoop-auth-2.6.0-amzn-0.jar ... *file:/usr/lib/hadoop-mapreduce/avro-1.7.4.jar* ... And in yarn-cluster mode: file:/mnt/yarn/usercache/hadoop/appcache/application_1441787021820_0004/container_1441787021820_0004_01_01/__app__.jar ... *file:/usr/lib/hadoop/lib/avro-1.7.4.jar* ... saving text file... done! In yarn-cluster mode it doesn't appear to have sight of the fat jar (simpleapp), but can see avro-1.7.4, but runs fine! Thanks, Tom On Wed, Sep 9, 2015 at 9:49 AM Aniket Bhatnagar <aniket.bhatna...@gmail.com> wrote: > Hi Tom > > There has to be a difference in classpaths in yarn-client and yarn-cluster > mode. Perhaps a good starting point would be to print classpath as a first > thing in SimpleApp.main. It should give clues around why it works in > yarn-cluster mode. > > Thanks, > Aniket > > On Wed, Sep 9, 2015, 2:11 PM Tom Seddon <mr.tom.sed...@gmail.com> wrote: > >> Hi, >> >> I have a problem trying to get a fairly simple app working which makes >> use of native avro libraries. The app runs fine on my local machine and in >> yarn-cluster mode, but when I try to run it on EMR yarn-client mode I get >> the error below. I'm aware this is a version problem, as EMR runs an >> earlier version of avro, and I am trying to use avro-1.7.7. >> >> What's confusing me a great deal is the fact that this runs fine in >> yarn-cluster mode. >> >> What is it about yarn-cluster mode that means the application has access >> to the correct version of the avro library? I need to run in yarn-client >> mode as I will be caching data to the driver machine in between batches. I >> think in yarn-cluster mode the driver can run on any machine in the cluster >> so this would not work. >> >> Grateful for any advice as I'm really stuck on this. AWS support are >> trying but they don't seem to know why this is happening either! >> >> Just to note, I'm aware of Databricks spark-avro project and have used >> it. This is an investigation to see if I can use RDDs instead of >> dataframes. >> >> java.lang.NoSuchMethodError: >> org.apache.avro.Schema$Parser.parse(Ljava/lang/String;[Ljava/lang/String;)Lorg/apache/avro/Schema; >> at ophan.thrift.event.Event.(Event.java:10) >> at SimpleApp$.main(SimpleApp.scala:25) >> at SimpleApp.main(SimpleApp.scala) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:606) >> at >> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:665) >> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:170) >> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:193) >> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112) >> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) >> >> Thanks, >> >> Tom >> >> >>
Re: SparkSQL DF.explode with Nulls
I figured it out. Needed a block style map and a check for null. The case class is just to name the transformed columns. case class Component(name: String, loadTimeMs: Long) avroFile.filter($lazyComponents.components.isNotNull) .explode($lazyComponents.components) { case Row(lazyComponents: Seq[Row]) = lazyComponents .map { x = val name = x.getString(0); val loadTimeMs = if (x.isNullAt(1)) 0 else x.getLong(1); Component(name, loadTimeMs) } } .select('pageViewId, 'name, 'loadTimeMs).take(20).foreach(println) On Thu, Jun 4, 2015 at 12:05 PM Tom Seddon mr.tom.sed...@gmail.com wrote: Hi, I've worked out how to use explode on my input avro dataset with the following structure root |-- pageViewId: string (nullable = false) |-- components: array (nullable = true) ||-- element: struct (containsNull = false) |||-- name: string (nullable = false) |||-- loadTimeMs: long (nullable = true) I'm trying to turn this into this layout with repeated pageViewIds for each row of my components: root |-- pageViewId: string (nullable = false) |-- name: string (nullable = false) |-- loadTimeMs: long (nullable = true) Explode words fine for the first 10 records using this bit of code, but my big problem is that loadTimeMs has nulls in it, which I think is causing the error. Any ideas how I can trap those nulls? Perhaps by converting to zeros and then I can deal with them later? I tried writing a udf which just takes the loadTimeMs column and swaps nulls for zeros, but this separates the struct and then I don't know how to use explode. avroFile.filter($lazyComponents.components.isNotNull) .explode($lazyComponents.components) { case Row(lazyComponents: Seq[Row]) = lazyComponents .map(x = x.getString(0) - x.getLong(1))} .select('pageViewId, '_1, '_2) .take(10).foreach(println) 15/06/04 12:01:21 ERROR Executor: Exception in task 0.0 in stage 19.0 (TID 65) java.lang.RuntimeException: Failed to check null bit for primitive long value. at scala.sys.package$.error(package.scala:27) at org.apache.spark.sql.catalyst.expressions.GenericRow.getLong(rows.scala:87) at $line127.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1$$anonfun$apply$1.apply(console:33) at $line127.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1$$anonfun$apply$1.apply(console:33) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34) at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) at scala.collection.AbstractTraversable.map(Traversable.scala:105) at $line127.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(console:33) at $line127.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(console:33) at scala.Function1$$anonfun$andThen$1.apply(Function1.scala:55) at org.apache.spark.sql.catalyst.expressions.UserDefinedGenerator.eval(generators.scala:89) at org.apache.spark.sql.execution.Generate$$anonfun$2$$anonfun$apply$1.apply(Generate.scala:71) at org.apache.spark.sql.execution.Generate$$anonfun$2$$anonfun$apply$1.apply(Generate.scala:70) at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:308) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273) at scala.collection.AbstractIterator.to(Iterator.scala:1157) at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252) at scala.collection.AbstractIterator.toArray(Iterator.scala:1157) at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:122) at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:122) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1498) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1498) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61
SparkSQL DF.explode with Nulls
Hi, I've worked out how to use explode on my input avro dataset with the following structure root |-- pageViewId: string (nullable = false) |-- components: array (nullable = true) ||-- element: struct (containsNull = false) |||-- name: string (nullable = false) |||-- loadTimeMs: long (nullable = true) I'm trying to turn this into this layout with repeated pageViewIds for each row of my components: root |-- pageViewId: string (nullable = false) |-- name: string (nullable = false) |-- loadTimeMs: long (nullable = true) Explode words fine for the first 10 records using this bit of code, but my big problem is that loadTimeMs has nulls in it, which I think is causing the error. Any ideas how I can trap those nulls? Perhaps by converting to zeros and then I can deal with them later? I tried writing a udf which just takes the loadTimeMs column and swaps nulls for zeros, but this separates the struct and then I don't know how to use explode. avroFile.filter($lazyComponents.components.isNotNull) .explode($lazyComponents.components) { case Row(lazyComponents: Seq[Row]) = lazyComponents .map(x = x.getString(0) - x.getLong(1))} .select('pageViewId, '_1, '_2) .take(10).foreach(println) 15/06/04 12:01:21 ERROR Executor: Exception in task 0.0 in stage 19.0 (TID 65) java.lang.RuntimeException: Failed to check null bit for primitive long value. at scala.sys.package$.error(package.scala:27) at org.apache.spark.sql.catalyst.expressions.GenericRow.getLong(rows.scala:87) at $line127.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1$$anonfun$apply$1.apply(console:33) at $line127.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1$$anonfun$apply$1.apply(console:33) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34) at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) at scala.collection.AbstractTraversable.map(Traversable.scala:105) at $line127.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(console:33) at $line127.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(console:33) at scala.Function1$$anonfun$andThen$1.apply(Function1.scala:55) at org.apache.spark.sql.catalyst.expressions.UserDefinedGenerator.eval(generators.scala:89) at org.apache.spark.sql.execution.Generate$$anonfun$2$$anonfun$apply$1.apply(Generate.scala:71) at org.apache.spark.sql.execution.Generate$$anonfun$2$$anonfun$apply$1.apply(Generate.scala:70) at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:308) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273) at scala.collection.AbstractIterator.to(Iterator.scala:1157) at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252) at scala.collection.AbstractIterator.toArray(Iterator.scala:1157) at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:122) at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:122) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1498) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1498) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) at org.apache.spark.scheduler.Task.run(Task.scala:64) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744)
PySpark saveAsTextFile gzip
Hi, I've searched but can't seem to find a PySpark example. How do I write compressed text file output to S3 using PySpark saveAsTextFile? Thanks, Tom
Efficient way to split an input data set into different output files
I'm trying to set up a PySpark ETL job that takes in JSON log files and spits out fact table files for upload to Redshift. Is there an efficient way to send different event types to different outputs without having to just read the same cached RDD twice? I have my first RDD which is just a json parsed version of the input data, and I need to create a flattened page views dataset off this based on eventType = 'INITIAL', and then a page events dataset from the same RDD based on eventType = 'ADDITIONAL'. Ideally I'd like the output files for both these tables to be written at the same time, so I'm picturing a function with one input RDD in and two RDDs out, or a function utilising two CSV writers. I'm using mapPartitions at the moment to write to files like this: def write_records(records): output = StringIO.StringIO() writer = vlad.CsvUnicodeWriter(output, dialect='excel') for record in records: writer.writerow(record) return [output.getvalue()] and I use this in the call to write the file as follows (pageviews and events get created off the same json parsed RDD by filtering on INITIAL or ADDITIONAL respectively): pageviews.mapPartitions(writeRecords).saveAsTextFile('s3n://output/pageviews/') events.mapPartitions(writeRecords).saveAsTextFile(''s3n://output/events/) Is there a way to change this so that both are written in the same process?
Re: Broadcast failure with variable size of ~ 500mb with key already cancelled ?
Hi, Just wondering if anyone has any advice about this issue, as I am experiencing the same thing. I'm working with multiple broadcast variables in PySpark, most of which are small, but one of around 4.5GB, using 10 workers at 31GB memory each and driver with same spec. It's not running out of memory as far as I can see, but definitely only happens when I add the large broadcast. Would be most grateful for advice. I tried playing around with the last 3 conf settings below, but no luck: SparkConf().set(spark.master.memory, 26) .set(spark.executor.memory, 26) .set(spark.worker.memory, 26) .set(spark.driver.memory, 26). .set(spark.storage.memoryFraction,1) .set(spark.core.connection.ack.wait.timeout,6000) .set(spark.akka.frameSize,50) Thanks, Tom On 24 October 2014 12:31, htailor hemant.tai...@live.co.uk wrote: Hi All, I am relatively new to spark and currently having troubles with broadcasting large variables ~500mb in size. Th e broadcast fails with an error shown below and the memory usage on the hosts also blow up. Our hardware consists of 8 hosts (1 x 64gb (driver) and 7 x 32gb (workers)) and we are using Spark 1.1 (Python) via Cloudera CDH 5.2. We have managed to replicate the error using a test script shown below. I would be interested to know if anyone has seen this before with broadcasting or know of a fix. === ERROR == 14/10/24 08:20:04 INFO BlockManager: Found block rdd_11_31 locally 14/10/24 08:20:08 INFO ConnectionManager: Key not valid ? sun.nio.ch.SelectionKeyImpl@fbc6caf 14/10/24 08:20:08 INFO ConnectionManager: Removing ReceivingConnection to ConnectionManagerId(pigeon3.ldn.ebs.io,55316) 14/10/24 08:20:08 INFO ConnectionManager: Removing SendingConnection to ConnectionManagerId(pigeon3.ldn.ebs.io,55316) 14/10/24 08:20:08 INFO ConnectionManager: Removing SendingConnection to ConnectionManagerId(pigeon3.ldn.ebs.io,55316) 14/10/24 08:20:08 INFO ConnectionManager: key already cancelled ? sun.nio.ch.SelectionKeyImpl@fbc6caf java.nio.channels.CancelledKeyException at org.apache.spark.network.ConnectionManager.run(ConnectionManager.scala:386) at org.apache.spark.network.ConnectionManager$$anon$4.run(ConnectionManager.scala:139) 14/10/24 08:20:13 INFO ConnectionManager: Removing ReceivingConnection to ConnectionManagerId(pigeon7.ldn.ebs.io,52524) 14/10/24 08:20:13 INFO ConnectionManager: Removing SendingConnection to ConnectionManagerId(pigeon7.ldn.ebs.io,52524) 14/10/24 08:20:13 INFO ConnectionManager: Removing SendingConnection to ConnectionManagerId(pigeon7.ldn.ebs.io,52524) 14/10/24 08:20:15 INFO ConnectionManager: Removing ReceivingConnection to ConnectionManagerId(toppigeon.ldn.ebs.io,34370) 14/10/24 08:20:15 INFO ConnectionManager: Key not valid ? sun.nio.ch.SelectionKeyImpl@3ecfdb7e 14/10/24 08:20:15 INFO ConnectionManager: Removing SendingConnection to ConnectionManagerId(toppigeon.ldn.ebs.io,34370) 14/10/24 08:20:15 INFO ConnectionManager: Removing SendingConnection to ConnectionManagerId(toppigeon.ldn.ebs.io,34370) 14/10/24 08:20:15 INFO ConnectionManager: key already cancelled ? sun.nio.ch.SelectionKeyImpl@3ecfdb7e java.nio.channels.CancelledKeyException at org.apache.spark.network.ConnectionManager.run(ConnectionManager.scala:386) at org.apache.spark.network.ConnectionManager$$anon$4.run(ConnectionManager.scala:139) 14/10/24 08:20:19 INFO ConnectionManager: Removing ReceivingConnection to ConnectionManagerId(pigeon8.ldn.ebs.io,48628) 14/10/24 08:20:19 INFO ConnectionManager: Removing SendingConnection to ConnectionManagerId(pigeon8.ldn.ebs.io,48628) 14/10/24 08:20:19 INFO ConnectionManager: Removing ReceivingConnection to ConnectionManagerId(pigeon6.ldn.ebs.io,44996) 14/10/24 08:20:19 INFO ConnectionManager: Removing SendingConnection to ConnectionManagerId(pigeon6.ldn.ebs.io,44996) 14/10/24 08:20:27 ERROR SendingConnection: Exception while reading SendingConnection to ConnectionManagerId(pigeon8.ldn.ebs.io,48628) java.nio.channels.ClosedChannelException at sun.nio.ch.SocketChannelImpl.ensureReadOpen(SocketChannelImpl.java:252) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:295) at org.apache.spark.network.SendingConnection.read(Connection.scala:390) at org.apache.spark.network.ConnectionManager$$anon$7.run(ConnectionManager.scala:199) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 14/10/24 08:20:27 ERROR SendingConnection: Exception while reading SendingConnection to ConnectionManagerId(pigeon6.ldn.ebs.io,44996) java.nio.channels.ClosedChannelException at sun.nio.ch.SocketChannelImpl.ensureReadOpen(SocketChannelImpl.java:252) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:295)
Re: ERROR ConnectionManager: Corresponding SendingConnection to ConnectionManagerId
Yes please can you share. I am getting this error after expanding my application to include a large broadcast variable. Would be good to know if it can be fixed with configuration. On 23 October 2014 18:04, Michael Campbell michael.campb...@gmail.com wrote: Can you list what your fix was so others can benefit? On Wed, Oct 22, 2014 at 8:15 PM, arthur.hk.c...@gmail.com arthur.hk.c...@gmail.com wrote: Hi, I have managed to resolve it because a wrong setting. Please ignore this . Regards Arthur On 23 Oct, 2014, at 5:14 am, arthur.hk.c...@gmail.com arthur.hk.c...@gmail.com wrote: 14/10/23 05:09:04 WARN ConnectionManager: All connections not cleaned up