There was a closure over the config object lurking around - but in any case upgrading to 1.2.0 for config did the trick as it seems to have been a bug in Typesafe config,
Thanks Matei! On Thu, Apr 10, 2014 at 8:46 AM, Nick Pentreath <nick.pentre...@gmail.com>wrote: > Ok I thought it may be closing over the config option. I am using config > for job configuration, but extracting vals from that. So not sure why as I > thought I'd avoided closing over it. Will go back to source and see where > it is creeping in. > > > > On Thu, Apr 10, 2014 at 8:42 AM, Matei Zaharia <matei.zaha...@gmail.com>wrote: > >> I haven't seen this but it may be a bug in Typesafe Config, since this is >> serializing a Config object. We don't actually use Typesafe Config >> ourselves. >> >> Do you have any nulls in the data itself by any chance? And do you know >> how that Config object is getting there? >> >> Matei >> >> On Apr 9, 2014, at 11:38 PM, Nick Pentreath <nick.pentre...@gmail.com> >> wrote: >> >> Anyone have a chance to look at this? >> >> Am I just doing something silly somewhere? >> >> If it makes any difference, I am using the elasticsearch-hadoop plugin >> for ESInputFormat. But as I say, I can parse the data (count, first() etc). >> I just can't save it as text file. >> >> >> >> >> On Tue, Apr 8, 2014 at 4:50 PM, Nick Pentreath >> <nick.pentre...@gmail.com>wrote: >> >>> Hi >>> >>> I'm using Spark 0.9.0. >>> >>> When calling saveAsTextFile on a custom hadoop inputformat (loaded with >>> newAPIHadoopRDD), I get the following error below. >>> >>> If I call count, I get the correct count of number of records, so the >>> inputformat is being read correctly... the issue only appears when trying >>> to use saveAsTextFile. >>> >>> If I call first() I get the correct output, also. So it doesn't appear >>> to be anything with the data or inputformat. >>> >>> Any idea what the actual problem is, since this stack trace is not >>> obvious (though it seems to be in ResultTask which ultimately causes this). >>> >>> Is this a known issue at all? >>> >>> >>> ====== >>> >>> 14/04/08 16:00:46 ERROR OneForOneStrategy: >>> java.lang.NullPointerException >>> at >>> com.typesafe.config.impl.SerializedConfigValue.writeOrigin(SerializedConfigValue.java:202) >>> at >>> com.typesafe.config.impl.ConfigImplUtil.writeOrigin(ConfigImplUtil.java:228) >>> at >>> com.typesafe.config.ConfigException.writeObject(ConfigException.java:58) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:601) >>> at >>> java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:975) >>> at >>> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1480) >>> at >>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1416) >>> at >>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1174) >>> at >>> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1528) >>> at >>> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1493) >>> at >>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1416) >>> at >>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1174) >>> at >>> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1528) >>> at >>> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1493) >>> at >>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1416) >>> at >>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1174) >>> at >>> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1528) >>> at >>> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1493) >>> at >>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1416) >>> at >>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1174) >>> at >>> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1528) >>> at >>> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1493) >>> at >>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1416) >>> at >>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1174) >>> at >>> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1528) >>> at >>> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1493) >>> at >>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1416) >>> at >>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1174) >>> at >>> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1528) >>> at >>> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1493) >>> at >>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1416) >>> at >>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1174) >>> at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:346) >>> at scala.collection.immutable.$colon$colon.writeObject(List.scala:379) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:601) >>> at >>> java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:975) >>> at >>> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1480) >>> at >>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1416) >>> at >>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1174) >>> at >>> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1528) >>> at >>> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1493) >>> at >>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1416) >>> at >>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1174) >>> at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:346) >>> at >>> org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:28) >>> at >>> org.apache.spark.scheduler.ResultTask$.serializeInfo(ResultTask.scala:48) >>> at >>> org.apache.spark.scheduler.ResultTask.writeExternal(ResultTask.scala:123) >>> at >>> java.io.ObjectOutputStream.writeExternalData(ObjectOutputStream.java:1443) >>> at >>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1414) >>> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1174) >>> at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:346) >>> at >>> org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:28) >>> at >>> org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:48) >>> at >>> org.apache.spark.scheduler.DAGScheduler.org<http://org.apache.spark.scheduler.dagscheduler.org/> >>> $apache$spark$scheduler$DAGScheduler$$submitMissingTasks(DAGScheduler.scala:778) >>> at >>> org.apache.spark.scheduler.DAGScheduler.org<http://org.apache.spark.scheduler.dagscheduler.org/> >>> $apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:724) >>> at >>> org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:554) >>> at >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:190) >>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) >>> at akka.actor.ActorCell.invoke(ActorCell.scala:456) >>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) >>> at akka.dispatch.Mailbox.run(Mailbox.scala:219) >>> at >>> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) >>> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) >>> at >>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) >>> at >>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) >>> at >>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) >>> >>> >> >> >