Re: Getting to proto buff classes in Spark Context

Ted Yu Mon, 23 Feb 2015 21:40:43 -0800

The classname given in stack trace was com.rick.reports.Reports

In the output from jar command the class is com.defend7.reports.Reports.


FYI

On Mon, Feb 23, 2015 at 9:33 PM, necro351 . <necro...@gmail.com> wrote:

> Hi Ted,
>
> Yes it appears to be:
> rick@ubuntu:~/go/src/rick/sparksprint/containers/tests/StreamingReports$
> jar tvf
> ../../../analyzer/spark/target/scala-2.10/rick-processors-assembly-1.0.jar|grep
> SensorReports
>   1128 Mon Feb 23 17:34:46 PST 2015
> com/defend7/reports/Reports$SensorReports$1.class
>  13507 Mon Feb 23 17:34:46 PST 2015
> com/defend7/reports/Reports$SensorReports$Builder.class
>  10640 Mon Feb 23 17:34:46 PST 2015
> com/defend7/reports/Reports$SensorReports.class
>    815 Mon Feb 23 17:34:46 PST 2015
> com/defend7/reports/Reports$SensorReportsOrBuilder.class
>
>
> On Mon Feb 23 2015 at 8:57:18 PM Ted Yu <yuzhih...@gmail.com> wrote:
>
>> bq. Caused by: java.lang.ClassNotFoundException: com.rick.reports.
>> Reports$SensorReports
>>
>> Is Reports$SensorReports class in rick-processors-assembly-1.0.jar ?
>>
>> Thanks
>>
>> On Mon, Feb 23, 2015 at 8:43 PM, necro351 . <necro...@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> I am trying to deserialize some data encoded using proto buff from
>>> within Spark and am getting class-not-found exceptions. I have narrowed the
>>> program down to something very simple that shows the problem exactly (see
>>> 'The Program' below) and hopefully someone can tell me the easy fix :)
>>>
>>> So the situation is I have some proto buff reports in /tmp/reports. I
>>> also have a Spark project with the below Scala code (under The Program) as
>>> well as a Java file defining SensorReports all in the same src sub-tree in
>>> my Spark project. Its built using sbt in the standard way. The Spark job
>>> reads in the reports from /tmp/reports and then prints them to the console.
>>> When I build and run my spark job with spark-submit everything works as
>>> expected and the reports are printed out. When I uncomment the 'XXX'
>>> variant in the Scala spark program and try to print the reports from within
>>> a Spark Context I get the class-not-found exceptions. I don't understand
>>> why. If I get this working then I will want to do more than just print the
>>> reports from within the Spark Context.
>>>
>>> My read of the documentation tells me that my spark job should have
>>> access to everything in the submitted jar and that jar includes the Java
>>> code generated by the proto buff library which defines SensorReports. This
>>> is the spark-submit invocation I use after building my job as an assembly
>>> with the sbt-assembly plugin:
>>>
>>> spark-submit --class com.rick.processors.NewReportProcessor --master
>>> local[*]
>>> ../../../analyzer/spark/target/scala-2.10/rick-processors-assembly-1.0.jar
>>>
>>> I have also tried adding the jar programmatically using sc.addJar but
>>> that does not help. I found a bug from July (
>>> https://github.com/apache/spark/pull/181) that seems related but it
>>> went into Spark 1.2.0 (which is what I am currently using) so I don't think
>>> that's it.
>>>
>>> Any ideas? Thanks!
>>>
>>> The Program:
>>> ==========
>>> package com.rick.processors
>>>
>>>
>>>
>>> import java.io.File
>>>
>>> import java.nio.file.{Path, Files, FileSystems}
>>>
>>> import org.apache.spark.{SparkContext, SparkConf}
>>>
>>> import com.rick.reports.Reports.SensorReports
>>>
>>>
>>>
>>> object NewReportProcessor {
>>>
>>>   private val sparkConf = new SparkConf().setAppName("ReportProcessor")
>>>
>>>   private val sc = new SparkContext(sparkConf)
>>>
>>>
>>>
>>>   def main(args: Array[String]) = {
>>>
>>>     val protoBuffsBinary = localFileReports()
>>>
>>>     val sensorReportsBundles = protoBuffsBinary.map(bundle =>
>>> SensorReports.parseFrom(bundle))
>>>     // XXX: Printing from within the SparkContext throws class-not-found
>>>
>>>     // exceptions, why?
>>>
>>>     // sc.makeRDD(sensorReportsBundles).foreach((x: SensorReports) =>
>>> println(x.toString))
>>>     sensorReportsBundles.foreach((x: SensorReports) =>
>>> println(x.toString))
>>>   }
>>>
>>>
>>>
>>>   private def localFileReports() = {
>>>
>>>     val reportDir = new File("/tmp/reports")
>>>
>>>     val reportFiles =
>>> reportDir.listFiles.filter(_.getName.endsWith(".report"))
>>>
>>>     reportFiles.map(file => {
>>>
>>>       val path = FileSystems.getDefault().getPath("/tmp/reports",
>>> file.getName())
>>>       Files.readAllBytes(path)
>>>
>>>     })
>>>
>>>   }
>>>
>>> }
>>>
>>> The Class-not-found exceptions:
>>> =========================
>>> Spark assembly has been built with Hive, including Datanucleus jars on
>>> classpath
>>> Using Spark's default log4j profile:
>>> org/apache/spark/log4j-defaults.properties
>>> 15/02/23 17:35:03 WARN Utils: Your hostname, ubuntu resolves to a
>>> loopback address: 127.0.1.1; using 192.168.241.128 instead (on interface
>>> eth0)
>>> 15/02/23 17:35:03 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to
>>> another address
>>> 15/02/23 17:35:04 INFO SecurityManager: Changing view acls to: rick
>>> 15/02/23 17:35:04 INFO SecurityManager: Changing modify acls to: rick
>>> 15/02/23 17:35:04 INFO SecurityManager: SecurityManager: authentication
>>> disabled; ui acls disabled; users with view permissions: Set(rick); users
>>> with modify permissions: Set(rick)
>>> 15/02/23 17:35:04 INFO Slf4jLogger: Slf4jLogger started
>>> 15/02/23 17:35:04 INFO Remoting: Starting remoting
>>> 15/02/23 17:35:04 INFO Remoting: Remoting started; listening on
>>> addresses :[akka.tcp://sparkDriver@192.168.241.128:38110]
>>> 15/02/23 17:35:04 INFO Utils: Successfully started service 'sparkDriver'
>>> on port 38110.
>>> 15/02/23 17:35:04 INFO SparkEnv: Registering MapOutputTracker
>>> 15/02/23 17:35:04 INFO SparkEnv: Registering BlockManagerMaster
>>> 15/02/23 17:35:04 INFO DiskBlockManager: Created local directory at
>>> /tmp/spark-local-20150223173504-b26c
>>> 15/02/23 17:35:04 INFO MemoryStore: MemoryStore started with capacity
>>> 267.3 MB
>>> 15/02/23 17:35:05 WARN NativeCodeLoader: Unable to load native-hadoop
>>> library for your platform... using builtin-java classes where applicable
>>> 15/02/23 17:35:05 INFO HttpFileServer: HTTP File server directory is
>>> /tmp/spark-c77dbc9a-d626-4991-a9b7-f593acafbe64
>>> 15/02/23 17:35:05 INFO HttpServer: Starting HTTP Server
>>> 15/02/23 17:35:05 INFO Utils: Successfully started service 'HTTP file
>>> server' on port 50950.
>>> 15/02/23 17:35:05 WARN Utils: Service 'SparkUI' could not bind on port
>>> 4040. Attempting port 4041.
>>> 15/02/23 17:35:05 WARN Utils: Service 'SparkUI' could not bind on port
>>> 4041. Attempting port 4042.
>>> 15/02/23 17:35:05 WARN Utils: Service 'SparkUI' could not bind on port
>>> 4042. Attempting port 4043.
>>> 15/02/23 17:35:06 WARN Utils: Service 'SparkUI' could not bind on port
>>> 4043. Attempting port 4044.
>>> 15/02/23 17:35:06 WARN Utils: Service 'SparkUI' could not bind on port
>>> 4044. Attempting port 4045.
>>> 15/02/23 17:35:06 WARN Utils: Service 'SparkUI' could not bind on port
>>> 4045. Attempting port 4046.
>>> 15/02/23 17:35:06 INFO Utils: Successfully started service 'SparkUI' on
>>> port 4046.
>>> 15/02/23 17:35:06 INFO SparkUI: Started SparkUI at
>>> http://192.168.241.128:4046
>>> 15/02/23 17:35:06 INFO SparkContext: Added JAR
>>> file:/home/rick/go/src/rick/sparksprint/containers/tests/StreamingReports/../../../analyzer/spark/target/scala-2.10/rick-processors-assembly-1.0.jar
>>> at http://192.168.241.128:50950/jars/rick-processors-assembly-1.0.jar
>>> with timestamp 1424741706610
>>> 15/02/23 17:35:06 INFO AkkaUtils: Connecting to HeartbeatReceiver:
>>> akka.tcp://sparkDriver@192.168.241.128:38110/user/HeartbeatReceiver
>>> 15/02/23 17:35:07 INFO NettyBlockTransferService: Server created on 57801
>>> 15/02/23 17:35:07 INFO BlockManagerMaster: Trying to register
>>> BlockManager
>>> 15/02/23 17:35:07 INFO BlockManagerMasterActor: Registering block
>>> manager localhost:57801 with 267.3 MB RAM, BlockManagerId(<driver>,
>>> localhost, 57801)
>>> 15/02/23 17:35:07 INFO BlockManagerMaster: Registered BlockManager
>>> 15/02/23 17:35:07 INFO SparkContext: Starting job: foreach at
>>> NewReportProcessor.scala:17
>>> 15/02/23 17:35:07 INFO DAGScheduler: Got job 0 (foreach at
>>> NewReportProcessor.scala:17) with 1 output partitions (allowLocal=false)
>>> 15/02/23 17:35:07 INFO DAGScheduler: Final stage: Stage 0(foreach at
>>> NewReportProcessor.scala:17)
>>> 15/02/23 17:35:07 INFO DAGScheduler: Parents of final stage: List()
>>> 15/02/23 17:35:07 INFO DAGScheduler: Missing parents: List()
>>> 15/02/23 17:35:07 INFO DAGScheduler: Submitting Stage 0
>>> (ParallelCollectionRDD[0] at makeRDD at NewReportProcessor.scala:17), which
>>> has no missing parents
>>> 15/02/23 17:35:07 INFO MemoryStore: ensureFreeSpace(1360) called with
>>> curMem=0, maxMem=280248975
>>> 15/02/23 17:35:07 INFO MemoryStore: Block broadcast_0 stored as values
>>> in memory (estimated size 1360.0 B, free 267.3 MB)
>>> 15/02/23 17:35:07 INFO MemoryStore: ensureFreeSpace(1071) called with
>>> curMem=1360, maxMem=280248975
>>> 15/02/23 17:35:07 INFO MemoryStore: Block broadcast_0_piece0 stored as
>>> bytes in memory (estimated size 1071.0 B, free 267.3 MB)
>>> 15/02/23 17:35:07 INFO BlockManagerInfo: Added broadcast_0_piece0 in
>>> memory on localhost:57801 (size: 1071.0 B, free: 267.3 MB)
>>> 15/02/23 17:35:07 INFO BlockManagerMaster: Updated info of block
>>> broadcast_0_piece0
>>> 15/02/23 17:35:07 INFO SparkContext: Created broadcast 0 from broadcast
>>> at DAGScheduler.scala:838
>>> 15/02/23 17:35:07 INFO DAGScheduler: Submitting 1 missing tasks from
>>> Stage 0 (ParallelCollectionRDD[0] at makeRDD at NewReportProcessor.scala:17)
>>> 15/02/23 17:35:07 INFO TaskSchedulerImpl: Adding task set 0.0 with 1
>>> tasks
>>> 15/02/23 17:35:07 INFO TaskSetManager: Starting task 0.0 in stage 0.0
>>> (TID 0, localhost, PROCESS_LOCAL, 5587 bytes)
>>> 15/02/23 17:35:07 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
>>> 15/02/23 17:35:07 INFO Executor: Fetching
>>> http://192.168.241.128:50950/jars/rick-processors-assembly-1.0.jar with
>>> timestamp 1424741706610
>>> 15/02/23 17:35:08 INFO Utils: Fetching
>>> http://192.168.241.128:50950/jars/rick-processors-assembly-1.0.jar to
>>> /tmp/fetchFileTemp2793880583189398319.tmp
>>> 15/02/23 17:35:08 INFO Executor: Adding
>>> file:/tmp/spark-bdec3945-52d1-42bf-8b7a-30f14f492a42/rick-processors-assembly-1.0.jar
>>> to class loader
>>> 15/02/23 17:35:08 ERROR Executor: Exception in task 0.0 in stage 0.0
>>> (TID 0)
>>> java.io.IOException: java.lang.RuntimeException: Unable to find proto
>>> buffer class
>>> at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:988)
>>> at
>>> org.apache.spark.rdd.ParallelCollectionPartition.readObject(ParallelCollectionRDD.scala:70)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>> at
>>> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
>>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
>>> at
>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>> at
>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>>> at
>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>>> at
>>> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
>>> at
>>> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
>>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>> at java.lang.Thread.run(Thread.java:745)
>>> Caused by: java.lang.RuntimeException: Unable to find proto buffer class
>>> at
>>> com.google.protobuf.GeneratedMessageLite$SerializedForm.readResolve(GeneratedMessageLite.java:775)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>> at
>>> java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:1104)
>>> at
>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1807)
>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>> at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
>>> at
>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>>> at
>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>> at
>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>> at
>>> java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500)
>>> at
>>> org.apache.spark.rdd.ParallelCollectionPartition$$anonfun$readObject$1.apply$mcV$sp(ParallelCollectionRDD.scala:74)
>>> at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:985)
>>> ... 20 more
>>> Caused by: java.lang.ClassNotFoundException:
>>> com.rick.reports.Reports$SensorReports
>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>>> at java.lang.Class.forName0(Native Method)
>>> at java.lang.Class.forName(Class.java:191)
>>> at
>>> com.google.protobuf.GeneratedMessageLite$SerializedForm.readResolve(GeneratedMessageLite.java:768)
>>> ... 37 more
>>> 15/02/23 17:35:08 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID
>>> 0, localhost): java.io.IOException: java.lang.RuntimeException: Unable to
>>> find proto buffer class
>>> at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:988)
>>> at
>>> org.apache.spark.rdd.ParallelCollectionPartition.readObject(ParallelCollectionRDD.scala:70)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>> at
>>> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
>>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
>>> at
>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>> at
>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>>> at
>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>>> at
>>> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
>>> at
>>> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
>>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>> at java.lang.Thread.run(Thread.java:745)
>>> Caused by: java.lang.RuntimeException: Unable to find proto buffer class
>>> at
>>> com.google.protobuf.GeneratedMessageLite$SerializedForm.readResolve(GeneratedMessageLite.java:775)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>> at
>>> java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:1104)
>>> at
>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1807)
>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>> at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
>>> at
>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>>> at
>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>> at
>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>> at
>>> java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500)
>>> at
>>> org.apache.spark.rdd.ParallelCollectionPartition$$anonfun$readObject$1.apply$mcV$sp(ParallelCollectionRDD.scala:74)
>>> at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:985)
>>> ... 20 more
>>> Caused by: java.lang.ClassNotFoundException:
>>> com.rick.reports.Reports$SensorReports
>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>>> at java.lang.Class.forName0(Native Method)
>>> at java.lang.Class.forName(Class.java:191)
>>> at
>>> com.google.protobuf.GeneratedMessageLite$SerializedForm.readResolve(GeneratedMessageLite.java:768)
>>> ... 37 more
>>>
>>> 15/02/23 17:35:08 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1
>>> times; aborting job
>>> 15/02/23 17:35:08 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose
>>> tasks have all completed, from pool
>>> 15/02/23 17:35:08 INFO TaskSchedulerImpl: Cancelling stage 0
>>> 15/02/23 17:35:08 INFO DAGScheduler: Job 0 failed: foreach at
>>> NewReportProcessor.scala:17, took 0.644071 s
>>> Exception in thread "main" org.apache.spark.SparkException: Job aborted
>>> due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent
>>> failure: Lost task 0.0 in stage 0.0 (TID 0, localhost):
>>> java.io.IOException: java.lang.RuntimeException: Unable to find proto
>>> buffer class
>>> at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:988)
>>> at
>>> org.apache.spark.rdd.ParallelCollectionPartition.readObject(ParallelCollectionRDD.scala:70)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>> at
>>> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
>>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
>>> at
>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>> at
>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>>> at
>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>>> at
>>> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
>>> at
>>> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
>>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>> at java.lang.Thread.run(Thread.java:745)
>>> Caused by: java.lang.RuntimeException: Unable to find proto buffer class
>>> at
>>> com.google.protobuf.GeneratedMessageLite$SerializedForm.readResolve(GeneratedMessageLite.java:775)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>> at
>>> java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:1104)
>>> at
>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1807)
>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>> at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
>>> at
>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>>> at
>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>> at
>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>> at
>>> java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500)
>>> at
>>> org.apache.spark.rdd.ParallelCollectionPartition$$anonfun$readObject$1.apply$mcV$sp(ParallelCollectionRDD.scala:74)
>>> at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:985)
>>> ... 20 more
>>> Caused by: java.lang.ClassNotFoundException:
>>> com.rick.reports.Reports$SensorReports
>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>>> at java.lang.Class.forName0(Native Method)
>>> at java.lang.Class.forName(Class.java:191)
>>> at
>>> com.google.protobuf.GeneratedMessageLite$SerializedForm.readResolve(GeneratedMessageLite.java:768)
>>> ... 37 more
>>>
>>> Driver stacktrace:
>>> at org.apache.spark.scheduler.DAGScheduler.org
>>> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1214)
>>> at
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1203)
>>> at
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1202)
>>> at
>>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>>> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>>> at
>>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1202)
>>> at
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
>>> at
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
>>> at scala.Option.foreach(Option.scala:236)
>>> at
>>> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:696)
>>> at
>>> org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1420)
>>> at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
>>> at
>>> org.apache.spark.scheduler.DAGSchedulerEventProcessActor.aroundReceive(DAGScheduler.scala:1375)
>>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>>> at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
>>> at akka.dispatch.Mailbox.run(Mailbox.scala:220)
>>> at
>>> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
>>> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>> at
>>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>>> at
>>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>> at
>>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>>
>>> The Proper Report Output:
>>> ====================
>>> reports {
>>>   network_report {
>>>     begin_time: 1424380054789676056
>>>     end_time: 1424380054789740740
>>>     source: "<some-IP-address>:80"
>>>     destination: "<some-IP-address>:46792"
>>>     protocol: TCP
>>>     stream_id: 3
>>>     stream_status: 8
>>>   }
>>>   http_report {
>>>     request {
>>>       method: "GET"
>>> etc...
>>>
>>>
>>

Re: Getting to proto buff classes in Spark Context

Reply via email to