Seems like a bug, could you file a JIRA? @Tim: Patrick said you take a look at Mesos related issues. Could you take a look at this. Thanks!
TD On Fri, Mar 27, 2015 at 1:25 PM, Ondrej Smola <ondrej.sm...@gmail.com> wrote: > Yes, only when using fine grained mode and replication > (StorageLevel.MEMORY_ONLY_2 > etc). > > 2015-03-27 19:06 GMT+01:00 Tathagata Das <t...@databricks.com>: > >> Does it fail with just Spark jobs (using storage levels) on non-coarse >> mode? >> >> TD >> >> On Fri, Mar 27, 2015 at 4:39 AM, Ondrej Smola <ondrej.sm...@gmail.com> >> wrote: >> >>> More info >>> >>> when using *spark.mesos.coarse* everything works as expected. I think >>> this must be a bug in spark-mesos integration. >>> >>> >>> 2015-03-27 9:23 GMT+01:00 Ondrej Smola <ondrej.sm...@gmail.com>: >>> >>>> It happens only when StorageLevel is used with 1 replica ( >>>> StorageLevel.MEMORY_ONLY_2,StorageLevel.MEMORY_AND_DISK_2) , >>>> StorageLevel.MEMORY_ONLY ,StorageLevel.MEMORY_AND_DISK works - the >>>> problems must be clearly somewhere between mesos-spark . From console I see >>>> that spark is trying to replicate to nodes -> nodes show up in Mesos active >>>> tasks ... but they always fail with ClassNotFoundE. >>>> >>>> 2015-03-27 0:52 GMT+01:00 Tathagata Das <t...@databricks.com>: >>>> >>>>> Could you try running a simpler spark streaming program with receiver >>>>> (may be socketStream) and see if that works. >>>>> >>>>> TD >>>>> >>>>> On Thu, Mar 26, 2015 at 2:08 PM, Ondrej Smola <ondrej.sm...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi thanks for reply, >>>>>> >>>>>> yes I have custom receiver -> but it has simple logic .. pop ids from >>>>>> redis queue -> load docs based on ids from elastic and store them in >>>>>> spark. >>>>>> No classloader modifications. I am running multiple Spark batch jobs >>>>>> (with >>>>>> user supplied partitioning) and they have no problems, debug in local >>>>>> mode >>>>>> show no errors. >>>>>> >>>>>> 2015-03-26 21:47 GMT+01:00 Tathagata Das <t...@databricks.com>: >>>>>> >>>>>>> Here are few steps to debug. >>>>>>> >>>>>>> 1. Try using replication from a Spark job: sc.parallelize(1 to 100, >>>>>>> 100).persist(StorageLevel.MEMORY_ONLY_2).count() >>>>>>> 2. If one works, then we know that there is probably nothing wrong >>>>>>> with the Spark installation, and probably in the threads related to the >>>>>>> receivers receiving the data. Are you writing a custom receiver? Are you >>>>>>> somehow playing around with the class loader in the custom receiver? >>>>>>> >>>>>>> TD >>>>>>> >>>>>>> >>>>>>> On Thu, Mar 26, 2015 at 10:59 AM, Ondrej Smola < >>>>>>> ondrej.sm...@gmail.com> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I am running spark streaming v 1.3.0 (running inside Docker) on >>>>>>>> Mesos 0.21.1. Spark streaming is started using Marathon -> docker >>>>>>>> container >>>>>>>> gets deployed and starts streaming (from custom Actor). Spark binary is >>>>>>>> located on shared GlusterFS volume. Data is streamed from >>>>>>>> Elasticsearch/Redis. When new batch arrives Spark tries to replicate >>>>>>>> it but >>>>>>>> fails with following error : >>>>>>>> >>>>>>>> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0 of size 2840 >>>>>>>> dropped from memory (free 278017782) >>>>>>>> 15/03/26 14:50:00 INFO BlockManager: Removing block >>>>>>>> broadcast_0_piece0 >>>>>>>> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0_piece0 of >>>>>>>> size 1658 dropped from memory (free 278019440) >>>>>>>> 15/03/26 14:50:00 INFO BlockManagerMaster: Updated info of block >>>>>>>> broadcast_0_piece0 >>>>>>>> 15/03/26 14:50:00 ERROR TransportRequestHandler: Error while >>>>>>>> invoking RpcHandler#receive() on RPC id 7178767328921933569 >>>>>>>> java.lang.ClassNotFoundException: >>>>>>>> org/apache/spark/storage/StorageLevel >>>>>>>> at java.lang.Class.forName0(Native Method) >>>>>>>> at java.lang.Class.forName(Class.java:344) >>>>>>>> at >>>>>>>> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:65) >>>>>>>> at >>>>>>>> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613) >>>>>>>> at >>>>>>>> java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518) >>>>>>>> at >>>>>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774) >>>>>>>> at >>>>>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) >>>>>>>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) >>>>>>>> at >>>>>>>> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:68) >>>>>>>> at >>>>>>>> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:88) >>>>>>>> at >>>>>>>> org.apache.spark.network.netty.NettyBlockRpcServer.receive(NettyBlockRpcServer.scala:65) >>>>>>>> at >>>>>>>> org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:124) >>>>>>>> at >>>>>>>> org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:97) >>>>>>>> at >>>>>>>> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:91) >>>>>>>> at >>>>>>>> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:44) >>>>>>>> at >>>>>>>> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) >>>>>>>> at >>>>>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) >>>>>>>> at >>>>>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) >>>>>>>> at >>>>>>>> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) >>>>>>>> at >>>>>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) >>>>>>>> at >>>>>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) >>>>>>>> at >>>>>>>> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163) >>>>>>>> at >>>>>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) >>>>>>>> at >>>>>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) >>>>>>>> at >>>>>>>> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787) >>>>>>>> at >>>>>>>> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130) >>>>>>>> at >>>>>>>> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) >>>>>>>> at >>>>>>>> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) >>>>>>>> at >>>>>>>> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) >>>>>>>> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) >>>>>>>> at >>>>>>>> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) >>>>>>>> at java.lang.Thread.run(Thread.java:745) >>>>>>>> 15/03/26 14:50:01 ERROR TransportRequestHandler: Error while >>>>>>>> invoking RpcHandler#receive() on RPC id 9001562482648380222 >>>>>>>> >>>>>>>> From mesos UI i see unpacked spark binary and my assembly jar in >>>>>>>> place (on running driver and on replication targets). I have other >>>>>>>> spark >>>>>>>> BATCH jobs running from same base docker image OK. When there is no >>>>>>>> incoming data exception is not thrown. Spark config : >>>>>>>> >>>>>>>> spark.master >>>>>>>> >>>>>>>> mesos://zk://incomparable-brush.maas:2181,cumbersome-match.maas:2181,voluminous-toys.maas:2181/mesos >>>>>>>> spark.serializer org.apache.spark.serializer.KryoSerializer >>>>>>>> spark.executor.uri >>>>>>>> file:///master/spark/spark-1.3.0-bin-hadoop2.4.tgz >>>>>>>> spark.local.dir /opt/spark_tmp >>>>>>>> >>>>>>>> spark.driver.port 41000 >>>>>>>> spark.executor.port 41016 >>>>>>>> spark.fileserver.port 41032 >>>>>>>> spark.broadcast.port 41048 >>>>>>>> spark.replClassServer.port 41064 >>>>>>>> spark.blockManager.port 41080 >>>>>>>> spark.ui.port 41096 >>>>>>>> spark.history.ui.port 41112 >>>>>>>> >>>>>>>> Thanks for any help >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >