Does it work as expected with smaller batch or smaller load? Could it be that it's accumulating too many events over 3 minutes?
You could also try increasing the parallelism via repartition to ensure smaller tasks that can safely fit in working memory. Sent from my iPhone > On 28 Oct 2015, at 17:45, Afshartous, Nick <nafshart...@turbine.com> wrote: > > > Hi, we are load testing our Spark 1.3 streaming (reading from Kafka) job and > seeing a problem. This is running in AWS/Yarn and the streaming batch > interval is set to 3 minutes and this is a ten node cluster. > > Testing at 30,000 events per second we are seeing the streaming job get stuck > (stack trace below) for over an hour. > > Thanks on any insights or suggestions. > -- > Nick > > org.apache.spark.streaming.api.java.AbstractJavaDStreamLike.mapPartitionsToPair(JavaDStreamLike.scala:43) > com.wb.analytics.spark.services.streaming.drivers.StreamingKafkaConsumerDriver.runStream(StreamingKafkaConsumerDriver.java:125) > com.wb.analytics.spark.services.streaming.drivers.StreamingKafkaConsumerDriver.main(StreamingKafkaConsumerDriver.java:71) > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > java.lang.reflect.Method.invoke(Method.java:606) > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:480) > > Notice: This communication is for the intended recipient(s) only and may > contain confidential, proprietary, legally protected or privileged > information of Turbine, Inc. If you are not the intended recipient(s), please > notify the sender at once and delete this communication. Unauthorized use of > the information in this communication is strictly prohibited and may be > unlawful. For those recipients under contract with Turbine, Inc., the > information in this communication is subject to the terms and conditions of > any applicable contracts or agreements. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org