Hi, we are load testing our Spark 1.3 streaming (reading from Kafka) job and seeing a problem. This is running in AWS/Yarn and the streaming batch interval is set to 3 minutes and this is a ten node cluster.
Testing at 30,000 events per second we are seeing the streaming job get stuck (stack trace below) for over an hour. Thanks on any insights or suggestions. -- Nick org.apache.spark.streaming.api.java.AbstractJavaDStreamLike.mapPartitionsToPair(JavaDStreamLike.scala:43) com.wb.analytics.spark.services.streaming.drivers.StreamingKafkaConsumerDriver.runStream(StreamingKafkaConsumerDriver.java:125) com.wb.analytics.spark.services.streaming.drivers.StreamingKafkaConsumerDriver.main(StreamingKafkaConsumerDriver.java:71) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) java.lang.reflect.Method.invoke(Method.java:606) org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:480) Notice: This communication is for the intended recipient(s) only and may contain confidential, proprietary, legally protected or privileged information of Turbine, Inc. If you are not the intended recipient(s), please notify the sender at once and delete this communication. Unauthorized use of the information in this communication is strictly prohibited and may be unlawful. For those recipients under contract with Turbine, Inc., the information in this communication is subject to the terms and conditions of any applicable contracts or agreements. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org