Re: Spark Streaming not working
Any solution please ? On Fri, Apr 10, 2020 at 11:04 PM Debabrata Ghosh wrote: > Hi, > I have a spark streaming application where Kafka is producing > records but unfortunately spark streaming isn't able to consume those. > > I am hitting the following error: > > 20/04/10 17:28:04 ERROR Executor: Exception in task 0.5 in stage 0.0 (TID 24) > java.lang.AssertionError: assertion failed: Failed to get records for > spark-executor-service-spark-ingestion dice-ingestion 11 0 after polling for > 12 > at scala.Predef$.assert(Predef.scala:170) > at > org.apache.spark.streaming.kafka010.CachedKafkaConsumer.get(CachedKafkaConsumer.scala:74) > at > org.apache.spark.streaming.kafka010.KafkaRDD$KafkaRDDIterator.next(KafkaRDD.scala:223) > at > org.apache.spark.streaming.kafka010.KafkaRDD$KafkaRDDIterator.next(KafkaRDD.scala:189) > > > Would you please be able to help with a resolution. > > Thanks, > Debu >
Re: Spark Streaming not working
unsubscribe
Re: Spark Streaming not working
Yes the Kafka producer is producing records from the same host - Rechecked Kafka connection and the connection is there. Came across this URL but unable to understand it https://stackoverflow.com/questions/42264669/spark-streaming-assertion-failed-failed-to-get-records-for-spark-executor-a-gro On Fri, Apr 10, 2020 at 11:14 PM Srinivas V wrote: > Check if your broker details are correct, verify if you have network > connectivity to your client box and Kafka broker server host. > > On Fri, Apr 10, 2020 at 11:04 PM Debabrata Ghosh > wrote: > >> Hi, >> I have a spark streaming application where Kafka is producing >> records but unfortunately spark streaming isn't able to consume those. >> >> I am hitting the following error: >> >> 20/04/10 17:28:04 ERROR Executor: Exception in task 0.5 in stage 0.0 (TID 24) >> java.lang.AssertionError: assertion failed: Failed to get records for >> spark-executor-service-spark-ingestion dice-ingestion 11 0 after polling for >> 12 >> at scala.Predef$.assert(Predef.scala:170) >> at >> org.apache.spark.streaming.kafka010.CachedKafkaConsumer.get(CachedKafkaConsumer.scala:74) >> at >> org.apache.spark.streaming.kafka010.KafkaRDD$KafkaRDDIterator.next(KafkaRDD.scala:223) >> at >> org.apache.spark.streaming.kafka010.KafkaRDD$KafkaRDDIterator.next(KafkaRDD.scala:189) >> >> >> Would you please be able to help with a resolution. >> >> Thanks, >> Debu >> >
Re: Spark Streaming not working
Check if your broker details are correct, verify if you have network connectivity to your client box and Kafka broker server host. On Fri, Apr 10, 2020 at 11:04 PM Debabrata Ghosh wrote: > Hi, > I have a spark streaming application where Kafka is producing > records but unfortunately spark streaming isn't able to consume those. > > I am hitting the following error: > > 20/04/10 17:28:04 ERROR Executor: Exception in task 0.5 in stage 0.0 (TID 24) > java.lang.AssertionError: assertion failed: Failed to get records for > spark-executor-service-spark-ingestion dice-ingestion 11 0 after polling for > 12 > at scala.Predef$.assert(Predef.scala:170) > at > org.apache.spark.streaming.kafka010.CachedKafkaConsumer.get(CachedKafkaConsumer.scala:74) > at > org.apache.spark.streaming.kafka010.KafkaRDD$KafkaRDDIterator.next(KafkaRDD.scala:223) > at > org.apache.spark.streaming.kafka010.KafkaRDD$KafkaRDDIterator.next(KafkaRDD.scala:189) > > > Would you please be able to help with a resolution. > > Thanks, > Debu >
Spark Streaming not working
Hi, I have a spark streaming application where Kafka is producing records but unfortunately spark streaming isn't able to consume those. I am hitting the following error: 20/04/10 17:28:04 ERROR Executor: Exception in task 0.5 in stage 0.0 (TID 24) java.lang.AssertionError: assertion failed: Failed to get records for spark-executor-service-spark-ingestion dice-ingestion 11 0 after polling for 12 at scala.Predef$.assert(Predef.scala:170) at org.apache.spark.streaming.kafka010.CachedKafkaConsumer.get(CachedKafkaConsumer.scala:74) at org.apache.spark.streaming.kafka010.KafkaRDD$KafkaRDDIterator.next(KafkaRDD.scala:223) at org.apache.spark.streaming.kafka010.KafkaRDD$KafkaRDDIterator.next(KafkaRDD.scala:189) Would you please be able to help with a resolution. Thanks, Debu
Re: [External Sender] Re: Driver pods stuck in running state indefinitely
No, there was no internal domain issue. As I mentioned I saw this issue only on a few nodes on the cluster. On Thu, Apr 9, 2020 at 10:49 PM Wei Zhang wrote: > Is there any internal domain name resolving issues? > > > Caused by: java.net.UnknownHostException: > spark-1586333186571-driver-svc.fractal-segmentation.svc > > -z > > From: Prudhvi Chennuru (CONT) > Sent: Friday, April 10, 2020 2:44 > To: user > Subject: Driver pods stuck in running state indefinitely > > > Hi, > >We are running spark batch jobs on K8s. >Kubernetes version: 1.11.5 , >spark version: 2.3.2, > docker version: 19.3.8 > >Issue: Few Driver pods are stuck in running state indefinitely with > error > >``` >The Initial job has not accepted any resources; check your cluster UI > to ensure that workers are registered and have sufficient resources. >``` > > Below is the log of the errored out executor pods > > ``` >Exception in thread "main" > java.lang.reflect.UndeclaredThrowableException > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1858) > at > org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:63) > at > org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188) > at > org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:293) > at > org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala) > Caused by: org.apache.spark.SparkException: Exception thrown in > awaitResult: > at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205) > at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75) > at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101) > at > org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:201) > at > org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:64) > at > org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:63) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1840) > ... 4 more > Caused by: java.io.IOException: Failed to connect to > spark-1586333186571-driver-svc.fractal-segmentation.svc:7078 > at > org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:245) > at > org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:187) > at > org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:198) > at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:194) > at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:190) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.net.UnknownHostException: > spark-1586333186571-driver-svc.fractal-segmentation.svc > at java.net.InetAddress.getAllByName0(InetAddress.java:1280) > at java.net.InetAddress.getAllByName(InetAddress.java:1192) > at java.net.InetAddress.getAllByName(InetAddress.java:1126) > at java.net.InetAddress.getByName(InetAddress.java:1076) > at io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:146) > at io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:143) > at java.security.AccessController.doPrivileged(Native Method) > at io.netty.util.internal.SocketUtils.addressByName(SocketUtils.java:143) > at > io.netty.resolver.DefaultNameResolver.doResolve(DefaultNameResolver.java:43) > at io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:63) > at io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:55) > at > io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:57) > at > io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:32) > at > io.netty.resolver.AbstractAddressResolver.resolve(AbstractAddressResolver.java:108) > at io.netty.bootstrap.Bootstrap.doResolveAndConnect0(Bootstrap.java:208) > at io.netty.bootstrap.Bootstrap.access$000(Bootstrap.java:49) > at io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:188) > at io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:174) > at > io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507) > at > io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:481) > at > io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420) > at > io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104) > at >
Spark hangs while reading from jdbc - does nothing
Hi all, I am on spark 2.4.4 and using scala 2.11.12, and running cluster mode on mesos. I am ingesting from an oracle database using spark.read.jdbc. I am seeing a strange issue where spark just hangs and does nothing, not starting any new tasks. Normally this job finishes in 30 stages but sometimes it stops at 29 completed stages and doesn’t start the last stage. The spark job is idling and there is no pending or active task. What could be the problem? Thanks. -- Cheers, Ruijing Li
Re: How to import PySpark into Jupyter
Hello Yasir, You need to check your 'PYTHONPATH' environment variable. For windows, If I do a "pip install", the package is installed in "lib\site-packages" under the python folder. If I "print (sys.path)", I see "lib\site-packages" as one of the entries, and I can expect "import " to work. Find the installation location of 'findspark' and add it to the PYTHONPATH, you can do that inside your script as well like: import sys sys.path.append('X:\PathTo\findspark\module') Hope it works. Regards, Akchhaya Sharma On Fri, Apr 10, 2020 at 4:35 PM Yasir Elgohary wrote: > Peace dear all, > > I hope you all are well and healthy... > > I am brand new to Spark/Hadoop. My env. is: Windows 7 with > Jupyter/Anaconda and Spark/Hadoop all installed on my laptop. How can I run > the following without errors: > > import findspark > findspark.init() > findspark.find() > from pyspark.sql import SparkSession > > This is the error msg. I get: > > ModuleNotFoundError: No module named 'findspark' > > > It seems I missing something for Spark to run well with Jupyter/Anaconda on > Windows 7. > > > Cheers > > > > > > Cheers >
Fwd: How to import PySpark into Jupyter
Peace dear all, I hope you all are well and healthy... I am brand new to Spark/Hadoop. My env. is: Windows 7 with Jupyter/Anaconda and Spark/Hadoop all installed on my laptop. How can I run the following without errors: import findspark findspark.init() findspark.find() from pyspark.sql import SparkSession This is the error msg. I get: ModuleNotFoundError: No module named 'findspark' It seems I missing something for Spark to run well with Jupyter/Anaconda on Windows 7. Cheers Cheers