Re: LiveListenerBus is occupying most of the Driver Memory and frequent GC is degrading the performance

2020-08-11 Thread Waleed Fateem
Hi Teja, The only thought I have is maybe considering decreasing the spark.scheduler.listenerbus.eventqueue.capacity parameter. That should decrease the driver memory pressure but of course you'll end up with dropping events probably more frequently, meaning you can't really trust anything you

Re: Spark hangs while reading from jdbc - does nothing Removing Guess work from trouble shooting

2020-04-24 Thread Waleed Fateem
Are you running this in local mode? If not, are you even sure that the hanging is occurring on the driver's side? Did you check the Spark UI to see if there is a straggler task or not? If you do have a straggler/hanging task, and in case this is not an application running in local mode then you

Re: Spark stuck at removing broadcast variable

2020-04-18 Thread Waleed Fateem
This might be obvious but just checking anyways, did you confirm whether or not all of the messages have already been consumed by Spark? If that's the case then I wouldn't expect much to happen unless new data comes into your Kafka topic. If you're a hundred percent sure that there's still plenty

Re: Spark Streaming on Compact Kafka topic - consumers 1 message per partition per batch

2020-04-01 Thread Waleed Fateem
Well this is interesting. Not sure if this is the expected behavior. The log messages you have referenced are actually printed out by the Kafka Consumer itself (org.apache.kafka.clients.consumer.internals.Fetcher). That log message belongs to a new feature added starting with Kafka 1.1: