Well that could be the problem. A SQL database is essential a big synchronizer. If you have a lot of spark tasks all bottlenecking on a single database socket (is the database clustered or colocated with spark workers?) then you will have blocked threads on the database server.
Sent from my Verizon Wireless 4G LTE smartphone -------- Original message -------- From: Malcolm Lockyer <malcolm.lock...@hapara.com> Date: 05/30/2016 10:40 PM (GMT-05:00) To: user@spark.apache.org Subject: Re: Spark + Kafka processing trouble On Tue, May 31, 2016 at 1:56 PM, Darren Govoni <dar...@ontrenet.com> wrote: > So you are calling a SQL query (to a single database) within a spark > operation distributed across your workers? Yes, but currently with very small sets of data (1-10,000) and on a single (dev) machine right now. (sorry didn't reply to the list) --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org