Reporting errors from spark sql

2016-08-18 Thread yael aharon
Hello, I am working on an SQL editor which is powered by spark SQL. When the SQL is not valid, I would like to provide the user with a line number and column number where the first error occurred. I am having a hard time finding a mechanism that will give me that information programmatically.

System.exit in local mode ?

2016-05-26 Thread yael aharon
Hello, I have noticed that in https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/SparkUncaughtExceptionHandler.scala spark would call System.exit if an uncaught exception was encountered. I have an application that is running spark in local mode, and would like

Allowing parallelism in spark local mode

2016-02-12 Thread yael aharon
Hello, I have an application that receives requests over HTTP and uses spark in local mode to process the requests. Each request is running in its own thread. It seems that spark is queueing the jobs, processing them one at a time. When 2 requests arrive simultaneously, the processing time for

Unexplained sleep time

2015-10-08 Thread yael aharon
Hello, I am working on improving the performance of our Spark on Yarn applications. Scanning through the logs I found the following lines: [2015-10-07T16:25:17.245-04:00] [DataProcessing] [INFO] [] [org.apache.spark.Logging$class] [tid:main] [userID:yarn] Started progress reporter thread - sleep

Adding Kafka topics to a running streaming context

2015-08-27 Thread yael aharon
Hello, My streaming application needs to allow consuming new Kafka topics at arbitrary times. I know I can stop and start the streaming context when I need to introduce a new stream, but that seems quite disruptive. I am wondering if other people have this situation and if there is a more elegant