Multithreaded vs Spark Executor

2015-09-11 Thread Rachana Srivastava
Hello all, We are getting stream of input data from a Kafka queue using Spark Streaming API. For each data element we want to run parallel threads to process a set of feature lists (nearly 100 feature or more).Since feature lists creation is independent of each other we would like to execu

SIGTERM 15 Issue : Spark Streaming for ingesting huge text files using custom Receiver

2015-09-11 Thread Varadhan, Jawahar
Hi all,   I have a coded a custom receiver which receives kafka messages. These Kafka messages have FTP server credentials in them. The receiver then opens the message and uses the ftp credentials in it  to connect to the ftp server. It then streams this huge text file (3.3G) . Finally this stre

Re: Concurrency issue in SQLExecution.withNewExecutionId

2015-09-11 Thread Olivier Toupin
@Andrew_Or-2 I am using Scala futures. -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Concurrency-issue-in-SQLExecution-withNewExecutionId-tp14035p14068.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. -

Re: New JavaRDD Inside JavaPairDStream

2015-09-11 Thread Cody Koeninger
No, in general you can't make new RDDs in code running on the executors. It looks like your properties file is a constant, why not process it at the beginning of the job and broadcast the result? On Fri, Sep 11, 2015 at 2:09 PM, Rachana Srivastava < rachana.srivast...@markmonitor.com> wrote: > H

New JavaRDD Inside JavaPairDStream

2015-09-11 Thread Rachana Srivastava
Hello all, Can we invoke JavaRDD while processing stream from Kafka for example. Following code is throwing some serialization exception. Not sure if this is feasible. JavaStreamingContext jssc = new JavaStreamingContext(jsc, Durations.seconds(5)); JavaPairReceiverInputDStream messages

Re: SparkR driver side JNI

2015-09-11 Thread Renyi Xiong
got it! thanks a lot. On Fri, Sep 11, 2015 at 11:10 AM, Shivaram Venkataraman < shiva...@eecs.berkeley.edu> wrote: > Its possible -- in the sense that a lot of designs are possible. But > AFAIK there are no clean interfaces for getting all the arguments / > SparkConf options from spark-submit and

New Spark json endpoints

2015-09-11 Thread Kevin Chen
Hello Spark Devs, I noticed that [SPARK-3454], which introduces new json endpoints at /api/v1/[path] for information previously only shown on the web UI, does not expose several useful properties about Spark jobs that are exposed on the web UI and on the unofficial /json endpoint. Specific exam

Re: [ANNOUNCE] Announcing Spark 1.5.0

2015-09-11 Thread Ryan Williams
yea, wfm now, thanks! On Fri, Sep 11, 2015 at 2:16 PM Jonathan Kelly wrote: > I just clicked the > http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.spark%22 > link provided above by Ryan, and I see 1.5.0. Was this just fixed within > the past hour, or is some caching causing some peo

Re: [ANNOUNCE] Announcing Spark 1.5.0

2015-09-11 Thread Jonathan Kelly
I just clicked the http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.spark%22 link provided above by Ryan, and I see 1.5.0. Was this just fixed within the past hour, or is some caching causing some people not to see it? On Fri, Sep 11, 2015 at 10:24 AM, Reynold Xin wrote: > It is alre

Re: SparkR driver side JNI

2015-09-11 Thread Shivaram Venkataraman
Its possible -- in the sense that a lot of designs are possible. But AFAIK there are no clean interfaces for getting all the arguments / SparkConf options from spark-submit and its all the more tricker to handle scenarios where the first JVM has already created a SparkContext that you want to use f

Re: SparkR driver side JNI

2015-09-11 Thread Renyi Xiong
forgot to reply all. I see. but what prevents e.g. R driver getting those command line arguments from spark-submit and setting them with SparkConf to R diver's in-process JVM through JNI? On Thu, Sep 10, 2015 at 9:29 PM, Shivaram Venkataraman < shiva...@eecs.berkeley.edu> wrote: > Yeah in additi

Re: [ANNOUNCE] Announcing Spark 1.5.0

2015-09-11 Thread Reynold Xin
It is already there, but the search is not updated. Not sure what's going on with maven central search. http://repo1.maven.org/maven2/org/apache/spark/spark-parent_2.10/1.5.0/ On Fri, Sep 11, 2015 at 10:21 AM, Ryan Williams < ryan.blake.willi...@gmail.com> wrote: > Any idea why 1.5.0 is not i

Re: [ANNOUNCE] Announcing Spark 1.5.0

2015-09-11 Thread Ted Yu
This is related: https://issues.apache.org/jira/browse/SPARK-10557 On Fri, Sep 11, 2015 at 10:21 AM, Ryan Williams < ryan.blake.willi...@gmail.com> wrote: > Any idea why 1.5.0 is not in Maven central yet > ? > Is that a separa

Re: [ANNOUNCE] Announcing Spark 1.5.0

2015-09-11 Thread Ryan Williams
Any idea why 1.5.0 is not in Maven central yet ? Is that a separate release process? On Wed, Sep 9, 2015 at 12:40 PM andy petrella wrote: > You can try it out really quickly by "building" a Spark Notebook from > http://spark-

Spark 1.5.0: setting up debug env

2015-09-11 Thread lonikar
I have setup spark debug env on windows and mac, and thought its worth sharing given some of the issues I encountered and the instructions given here did not work for *eclipse* (possibly outd

Re: MongoDB and Spark

2015-09-11 Thread Corey Nolet
Unfortunately, MongoDB does not directly expose its locality via its client API so the problem with trying to schedule Spark tasks against it is that the tasks themselves cannot be scheduled locally on nodes containing query results- which means you can only assume most results will be sent over th

Re: Spark 1.5: How to trigger expression execution through UnsafeRow/TungstenProject

2015-09-11 Thread lonikar
thanks that worked -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-1-5-How-to-trigger-expression-execution-through-UnsafeRow-TungstenProject-tp14026p14053.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. -

Re: Spark 1.5.x: Java files in src/main/scala and vice versa

2015-09-11 Thread lonikar
It does not cause any problem when building using maven. But when doing eclipse:eclipse, the generated .classpath files contained only . This caused all the .scala sources to be ignored and caused all kinds of eclipse build errors. It resolved only when I added prebuild jars in the java build path,

Re: MongoDB and Spark

2015-09-11 Thread Sandeep Giri
I think it should be possible by loading collections as RDD and then doing a union on them. Regards, Sandeep Giri, +1 347 781 4573 (US) +91-953-899-8962 (IN) www.KnowBigData.com. Phone: +1-253-397-1945 (Office) [image: linkedin icon]

Re: MongoDB and Spark

2015-09-11 Thread Sandeep Giri
use map-reduce. On Fri, Sep 11, 2015, 14:32 Mishra, Abhishek wrote: > Hello , > > > > Is there any way to query multiple collections from mongodb using spark > and java. And i want to create only one Configuration Object. Please help > if anyone has something regarding this. > > > > > > Thank Y