Re: JavaRDD using Reflection

2015-09-14 Thread Ajay Singal
Hello Rachana, The easiest way would be to start with creating a 'parent' JavaRDD and run different filters (based on different input arguments) to create respective 'child' JavaRDDs dynamically. Notice that the creation of these children RDDs is handled by the application driver. Hope this

Re: Controlling number of executors on Mesos vs YARN

2015-08-13 Thread Ajay Singal
Hi Tim, An option like spark.mesos.executor.max to cap the number of executors per node/application would be very useful. However, having an option like spark.mesos.executor.num to specify desirable number of executors per node would provide even/much better control. Thanks, Ajay On Wed, Aug

Re: Controlling number of executors on Mesos vs YARN

2015-08-13 Thread Ajay Singal
: You're referring to both fine grain and coarse grain? Desirable number of executors per node could be interesting but it can't be guaranteed (or we could try to and when failed abort the job). How would you imagine this new option to actually work? Tim On Wed, Aug 12, 2015 at 11:48 AM, Ajay

Re: How to increase parallelism of a Spark cluster?

2015-08-03 Thread Ajay Singal
Hi Sujit, From experimenting with Spark (and other documentation), my understanding is as follows: 1. Each application consists of one or more Jobs 2. Each Job has one or more Stages 3. Each Stage creates one or more Tasks (normally, one Task per Partition) 4. Master

Re: Facing problem in Oracle VM Virtual Box

2015-07-24 Thread Ajay Singal
Hi Chintan, This is more of Oracle VirtualBox virtualization issue than Spark issue. VT-x is hardware assisted virtualization and it is required by Oracle VirtualBox for all (64 bits) guests. The error message indicates that either your processor does not support VT-x (but your VM is

Re: ERROR SparkUI: Failed to bind SparkUI java.net.BindException: Address already in use: Service 'SparkUI' failed after 16 retries!

2015-07-24 Thread Ajay Singal
is to define a function to find open port and use that. Thanks Joji John -- *From:* Ajay Singal asinga...@gmail.com *Sent:* Friday, July 24, 2015 6:59 AM *To:* Joji John *Cc:* user@spark.apache.org *Subject:* Re: ERROR SparkUI: Failed to bind SparkUI

Re: ERROR SparkUI: Failed to bind SparkUI java.net.BindException: Address already in use: Service 'SparkUI' failed after 16 retries!

2015-07-24 Thread Ajay Singal
Hi Jodi, I guess, there is no hard limit on number of Spark applications running in parallel. However, you need to ensure that you do not use the same (e.g., default) port numbers for each application. In your specific case, for example, if you try using default SparkUI port 4040 for more than

Instantiating/starting Spark jobs programmatically

2015-04-20 Thread Ajay Singal
Greetings, We have an analytics workflow system in production. This system is built in Java and utilizes other services (including Apache Solr). It works fine with moderate level of data/processing load. However, when the load goes beyond certain limit (e.g., more than 10 million