You didnt specify what is the key blocker. Why is processing time underutilized? Are your threads processing the results hence spark jobs are not being deployed?
Mayur Rustagi Ph: +919632149971 h <https://twitter.com/mayur_rustagi>ttp://www.sigmoidanalytics.com https://twitter.com/mayur_rustagi On Thu, Feb 20, 2014 at 7:30 PM, Livni, Dana <[email protected]> wrote: > Hi, > > > > Wanted to know what is the best practice to a certain scenario we have. > > > > we have a lot of batch processing on data stored in HBASE cluster. they > are independent and need to run in parallel. > > The current implementation we are using is running multiple independent > process (each of them is multi treaded itself). > > Each process raise one spark context and all it child thread are using it. > > This creates a situation in which we raise around 150 concurrent spark > context (each is used by 5-10 threads each preform about 4 map tasks). > > > > It seems this implementation is not very efficient both in memory meaner > (mainly for our batch server) and processing time on the cluster. > > > > Wanted to know what will be the best way to do it? > > We thought maybe create a service that will raise only one spark context > and all the process and threads will send request to it. > > Does anyone have insights if this will be better solution or maybe have > other ideas. > > Thanks in advanced > > Dana > > > > --------------------------------------------------------------------- > Intel Electronics Ltd. > > This e-mail and any attachments may contain confidential material for > the sole use of the intended recipient(s). Any review or distribution > by others is strictly prohibited. If you are not the intended > recipient, please contact the sender and delete all copies. >
