Using FAIR mode. If no other way. I think there is a limitation on number of parallel jobs that spark can run. Is there a way that more number of jobs can run in parallel. This is alright because, this sparkcontext would only be used during web service calls. I looked at spark configuration page and tried a few. But they didnt seem to work. I am using spark 2.3.1
Thanks. On Sun, Sep 23, 2018 at 6:00 PM Michael Artz <michaelea...@gmail.com> wrote: > Are you using the scheduler in fair mode instead of fifo mode? > > Sent from my iPhone > > > On Sep 22, 2018, at 12:58 AM, Jatin Puri <purija...@gmail.com> wrote: > > > > Hi. > > > > What tactics can I apply for such a scenario. > > > > I have a pipeline of 10 stages. Simple text processing. I train the data > with the pipeline and for the fitted data, do some modelling and store the > results. > > > > I also have a web-server, where I receive requests. For each request > (dataframe of single row), I transform against the same pipeline created > above. And do the respective action. The problem is: calling spark for > single row takes less than 1 second, but under higher load, spark > becomes a major bottleneck. > > > > One solution that I can think of, is to have scala re-implementation > of the same pipeline, and with the help of the model generated above, > process the requests. But this results in duplication of code and hence > maintenance. > > > > Is there any way, that I can call the same pipeline (transform) in a > very light manner, and just for single row. So that it just works > concurrently and spark does not remain a bottlenect? > > > > Thanks > > Jatin > -- Jatin Puri http://jatinpuri.com <http://www.jatinpuri.com>