Thank you very much for your answers, Now i understand better what i have
to do! Thank you!
On Wed, 8 Feb 2017 at 22:37, Gourav Sengupta
wrote:
> Hi,
>
> I am not quite sure of your used case here, but I would use spark-submit
> and submit sequential jobs as steps to
Hi,
I am not quite sure of your used case here, but I would use spark-submit
and submit sequential jobs as steps to an EMR cluster.
Regards,
Gourav
On Wed, Feb 8, 2017 at 11:10 AM, Cosmin Posteuca
wrote:
> I tried to run some test on EMR on yarn cluster mode.
>
> I
The resource management in yarn cluster mode is yarns task. So it dependents
how you configured the queues and the scheduler there.
> On 8 Feb 2017, at 12:10, Cosmin Posteuca wrote:
>
> I tried to run some test on EMR on yarn cluster mode.
>
> I have a cluster with
I tried to run some test on EMR on yarn cluster mode.
I have a cluster with 16 cores(8 processors with 2 threads each). If i run
one job(use 5 core) takes 90 seconds, if i run 2 jobs simultaneous, both
finished in 170 seconds. If i run 3 jobs simultaneous, all three finished
in 240 seconds.
If i
Hi,
Michael's answer will solve the problem in case you using only SQL based
solution.
Otherwise please refer to the wonderful details mentioned here
https://spark.apache.org/docs/latest/job-scheduling.html. With EMR 5.3.0
released SPARK 2.1.0 is available in AWS.
(note that there is an issue
Why couldn’t you use the spark thrift server?
On Feb 7, 2017, at 1:28 PM, Cosmin Posteuca
> wrote:
answer for Gourav Sengupta
I want to use same spark application because i want to work as a FIFO
scheduler. My problem is that i have
answer for Gourav Sengupta
I want to use same spark application because i want to work as a FIFO
scheduler. My problem is that i have many jobs(not so big) and if i run an
application for every job my cluster will split resources as a FAIR
scheduler(it's what i observe, maybe i'm wrong) and exist
Response for vincent:
Thanks for answer!
Yes, i need a business solution, that's the reason why i can't use Spark
jobserver or Livy solutions. I will look on your github to see how to build
such a system.
But i don't understand, why spark doesn't have a solution for this kind of
problem? and
Hi,
May I ask the reason for using the same spark application? Is it because of
the time it takes in order to start a spark context?
On another note you may want to look at the number of contributors in a
github repo before choosing a solution.
Regards,
Gourav
On Tue, Feb 7, 2017 at 5:26 PM,
Spark jobserver or Livy server are the best options for pure technical API.
If you want to publish business API you will probably have to build you own
app like the one I wrote a year ago
https://github.com/elppc/akka-spark-experiments
It combines Akka actors and a shared Spark context to serve
I think you are loking for livy or spark jobserver
On Wed, 8 Feb 2017 at 12:37 am, Cosmin Posteuca
wrote:
> I want to run different jobs on demand with same spark context, but i
> don't know how exactly i can do this.
>
> I try to get current context, but seems it
I want to run different jobs on demand with same spark context, but i don't
know how exactly i can do this.
I try to get current context, but seems it create a new spark context(with
new executors).
I call spark-submit to add new jobs.
I run code on Amazon EMR(3 instances, 4 core & 16GB ram /
12 matches
Mail list logo