Re: Issue running Spark 1.4 on Yarn

2015-06-11 Thread matvey14
No, this just a random queue name I picked when submitting the job, there's
no specific configuration for it. I am not logged in, so don't have the
default fair scheduler configuration in front of me, but I don't think
that's the problem. The cluster is completely idle, there aren't any jobs
being executed, so it can't be hitting any of fair scheduler's limits.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Issue-running-Spark-1-4-on-Yarn-tp23211p23274.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Issue running Spark 1.4 on Yarn

2015-06-10 Thread matvey14
Hi nsalian,

For some reason the rest of this thread isn't showing up here. The
NodeManager isn't busy. I'll copy/paste, the details are in there.



I've tried running a Hadoop app pointing to the same queue. Same thing now,
the job doesn't get accepted. I've cleared out the queue and killed all the
pending jobs, the queue is still unusable.

It seems like an issue with YARN, but it's specifically Spark that leaves
the queue in this state. I've ran a Hadoop job in a for loop 10x, while
specifying the queue explicitly, just to double-check.

On Tue, Jun 9, 2015 at 4:45 PM, Matt Kapilevich matve...@gmail.com wrote:
From the RM scheduler, I see 3 applications currently stuck in the
root.thequeue queue.

Used Resources: memory:0, vCores:0
Num Active Applications: 0
Num Pending Applications: 3
Min Resources: memory:0, vCores:0
Max Resources: memory:6655, vCores:4
Steady Fair Share: memory:1664, vCores:0
Instantaneous Fair Share: memory:6655, vCores:0

On Tue, Jun 9, 2015 at 4:30 PM, Matt Kapilevich matve...@gmail.com wrote:
Yes! If I either specify a different queue or don't specify a queue at all,
it works.

On Tue, Jun 9, 2015 at 4:25 PM, Marcelo Vanzin van...@cloudera.com wrote:
Does it work if you don't specify a queue?

On Tue, Jun 9, 2015 at 1:21 PM, Matt Kapilevich matve...@gmail.com wrote:
Hi Marcelo,

Yes, restarting YARN fixes this behavior and it again works the first few
times. The only thing that's consistent is that once Spark job submissions
stop working, it's broken for good.

On Tue, Jun 9, 2015 at 4:12 PM, Marcelo Vanzin van...@cloudera.com wrote:
Apologies, I see you already posted everything from the RM logs that mention
your stuck app.

Have you tried restarting the YARN cluster to see if that changes anything?
Does it go back to the first few tries work behaviour?

I run 1.4 on top of CDH 5.4 pretty often and haven't seen anything like
this.


On Tue, Jun 9, 2015 at 1:01 PM, Marcelo Vanzin van...@cloudera.com wrote:
On Tue, Jun 9, 2015 at 11:31 AM, Matt Kapilevich matve...@gmail.com wrote:
 Like I mentioned earlier, I'm able to execute Hadoop jobs fine even now -
this problem is specific to Spark.

That doesn't necessarily mean anything. Spark apps have different resource
requirements than Hadoop apps.
 
Check your RM logs for any line that mentions your Spark app id. That may
give you some insight into what's happening or not.

-- 
Marcelo



-- 
Marcelo




-- 
Marcelo







--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Issue-running-Spark-1-4-on-Yarn-tp23211p23258.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org