Re: Hello

2016-06-20 Thread Joseph Bradley
Hi Harmeet, I'll add one more item to the other advice: The community is in the process of putting together a roadmap JIRA for 2.1 for ML: https://issues.apache.org/jira/browse/SPARK-15581 This JIRA lists some of the major items and links to a few umbrella JIRAs with subtasks. I'd expect this

Re: Spark 2.0 on YARN - Files in config archive not ending up on executor classpath

2016-06-20 Thread Jonathan Kelly
OK, JIRA created: https://issues.apache.org/jira/browse/SPARK-16080 Also, after looking at the code a bit I think I see the reason. If I'm correct, it may actually be a very easy fix. On Mon, Jun 20, 2016 at 1:21 PM Marcelo Vanzin wrote: > It doesn't hurt to have a bug

Re: Spark 2.0 on YARN - Files in config archive not ending up on executor classpath

2016-06-20 Thread Marcelo Vanzin
It doesn't hurt to have a bug tracking it, in case anyone else has time to look at it before I do. On Mon, Jun 20, 2016 at 1:20 PM, Jonathan Kelly wrote: > Thanks for the confirmation! Shall I cut a JIRA issue? > > On Mon, Jun 20, 2016 at 10:42 AM Marcelo Vanzin

Re: Spark 2.0 on YARN - Files in config archive not ending up on executor classpath

2016-06-20 Thread Jonathan Kelly
Thanks for the confirmation! Shall I cut a JIRA issue? On Mon, Jun 20, 2016 at 10:42 AM Marcelo Vanzin wrote: > I just tried this locally and can see the wrong behavior you mention. > I'm running a somewhat old build of 2.0, but I'll take a look. > > On Mon, Jun 20, 2016 at

Re: Question about equality of o.a.s.sql.Row

2016-06-20 Thread dhruve ashar
In scala, "==" and "!=" are not operators but methods which are defined here as : The expression x == that is equivalent to if (x eq null) that eq null else x.*equals*(that). The expression x != that is equivalent to true if !(this == that) So

Integrating a Native ARchive (NAR) into Apache Spark

2016-06-20 Thread Erik O'Shaughnessy
Hello, I’ve cobbled together a JNI interface and packaged it with the nar-maven-plugin, but I’m frankly out of my depth with Java/Scala/Maven and could use some pointers about how to make my new JNI object available for import in spark-shell. My NAR project builds a simple Foo object that

Re: Inconsistent joinWith behavior?

2016-06-20 Thread Richard Marscher
Hi, thanks for the response. I have created a JIRA ticket: https://issues.apache.org/jira/browse/SPARK-16076 On Mon, Jun 20, 2016 at 2:52 PM, Yin Huai wrote: > Hello Richard, > > Looks like the Dataset is Dataset[(Int, Int)]. I guess for the case of > "ds.joinWith(other,

Re: Question about equality of o.a.s.sql.Row

2016-06-20 Thread Michael Armbrust
> > This is because two objects are compared by "o1 != o2" instead of > "o1.equals(o2)" at > https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/Row.scala#L408 Even equals(...) does not do what you want on the JVM: scala> Array(1,2).equals(Array(1,2))

Re: Inconsistent joinWith behavior?

2016-06-20 Thread Yin Huai
Hello Richard, Looks like the Dataset is Dataset[(Int, Int)]. I guess for the case of "ds.joinWith(other, expr, Outer).map({ case (t, u) => (Option(t), Option(u)) })". We are trying to use null to create a "(Int, Int)" and somehow it ended up with a tuple2 having default values. Can you create a

Re: Spark 2.0 on YARN - Files in config archive not ending up on executor classpath

2016-06-20 Thread Marcelo Vanzin
I just tried this locally and can see the wrong behavior you mention. I'm running a somewhat old build of 2.0, but I'll take a look. On Mon, Jun 20, 2016 at 7:04 AM, Jonathan Kelly wrote: > Does anybody have any thoughts on this? > > On Fri, Jun 17, 2016 at 6:36 PM

Re: cutting 1.6.2 rc and 2.0.0 rc this week?

2016-06-20 Thread Sean Owen
This is only my opinion, but, I really do not expect a 1.7.0. I can imagine a 1.6.3 bug fix release in a few months, but kind of doubt it would continue much past that. On Mon, Jun 20, 2016 at 5:36 PM, Stephen Hellberg wrote: > Sean Owen wrote >> Clearly we need to keep the

Re: How to explain SchedulerBackend.reviveOffers()?

2016-06-20 Thread Matei Zaharia
Hi Jacek, This applies to all schedulers actually -- it just tells Spark to re-check the available nodes and possibly launch tasks on them, because a new stage was submitted. Then when any node is available, the scheduler will call the TaskSetManager with an "offer" for the node. Matei > On

Re: cutting 1.6.2 rc and 2.0.0 rc this week?

2016-06-20 Thread Stephen Hellberg
Sean Owen wrote > Clearly we need to keep the 1.x line going for a bit... Is there any perspective on just how long 'a bit' might be? I'm not sure I've found any prior description in our community of a (long-term?) support commitment previously - are we talking months, or years? -- View this

Inconsistent joinWith behavior?

2016-06-20 Thread Richard Marscher
I know recently outer join was changed to preserve actual nulls through the join in https://github.com/apache/spark/pull/13425. I am seeing what seems like inconsistent behavior though based on how the join is interacted with. In one case the default datatype values are still used instead of nulls

Re: Spark 2.0 on YARN - Files in config archive not ending up on executor classpath

2016-06-20 Thread Jonathan Kelly
Does anybody have any thoughts on this? On Fri, Jun 17, 2016 at 6:36 PM Jonathan Kelly wrote: > I'm trying to debug a problem in Spark 2.0.0-SNAPSHOT > (commit bdf5fe4143e5a1a393d97d0030e76d35791ee248) where Spark's > log4j.properties is not getting picked up in the

How to explain SchedulerBackend.reviveOffers()?

2016-06-20 Thread Jacek Laskowski
Hi, Whenever I see `backend.reviveOffers()` I'm struggling myself with properly explaining what it does. My understanding is that it requests a SchedulerBackend (that's responsible for talking to a cluster manager) to...that's the moment I'm not sure about. How would you explain