Re: Dataframe.fillna from 1.3.0

2015-04-20 Thread Reynold Xin
Ah ic. You can do something like df.select(coalesce(df(a), lit(0.0))) On Mon, Apr 20, 2015 at 1:44 PM, Olivier Girardot o.girar...@lateral-thoughts.com wrote: From PySpark it seems to me that the fillna is relying on Java/Scala code, that's why I was wondering. Thank you for answering :)

Re: Dataframe.fillna from 1.3.0

2015-04-20 Thread Reynold Xin
You can just create fillna function based on the 1.3.1 implementation of fillna, no? On Mon, Apr 20, 2015 at 2:48 AM, Olivier Girardot o.girar...@lateral-thoughts.com wrote: a UDF might be a good idea no ? Le lun. 20 avr. 2015 à 11:17, Olivier Girardot o.girar...@lateral-thoughts.com a

Re: [sql] Dataframe how to check null values

2015-04-20 Thread Ted Yu
I found: https://issues.apache.org/jira/browse/SPARK-6573 On Apr 20, 2015, at 4:29 AM, Peter Rudenko petro.rude...@gmail.com wrote: Sounds very good. Is there a jira for this? Would be cool to have in 1.4, because currently cannot use dataframe.describe function with NaN values, need to

Re: [sql] Dataframe how to check null values

2015-04-20 Thread Peter Rudenko
Sounds very good. Is there a jira for this? Would be cool to have in 1.4, because currently cannot use dataframe.describe function with NaN values, need to filter manually all the columns. Thanks, Peter Rudenko On 2015-04-02 21:18, Reynold Xin wrote: Incidentally, we were discussing this

Re: How to use Spark Streaming .jar file that I've built using a different branch than master?

2015-04-20 Thread Emre Sevinc
Apparently, after *only* building Spark Streaming, I also have to: mvn --projects assembly/ -DskipTests clean install so that my test project uses the new version when I pass it to spark-submit. -- Emre Sevinç On Mon, Apr 20, 2015 at 10:58 AM, Emre Sevinc emre.sev...@gmail.com wrote:

Re: Dataframe.fillna from 1.3.0

2015-04-20 Thread Olivier Girardot
a UDF might be a good idea no ? Le lun. 20 avr. 2015 à 11:17, Olivier Girardot o.girar...@lateral-thoughts.com a écrit : Hi everyone, let's assume I'm stuck in 1.3.0, how can I benefit from the *fillna* API in PySpark, is there any efficient alternative to mapping the records myself ?

Re: How to use Spark Streaming .jar file that I've built using a different branch than master?

2015-04-20 Thread Emre Sevinc
I thought it was spark-submit that was configuring and arranging everything related to classpath (am I wrong?), e.g. that's how I used Spark so far. Is there a way to do it using spark-submit? -- Emre On Mon, Apr 20, 2015 at 11:06 AM, Akhil Das ak...@sigmoidanalytics.com wrote: I think you can

Re: Addition of new Metrics for killed executors.

2015-04-20 Thread Archit Thakur
Hi Twinkle, We have a use case in where we want to debug the reason of how n why an executor got killed. Could be because of stackoverflow, GC or any other unexpected scenario. If I see the driver UI there is no information present around killed executors, So was just curious how do people

How to use Spark Streaming .jar file that I've built using a different branch than master?

2015-04-20 Thread Emre Sevinc
Hello, I'm building a different version of Spark Streaming (based on a different branch than master) in my application for testing purposes, but it seems like spark-submit is ignoring my newly built Spark Streaming .jar, and using an older version. Here's some context: I'm on a different

Re: How to use Spark Streaming .jar file that I've built using a different branch than master?

2015-04-20 Thread Akhil Das
I think you can override the SPARK_CLASSPATH with your newly built jar. Thanks Best Regards On Mon, Apr 20, 2015 at 2:28 PM, Emre Sevinc emre.sev...@gmail.com wrote: Hello, I'm building a different version of Spark Streaming (based on a different branch than master) in my application for

Dataframe.fillna from 1.3.0

2015-04-20 Thread Olivier Girardot
Hi everyone, let's assume I'm stuck in 1.3.0, how can I benefit from the *fillna* API in PySpark, is there any efficient alternative to mapping the records myself ? Regards, Olivier.

Re: Addition of new Metrics for killed executors.

2015-04-20 Thread twinkle sachdeva
Hi Archit, What is your use case and what kind of metrics are you planning to add? Thanks, Twinkle On Fri, Apr 17, 2015 at 4:07 PM, Archit Thakur archit279tha...@gmail.com wrote: Hi, We are planning to add new Metrics in Spark for the executors that got killed during the execution. Was