How do we convert a Dataset includes timestamp columns to RDD?

2015-12-16 Thread Yu Ishikawa
Hi all, When I tried to convert a Dataset which includes a TimestampType column to a RDD under master branch on spark-shell, I got an error about `org.apache.spark.SparkException: Task not serializable`. How do we convert Dataset includes timestamp to RDD? Here is the example code and the error:

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-16 Thread Jean-Baptiste Onofré
+1 (non binding) Tested in standalone and yarn with different samples. Regards JB On 12/16/2015 10:32 PM, Michael Armbrust wrote: Please vote on releasing the following candidate as Apache Spark version 1.6.0! The vote is open until Saturday, December 19, 2015 at 18:00 UTC and passes if a maj

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-16 Thread Yin Huai
+1 On Wed, Dec 16, 2015 at 7:19 PM, Patrick Wendell wrote: > +1 > > On Wed, Dec 16, 2015 at 6:15 PM, Ted Yu wrote: > >> Ran test suite (minus docker-integration-tests) >> All passed >> >> +1 >> >> [INFO] Spark Project External ZeroMQ .. SUCCESS [ >> 13.647 s] >> [INFO] Spark

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-16 Thread Patrick Wendell
+1 On Wed, Dec 16, 2015 at 6:15 PM, Ted Yu wrote: > Ran test suite (minus docker-integration-tests) > All passed > > +1 > > [INFO] Spark Project External ZeroMQ .. SUCCESS [ > 13.647 s] > [INFO] Spark Project External Kafka ... SUCCESS [ > 45.424 s] > [INF

Re: Re: does spark really support label expr like && or || ?

2015-12-16 Thread Marcelo Vanzin
On Wed, Dec 16, 2015 at 6:31 PM, Allen Zhang wrote: > so , my question is does the spark.yarn.executor.nodeLabelExpression and > spark.yarn.am.nodeLabelExpression really support "EXPRESSION" like and &&, > or ||, or even ! and so on. Spark doesn't do anything with those values except pass them to

Re: does spark really support label expr like && or || ?

2015-12-16 Thread Allen Zhang
more details commands: 2. yarn rmadmin -replaceLabelsOnNode spark-dev:54321,foo; yarn rmadmin -replaceLabelsOnNode sut-1:54321,bar; yarn rmadmin -replaceLabelsOnNode sut-2:54321,bye; yarn rmadmin -replaceLabelsOnNode sut-3:54321,foo; At 2015-12-17 10:31:20, "Allen Zhang" wrote:

Re:Re: does spark really support label expr like && or || ?

2015-12-16 Thread Allen Zhang
Hi Ted, I have 4 vms(spark-dev, sut-1, sut-2, sut-3): With these commands: 1. yarn rmadmin -addToClusterNodeLabels foo,bar,bye 2. yarn rmadmin -replaceLabelsOnNode spark-dev:54321,foo;yarn rmadmin -replaceLabelsOnNode sut-1:54321,foo, same to sut-2 and sut-3 3. yarn rmadmin -refreshQueues I

Re: java.lang.NoSuchMethodError while saving a random forest model Spark version 1.5

2015-12-16 Thread Joseph Bradley
This method is tested in the Spark 1.5 unit tests, so I'd guess it's a problem with the Parquet dependency. What version of Parquet are you building Spark 1.5 off of? (I'm not that familiar with Parquet issues myself, but hopefully a SQL person can chime in.) On Tue, Dec 15, 2015 at 3:23 PM, Rac

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-16 Thread Ted Yu
Ran test suite (minus docker-integration-tests) All passed +1 [INFO] Spark Project External ZeroMQ .. SUCCESS [ 13.647 s] [INFO] Spark Project External Kafka ... SUCCESS [ 45.424 s] [INFO] Spark Project Examples . SUCCESS [02:06

Re: Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-16 Thread Saisai Shao
+1 (non-binding) after SPARK-12345 is merged. On Thu, Dec 17, 2015 at 9:55 AM, Allen Zhang wrote: > plus 1 > > > > > > > 在 2015-12-17 09:39:39,"Joseph Bradley" 写道: > > +1 > > On Wed, Dec 16, 2015 at 5:26 PM, Reynold Xin wrote: > >> +1 >> >> >> On Wed, Dec 16, 2015 at 5:24 PM, Mark Hamstra >>

Spark basicOperators

2015-12-16 Thread sara mustafa
Hi, The class org.apache.spark.sql.execution.basicOperators.scala contains the implementation of multiple operators, how could I measure the execution time of any operator? thanks, -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-basicOperators-

Re:Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-16 Thread Allen Zhang
plus 1 在 2015-12-17 09:39:39,"Joseph Bradley" 写道: +1 On Wed, Dec 16, 2015 at 5:26 PM, Reynold Xin wrote: +1 On Wed, Dec 16, 2015 at 5:24 PM, Mark Hamstra wrote: +1 On Wed, Dec 16, 2015 at 1:32 PM, Michael Armbrust wrote: Please vote on releasing the following candidate as Apa

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-16 Thread Joseph Bradley
+1 On Wed, Dec 16, 2015 at 5:26 PM, Reynold Xin wrote: > +1 > > > On Wed, Dec 16, 2015 at 5:24 PM, Mark Hamstra > wrote: > >> +1 >> >> On Wed, Dec 16, 2015 at 1:32 PM, Michael Armbrust > > wrote: >> >>> Please vote on releasing the following candidate as Apache Spark version >>> 1.6.0! >>> >>>

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-16 Thread Reynold Xin
+1 On Wed, Dec 16, 2015 at 5:24 PM, Mark Hamstra wrote: > +1 > > On Wed, Dec 16, 2015 at 1:32 PM, Michael Armbrust > wrote: > >> Please vote on releasing the following candidate as Apache Spark version >> 1.6.0! >> >> The vote is open until Saturday, December 19, 2015 at 18:00 UTC and >> passe

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-16 Thread Mark Hamstra
+1 On Wed, Dec 16, 2015 at 1:32 PM, Michael Armbrust wrote: > Please vote on releasing the following candidate as Apache Spark version > 1.6.0! > > The vote is open until Saturday, December 19, 2015 at 18:00 UTC and > passes if a majority of at least 3 +1 PMC votes are cast. > > [ ] +1 Release t

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-16 Thread Michael Armbrust
+1 On Wed, Dec 16, 2015 at 4:37 PM, Andrew Or wrote: > +1 > > Mesos cluster mode regression in RC2 is now fixed (SPARK-12345 > / PR10332 > ). > > Also tested on standalone client and cluster mode. No

Re: JIRA: Wrong dates from imported JIRAs

2015-12-16 Thread Josh Rosen
Personally, I'd rather avoid the risk of breaking things during the reimport. In my experience we've had a lot of unforeseen problems with JIRA import/export and the benefit here doesn't seem huge (this issue only impacts people that are searching for the oldest JIRAs across all projects, which I t

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-16 Thread Andrew Or
+1 Mesos cluster mode regression in RC2 is now fixed (SPARK-12345 / PR10332 ). Also tested on standalone client and cluster mode. No problems. 2015-12-16 15:16 GMT-08:00 Rad Gruchalski : > I also not

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-16 Thread Rad Gruchalski
I also noticed that spark.replClassServer.host and spark.replClassServer.port aren’t used anymore. The transport now happens over the main RpcEnv. Kind regards,
 Radek Gruchalski 
ra...@gruchalski.com (mailto:ra...@gruchalski.com)
 (mailto:ra...@gruchalski.com) de.linkedin.com/in/radgru

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-16 Thread Marcelo Vanzin
I was going to say that spark.executor.port is not used anymore in 1.6, but damn, there's still that akka backend hanging around there even when netty is being used... we should fix this, should be a simple one-liner. On Wed, Dec 16, 2015 at 2:35 PM, singinpirate wrote: > -0 (non-binding) > > I h

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-16 Thread singinpirate
-0 (non-binding) I have observed that when we set spark.executor.port in 1.6, we get thrown a NPE in SparkEnv$.create(SparkEnv.scala:259). It used to work in 1.5.2. Is anyone else seeing this? On Wed, Dec 16, 2015 at 2:26 PM Jiří Syrový wrote: > +1 Tested in standalone mode and so far seems to

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-16 Thread Jiří Syrový
+1 Tested in standalone mode and so far seems to be fairly stable. 2015-12-16 22:32 GMT+01:00 Michael Armbrust : > Please vote on releasing the following candidate as Apache Spark version > 1.6.0! > > The vote is open until Saturday, December 19, 2015 at 18:00 UTC and > passes if a majority of at

[VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-16 Thread Michael Armbrust
Please vote on releasing the following candidate as Apache Spark version 1.6.0! The vote is open until Saturday, December 19, 2015 at 18:00 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.6.0 [ ] -1 Do not release this package becaus

Re: ​Spark 1.6 - H​ive remote metastore not working

2015-12-16 Thread syepes
Thanks for the the info. -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-1-6-H-ive-remote-metastore-not-working-tp15634p15658.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. ---

Re: ​Spark 1.6 - H​ive remote metastore not working

2015-12-16 Thread Yin Huai
oh i see. In your log, I guess you can find a line like "Initializing execution hive, version". The line you showed is actually associated with execution hive, which is a fake metastore that used by spark sql internally. Logs related to the real metastore (the metastore storing table metadata and e

Re: ​Spark 1.6 - H​ive remote metastore not working

2015-12-16 Thread syepes
Thanks for the reply. The thing is that with 1.5 it never showed messages like the following: 15/12/16 00:06:11 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY 15/12/16 00:06:11 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException This is a bit misl

Re: ​Spark 1.6 - H​ive remote metastore not working

2015-12-16 Thread Yin Huai
I see 15/12/16 00:06:13 INFO metastore: Trying to connect to metastore with URI thrift://remoteNode:9083 15/12/16 00:06:14 INFO metastore: Connected to metastore. Looks like you were connected to your remote metastore. On Tue, Dec 15, 2015 at 3:31 PM, syepes wrote: > ​Hello, > > I am testing o

Re: JIRA: Wrong dates from imported JIRAs

2015-12-16 Thread Lars Francke
Any other opinions on this? On Fri, Dec 11, 2015 at 9:54 AM, Lars Francke wrote: > That's a good point. I assume there's always a small risk but it's at > least the documented way from Atlassian to change the creation date so I'd > hope it should be okay. I'd build the minimal CSV file. > > I ag

Re: Update to Spar Mesos docs possibly? LIBPROCESS_IP needs to be set for client mode

2015-12-16 Thread Aaron
Wrt to PR, sure, let me update the documentation, i'll send it out shortly. My Fork is on Github..is the PR from there ok? Cheers, Aaron On Wed, Dec 16, 2015 at 11:33 AM, Timothy Chen wrote: > Yes if want to manually override what IP to use to be contacted by the > master you can set LIPROCESS_

Re: Update to Spar Mesos docs possibly? LIBPROCESS_IP needs to be set for client mode

2015-12-16 Thread Timothy Chen
Yes if want to manually override what IP to use to be contacted by the master you can set LIPROCESS_IP and LIBPROCESS_PORT. It is a Mesos specific settings. We can definitely update the docs. Note that in the future as we move to use the new Mesos Http API these configurations won't be needed (

Re: Update to Spar Mesos docs possibly? LIBPROCESS_IP needs to be set for client mode

2015-12-16 Thread Iulian Dragoș
Sure, documenting this would be great, I just wanted to understand the context. There is a related ticket: SPARK-5488 . Would you mind opening a PR? On Wed, Dec 16, 2015 at 5:11 PM, Aaron wrote: > Basically, my hostname doesn't resolve to an "ac

Re: Update to Spar Mesos docs possibly? LIBPROCESS_IP needs to be set for client mode

2015-12-16 Thread Iulian Dragoș
LIBPROCESS_IP has zero hits in the Spark code base. This seems to be a Mesos-specific setting. Have you tried setting SPARK_LOCAL_IP? On Wed, Dec 16, 2015 at 5:07 PM, Aaron wrote: > Found this thread that talked about it to help understand it better: > > > https://mail-archives.apache.org/mod_m

Re: Update to Spar Mesos docs possibly? LIBPROCESS_IP needs to be set for client mode

2015-12-16 Thread Aaron
Basically, my hostname doesn't resolve to an "accessible" IP address...which isn't a big deal, I normally set SPARK_LOCAL_IP when I am doing things on a YARN cluster. But, we've moved to a Mesos Cluster recently, and had to track down when it wasn't working...I assumed (badly obviously) that setti

Re: Update to Spar Mesos docs possibly? LIBPROCESS_IP needs to be set for client mode

2015-12-16 Thread Aaron
Found this thread that talked about it to help understand it better: https://mail-archives.apache.org/mod_mbox/mesos-user/201507.mbox/%3ccajq68qf9pejgnwomasm2dqchyaxpcaovnfkfgggxxpzj2jo...@mail.gmail.com%3E > > When you run Spark on Mesos it needs to run > > spark driver > mesos scheduler > > and

Re: Update to Spar Mesos docs possibly? LIBPROCESS_IP needs to be set for client mode

2015-12-16 Thread Aaron
The more I read/look into this..it's because the Spark Mesos Scheduler resolves to something can't be reached (e.g. localhost). So, maybe this just needs to be added to the docs, if your host cannot or does not resolve to the IP address you want. Maybe just a footnote or something? On Wed, Dec

Re: Update to Spar Mesos docs possibly? LIBPROCESS_IP needs to be set for client mode

2015-12-16 Thread Iulian Dragoș
Hi Aaron, I never had to use that variable. What is it for? On Wed, Dec 16, 2015 at 2:00 PM, Aaron wrote: > In going through running various Spark jobs, both Spark 1.5.2 and the > new Spark 1.6 SNAPSHOTs, on a Mesos cluster (currently 0.25), we > noticed that is in order to run the Spark shells

Update to Spar Mesos docs possibly? LIBPROCESS_IP needs to be set for client mode

2015-12-16 Thread Aaron
In going through running various Spark jobs, both Spark 1.5.2 and the new Spark 1.6 SNAPSHOTs, on a Mesos cluster (currently 0.25), we noticed that is in order to run the Spark shells (both python and scala), we needed to set the LIBPROCESS_IP environment variable before running. Was curious if th

A bug in Spark ML? NoSuchElementException while using RandomForest for regression.

2015-12-16 Thread Eugene Morozov
Hi! I've looked through issues and haven't found anything like that, so I've created a new one. Everything to reproduce is attached to it: https://issues.apache.org/jira/browse/SPARK-12367 Could you, please, take a look and if possible advice any workaround. Thank you in advance. -- Be well! Jean

Re: does spark really support label expr like && or || ?

2015-12-16 Thread Ted Yu
Allen: Since you mentioned scheduling, I assume you were talking about node label support in YARN. If that is the case, can you give us some more information: How node labels are setup in YARN cluster How you specified node labels in application Hadoop and Spark releases you are using Cheers >

Re: BIRCH clustering algorithm

2015-12-16 Thread Dzeno
Hi Joseph, Thank you for your tips. Thanks, Dzeno > On Dec 15, 2015, at 10:58 PM, Joseph Bradley wrote: > > Hi Dzeno, > > I'm not familiar with the algorithm myself, but if you have an important use > case for it, you could open a JIRA to discuss it. However, if it is a less > common al

Re:Re: does spark really support label expr like && or || ?

2015-12-16 Thread Allen Zhang
yes, I've tried that as well. NOT work at all. any feedback are welcome. Thanks, Allen At 2015-12-16 17:00:42, "Chang Ya-Hsuan" wrote: are you trying to do dataframe boolean expression? please use '&' for 'and', '|' for 'or', '~' for 'not' when building DataFrame boolean expressions.

Re: does spark really support label expr like && or || ?

2015-12-16 Thread Chang Ya-Hsuan
are you trying to do dataframe boolean expression? please use '&' for 'and', '|' for 'or', '~' for 'not' when building DataFrame boolean expressions. example: >>> df = sqlContext.range(10) >>> df.where( (df.id==1) | ~(df.id==1)) DataFrame[id: bigint] On Wed, Dec 16, 2015 at 4:32 PM, Allen Zhang

does spark really support label expr like && or || ?

2015-12-16 Thread Allen Zhang
Hi All, does spark label expression really support "&&" or "||" or even "!" for label based schedulering? I tried that but it does NOT work. Best Regards, Allen