Hi everyone, I think there's a blocker on PySpark the "when" functions in python seems to be broken but the Scala API seems fine. Here's a snippet demonstrating that with Spark 1.4.0 RC3 :
In [*1*]: df = sqlCtx.createDataFrame([(1, "1"), (2, "2"), (1, "2"), (1, "2")], ["key", "value"]) In [*2*]: from pyspark.sql import functions as F In [*8*]: df.select(df.key, F.when(df.key > 1, 0).when(df.key == 0, 2).otherwise(1)).show() +---+---------------------------------+ | key |CASE WHEN (key = 0) THEN 2 ELSE 1| +---+---------------------------------+ | 1| 1| | 2| 1| | 1| 1| | 1| 1| +---+---------------------------------+ When in Scala I get the expectes expression and behaviour : scala> val df = sqlContext.createDataFrame(List((1, "1"), (2, "2"), (1, "2"), (1, "2"))).toDF("key", "value") scala> import org.apache.spark.sql.functions._ scala> df.select(df("key"), when(df("key") > 1, 0).when(df("key") === 2, 2).otherwise(1)).show() +---+-------------------------------------------------------+ |key|CASE WHEN (key > 1) THEN 0 WHEN (key = 2) THEN 2 ELSE 1| +---+-------------------------------------------------------+ | 1| 1| | 2| 0| | 1| 1| | 1| 1| +---+-------------------------------------------------------+ I've opened the Jira (https://issues.apache.org/jira/browse/SPARK-8038) and fixed it here https://github.com/apache/spark/pull/6580 Regards, Olivier. Le mar. 2 juin 2015 à 07:34, Bobby Chowdary <bobby.chowdar...@gmail.com> a écrit : > Hi Patrick, > Thanks for clarifying. No issues with functionality. > +1 (non-binding) > > Thanks > Bobby > > On Mon, Jun 1, 2015 at 9:41 PM, Patrick Wendell <pwend...@gmail.com> > wrote: > >> Hey Bobby, >> >> Those are generic warnings that the hadoop libraries throw. If you are >> using MapRFS they shouldn't matter since you are using the MapR client >> and not the default hadoop client. >> >> Do you have any issues with functionality... or was it just seeing the >> warnings that was the concern? >> >> Thanks for helping test! >> >> - Patrick >> >> On Mon, Jun 1, 2015 at 5:18 PM, Bobby Chowdary >> <bobby.chowdar...@gmail.com> wrote: >> > Hive Context works on RC3 for Mapr after adding >> > spark.sql.hive.metastore.sharedPrefixes as suggested in SPARK-7819. >> However, >> > there still seems to be some other issues with native libraries, i get >> below >> > warning >> > WARN NativeCodeLoader: Unable to load native-hadoop library for your >> > platform... using builtin-java classes where applicable. I tried adding >> even >> > after adding SPARK_LIBRARYPATH and --driver-library-path with no luck. >> > >> > Built on MacOSX and running CentOS 7 JDK1.6 and JDK 1.8 (tried both) >> > >> > make-distribution.sh --tgz --skip-java-test -Phive -Phive-0.13.1 >> -Pmapr4 >> > -Pnetlib-lgpl -Phive-thriftserver. >> > >> > C >> > >> > On Mon, Jun 1, 2015 at 3:05 PM, Sean Owen <so...@cloudera.com> wrote: >> >> >> >> I get a bunch of failures in VersionSuite with build/test params >> >> "-Pyarn -Phive -Phadoop-2.6": >> >> >> >> - success sanity check *** FAILED *** >> >> java.lang.RuntimeException: [download failed: >> >> org.jboss.netty#netty;3.2.2.Final!netty.jar(bundle), download failed: >> >> commons-net#commons-net;3.1!commons-net.jar] >> >> at >> >> >> org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:978) >> >> >> >> ... but maybe I missed the memo about how to build for Hive? do I >> >> still need another Hive profile? >> >> >> >> Other tests, signatures, etc look good. >> >> >> >> On Sat, May 30, 2015 at 12:40 AM, Patrick Wendell <pwend...@gmail.com> >> >> wrote: >> >> > Please vote on releasing the following candidate as Apache Spark >> version >> >> > 1.4.0! >> >> > >> >> > The tag to be voted on is v1.4.0-rc3 (commit dd109a8): >> >> > >> >> > >> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=dd109a8746ec07c7c83995890fc2c0cd7a693730 >> >> > >> >> > The release files, including signatures, digests, etc. can be found >> at: >> >> > >> http://people.apache.org/~pwendell/spark-releases/spark-1.4.0-rc3-bin/ >> >> > >> >> > Release artifacts are signed with the following key: >> >> > https://people.apache.org/keys/committer/pwendell.asc >> >> > >> >> > The staging repository for this release can be found at: >> >> > [published as version: 1.4.0] >> >> > >> https://repository.apache.org/content/repositories/orgapachespark-1109/ >> >> > [published as version: 1.4.0-rc3] >> >> > >> https://repository.apache.org/content/repositories/orgapachespark-1110/ >> >> > >> >> > The documentation corresponding to this release can be found at: >> >> > >> http://people.apache.org/~pwendell/spark-releases/spark-1.4.0-rc3-docs/ >> >> > >> >> > Please vote on releasing this package as Apache Spark 1.4.0! >> >> > >> >> > The vote is open until Tuesday, June 02, at 00:32 UTC and passes >> >> > if a majority of at least 3 +1 PMC votes are cast. >> >> > >> >> > [ ] +1 Release this package as Apache Spark 1.4.0 >> >> > [ ] -1 Do not release this package because ... >> >> > >> >> > To learn more about Apache Spark, please see >> >> > http://spark.apache.org/ >> >> > >> >> > == What has changed since RC1 == >> >> > Below is a list of bug fixes that went into this RC: >> >> > http://s.apache.org/vN >> >> > >> >> > == How can I help test this release? == >> >> > If you are a Spark user, you can help us test this release by >> >> > taking a Spark 1.3 workload and running on this release candidate, >> >> > then reporting any regressions. >> >> > >> >> > == What justifies a -1 vote for this release? == >> >> > This vote is happening towards the end of the 1.4 QA period, >> >> > so -1 votes should only occur for significant regressions from 1.3.1. >> >> > Bugs already present in 1.3.X, minor regressions, or bugs related >> >> > to new features will not block this release. >> >> > >> >> > --------------------------------------------------------------------- >> >> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >> >> > For additional commands, e-mail: dev-h...@spark.apache.org >> >> > >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >> >> For additional commands, e-mail: dev-h...@spark.apache.org >> >> >> > >> > >