Re: [VOTE] Release Apache Spark 1.4.0 (RC3)

Olivier Girardot Tue, 02 Jun 2015 02:34:13 -0700

Hi everyone,
I think there's a blocker on PySpark the "when" functions in python seems
to be broken but the Scala API seems fine.
Here's a snippet demonstrating that with Spark 1.4.0 RC3 :


In [*1*]: df = sqlCtx.createDataFrame([(1, "1"), (2, "2"), (1, "2"), (1,
"2")], ["key", "value"])

In [*2*]: from pyspark.sql import functions as F

In [*8*]: df.select(df.key, F.when(df.key > 1, 0).when(df.key == 0,
2).otherwise(1)).show()
+---+---------------------------------+
| key |CASE WHEN (key = 0) THEN 2 ELSE 1|
+---+---------------------------------+
| 1| 1|
| 2| 1|
| 1| 1|
| 1| 1|
+---+---------------------------------+

When in Scala I get the expectes expression and behaviour :

scala> val df = sqlContext.createDataFrame(List((1, "1"), (2, "2"), (1,
"2"), (1, "2"))).toDF("key", "value")

scala> import org.apache.spark.sql.functions._

scala> df.select(df("key"), when(df("key") > 1, 0).when(df("key") === 2,
2).otherwise(1)).show()


+---+-------------------------------------------------------+

|key|CASE WHEN (key > 1) THEN 0 WHEN (key = 2) THEN 2 ELSE 1|
+---+-------------------------------------------------------+
| 1| 1|
| 2| 0|
| 1| 1|
| 1| 1|
+---+-------------------------------------------------------+

I've opened the Jira (https://issues.apache.org/jira/browse/SPARK-8038) and
fixed it here https://github.com/apache/spark/pull/6580

Regards,

Olivier.

Le mar. 2 juin 2015 à 07:34, Bobby Chowdary <bobby.chowdar...@gmail.com> a
écrit :

> Hi Patrick,
>                   Thanks for clarifying. No issues with functionality.
> +1 (non-binding)
>
> Thanks
> Bobby
>
> On Mon, Jun 1, 2015 at 9:41 PM, Patrick Wendell <pwend...@gmail.com>
> wrote:
>
>> Hey Bobby,
>>
>> Those are generic warnings that the hadoop libraries throw. If you are
>> using MapRFS they shouldn't matter since you are using the MapR client
>> and not the default hadoop client.
>>
>> Do you have any issues with functionality... or was it just seeing the
>> warnings that was the concern?
>>
>> Thanks for helping test!
>>
>> - Patrick
>>
>> On Mon, Jun 1, 2015 at 5:18 PM, Bobby Chowdary
>> <bobby.chowdar...@gmail.com> wrote:
>> > Hive Context works on RC3 for Mapr after adding
>> > spark.sql.hive.metastore.sharedPrefixes as suggested in SPARK-7819.
>> However,
>> > there still seems to be some other issues with native libraries, i get
>> below
>> > warning
>> > WARN NativeCodeLoader: Unable to load native-hadoop library for your
>> > platform... using builtin-java classes where applicable. I tried adding
>> even
>> > after adding SPARK_LIBRARYPATH and --driver-library-path with no luck.
>> >
>> > Built on MacOSX and running CentOS 7 JDK1.6 and JDK 1.8 (tried both)
>> >
>> >  make-distribution.sh --tgz --skip-java-test -Phive -Phive-0.13.1
>> -Pmapr4
>> > -Pnetlib-lgpl -Phive-thriftserver.
>> >
>> >   C
>> >
>> > On Mon, Jun 1, 2015 at 3:05 PM, Sean Owen <so...@cloudera.com> wrote:
>> >>
>> >> I get a bunch of failures in VersionSuite with build/test params
>> >> "-Pyarn -Phive -Phadoop-2.6":
>> >>
>> >> - success sanity check *** FAILED ***
>> >>   java.lang.RuntimeException: [download failed:
>> >> org.jboss.netty#netty;3.2.2.Final!netty.jar(bundle), download failed:
>> >> commons-net#commons-net;3.1!commons-net.jar]
>> >>   at
>> >>
>> org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:978)
>> >>
>> >> ... but maybe I missed the memo about how to build for Hive? do I
>> >> still need another Hive profile?
>> >>
>> >> Other tests, signatures, etc look good.
>> >>
>> >> On Sat, May 30, 2015 at 12:40 AM, Patrick Wendell <pwend...@gmail.com>
>> >> wrote:
>> >> > Please vote on releasing the following candidate as Apache Spark
>> version
>> >> > 1.4.0!
>> >> >
>> >> > The tag to be voted on is v1.4.0-rc3 (commit dd109a8):
>> >> >
>> >> >
>> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=dd109a8746ec07c7c83995890fc2c0cd7a693730
>> >> >
>> >> > The release files, including signatures, digests, etc. can be found
>> at:
>> >> >
>> http://people.apache.org/~pwendell/spark-releases/spark-1.4.0-rc3-bin/
>> >> >
>> >> > Release artifacts are signed with the following key:
>> >> > https://people.apache.org/keys/committer/pwendell.asc
>> >> >
>> >> > The staging repository for this release can be found at:
>> >> > [published as version: 1.4.0]
>> >> >
>> https://repository.apache.org/content/repositories/orgapachespark-1109/
>> >> > [published as version: 1.4.0-rc3]
>> >> >
>> https://repository.apache.org/content/repositories/orgapachespark-1110/
>> >> >
>> >> > The documentation corresponding to this release can be found at:
>> >> >
>> http://people.apache.org/~pwendell/spark-releases/spark-1.4.0-rc3-docs/
>> >> >
>> >> > Please vote on releasing this package as Apache Spark 1.4.0!
>> >> >
>> >> > The vote is open until Tuesday, June 02, at 00:32 UTC and passes
>> >> > if a majority of at least 3 +1 PMC votes are cast.
>> >> >
>> >> > [ ] +1 Release this package as Apache Spark 1.4.0
>> >> > [ ] -1 Do not release this package because ...
>> >> >
>> >> > To learn more about Apache Spark, please see
>> >> > http://spark.apache.org/
>> >> >
>> >> > == What has changed since RC1 ==
>> >> > Below is a list of bug fixes that went into this RC:
>> >> > http://s.apache.org/vN
>> >> >
>> >> > == How can I help test this release? ==
>> >> > If you are a Spark user, you can help us test this release by
>> >> > taking a Spark 1.3 workload and running on this release candidate,
>> >> > then reporting any regressions.
>> >> >
>> >> > == What justifies a -1 vote for this release? ==
>> >> > This vote is happening towards the end of the 1.4 QA period,
>> >> > so -1 votes should only occur for significant regressions from 1.3.1.
>> >> > Bugs already present in 1.3.X, minor regressions, or bugs related
>> >> > to new features will not block this release.
>> >> >
>> >> > ---------------------------------------------------------------------
>> >> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> >> > For additional commands, e-mail: dev-h...@spark.apache.org
>> >> >
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> >> For additional commands, e-mail: dev-h...@spark.apache.org
>> >>
>> >
>>
>
>

Re: [VOTE] Release Apache Spark 1.4.0 (RC3)

Reply via email to