I'm working on one of the Palantir teams using Spark, and here is our
feedback:

We have encountered three issues when upgrading to spark 1.4.0. I'm not
sure they qualify as a -1, as they come from using non-public APIs and
multiple spark contexts for the purposes of testing, but I do want to bring
them up for awareness =)

   1. Our UDT was serializing to a StringType, but now strings are
   represented internally as UTF8String, so we had to change our UDT to use
   UTF8String.apply() and UTF8String.toString() to convert back to String.
   2. createDataFrame when using UDTs used to accept things in the
   serialized catalyst form. Now, they're supposed to be in the UDT java class
   form (I think this change would've affected us in 1.3.1 already, since we
   were in 1.3.0)
   3. derby database lifecycle management issue with HiveContext. We have
   been using a SparkContextResource JUnit Rule that we wrote, and it sets up
   then tears down a SparkContext and HiveContext between unit test runs
   within the same process (possibly the same thread as well). Multiple
   contexts are not being used at once. It used to work in 1.3.0, but now when
   we try to create the HiveContext for the second unit test, then it
   complains with the following exception. I have a feeling it might have
   something to do with the Hive object being thread local, and us not
   explicitly closing the HiveContext and everything it holds. The full stack
   trace is here: https://gist.github.com/justinuang/0403d49cdeedf91727cd

Caused by: java.sql.SQLException: Failed to start database
'metastore_db' with class loader
org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@5dea2446,
see the next exception for details.
        at 
org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown
Source)


On Wed, May 20, 2015 at 10:35 AM Imran Rashid <iras...@cloudera.com> wrote:

> -1
>
> discovered I accidentally removed master & worker json endpoints, will
> restore
> https://issues.apache.org/jira/browse/SPARK-7760
>
> On Tue, May 19, 2015 at 11:10 AM, Patrick Wendell <pwend...@gmail.com>
> wrote:
>
>> Please vote on releasing the following candidate as Apache Spark version
>> 1.4.0!
>>
>> The tag to be voted on is v1.4.0-rc1 (commit 777a081):
>>
>> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=777a08166f1fb144146ba32581d4632c3466541e
>>
>> The release files, including signatures, digests, etc. can be found at:
>> http://people.apache.org/~pwendell/spark-1.4.0-rc1/
>>
>> Release artifacts are signed with the following key:
>> https://people.apache.org/keys/committer/pwendell.asc
>>
>> The staging repository for this release can be found at:
>> https://repository.apache.org/content/repositories/orgapachespark-1092/
>>
>> The documentation corresponding to this release can be found at:
>> http://people.apache.org/~pwendell/spark-1.4.0-rc1-docs/
>>
>> Please vote on releasing this package as Apache Spark 1.4.0!
>>
>> The vote is open until Friday, May 22, at 17:03 UTC and passes
>> if a majority of at least 3 +1 PMC votes are cast.
>>
>> [ ] +1 Release this package as Apache Spark 1.4.0
>> [ ] -1 Do not release this package because ...
>>
>> To learn more about Apache Spark, please see
>> http://spark.apache.org/
>>
>> == How can I help test this release? ==
>> If you are a Spark user, you can help us test this release by
>> taking a Spark 1.3 workload and running on this release candidate,
>> then reporting any regressions.
>>
>> == What justifies a -1 vote for this release? ==
>> This vote is happening towards the end of the 1.4 QA period,
>> so -1 votes should only occur for significant regressions from 1.3.1.
>> Bugs already present in 1.3.X, minor regressions, or bugs related
>> to new features will not block this release.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> For additional commands, e-mail: dev-h...@spark.apache.org
>>
>>
>

Reply via email to