Re: [VOTE] Release Apache Spark 1.0.0 (RC10)

Kevin Markey Thu, 22 May 2014 00:33:55 -0700

I've discovered that one of the anomalies I encountered was due to a(embarrassing? humorous?) user error. See the user list thread "FailedRC-10 yarn-cluster job for FS closed error when cleaning up stagingdirectory" for my discussion. With the user error corrected, the FSclosed exception only prevents deletion of the staging directory, butdoes not affect completion with "SUCCESS." The FS closed exception stillneeds some investigation at least by me.

I tried the patch reported by SPARK-1898, but it didn't fix the problemwithout fixing the user error. I did not attempt to test my fix withoutthe patch, so I can't pass judgment on the patch.

Although this is merely a pseudocluster based test -- I can'treconfigure our cluster with RC-10 -- I'll now change my vote to...


+1.

Thanks all who helped.
Kevin



On 05/21/2014 09:18 PM, Tom Graves wrote:

I don't think Kevin's issue would be with an api change in YarnClientImpl since 
in both cases he says he is using hadoop 2.3.0.  I'll take a look at his post 
in the user list.

Tom




On Wednesday, May 21, 2014 7:01 PM, Colin McCabe <cmcc...@alumni.cmu.edu> wrote:


Hi Kevin,

Can you try https://issues.apache.org/jira/browse/SPARK-1898 to see if it
fixes your issue?

Running in YARN cluster mode, I had a similar issue where Spark was able to
create a Driver and an Executor via YARN, but then it stopped making any
progress.

Note: I was using a pre-release version of CDH5.1.0, not 2.3 like you were
using.

best,
Colin



On Wed, May 21, 2014 at 3:34 PM, Kevin Markey <kevin.mar...@oracle.com>wrote:

0

Abstaining because I'm not sure if my failures are due to Spark,
configuration, or other factors...

Compiled and deployed RC10 for YARN, Hadoop 2.3

  per Spark 1.0.0 Yarn

documentation.  No problems.
Rebuilt applications against RC10 and Hadoop 2.3.0 (plain vanilla Apache
release).
Updated scripts for various applications.
Application had successfully compiled and run against Spark 0.9.1 and
Hadoop 2.3.0.
Ran in "yarn-cluster" mode.
Application ran to conclusion except that it ultimately failed because of
an exception when Spark tried to clean up the staging directory.  Also,
where before Yarn would report the running program as "RUNNING", it only
reported this application as "ACCEPTED".  It appeared to run two containers
when the first instance never reported that it was RUNNING.

I will post a

  separate note to the USER list about the specifics.

Thanks
Kevin Markey



On 05/21/2014 10:58 AM, Mark Hamstra wrote:

+1


On Tue, May 20, 2014 at 11:09 PM, Henry Saputra <henry.sapu...@gmail.com>
wrote:

   Signature and hash for source looks good

No external executable package with source - good
Compiled with git and maven - good
Ran examples and sample programs locally and standalone -good

+1

- Henry



On Tue, May 20, 2014 at 1:13 PM, Tathagata Das
<tathagata.das1...@gmail.com> wrote:

Please vote on releasing the following candidate as Apache Spark version

1.0.0!

This has a few bug fixes on top of rc9:
SPARK-1875: https://github.com/apache/spark/pull/824
SPARK-1876: https://github.com/apache/spark/pull/819
SPARK-1878: https://github.com/apache/spark/pull/822
SPARK-1879: https://github.com/apache/spark/pull/823

The tag to be voted on is v1.0.0-rc10 (commit d8070234):

   https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=

d807023479ce10aec28ef3c1ab646ddefc2e663c

The

  release files, including signatures, digests, etc. can be found at:

http://people.apache.org/~tdas/spark-1.0.0-rc10/

The release artifacts are signed with the following key:
https://people.apache.org/keys/committer/tdas.asc

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-1018/

The documentation

  corresponding to this release can be found at:

http://people.apache.org/~tdas/spark-1.0.0-rc10-docs/

The full list of changes in this release can be found at:

   https://git-wip-us.apache.org/repos/asf?p=spark.git;a=blob;

f=CHANGES.txt;h=d21f0ace6326e099360975002797eb7cba9d5273;hb=
d807023479ce10aec28ef3c1ab646ddefc2e663c

Please vote on releasing this package as Apache Spark 1.0.0!

The vote is open until

  Friday, May 23, at 20:00 UTC and passes if

amajority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 1.0.0
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see
http://spark.apache.org/

====== API Changes ======
We welcome users to compile Spark applications against 1.0. There are
a few API changes in this release. Here are links to the associated
upgrade guides - user facing changes have been kept as small as
possible.

Changes to ML vector specification:

   http://people.apache.org/~tdas/spark-1.0.0-rc10-docs/

mllib-guide.html#from-09-to-10

Changes to the Java API:

   http://people.apache.org/~tdas/spark-1.0.0-rc10-docs/

java-programming-guide.html#upgrading-from-pre-10-versions-of-spark

Changes to the streaming API:

   http://people.apache.org/~tdas/spark-1.0.0-rc10-docs/

streaming-programming-guide.html#migration-guide-from-091-or-below-to-1x

Changes to the GraphX API:

   http://people.apache.org/~tdas/spark-1.0.0-rc10-docs/

graphx-programming-guide.html#upgrade-guide-from-spark-091

Other changes:
coGroup and related functions now return Iterable[T] instead of Seq[T]
==> Call toSeq on the result to restore the old behavior

SparkContext.jarOfClass returns Option[String] instead of Seq[String]
==> Call toSeq on the result to restore old behavior

Re: [VOTE] Release Apache Spark 1.0.0 (RC10)

Reply via email to