I've discovered that one of the anomalies I encountered was due to a
(embarrassing? humorous?) user error. See the user list thread Failed
RC-10 yarn-cluster job for FS closed error when cleaning up staging
directory for my discussion. With the user error corrected, the FS
closed exception
I retested several different cases...
1. FS closed exception shows up ONLY in RC-10, not in Spark 0.9.1, with
both Hadoop 2.2 and 2.3.
2. SPARK-1898 has no effect for my use cases.
3. The failure to report that the underlying application is RUNNING
and that it has succeeded is due ONLY to my
Hi Kevin,
On Thu, May 22, 2014 at 9:49 AM, Kevin Markey kevin.mar...@oracle.com wrote:
The FS closed exception only effects the cleanup of the staging directory,
not the final success or failure. I've not yet tested the effect of
changing my application's initialization, use, or closing of
The FileSystem cache is something that has caused a lot of pain over the
years. Unfortunately we (in Hadoop core) can't change the way it works now
because there are too many users depending on the current behavior.
Basically, the idea is that when you request a FileSystem with certain
options
In Spark 0.9.0 and 0.9.1, we stopped using the FileSystem cache correctly,
and we just recently resumed using it in 1.0 (and in 0.9.2) when this issue
was fixed: https://issues.apache.org/jira/browse/SPARK-1676
Prior to this fix, each Spark task created and cached its own FileSystems
due to a bug
Thank you, all! This is quite helpful.
We have been arguing how to handle this issue across a growing
application. Unfortunately the Hadoop FileSystem java doc should say
all this but doesn't!
Kevin
On 05/22/2014 01:48 PM, Aaron Davidson wrote:
In Spark 0.9.0 and 0.9.1, we stopped using
Hey all,
On further testing, I came across a bug that breaks execution of
pyspark scripts on YARN.
https://issues.apache.org/jira/browse/SPARK-1900
This is a blocker and worth cutting a new RC.
We also found a fix for a known issue that prevents additional jar
files to be specified through
On Thu, May 22, 2014 at 12:48 PM, Aaron Davidson ilike...@gmail.com wrote:
In Spark 0.9.0 and 0.9.1, we stopped using the FileSystem cache correctly,
and we just recently resumed using it in 1.0 (and in 0.9.2) when this issue
was fixed: https://issues.apache.org/jira/browse/SPARK-1676
Looks like SPARK-1900 is a blocker for YARN and might as well add
SPARK-1870 while at it.
TD or Patrick, could you kindly send [CANCEL] prefixed in the subject
email out for the RC10 Vote to help people follow the active VOTE
threads? The VOTE emails are getting a bit hard to follow.
- Henry
Right! Doing that.
TD
On Thu, May 22, 2014 at 3:07 PM, Henry Saputra henry.sapu...@gmail.com wrote:
Looks like SPARK-1900 is a blocker for YARN and might as well add
SPARK-1870 while at it.
TD or Patrick, could you kindly send [CANCEL] prefixed in the subject
email out for the RC10 Vote to
+1
On Tue, May 20, 2014 at 11:09 PM, Henry Saputra henry.sapu...@gmail.comwrote:
Signature and hash for source looks good
No external executable package with source - good
Compiled with git and maven - good
Ran examples and sample programs locally and standalone -good
+1
- Henry
On
0
Abstaining because I'm not sure if my failures are due to Spark,
configuration, or other factors...
Compiled and deployed RC10 for YARN, Hadoop 2.3 per Spark 1.0.0 Yarn
documentation. No problems.
Rebuilt applications against RC10 and Hadoop 2.3.0 (plain vanilla Apache
release).
Updated
Hi Kevin,
Can you try https://issues.apache.org/jira/browse/SPARK-1898 to see if it
fixes your issue?
Running in YARN cluster mode, I had a similar issue where Spark was able to
create a Driver and an Executor via YARN, but then it stopped making any
progress.
Note: I was using a pre-release
Has anyone tried pyspark on yarn and got it to work? I was having issues when
I built spark on redhat but when I built on my mac it had worked, but now when
I build it on my mac it also doesn't work.
Tom
On Tuesday, May 20, 2014 3:14 PM, Tathagata Das tathagata.das1...@gmail.com
wrote:
I don't think Kevin's issue would be with an api change in YarnClientImpl since
in both cases he says he is using hadoop 2.3.0. I'll take a look at his post
in the user list.
Tom
On Wednesday, May 21, 2014 7:01 PM, Colin McCabe cmcc...@alumni.cmu.edu wrote:
Hi Kevin,
Can you try
Please vote on releasing the following candidate as Apache Spark version 1.0.0!
This has a few bug fixes on top of rc9:
SPARK-1875: https://github.com/apache/spark/pull/824
SPARK-1876: https://github.com/apache/spark/pull/819
SPARK-1878: https://github.com/apache/spark/pull/822
SPARK-1879:
+1
On Tue, May 20, 2014 at 5:26 PM, Andrew Or and...@databricks.com wrote:
+1
2014-05-20 13:13 GMT-07:00 Tathagata Das tathagata.das1...@gmail.com:
Please vote on releasing the following candidate as Apache Spark version
1.0.0!
This has a few bug fixes on top of rc9:
SPARK-1875:
+1 (non-binding)
I have:
- checked signatures and checksums of the files
- built the code from the git repo using both sbt and mvn (against hadoop 2.3.0)
- ran a few simple jobs in local, yarn-client and yarn-cluster mode
Haven't explicitly tested any of the recent fixes, streaming nor sql.
On
+1
Tested it on both Windows and Mac OS X, with both Scala and Python. Confirmed
that the issues in the previous RC were fixed.
Matei
On May 20, 2014, at 5:28 PM, Marcelo Vanzin van...@cloudera.com wrote:
+1 (non-binding)
I have:
- checked signatures and checksums of the files
- built
19 matches
Mail list logo