Re:[VOTE] Release Apache Spark 1.0.2 (RC1)

2014-07-27 Thread witgo
-1
The following bug should be fixed:
https://issues.apache.org/jira/browse/SPARK-2677‍





-- Original --
From:  Tathagata Das;tathagata.das1...@gmail.com;
Date:  Sat, Jul 26, 2014 07:08 AM
To:  dev@spark.apache.orgdev@spark.apache.org; 

Subject:  [VOTE] Release Apache Spark 1.0.2 (RC1)



Please vote on releasing the following candidate as Apache Spark version 1.0.2.

This release fixes a number of bugs in Spark 1.0.1.
Some of the notable ones are
- SPARK-2452: Known issue is Spark 1.0.1 caused by attempted fix for
SPARK-1199. The fix was reverted for 1.0.2.
- SPARK-2576: NoClassDefFoundError when executing Spark QL query on
HDFS CSV file.
The full list is at http://s.apache.org/9NJ

The tag to be voted on is v1.0.2-rc1 (commit 8fb6f00e):
https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=8fb6f00e195fb258f3f70f04756e07c259a2351f

The release files, including signatures, digests, etc can be found at:
http://people.apache.org/~tdas/spark-1.0.2-rc1/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/tdas.asc

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-1024/

The documentation corresponding to this release can be found at:
http://people.apache.org/~tdas/spark-1.0.2-rc1-docs/

Please vote on releasing this package as Apache Spark 1.0.2!

The vote is open until Tuesday, July 29, at 23:00 UTC and passes if
a majority of at least 3 +1 PMC votes are cast.
[ ] +1 Release this package as Apache Spark 1.0.2
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see
http://spark.apache.org/

Re:[VOTE] Release Apache Spark 1.0.2 (RC1)

2014-07-27 Thread Andrew Ash
Is that a regression since 1.0.0?
On Jul 27, 2014 10:43 AM, witgo wi...@qq.com wrote:

 -1
 The following bug should be fixed:
 https://issues.apache.org/jira/browse/SPARK-2677‍





 -- Original --
 From:  Tathagata Das;tathagata.das1...@gmail.com;
 Date:  Sat, Jul 26, 2014 07:08 AM
 To:  dev@spark.apache.orgdev@spark.apache.org;

 Subject:  [VOTE] Release Apache Spark 1.0.2 (RC1)



 Please vote on releasing the following candidate as Apache Spark version
 1.0.2.

 This release fixes a number of bugs in Spark 1.0.1.
 Some of the notable ones are
 - SPARK-2452: Known issue is Spark 1.0.1 caused by attempted fix for
 SPARK-1199. The fix was reverted for 1.0.2.
 - SPARK-2576: NoClassDefFoundError when executing Spark QL query on
 HDFS CSV file.
 The full list is at http://s.apache.org/9NJ

 The tag to be voted on is v1.0.2-rc1 (commit 8fb6f00e):

 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=8fb6f00e195fb258f3f70f04756e07c259a2351f

 The release files, including signatures, digests, etc can be found at:
 http://people.apache.org/~tdas/spark-1.0.2-rc1/

 Release artifacts are signed with the following key:
 https://people.apache.org/keys/committer/tdas.asc

 The staging repository for this release can be found at:
 https://repository.apache.org/content/repositories/orgapachespark-1024/

 The documentation corresponding to this release can be found at:
 http://people.apache.org/~tdas/spark-1.0.2-rc1-docs/

 Please vote on releasing this package as Apache Spark 1.0.2!

 The vote is open until Tuesday, July 29, at 23:00 UTC and passes if
 a majority of at least 3 +1 PMC votes are cast.
 [ ] +1 Release this package as Apache Spark 1.0.2
 [ ] -1 Do not release this package because ...

 To learn more about Apache Spark, please see
 http://spark.apache.org/


Re: [VOTE] Release Apache Spark 1.0.2 (RC1)

2014-07-27 Thread Patrick Wendell
SPARK-2677 is a long standing issue and not a regression. Also, as far
as I can see there is no patch for it or clear understanding of the
cause. This type of bug does not warrant holding a release. If we fix
SPARK-2677 we can just make another release with the fix.

On Sun, Jul 27, 2014 at 10:47 AM, Andrew Ash and...@andrewash.com wrote:
 Is that a regression since 1.0.0?
 On Jul 27, 2014 10:43 AM, witgo wi...@qq.com wrote:

 -1
 The following bug should be fixed:
 https://issues.apache.org/jira/browse/SPARK-2677





 -- Original --
 From:  Tathagata Das;tathagata.das1...@gmail.com;
 Date:  Sat, Jul 26, 2014 07:08 AM
 To:  dev@spark.apache.orgdev@spark.apache.org;

 Subject:  [VOTE] Release Apache Spark 1.0.2 (RC1)



 Please vote on releasing the following candidate as Apache Spark version
 1.0.2.

 This release fixes a number of bugs in Spark 1.0.1.
 Some of the notable ones are
 - SPARK-2452: Known issue is Spark 1.0.1 caused by attempted fix for
 SPARK-1199. The fix was reverted for 1.0.2.
 - SPARK-2576: NoClassDefFoundError when executing Spark QL query on
 HDFS CSV file.
 The full list is at http://s.apache.org/9NJ

 The tag to be voted on is v1.0.2-rc1 (commit 8fb6f00e):

 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=8fb6f00e195fb258f3f70f04756e07c259a2351f

 The release files, including signatures, digests, etc can be found at:
 http://people.apache.org/~tdas/spark-1.0.2-rc1/

 Release artifacts are signed with the following key:
 https://people.apache.org/keys/committer/tdas.asc

 The staging repository for this release can be found at:
 https://repository.apache.org/content/repositories/orgapachespark-1024/

 The documentation corresponding to this release can be found at:
 http://people.apache.org/~tdas/spark-1.0.2-rc1-docs/

 Please vote on releasing this package as Apache Spark 1.0.2!

 The vote is open until Tuesday, July 29, at 23:00 UTC and passes if
 a majority of at least 3 +1 PMC votes are cast.
 [ ] +1 Release this package as Apache Spark 1.0.2
 [ ] -1 Do not release this package because ...

 To learn more about Apache Spark, please see
 http://spark.apache.org/


branch-1.1 will be cut on Friday

2014-07-27 Thread Patrick Wendell
Hey All,

Just a heads up, we'll cut branch-1.1 on this Friday, August 1st. Once
the release branch is cut we'll start community QA and go into the
normal triage process for merging patches into that branch.

For Spark core, we'll be conservative in merging things past the
freeze date (e.g. high priority fixes) to ensure a healthy amount of
time for testing. A key focus of this release in core is improving
overall stability and resilience of Spark core.

As always, I'll encourage of committers/contributors to help review
patches this week to so we can get as many things in as possible.
People have been quite active recently, which is great!

Good luck!
- Patrick


Re: branch-1.1 will be cut on Friday

2014-07-27 Thread Nan Zhu
Good news, we will see the official version containing JDBC in very soon! 

Also, I have several pending PRs, can anyone continue the review process in 
this week?

Avoid overwriting already-set SPARK_HOME in spark-submit: 
https://github.com/apache/spark/pull/1331

fix locality inversion bug in TaskSetManager: 
https://github.com/apache/spark/pull/1313 (Matei and Mridulm are working on it)

Allow multiple executor per worker in Standalone mode: 
https://github.com/apache/spark/pull/731 

Ensure actor is self-contained  in DAGScheduler: 
https://github.com/apache/spark/pull/637

Best, 

-- 
Nan Zhu


On Sunday, July 27, 2014 at 2:31 PM, Patrick Wendell wrote:

 Hey All,
 
 Just a heads up, we'll cut branch-1.1 on this Friday, August 1st. Once
 the release branch is cut we'll start community QA and go into the
 normal triage process for merging patches into that branch.
 
 For Spark core, we'll be conservative in merging things past the
 freeze date (e.g. high priority fixes) to ensure a healthy amount of
 time for testing. A key focus of this release in core is improving
 overall stability and resilience of Spark core.
 
 As always, I'll encourage of committers/contributors to help review
 patches this week to so we can get as many things in as possible.
 People have been quite active recently, which is great!
 
 Good luck!
 - Patrick
 
 




Re: Utilize newer hadoop releases WAS: [VOTE] Release Apache Spark 1.0.2 (RC1)

2014-07-27 Thread Patrick Wendell
Hey Ted,

We always intend Spark to work with the newer Hadoop versions and
encourage Spark users to use the newest Hadoop versions for best
performance.

We do try to be liberal in terms of supporting older versions as well.
This is because many people run older HDFS versions and we want Spark
to read and write data from them. So far we've been willing to do this
despite some maintenance cost.

The reason is that for many users it's very expensive to do a
whole-sale upgrade of HDFS, but trying out new versions of Spark is
much easier. For instance, some of the largest scale Spark users run
fairly old or forked HDFS versions.

- Patrick

On Sun, Jul 27, 2014 at 12:01 PM, Ted Yu yuzhih...@gmail.com wrote:
 Thanks for replying, Patrick.

 The intention of my first email was for utilizing newer hadoop releases for
 their bug fixes. I am still looking for clean way of passing hadoop release
 version number to individual classes.
 Using newer hadoop releases would encourage pushing bug fixes / new
 features upstream. Ultimately Spark code would become cleaner.

 Cheers

 On Sun, Jul 27, 2014 at 8:52 AM, Patrick Wendell pwend...@gmail.com wrote:

 Ted - technically I think you are correct, although I wouldn't
 recommend disabling this lock. This lock is not expensive (acquired
 once per task, as are many other locks already). Also, we've seen some
 cases where Hadoop concurrency bugs ended up requiring multiple fixes
 - concurrency of client access is not well tested in the Hadoop
 codebase since most of the Hadoop tools to not use concurrent access.
 So in general it's good to be conservative in what we expect of the
 Hadoop client libraries.

 If you'd like to discuss this further, please fork a new thread, since
 this is a vote thread. Thanks!

 On Fri, Jul 25, 2014 at 10:14 PM, Ted Yu yuzhih...@gmail.com wrote:
  HADOOP-10456 is fixed in hadoop 2.4.1
 
  Does this mean that synchronization
  on HadoopRDD.CONFIGURATION_INSTANTIATION_LOCK can be bypassed for hadoop
  2.4.1 ?
 
  Cheers
 
 
  On Fri, Jul 25, 2014 at 6:00 PM, Patrick Wendell pwend...@gmail.com
 wrote:
 
  The most important issue in this release is actually an ammendment to
  an earlier fix. The original fix caused a deadlock which was a
  regression from 1.0.0-1.0.1:
 
  Issue:
  https://issues.apache.org/jira/browse/SPARK-1097
 
  1.0.1 Fix:
  https://github.com/apache/spark/pull/1273/files (had a deadlock)
 
  1.0.2 Fix:
  https://github.com/apache/spark/pull/1409/files
 
  I failed to correctly label this on JIRA, but I've updated it!
 
  On Fri, Jul 25, 2014 at 5:35 PM, Michael Armbrust
  mich...@databricks.com wrote:
   That query is looking at Fix Version not Target Version.  The fact
  that
   the first one is still open is only because the bug is not resolved in
   master.  It is fixed in 1.0.2.  The second one is partially fixed in
  1.0.2,
   but is not worth blocking the release for.
  
  
   On Fri, Jul 25, 2014 at 4:23 PM, Nicholas Chammas 
   nicholas.cham...@gmail.com wrote:
  
   TD, there are a couple of unresolved issues slated for 1.0.2
   
  
 
 https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20fixVersion%20%3D%201.0.2%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY%20priority%20DESC
   .
   Should they be edited somehow?
  
  
   On Fri, Jul 25, 2014 at 7:08 PM, Tathagata Das 
   tathagata.das1...@gmail.com
   wrote:
  
Please vote on releasing the following candidate as Apache Spark
  version
1.0.2.
   
This release fixes a number of bugs in Spark 1.0.1.
Some of the notable ones are
- SPARK-2452: Known issue is Spark 1.0.1 caused by attempted fix
 for
SPARK-1199. The fix was reverted for 1.0.2.
- SPARK-2576: NoClassDefFoundError when executing Spark QL query on
HDFS CSV file.
The full list is at http://s.apache.org/9NJ
   
The tag to be voted on is v1.0.2-rc1 (commit 8fb6f00e):
   
   
  
 
 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=8fb6f00e195fb258f3f70f04756e07c259a2351f
   
The release files, including signatures, digests, etc can be found
 at:
http://people.apache.org/~tdas/spark-1.0.2-rc1/
   
Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/tdas.asc
   
The staging repository for this release can be found at:
   
  https://repository.apache.org/content/repositories/orgapachespark-1024/
   
The documentation corresponding to this release can be found at:
http://people.apache.org/~tdas/spark-1.0.2-rc1-docs/
   
Please vote on releasing this package as Apache Spark 1.0.2!
   
The vote is open until Tuesday, July 29, at 23:00 UTC and passes if
a majority of at least 3 +1 PMC votes are cast.
[ ] +1 Release this package as Apache Spark 1.0.2
[ ] -1 Do not release this package because ...
   
To learn more about Apache Spark, please see
http://spark.apache.org/
   
  
 



Re: Utilize newer hadoop releases WAS: [VOTE] Release Apache Spark 1.0.2 (RC1)

2014-07-27 Thread Sean Owen
Good idea, although it gets difficult in the context of multiple
distributions. Say change X is not present in version A, but present
in version B. If you depend on X, what version can you look for to
detect it? The distribution will return A or A+X or somesuch, but
testing for A will give an incorrect answer, and the code can't be
expected to look for everyone's A+X versions. Actually inspecting
the code is more robust if a bit messier.

On Sun, Jul 27, 2014 at 9:50 PM, Matei Zaharia matei.zaha...@gmail.com wrote:
 For this particular issue, it would be good to know if Hadoop provides an API 
 to determine the Hadoop version. If not, maybe that can be added to Hadoop in 
 its next release, and we can check for it with reflection. We recently added 
 a SparkContext.version() method in Spark to let you tell the version.

 Matei

 On Jul 27, 2014, at 12:19 PM, Patrick Wendell pwend...@gmail.com wrote:

 Hey Ted,

 We always intend Spark to work with the newer Hadoop versions and
 encourage Spark users to use the newest Hadoop versions for best
 performance.

 We do try to be liberal in terms of supporting older versions as well.
 This is because many people run older HDFS versions and we want Spark
 to read and write data from them. So far we've been willing to do this
 despite some maintenance cost.

 The reason is that for many users it's very expensive to do a
 whole-sale upgrade of HDFS, but trying out new versions of Spark is
 much easier. For instance, some of the largest scale Spark users run
 fairly old or forked HDFS versions.

 - Patrick

 On Sun, Jul 27, 2014 at 12:01 PM, Ted Yu yuzhih...@gmail.com wrote:
 Thanks for replying, Patrick.

 The intention of my first email was for utilizing newer hadoop releases for
 their bug fixes. I am still looking for clean way of passing hadoop release
 version number to individual classes.
 Using newer hadoop releases would encourage pushing bug fixes / new
 features upstream. Ultimately Spark code would become cleaner.

 Cheers

 On Sun, Jul 27, 2014 at 8:52 AM, Patrick Wendell pwend...@gmail.com wrote:

 Ted - technically I think you are correct, although I wouldn't
 recommend disabling this lock. This lock is not expensive (acquired
 once per task, as are many other locks already). Also, we've seen some
 cases where Hadoop concurrency bugs ended up requiring multiple fixes
 - concurrency of client access is not well tested in the Hadoop
 codebase since most of the Hadoop tools to not use concurrent access.
 So in general it's good to be conservative in what we expect of the
 Hadoop client libraries.

 If you'd like to discuss this further, please fork a new thread, since
 this is a vote thread. Thanks!

 On Fri, Jul 25, 2014 at 10:14 PM, Ted Yu yuzhih...@gmail.com wrote:
 HADOOP-10456 is fixed in hadoop 2.4.1

 Does this mean that synchronization
 on HadoopRDD.CONFIGURATION_INSTANTIATION_LOCK can be bypassed for hadoop
 2.4.1 ?

 Cheers


 On Fri, Jul 25, 2014 at 6:00 PM, Patrick Wendell pwend...@gmail.com
 wrote:

 The most important issue in this release is actually an ammendment to
 an earlier fix. The original fix caused a deadlock which was a
 regression from 1.0.0-1.0.1:

 Issue:
 https://issues.apache.org/jira/browse/SPARK-1097

 1.0.1 Fix:
 https://github.com/apache/spark/pull/1273/files (had a deadlock)

 1.0.2 Fix:
 https://github.com/apache/spark/pull/1409/files

 I failed to correctly label this on JIRA, but I've updated it!

 On Fri, Jul 25, 2014 at 5:35 PM, Michael Armbrust
 mich...@databricks.com wrote:
 That query is looking at Fix Version not Target Version.  The fact
 that
 the first one is still open is only because the bug is not resolved in
 master.  It is fixed in 1.0.2.  The second one is partially fixed in
 1.0.2,
 but is not worth blocking the release for.


 On Fri, Jul 25, 2014 at 4:23 PM, Nicholas Chammas 
 nicholas.cham...@gmail.com wrote:

 TD, there are a couple of unresolved issues slated for 1.0.2
 


 https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20fixVersion%20%3D%201.0.2%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY%20priority%20DESC
 .
 Should they be edited somehow?


 On Fri, Jul 25, 2014 at 7:08 PM, Tathagata Das 
 tathagata.das1...@gmail.com
 wrote:

 Please vote on releasing the following candidate as Apache Spark
 version
 1.0.2.

 This release fixes a number of bugs in Spark 1.0.1.
 Some of the notable ones are
 - SPARK-2452: Known issue is Spark 1.0.1 caused by attempted fix
 for
 SPARK-1199. The fix was reverted for 1.0.2.
 - SPARK-2576: NoClassDefFoundError when executing Spark QL query on
 HDFS CSV file.
 The full list is at http://s.apache.org/9NJ

 The tag to be voted on is v1.0.2-rc1 (commit 8fb6f00e):




 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=8fb6f00e195fb258f3f70f04756e07c259a2351f

 The release files, including signatures, digests, etc can be found
 at:
 http://people.apache.org/~tdas/spark-1.0.2-rc1/

 Release artifacts are signed with the 

Re: [VOTE] Release Apache Spark 1.0.2 (RC1)

2014-07-27 Thread Matei Zaharia
+1

Tested this on Mac OS X.

Matei

On Jul 25, 2014, at 4:08 PM, Tathagata Das tathagata.das1...@gmail.com wrote:

 Please vote on releasing the following candidate as Apache Spark version 
 1.0.2.
 
 This release fixes a number of bugs in Spark 1.0.1.
 Some of the notable ones are
 - SPARK-2452: Known issue is Spark 1.0.1 caused by attempted fix for
 SPARK-1199. The fix was reverted for 1.0.2.
 - SPARK-2576: NoClassDefFoundError when executing Spark QL query on
 HDFS CSV file.
 The full list is at http://s.apache.org/9NJ
 
 The tag to be voted on is v1.0.2-rc1 (commit 8fb6f00e):
 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=8fb6f00e195fb258f3f70f04756e07c259a2351f
 
 The release files, including signatures, digests, etc can be found at:
 http://people.apache.org/~tdas/spark-1.0.2-rc1/
 
 Release artifacts are signed with the following key:
 https://people.apache.org/keys/committer/tdas.asc
 
 The staging repository for this release can be found at:
 https://repository.apache.org/content/repositories/orgapachespark-1024/
 
 The documentation corresponding to this release can be found at:
 http://people.apache.org/~tdas/spark-1.0.2-rc1-docs/
 
 Please vote on releasing this package as Apache Spark 1.0.2!
 
 The vote is open until Tuesday, July 29, at 23:00 UTC and passes if
 a majority of at least 3 +1 PMC votes are cast.
 [ ] +1 Release this package as Apache Spark 1.0.2
 [ ] -1 Do not release this package because ...
 
 To learn more about Apache Spark, please see
 http://spark.apache.org/



Re: Utilize newer hadoop releases WAS: [VOTE] Release Apache Spark 1.0.2 (RC1)

2014-07-27 Thread Matei Zaharia
We could also do this, though it would be great if the Hadoop project provided 
this version number as at least a baseline. It's up to distributors to decide 
which version they report but I imagine they won't remove stuff that's in the 
reported version number.

Matei

On Jul 27, 2014, at 1:57 PM, Sean Owen so...@cloudera.com wrote:

 Good idea, although it gets difficult in the context of multiple
 distributions. Say change X is not present in version A, but present
 in version B. If you depend on X, what version can you look for to
 detect it? The distribution will return A or A+X or somesuch, but
 testing for A will give an incorrect answer, and the code can't be
 expected to look for everyone's A+X versions. Actually inspecting
 the code is more robust if a bit messier.
 
 On Sun, Jul 27, 2014 at 9:50 PM, Matei Zaharia matei.zaha...@gmail.com 
 wrote:
 For this particular issue, it would be good to know if Hadoop provides an 
 API to determine the Hadoop version. If not, maybe that can be added to 
 Hadoop in its next release, and we can check for it with reflection. We 
 recently added a SparkContext.version() method in Spark to let you tell the 
 version.
 
 Matei
 
 On Jul 27, 2014, at 12:19 PM, Patrick Wendell pwend...@gmail.com wrote:
 
 Hey Ted,
 
 We always intend Spark to work with the newer Hadoop versions and
 encourage Spark users to use the newest Hadoop versions for best
 performance.
 
 We do try to be liberal in terms of supporting older versions as well.
 This is because many people run older HDFS versions and we want Spark
 to read and write data from them. So far we've been willing to do this
 despite some maintenance cost.
 
 The reason is that for many users it's very expensive to do a
 whole-sale upgrade of HDFS, but trying out new versions of Spark is
 much easier. For instance, some of the largest scale Spark users run
 fairly old or forked HDFS versions.
 
 - Patrick
 
 On Sun, Jul 27, 2014 at 12:01 PM, Ted Yu yuzhih...@gmail.com wrote:
 Thanks for replying, Patrick.
 
 The intention of my first email was for utilizing newer hadoop releases for
 their bug fixes. I am still looking for clean way of passing hadoop release
 version number to individual classes.
 Using newer hadoop releases would encourage pushing bug fixes / new
 features upstream. Ultimately Spark code would become cleaner.
 
 Cheers
 
 On Sun, Jul 27, 2014 at 8:52 AM, Patrick Wendell pwend...@gmail.com 
 wrote:
 
 Ted - technically I think you are correct, although I wouldn't
 recommend disabling this lock. This lock is not expensive (acquired
 once per task, as are many other locks already). Also, we've seen some
 cases where Hadoop concurrency bugs ended up requiring multiple fixes
 - concurrency of client access is not well tested in the Hadoop
 codebase since most of the Hadoop tools to not use concurrent access.
 So in general it's good to be conservative in what we expect of the
 Hadoop client libraries.
 
 If you'd like to discuss this further, please fork a new thread, since
 this is a vote thread. Thanks!
 
 On Fri, Jul 25, 2014 at 10:14 PM, Ted Yu yuzhih...@gmail.com wrote:
 HADOOP-10456 is fixed in hadoop 2.4.1
 
 Does this mean that synchronization
 on HadoopRDD.CONFIGURATION_INSTANTIATION_LOCK can be bypassed for hadoop
 2.4.1 ?
 
 Cheers
 
 
 On Fri, Jul 25, 2014 at 6:00 PM, Patrick Wendell pwend...@gmail.com
 wrote:
 
 The most important issue in this release is actually an ammendment to
 an earlier fix. The original fix caused a deadlock which was a
 regression from 1.0.0-1.0.1:
 
 Issue:
 https://issues.apache.org/jira/browse/SPARK-1097
 
 1.0.1 Fix:
 https://github.com/apache/spark/pull/1273/files (had a deadlock)
 
 1.0.2 Fix:
 https://github.com/apache/spark/pull/1409/files
 
 I failed to correctly label this on JIRA, but I've updated it!
 
 On Fri, Jul 25, 2014 at 5:35 PM, Michael Armbrust
 mich...@databricks.com wrote:
 That query is looking at Fix Version not Target Version.  The fact
 that
 the first one is still open is only because the bug is not resolved in
 master.  It is fixed in 1.0.2.  The second one is partially fixed in
 1.0.2,
 but is not worth blocking the release for.
 
 
 On Fri, Jul 25, 2014 at 4:23 PM, Nicholas Chammas 
 nicholas.cham...@gmail.com wrote:
 
 TD, there are a couple of unresolved issues slated for 1.0.2
 
 
 
 https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20fixVersion%20%3D%201.0.2%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY%20priority%20DESC
 .
 Should they be edited somehow?
 
 
 On Fri, Jul 25, 2014 at 7:08 PM, Tathagata Das 
 tathagata.das1...@gmail.com
 wrote:
 
 Please vote on releasing the following candidate as Apache Spark
 version
 1.0.2.
 
 This release fixes a number of bugs in Spark 1.0.1.
 Some of the notable ones are
 - SPARK-2452: Known issue is Spark 1.0.1 caused by attempted fix
 for
 SPARK-1199. The fix was reverted for 1.0.2.
 - SPARK-2576: NoClassDefFoundError when executing Spark QL query on
 HDFS CSV 

new JDBC server test cases seems failed ?

2014-07-27 Thread Nan Zhu
Hi, all

It seems that the JDBC test cases are failed unexpectedly in Jenkins?


[info] - test query execution against a Hive Thrift server *** FAILED *** 
[info] java.sql.SQLException: Could not open connection to 
jdbc:hive2://localhost:45518/: java.net.ConnectException: Connection refused 
[info] at 
org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:146) 
[info] at org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:123) 
[info] at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) [info] 
at java.sql.DriverManager.getConnection(DriverManager.java:571) [info] at 
java.sql.DriverManager.getConnection(DriverManager.java:215) [info] at 
org.apache.spark.sql.hive.thriftserver.HiveThriftServer2Suite.getConnection(HiveThriftServer2Suite.scala:131)
 [info] at 
org.apache.spark.sql.hive.thriftserver.HiveThriftServer2Suite.createStatement(HiveThriftServer2Suite.scala:134)
 [info] at 
org.apache.spark.sql.hive.thriftserver.HiveThriftServer2Suite$$anonfun$1.apply$mcV$sp(HiveThriftServer2Suite.scala:110)
 [info] at org.apache.spark.sql.hive.thri
ftserver.HiveThriftServer2Suite$$anonfun$1.apply(HiveThriftServer2Suite.scala:107)
 [info] at 
org.apache.spark.sql.hive.thriftserver.HiveThriftServer2Suite$$anonfun$1.apply(HiveThriftServer2Suite.scala:107)
 [info] ... [info] Cause: org.apache.thrift.transport.TTransportException: 
java.net.ConnectException: Connection refused [info] at 
org.apache.thrift.transport.TSocket.open(TSocket.java:185) [info] at 
org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:248) [info] 
at 
org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
 [info] at 
org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:144) 
[info] at org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:123) 
[info] at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) [info] 
at java.sql.DriverManager.getConnection(DriverManager.java:571) [info] at 
java.sql.DriverManager.getConnection(DriverManager.java:215) [info] at 
org.apache.spark.sql.hive.thriftserver.H
iveThriftServer2Suite.getConnection(HiveThriftServer2Suite.scala:131) [info] at 
org.apache.spark.sql.hive.thriftserver.HiveThriftServer2Suite.createStatement(HiveThriftServer2Suite.scala:134)
 [info] ... [info] Cause: java.net.ConnectException: Connection refused [info] 
at java.net.PlainSocketImpl.socketConnect(Native Method) [info] at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) 
[info] at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
 [info] at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) 
[info] at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) [info] at 
java.net.Socket.connect(Socket.java:579) [info] at 
org.apache.thrift.transport.TSocket.open(TSocket.java:180) [info] at 
org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:248) [info] 
at 
org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
 [info] at org.apache.hive.jdbc.HiveConn
ection.openTransport(HiveConnection.java:144) [info] ... [info] CliSuite: 
Executing: create table hive_test1(key int, val string);, expecting output: OK 
[warn] four warnings found [warn] Note: 
/home/jenkins/workspace/SparkPullRequestBuilder@4/core/src/test/java/org/apache/spark/JavaAPISuite.java
 uses or overrides a deprecated API. [warn] Note: Recompile with 
-Xlint:deprecation for details. [info] - simple commands *** FAILED *** [info] 
java.lang.AssertionError: assertion failed: Didn't find OK in the output: 
[info] at scala.Predef$.assert(Predef.scala:179) [info] at 
org.apache.spark.sql.hive.thriftserver.TestUtils$class.waitForQuery(TestUtils.scala:70)
 [info] at 
org.apache.spark.sql.hive.thriftserver.CliSuite.waitForQuery(CliSuite.scala:25) 
[info] at 
org.apache.spark.sql.hive.thriftserver.TestUtils$class.executeQuery(TestUtils.scala:62)
 [info] at 
org.apache.spark.sql.hive.thriftserver.CliSuite.executeQuery(CliSuite.scala:25) 
[info] at org.apache.spark.sql.hive.thriftserver.CliSuite
$$anonfun$1.apply$mcV$sp(CliSuite.scala:53) [info] at 
org.apache.spark.sql.hive.thriftserver.CliSuite$$anonfun$1.apply(CliSuite.scala:51)
 [info] at 
org.apache.spark.sql.hive.thriftserver.CliSuite$$anonfun$1.apply(CliSuite.scala:51)
 [info] at 
org.scalatest.Transformer$$anonfun$apply$1.apply(Transformer.scala:22) [info] 
at org.scalatest.Transformer$$anonfun$apply$1.apply(Transformer.scala:22) 
[log4j:WARN No appenders could be found for logger 
(org.apache.hadoop.util.Shell). log4j:WARN Please initialize the log4j system 
properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig 
for more info. 14/07/27 17:06:43 INFO ClientBase: Using Spark's default log4j 
profile: org/apache/spark/log4j-defaults.properties info] ... [info] ScalaTest 
[info] Run completed in 41 seconds, 789 milliseconds. [info] Total number of 
tests run: 2 [info] 

Re: new JDBC server test cases seems failed ?

2014-07-27 Thread Michael Armbrust
How recent is this? We've already reverted this patch once due to failing
tests.  It would be helpful to include a link to the failed build.  If its
failing again we'll have to revert again.


On Sun, Jul 27, 2014 at 5:26 PM, Nan Zhu zhunanmcg...@gmail.com wrote:

 Hi, all

 It seems that the JDBC test cases are failed unexpectedly in Jenkins?


 [info] - test query execution against a Hive Thrift server *** FAILED ***
 [info] java.sql.SQLException: Could not open connection to
 jdbc:hive2://localhost:45518/: java.net.ConnectException: Connection
 refused [info] at
 org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:146)
 [info] at
 org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:123) [info]
 at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) [info] at
 java.sql.DriverManager.getConnection(DriverManager.java:571) [info] at
 java.sql.DriverManager.getConnection(DriverManager.java:215) [info] at
 org.apache.spark.sql.hive.thriftserver.HiveThriftServer2Suite.getConnection(HiveThriftServer2Suite.scala:131)
 [info] at
 org.apache.spark.sql.hive.thriftserver.HiveThriftServer2Suite.createStatement(HiveThriftServer2Suite.scala:134)
 [info] at
 org.apache.spark.sql.hive.thriftserver.HiveThriftServer2Suite$$anonfun$1.apply$mcV$sp(HiveThriftServer2Suite.scala:110)
 [info] at org.apache.spark.sql.hive.thri
 ftserver.HiveThriftServer2Suite$$anonfun$1.apply(HiveThriftServer2Suite.scala:107)
 [info] at
 org.apache.spark.sql.hive.thriftserver.HiveThriftServer2Suite$$anonfun$1.apply(HiveThriftServer2Suite.scala:107)
 [info] ... [info] Cause: org.apache.thrift.transport.TTransportException:
 java.net.ConnectException: Connection refused [info] at
 org.apache.thrift.transport.TSocket.open(TSocket.java:185) [info] at
 org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:248)
 [info] at
 org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
 [info] at
 org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:144)
 [info] at
 org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:123) [info]
 at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) [info] at
 java.sql.DriverManager.getConnection(DriverManager.java:571) [info] at
 java.sql.DriverManager.getConnection(DriverManager.java:215) [info] at
 org.apache.spark.sql.hive.thriftserver.H
 iveThriftServer2Suite.getConnection(HiveThriftServer2Suite.scala:131)
 [info] at
 org.apache.spark.sql.hive.thriftserver.HiveThriftServer2Suite.createStatement(HiveThriftServer2Suite.scala:134)
 [info] ... [info] Cause: java.net.ConnectException: Connection refused
 [info] at java.net.PlainSocketImpl.socketConnect(Native Method) [info] at
 java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
 [info] at
 java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
 [info] at
 java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
 [info] at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) [info]
 at java.net.Socket.connect(Socket.java:579) [info] at
 org.apache.thrift.transport.TSocket.open(TSocket.java:180) [info] at
 org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:248)
 [info] at
 org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
 [info] at org.apache.hive.jdbc.HiveConn
 ection.openTransport(HiveConnection.java:144) [info] ... [info] CliSuite:
 Executing: create table hive_test1(key int, val string);, expecting output:
 OK [warn] four warnings found [warn] Note:
 /home/jenkins/workspace/SparkPullRequestBuilder@4/core/src/test/java/org/apache/spark/JavaAPISuite.java
 uses or overrides a deprecated API. [warn] Note: Recompile with
 -Xlint:deprecation for details. [info] - simple commands *** FAILED ***
 [info] java.lang.AssertionError: assertion failed: Didn't find OK in the
 output: [info] at scala.Predef$.assert(Predef.scala:179) [info] at
 org.apache.spark.sql.hive.thriftserver.TestUtils$class.waitForQuery(TestUtils.scala:70)
 [info] at
 org.apache.spark.sql.hive.thriftserver.CliSuite.waitForQuery(CliSuite.scala:25)
 [info] at
 org.apache.spark.sql.hive.thriftserver.TestUtils$class.executeQuery(TestUtils.scala:62)
 [info] at
 org.apache.spark.sql.hive.thriftserver.CliSuite.executeQuery(CliSuite.scala:25)
 [info] at org.apache.spark.sql.hive.thriftserver.CliSuite
 $$anonfun$1.apply$mcV$sp(CliSuite.scala:53) [info] at
 org.apache.spark.sql.hive.thriftserver.CliSuite$$anonfun$1.apply(CliSuite.scala:51)
 [info] at
 org.apache.spark.sql.hive.thriftserver.CliSuite$$anonfun$1.apply(CliSuite.scala:51)
 [info] at
 org.scalatest.Transformer$$anonfun$apply$1.apply(Transformer.scala:22)
 [info] at
 org.scalatest.Transformer$$anonfun$apply$1.apply(Transformer.scala:22)
 [log4j:WARN No appenders could be found for logger
 (org.apache.hadoop.util.Shell). log4j:WARN Please initialize the log4j
 system properly. log4j:WARN See
 

Re: new JDBC server test cases seems failed ?

2014-07-27 Thread Patrick Wendell
I'm going to revert it again - Cheng can you try to look into this? Thanks.

On Sun, Jul 27, 2014 at 6:06 PM, Nan Zhu zhunanmcg...@gmail.com wrote:
 it's 20 minutes ago

 https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17259/consoleFull

 --
 Nan Zhu


 On Sunday, July 27, 2014 at 8:53 PM, Michael Armbrust wrote:

 How recent is this? We've already reverted this patch once due to failing
 tests. It would be helpful to include a link to the failed build. If its
 failing again we'll have to revert again.


 On Sun, Jul 27, 2014 at 5:26 PM, Nan Zhu zhunanmcg...@gmail.com 
 (mailto:zhunanmcg...@gmail.com) wrote:

  Hi, all
 
  It seems that the JDBC test cases are failed unexpectedly in Jenkins?
 
 
  [info] - test query execution against a Hive Thrift server *** FAILED ***
  [info] java.sql.SQLException: Could not open connection to
  jdbc:hive2://localhost:45518/: java.net.ConnectException: Connection
  refused [info] at
  org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:146)
  [info] at
  org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:123) [info]
  at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) [info] at
  java.sql.DriverManager.getConnection(DriverManager.java:571) [info] at
  java.sql.DriverManager.getConnection(DriverManager.java:215) [info] at
  org.apache.spark.sql.hive.thriftserver.HiveThriftServer2Suite.getConnection(HiveThriftServer2Suite.scala:131)
  [info] at
  org.apache.spark.sql.hive.thriftserver.HiveThriftServer2Suite.createStatement(HiveThriftServer2Suite.scala:134)
  [info] at
  org.apache.spark.sql.hive.thriftserver.HiveThriftServer2Suite$$anonfun$1.apply$mcV$sp(HiveThriftServer2Suite.scala:110)
  [info] at org.apache.spark.sql.hive.thri
  ftserver.HiveThriftServer2Suite$$anonfun$1.apply(HiveThriftServer2Suite.scala:107)
  [info] at
  org.apache.spark.sql.hive.thriftserver.HiveThriftServer2Suite$$anonfun$1.apply(HiveThriftServer2Suite.scala:107)
  [info] ... [info] Cause: org.apache.thrift.transport.TTransportException:
  java.net.ConnectException: Connection refused [info] at
  org.apache.thrift.transport.TSocket.open(TSocket.java:185) [info] at
  org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:248)
  [info] at
  org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
  [info] at
  org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:144)
  [info] at
  org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:123) [info]
  at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) [info] at
  java.sql.DriverManager.getConnection(DriverManager.java:571) [info] at
  java.sql.DriverManager.getConnection(DriverManager.java:215) [info] at
  org.apache.spark.sql.hive.thriftserver.H
  iveThriftServer2Suite.getConnection(HiveThriftServer2Suite.scala:131)
  [info] at
  org.apache.spark.sql.hive.thriftserver.HiveThriftServer2Suite.createStatement(HiveThriftServer2Suite.scala:134)
  [info] ... [info] Cause: java.net.ConnectException: Connection refused
  [info] at java.net.PlainSocketImpl.socketConnect(Native Method) [info] at
  java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
  [info] at
  java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
  [info] at
  java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
  [info] at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) [info]
  at java.net.Socket.connect(Socket.java:579) [info] at
  org.apache.thrift.transport.TSocket.open(TSocket.java:180) [info] at
  org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:248)
  [info] at
  org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
  [info] at org.apache.hive.jdbc.HiveConn
  ection.openTransport(HiveConnection.java:144) [info] ... [info] CliSuite:
  Executing: create table hive_test1(key int, val string);, expecting output:
  OK [warn] four warnings found [warn] Note:
  /home/jenkins/workspace/SparkPullRequestBuilder@4/core/src/test/java/org/apache/spark/JavaAPISuite.java
  uses or overrides a deprecated API. [warn] Note: Recompile with
  -Xlint:deprecation for details. [info] - simple commands *** FAILED ***
  [info] java.lang.AssertionError: assertion failed: Didn't find OK in the
  output: [info] at scala.Predef$.assert(Predef.scala:179) [info] at
  org.apache.spark.sql.hive.thriftserver.TestUtils$class.waitForQuery(TestUtils.scala:70)
  [info] at
  org.apache.spark.sql.hive.thriftserver.CliSuite.waitForQuery(CliSuite.scala:25)
  [info] at
  org.apache.spark.sql.hive.thriftserver.TestUtils$class.executeQuery(TestUtils.scala:62)
  [info] at
  org.apache.spark.sql.hive.thriftserver.CliSuite.executeQuery(CliSuite.scala:25)
  [info] at org.apache.spark.sql.hive.thriftserver.CliSuite
  $$anonfun$1.apply$mcV$sp(CliSuite.scala:53) [info] at
  org.apache.spark.sql.hive.thriftserver.CliSuite$$anonfun$1.apply(CliSuite.scala:51)
  [info] 

No such file or directory errors running tests

2014-07-27 Thread Stephen Boesch
I have pulled latest from github this afternoon.   There are many many
errors:

source_home/assembly/target/scala-2.10: No such file or directory

This causes many tests to fail.

Here is the command line I am running

mvn -Pyarn -Phadoop-2.3 -Phive package test


Re: No such file or directory errors running tests

2014-07-27 Thread Reynold Xin
To run through all the tests you'd need to create the assembly jar first.


I've seen this asked a few times. Maybe we should make it more obvious.



http://spark.apache.org/docs/latest/building-with-maven.html

Spark Tests in Maven

Tests are run by default via the ScalaTest Maven plugin
http://www.scalatest.org/user_guide/using_the_scalatest_maven_plugin.
Some of the require Spark to be packaged first, so always run mvn package
 with -DskipTests the first time. You can then run the tests with mvn
-Dhadoop.version=... test.

The ScalaTest plugin also supports running only a specific test suite as
follows:

mvn -Dhadoop.version=... -DwildcardSuites=org.apache.spark.repl.ReplSuite test





On Sun, Jul 27, 2014 at 7:07 PM, Stephen Boesch java...@gmail.com wrote:

 I have pulled latest from github this afternoon.   There are many many
 errors:

 source_home/assembly/target/scala-2.10: No such file or directory

 This causes many tests to fail.

 Here is the command line I am running

 mvn -Pyarn -Phadoop-2.3 -Phive package test



Re: No such file or directory errors running tests

2014-07-27 Thread Stephen Boesch
i Reynold,
  thanks for responding here. Yes I had looked at the building with maven
page in the past.  I have not noticed  that the package step must happen
*before *the test.  I had assumed it were a corequisite -as seen in my
command line.

So the following sequence appears to work fine (so far so good - well past
when the prior attempts failed):


 mvn -Pyarn -Phadoop-2.3 -DskipTests -Phive clean package
mvn -Pyarn -Phadoop-2.3 -Phive test

AFA documentation,  yes adding another sentence to that same Building with
Maven page would likely be helpful to future generations.


2014-07-27 19:10 GMT-07:00 Reynold Xin r...@databricks.com:

 To run through all the tests you'd need to create the assembly jar first.


 I've seen this asked a few times. Maybe we should make it more obvious.



 http://spark.apache.org/docs/latest/building-with-maven.html

 Spark Tests in Maven

 Tests are run by default via the ScalaTest Maven plugin
 http://www.scalatest.org/user_guide/using_the_scalatest_maven_plugin.
 Some of the require Spark to be packaged first, so always run mvn package
  with -DskipTests the first time. You can then run the tests with mvn
 -Dhadoop.version=... test.

 The ScalaTest plugin also supports running only a specific test suite as
 follows:

 mvn -Dhadoop.version=... -DwildcardSuites=org.apache.spark.repl.ReplSuite
 test





 On Sun, Jul 27, 2014 at 7:07 PM, Stephen Boesch java...@gmail.com wrote:

  I have pulled latest from github this afternoon.   There are many many
  errors:
 
  source_home/assembly/target/scala-2.10: No such file or directory
 
  This causes many tests to fail.
 
  Here is the command line I am running
 
  mvn -Pyarn -Phadoop-2.3 -Phive package test
 



Can I translate the documentations of Spark in Japanese?

2014-07-27 Thread Yu Ishikawa
Hi all,

I'm Yu Ishikawa, a Japanese.
I would like to translate the documentations of Spark 1.0.x officially.
If I will translate them and send a pull request, then can you merge it ?
And where is the best directory to create the Japanese documentations ?

Best,
Yu



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-translate-the-documentations-of-Spark-in-Japanese-tp7538.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.


Re: No such file or directory errors running tests

2014-07-27 Thread Reynold Xin
Would you like to submit a pull request? All doc source code are in the
docs folder. Cheers.



On Sun, Jul 27, 2014 at 7:35 PM, Stephen Boesch java...@gmail.com wrote:

 i Reynold,
   thanks for responding here. Yes I had looked at the building with maven
 page in the past.  I have not noticed  that the package step must happen
 *before *the test.  I had assumed it were a corequisite -as seen in my
 command line.

 So the following sequence appears to work fine (so far so good - well past
 when the prior attempts failed):


  mvn -Pyarn -Phadoop-2.3 -DskipTests -Phive clean package
 mvn -Pyarn -Phadoop-2.3 -Phive test

 AFA documentation,  yes adding another sentence to that same Building with
 Maven page would likely be helpful to future generations.


 2014-07-27 19:10 GMT-07:00 Reynold Xin r...@databricks.com:

  To run through all the tests you'd need to create the assembly jar first.
 
 
  I've seen this asked a few times. Maybe we should make it more obvious.
 
 
 
  http://spark.apache.org/docs/latest/building-with-maven.html
 
  Spark Tests in Maven
 
  Tests are run by default via the ScalaTest Maven plugin
  http://www.scalatest.org/user_guide/using_the_scalatest_maven_plugin.
  Some of the require Spark to be packaged first, so always run mvn package
   with -DskipTests the first time. You can then run the tests with mvn
  -Dhadoop.version=... test.
 
  The ScalaTest plugin also supports running only a specific test suite as
  follows:
 
  mvn -Dhadoop.version=... -DwildcardSuites=org.apache.spark.repl.ReplSuite
  test
 
 
 
 
 
  On Sun, Jul 27, 2014 at 7:07 PM, Stephen Boesch java...@gmail.com
 wrote:
 
   I have pulled latest from github this afternoon.   There are many many
   errors:
  
   source_home/assembly/target/scala-2.10: No such file or directory
  
   This causes many tests to fail.
  
   Here is the command line I am running
  
   mvn -Pyarn -Phadoop-2.3 -Phive package test
  
 



Re: No such file or directory errors running tests

2014-07-27 Thread Stephen Boesch
 OK i'll do it after confirming all the tests run


2014-07-27 19:36 GMT-07:00 Reynold Xin r...@databricks.com:

 Would you like to submit a pull request? All doc source code are in the
 docs folder. Cheers.



 On Sun, Jul 27, 2014 at 7:35 PM, Stephen Boesch java...@gmail.com wrote:

  i Reynold,
thanks for responding here. Yes I had looked at the building with maven
  page in the past.  I have not noticed  that the package step must
 happen
  *before *the test.  I had assumed it were a corequisite -as seen in my
  command line.
 
  So the following sequence appears to work fine (so far so good - well
 past
  when the prior attempts failed):
 
 
   mvn -Pyarn -Phadoop-2.3 -DskipTests -Phive clean package
  mvn -Pyarn -Phadoop-2.3 -Phive test
 
  AFA documentation,  yes adding another sentence to that same Building
 with
  Maven page would likely be helpful to future generations.
 
 
  2014-07-27 19:10 GMT-07:00 Reynold Xin r...@databricks.com:
 
   To run through all the tests you'd need to create the assembly jar
 first.
  
  
   I've seen this asked a few times. Maybe we should make it more obvious.
  
  
  
   http://spark.apache.org/docs/latest/building-with-maven.html
  
   Spark Tests in Maven
  
   Tests are run by default via the ScalaTest Maven plugin
   http://www.scalatest.org/user_guide/using_the_scalatest_maven_plugin
 .
   Some of the require Spark to be packaged first, so always run mvn
 package
with -DskipTests the first time. You can then run the tests with mvn
   -Dhadoop.version=... test.
  
   The ScalaTest plugin also supports running only a specific test suite
 as
   follows:
  
   mvn -Dhadoop.version=...
 -DwildcardSuites=org.apache.spark.repl.ReplSuite
   test
  
  
  
  
  
   On Sun, Jul 27, 2014 at 7:07 PM, Stephen Boesch java...@gmail.com
  wrote:
  
I have pulled latest from github this afternoon.   There are many
 many
errors:
   
source_home/assembly/target/scala-2.10: No such file or directory
   
This causes many tests to fail.
   
Here is the command line I am running
   
mvn -Pyarn -Phadoop-2.3 -Phive package test
   
  
 



Re: No such file or directory errors running tests

2014-07-27 Thread Steve Nunez
Whilst we¹re on this topic, I¹d be interested to see if you get hive
failures. I¹m trying to build on a Mac using HDP and seem to be getting
failures related to Parquet. I¹ll know for sure once I get in tomorrow and
confirm with engineering, but this is likely because the version of Hive
is 0.12.0, and Parquet is only supported in Hive 0.13 (HDP is 0.13)

Any idea on what it would take to bump the Hive version up to the latest?

Regards,
- SteveN



On 7/27/14, 19:39, Stephen Boesch java...@gmail.com wrote:

 OK i'll do it after confirming all the tests run


2014-07-27 19:36 GMT-07:00 Reynold Xin r...@databricks.com:

 Would you like to submit a pull request? All doc source code are in the
 docs folder. Cheers.



 On Sun, Jul 27, 2014 at 7:35 PM, Stephen Boesch java...@gmail.com
wrote:

  i Reynold,
thanks for responding here. Yes I had looked at the building with
maven
  page in the past.  I have not noticed  that the package step must
 happen
  *before *the test.  I had assumed it were a corequisite -as seen in my
  command line.
 
  So the following sequence appears to work fine (so far so good - well
 past
  when the prior attempts failed):
 
 
   mvn -Pyarn -Phadoop-2.3 -DskipTests -Phive clean package
  mvn -Pyarn -Phadoop-2.3 -Phive test
 
  AFA documentation,  yes adding another sentence to that same Building
 with
  Maven page would likely be helpful to future generations.
 
 
  2014-07-27 19:10 GMT-07:00 Reynold Xin r...@databricks.com:
 
   To run through all the tests you'd need to create the assembly jar
 first.
  
  
   I've seen this asked a few times. Maybe we should make it more
obvious.
  
  
  
   http://spark.apache.org/docs/latest/building-with-maven.html
  
   Spark Tests in Maven
  
   Tests are run by default via the ScalaTest Maven plugin
   
http://www.scalatest.org/user_guide/using_the_scalatest_maven_plugin
 .
   Some of the require Spark to be packaged first, so always run mvn
 package
with -DskipTests the first time. You can then run the tests with
mvn
   -Dhadoop.version=... test.
  
   The ScalaTest plugin also supports running only a specific test
suite
 as
   follows:
  
   mvn -Dhadoop.version=...
 -DwildcardSuites=org.apache.spark.repl.ReplSuite
   test
  
  
  
  
  
   On Sun, Jul 27, 2014 at 7:07 PM, Stephen Boesch java...@gmail.com
  wrote:
  
I have pulled latest from github this afternoon.   There are many
 many
errors:
   
source_home/assembly/target/scala-2.10: No such file or
directory
   
This causes many tests to fail.
   
Here is the command line I am running
   
mvn -Pyarn -Phadoop-2.3 -Phive package test
   
  
 




-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: No such file or directory errors running tests

2014-07-27 Thread Stephen Boesch
Hi Steve,
  I am running on the cdh5.0.0 VM (which is CentOS 6.5)   Given the
difference in O/S and Hadoop distro between us my results are not likely to
be of direct help to you. But in any case i will let you know (likely
offline).


2014-07-27 20:02 GMT-07:00 Steve Nunez snu...@hortonworks.com:

 Whilst we¹re on this topic, I¹d be interested to see if you get hive
 failures. I¹m trying to build on a Mac using HDP and seem to be getting
 failures related to Parquet. I¹ll know for sure once I get in tomorrow and
 confirm with engineering, but this is likely because the version of Hive
 is 0.12.0, and Parquet is only supported in Hive 0.13 (HDP is 0.13)

 Any idea on what it would take to bump the Hive version up to the latest?

 Regards,
 - SteveN



 On 7/27/14, 19:39, Stephen Boesch java...@gmail.com wrote:

  OK i'll do it after confirming all the tests run
 
 
 2014-07-27 19:36 GMT-07:00 Reynold Xin r...@databricks.com:
 
  Would you like to submit a pull request? All doc source code are in the
  docs folder. Cheers.
 
 
 
  On Sun, Jul 27, 2014 at 7:35 PM, Stephen Boesch java...@gmail.com
 wrote:
 
   i Reynold,
 thanks for responding here. Yes I had looked at the building with
 maven
   page in the past.  I have not noticed  that the package step must
  happen
   *before *the test.  I had assumed it were a corequisite -as seen in my
   command line.
  
   So the following sequence appears to work fine (so far so good - well
  past
   when the prior attempts failed):
  
  
mvn -Pyarn -Phadoop-2.3 -DskipTests -Phive clean package
   mvn -Pyarn -Phadoop-2.3 -Phive test
  
   AFA documentation,  yes adding another sentence to that same Building
  with
   Maven page would likely be helpful to future generations.
  
  
   2014-07-27 19:10 GMT-07:00 Reynold Xin r...@databricks.com:
  
To run through all the tests you'd need to create the assembly jar
  first.
   
   
I've seen this asked a few times. Maybe we should make it more
 obvious.
   
   
   
http://spark.apache.org/docs/latest/building-with-maven.html
   
Spark Tests in Maven
   
Tests are run by default via the ScalaTest Maven plugin
   
 http://www.scalatest.org/user_guide/using_the_scalatest_maven_plugin
  .
Some of the require Spark to be packaged first, so always run mvn
  package
 with -DskipTests the first time. You can then run the tests with
 mvn
-Dhadoop.version=... test.
   
The ScalaTest plugin also supports running only a specific test
 suite
  as
follows:
   
mvn -Dhadoop.version=...
  -DwildcardSuites=org.apache.spark.repl.ReplSuite
test
   
   
   
   
   
On Sun, Jul 27, 2014 at 7:07 PM, Stephen Boesch java...@gmail.com
   wrote:
   
 I have pulled latest from github this afternoon.   There are many
  many
 errors:

 source_home/assembly/target/scala-2.10: No such file or
 directory

 This causes many tests to fail.

 Here is the command line I am running

 mvn -Pyarn -Phadoop-2.3 -Phive package test

   
  
 



 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



Re: Can I translate the documentations of Spark in Japanese?

2014-07-27 Thread Patrick Wendell
Hey Yu,

I think we could definitely put a pointer to documentation in other
languages that is hosted somewhere welse, but since we are not in a
position to maintain this, I'm not sure we could merge it into the
mainline Spark codebase. I'd be interested to know what other projects
do about this situation!

- Patrick

On Sun, Jul 27, 2014 at 7:34 PM, Yu Ishikawa
yuu.ishikawa+sp...@gmail.com wrote:
 Hi all,

 I'm Yu Ishikawa, a Japanese.
 I would like to translate the documentations of Spark 1.0.x officially.
 If I will translate them and send a pull request, then can you merge it ?
 And where is the best directory to create the Japanese documentations ?

 Best,
 Yu



 --
 View this message in context: 
 http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-translate-the-documentations-of-Spark-in-Japanese-tp7538.html
 Sent from the Apache Spark Developers List mailing list archive at Nabble.com.