date:20140622

[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46774165
  
Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-2231: dev/run-tests should include YARN ...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1175#issuecomment-46774162
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46774166
  

Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16007/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-2231: dev/run-tests should include YARN ...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1175#issuecomment-46774164
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2141] Adding getPersistentRddIds and un...

2014-06-22 Thread pwendell

Github user pwendell commented on a diff in the pull request:

https://github.com/apache/spark/pull/1082#discussion_r14053327
  
--- Diff: 
core/src/main/scala/org/apache/spark/api/java/JavaSparkContext.scala ---
@@ -559,6 +559,19 @@ class JavaSparkContext(val sc: SparkContext) extends 
JavaSparkContextVarargsWork
   def getLocalProperty(key: String): String = sc.getLocalProperty(key)
 
   /**
+   * Get a set of RDD IDs that have marked themselves as persistent via 
cache() call.
+   * Note that this does not necessarily mean the caching or computation 
was successful.
+   */
+  def getPersistentRddIds(): java.util.Set[Int] =
+setAsJavaSet(sc.getPersistentRDDs.keySet)
+
+  /**
+   * Unpersist an RDD from memory and/or disk storage
--- End diff --

Minor: needs to end with a `.`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2141] Adding getPersistentRddIds and un...

2014-06-22 Thread pwendell

Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/1082#issuecomment-46774517
  
LGTM with a minor comment that can be addressed on merge. @rxin any further 
comments?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-2034. KafkaInputDStream doesn't close re...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/980#issuecomment-46774621
  
Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-2034. KafkaInputDStream doesn't close re...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/980#issuecomment-46774624
  
All automated tests passed.
Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16008/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-2231: dev/run-tests should include YARN ...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1175#issuecomment-46774725
  
Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-2231: dev/run-tests should include YARN ...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1175#issuecomment-46774727
  
All automated tests passed.
Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16009/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread aarondav

Github user aarondav commented on a diff in the pull request:

https://github.com/apache/spark/pull/1172#discussion_r14053359
  
--- Diff: 
core/src/main/scala/org/apache/spark/executor/ExecutorBackend.scala ---
@@ -26,4 +26,7 @@ import org.apache.spark.TaskState.TaskState
  */
 private[spark] trait ExecutorBackend {
   def statusUpdate(taskId: Long, state: TaskState, data: ByteBuffer)
+
+  // Exists as a work around for SPARK-1112. This only exists in 
branch-1.x of Spark.
+  def akkaFrameSize(): Long = Long.MaxValue
--- End diff --

I see. So the only real change is that in certain cases the LocalBackend 
will no longer use the BlockManager for returning results. Sounds fine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-2231: dev/run-tests should include YARN ...

2014-06-22 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/1175


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread mengxr

Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46775109
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-2034. KafkaInputDStream doesn't close re...

2014-06-22 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/980


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46775157
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46775153
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46775720
  

Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16010/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46775719
  
Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-2229: FileAppender throw an llegalArgume...

2014-06-22 Thread srowen

Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/1174#issuecomment-46778033
  
+1 I literally ran into this too 6 hours ago and had the same fix. It's 
from the change for SPARK-1940. I think it's a good idea that test be run on 
Java 6 as a result? this is another of several that would have been caught by 
that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [MLLIB] [SPARK-2222] Add multiclass evaluation...

2014-06-22 Thread avulanov

Github user avulanov commented on the pull request:

https://github.com/apache/spark/pull/1155#issuecomment-46778186
  
The micro averaged Precision and Recall are equal for multiclass 
classifier, because sum(fni)=sum(fpi), i.e. they are just the sum of all 
non-diagonal elements in confusion matrix. F1-measure, as a harmonic mean of 
teo equal numbers, also equals to P and R. For more details please refer to the 
book "Introduction to IR" by Manning.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1316. Remove use of Commons IO

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1173#issuecomment-46778249
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1316. Remove use of Commons IO

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1173#issuecomment-46778248
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1316. Remove use of Commons IO

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1173#issuecomment-46779166
  
All automated tests passed.
Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16011/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1316. Remove use of Commons IO

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1173#issuecomment-46779165
  
Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1949. Servlet 2.5 vs 3.0 conflict in SBT...

2014-06-22 Thread srowen

Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/906#issuecomment-46781514
  
Ah OK, it did fail for me locally with `sbt clean assembly test`. Sorry, 
this did in fact have a problem. I think akka does need the old Netty; the 
second commit was a change too far. The first commit it the one cleaning up the 
immediate issue. I dropped the second commit and rebased and all is well. Let's 
see what Jenkins makes of it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1949. Servlet 2.5 vs 3.0 conflict in SBT...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/906#issuecomment-46781576
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1949. Servlet 2.5 vs 3.0 conflict in SBT...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/906#issuecomment-46781581
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1949. Servlet 2.5 vs 3.0 conflict in SBT...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/906#issuecomment-46782563
  
All automated tests passed.
Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16012/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1949. Servlet 2.5 vs 3.0 conflict in SBT...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/906#issuecomment-46782562
  
Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [Spark 1199][WIP] Changed wrappers to not use ...

2014-06-22 Thread ScrapCodes

GitHub user ScrapCodes opened a pull request:

https://github.com/apache/spark/pull/1176

[Spark 1199][WIP] Changed wrappers to not use vals and thus avoid Path 
dependent types problem.

TODO: Write description. basically it fails for one particular scenario and 
I am enjoying tough time debugging it :)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ScrapCodes/spark-1 
SPARK-1199/repl-case-class-fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1176.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1176


commit c2a6498bb40253da9195d01927caa2748919ad96
Author: Prashant Sharma 
Date:   2014-06-18T12:34:12Z

Back porting scala 2.11 SI-7747's changes on top of my patch.

commit fa7ffca15d0d6cd1c8e2a0064ba4f12f35d5f263
Author: Prashant Sharma 
Date:   2014-06-19T12:06:08Z

Added a convenience for debugging the generated wrappers as it exists in 
scala 2.11 repl.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [Spark 1199][WIP] Changed wrappers to not use ...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1176#issuecomment-46785360
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1199][WIP] Changed wrappers to not use ...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1176#issuecomment-46785365
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [MLLIB] [SPARK-2222] Add multiclass evaluation...

2014-06-22 Thread xiejuncs

Github user xiejuncs commented on the pull request:

https://github.com/apache/spark/pull/1155#issuecomment-46786256
  
It makes sense. You are right. sum(fni)=sum(fpi). The recall and precision 
are the same. Thanks very much.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1199][WIP] Changed wrappers to not use ...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1176#issuecomment-46786334
  
Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1199][WIP] Changed wrappers to not use ...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1176#issuecomment-46786336
  

Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16013/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Update BasicOperationsSuite.scala

2014-06-22 Thread baishuo

Github user baishuo commented on the pull request:

https://github.com/apache/spark/pull/1084#issuecomment-46786699
  
let me do a check


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1416: PySpark support for SequenceFile a...

2014-06-22 Thread rjurney

Github user rjurney commented on the pull request:

https://github.com/apache/spark/pull/455#issuecomment-46789138
  
Thanks, master doesn't build for me. Is there a particular commit you
recommend using?

[error]

[error]   last tree to typer:
Literal(Constant(org.apache.spark.sql.catalyst.types.PrimitiveType))

[error]   symbol: null

[error]symbol definition: null

[error]  tpe:
Class(classOf[org.apache.spark.sql.catalyst.types.PrimitiveType])

[error]symbol owners:

[error]   context owners: object TestSQLContext -> package test

[error]

[error] == Enclosing template or block ==

[error]

[error] Template( // val :  in object
TestSQLContext, tree.tpe=org.apache.spark.sql.test.TestSQLContext.type

[error]   "org.apache.spark.sql.SQLContext" // parents

[error]   ValDef(

[error] private

[error] "_"

[error] 

[error] 

[error]   )

[error]   // 2 statements

[error]   DefDef( // private def readResolve(): Object in object
TestSQLContext

[error]  private 

[error] "readResolve"

[error] []

[error] List(Nil)

[error]  // tree.tpe=Object

[error] test.this."TestSQLContext" // object TestSQLContext in package
test, tree.tpe=org.apache.spark.sql.test.TestSQLContext.type

[error]   )

[error]   DefDef( // def ():
org.apache.spark.sql.test.TestSQLContext.type in object TestSQLContext

[error] 

[error] ""

[error] []

[error] List(Nil)

[error]  // tree.tpe=org.apache.spark.sql.test.TestSQLContext.type

[error] Block( // tree.tpe=Unit

[error]   Apply( // def (sparkContext:
org.apache.spark.SparkContext): org.apache.spark.sql.SQLContext in class
SQLContext, tree.tpe=org.apache.spark.sql.SQLContext

[error] TestSQLContext.super."" // def (sparkContext:
org.apache.spark.SparkContext): org.apache.spark.sql.SQLContext in class
SQLContext, tree.tpe=(sparkContext:
org.apache.spark.SparkContext)org.apache.spark.sql.SQLContext

[error] Apply( // def (master: String,appName: String,conf:
org.apache.spark.SparkConf): org.apache.spark.SparkContext in class
SparkContext, tree.tpe=org.apache.spark.SparkContext

[error]   new org.apache.spark.SparkContext."" // def
(master: String,appName: String,conf: org.apache.spark.SparkConf):
org.apache.spark.SparkContext in class SparkContext, tree.tpe=(master:
String, appName: String, conf:
org.apache.spark.SparkConf)org.apache.spark.SparkContext

[error]   // 3 arguments

[error]   "local"

[error]   "TestSQLContext"

[error]   Apply( // def (): org.apache.spark.SparkConf in
class SparkConf, tree.tpe=org.apache.spark.SparkConf

[error] new org.apache.spark.SparkConf."" // def
(): org.apache.spark.SparkConf in class SparkConf,
tree.tpe=()org.apache.spark.SparkConf

[error] Nil

[error]   )

[error] )

[error]   )

[error]   ()

[error] )

[error]   )

[error] )

[error]

[error] == Expanded type of tree ==

[error]

[error] ConstantType(

[error]   value =
Constant(org.apache.spark.sql.catalyst.types.PrimitiveType)

[error] )

[error]

[error] uncaught exception during compilation: java.lang.AssertionError

java.lang.AssertionError: assertion failed: List(object package$DebugNode,
object package$DebugNode)

at scala.reflect.internal.Symbols$Symbol.suchThat(Symbols.scala:1678)

at

scala.reflect.internal.Symbols$ClassSymbol.companionModule0(Symbols.scala:2988)

at

scala.reflect.internal.Symbols$ClassSymbol.companionModule(Symbols.scala:2991)

at
scala.tools.nsc.backend.jvm.GenASM$JPlainBuilder.genClass(GenASM.scala:1371)

at scala.tools.nsc.backend.jvm.GenASM$AsmPhase.run(GenASM.scala:120)

at scala.tools.nsc.Global$Run.compileUnitsInternal(Global.scala:1583)

at scala.tools.nsc.Global$Run.compileUnits(Global.scala:1557)

at scala.tools.nsc.Global$Run.compileSources(Global.scala:1553)

at scala.tools.nsc.Global$Run.compile(Global.scala:1662)

at xsbt.CachedCompiler0.run(CompilerInterface.scala:123)

at xsbt.CachedCompiler0.run(CompilerInterface.scala:99)

at xsbt.CompilerInterface.run(CompilerInterface.scala:27)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at

sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at

[GitHub] spark pull request: SPARK-1316. Remove use of Commons IO

2014-06-22 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/1173#issuecomment-46789151
  
Thanks. I've merged this in master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1316. Remove use of Commons IO

2014-06-22 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/1173


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2141] Adding getPersistentRddIds and un...

2014-06-22 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/1082#issuecomment-46789198
  
Yup looks good to me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread pwendell

Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46789575
  
I think these are failing because our tests assume that in local mode we 
enforce the frame size limit (which we actually don't need to). I'll make the 
appropriate adjustments in a bit. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: add a materialize method to materialize Vertex...

2014-06-22 Thread bxshi

GitHub user bxshi opened a pull request:

https://github.com/apache/spark/pull/1177

add a materialize method to materialize VertexRDD by calling RDD's count

Seems one can not materialize VertexRDD by simply calling count method, 
which is overridden by VertexRDD. But if you call RDD's count, it could 
materialize it. 

Is this a feature that designed to get the count without materialize 
VertexRDD? If so, do you guys think it is necessary to add a materialize method 
to VertexRDD? 

By the way, does count() is the cheapest way to materialize a RDD? Or it 
just cost the same resources like other actions? 

Best,

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bxshi/spark materialize_vertexRDD

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1177.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1177


commit 3be5d6a6f6285c6276d80210bf477c483c09c2f9
Author: bxshi 
Date:   2014-06-22T20:39:52Z

add a materialize method to materialize VertexRDD by calling RDD's count 
method




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: add a materialize method to materialize Vertex...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1177#issuecomment-46792651
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: add a materialize method to materialize Vertex...

2014-06-22 Thread bxshi

Github user bxshi commented on the pull request:

https://github.com/apache/spark/pull/1177#issuecomment-46792759
  
Here's a simple code that could reproduce the problem

```
val conf = new SparkConf().setAppName("HDTM")
  .setMaster("local[4]")

val sc = new SparkContext(conf)

sc.setCheckpointDir("./checkpoint")
val v = sc.parallelize(Seq[(VertexId, Long)]((0L, 0L), (1L, 1L), (2L, 
2L)))
val e = sc.parallelize(Seq[Edge[Long]](Edge(0L, 1L, 0L), Edge(1L, 2L, 
1L), Edge(2L, 0L, 2L)))
val g = Graph(v, e)
g.vertices.checkpoint()
g.edges.checkpoint()
g.vertices.count()
g.numEdges
println(s"${g.vertices.isCheckpointed } ${g.edges.isCheckpointed}")

g.vertices.materialize()
println(s"${g.vertices.isCheckpointed } ${g.edges.isCheckpointed}")
```

The first output is `false true` and after calling `materialize` the output 
is `true true`, which means vertexRDD is correctly check pointed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread pwendell

Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46793729
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread pwendell

Github user pwendell commented on a diff in the pull request:

https://github.com/apache/spark/pull/1172#discussion_r14056893
  
--- Diff: 
core/src/main/scala/org/apache/spark/executor/ExecutorBackend.scala ---
@@ -26,4 +26,7 @@ import org.apache.spark.TaskState.TaskState
  */
 private[spark] trait ExecutorBackend {
   def statusUpdate(taskId: Long, state: TaskState, data: ByteBuffer)
+
+  // Exists as a work around for SPARK-1112. This only exists in 
branch-1.x of Spark.
+  def akkaFrameSize(): Long = Long.MaxValue
--- End diff --

So that change actually alters the expectations of the unit tests, so I 
went ahead and just enforced the limit in the LocalBackend anwyays.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46793847
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46793850
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46794756
  
Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46794757
  
All automated tests passed.
Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16014/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2124] Move aggregation into shuffle imp...

2014-06-22 Thread jerryshao

Github user jerryshao commented on the pull request:

https://github.com/apache/spark/pull/1064#issuecomment-46798422
  
Hi Matei, thanks for your review, I will update the code soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-2229: FileAppender throw an llegalArgume...

2014-06-22 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/1174#issuecomment-46799166
  
Thanks. I'm merging this in master.

@pwendell - we probably want to run tests on JDK6 ... (if possible both in 
the build matrix)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-2229: FileAppender throw an llegalArgume...

2014-06-22 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/1174


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2234][SQL]Spark SQL basicOperators add ...

2014-06-22 Thread YanjieGao

Github user YanjieGao commented on the pull request:

https://github.com/apache/spark/pull/1151#issuecomment-46800258
  
Hi marmbrus
I update these files as your comment tips ,but i think i may make some 
mistakes in the code  .Could you help me and give me some tips ?I will continue 
to work around it  and debug it  to make it better 
Thanks a lot !


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SQL][SPARK-2212]HashJoin(Shuffled)

2014-06-22 Thread chenghao-intel

Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/1147#issuecomment-46800787
  
Thank you all for the comments, I will changed some of the code accordingly.
This PR actually contains 2 relevant parts:
- Code Re-factor for Join
  - Removed `FilteredOperation` from the patterns.scala, cause the 
filters(WHERE CONDITION & JOIN CONDITION) has been pushed down via the 
`PushPredicateThroughJoin` in logical.Optimizer.scala already. Discard the 
combination of filters(where and join condition) seems make the join pattern 
match more clean and simple.
  - Pattern matching order is actually very critical for the Join Operator 
Selection in SparkStrategies.scala, hence I merged the 3 Join Strategies into 1.
  - The trait `BinaryJoinNode`, which can be utilized by `HashJoin` / 
`SortMergeJoin`(will implement soon) / `CartesionProduct`(InnerJoin) / `MapSide 
Join` (Left/Inner/LeftSemi, assume the right table is the build table) for all 
of the join types; and if we want to add code gen for join condition, only we 
need to modify is the trait `BinaryJoinNode`.
- Add Outer Join Support for HashJoin
  - With `BinaryJoinNode`, add hash based outer join support is easy.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread pwendell

Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46801344
  
@aarondav mind taking a final pass and merging this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread aarondav

Github user aarondav commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46801472
  
Absolutely. LGTM, merging into branch-1.0.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Compression should be a setting for individual...

2014-06-22 Thread ScrapCodes

Github user ScrapCodes closed the pull request at:

https://github.com/apache/spark/pull/1091


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Compression should be a setting for individual...

2014-06-22 Thread ScrapCodes

Github user ScrapCodes commented on the pull request:

https://github.com/apache/spark/pull/1091#issuecomment-46801657
  
Thanks @rxin 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1937: fix issue with task locality

2014-06-22 Thread lirui-intel

Github user lirui-intel commented on the pull request:

https://github.com/apache/spark/pull/892#issuecomment-46801884
  
Sorry about the code style and thanks @mateiz for pointing out. I've 
updated the patch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread aarondav

Github user aarondav commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46801936
  
You may have to close this manually, @pwendell, I'm not sure github will 
close it if it's not in master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1937: fix issue with task locality

2014-06-22 Thread lirui-intel

Github user lirui-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/892#discussion_r14059200
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -181,16 +181,14 @@ private[spark] class TaskSetManager(
 var hadAliveLocations = false
 for (loc <- tasks(index).preferredLocations) {
   for (execId <- loc.executorId) {
-if (sched.isExecutorAlive(execId)) {
-  addTo(pendingTasksForExecutor.getOrElseUpdate(execId, new 
ArrayBuffer))
-  hadAliveLocations = true
-}
+addTo(pendingTasksForExecutor.getOrElseUpdate(execId, new 
ArrayBuffer))
   }
   if (sched.hasExecutorsAliveOnHost(loc.host)) {
-addTo(pendingTasksForHost.getOrElseUpdate(loc.host, new 
ArrayBuffer))
-for (rack <- sched.getRackForHost(loc.host)) {
-  addTo(pendingTasksForRack.getOrElseUpdate(rack, new ArrayBuffer))
-}
+hadAliveLocations = true
+  }
+  addTo(pendingTasksForHost.getOrElseUpdate(loc.host, new ArrayBuffer))
+  for (rack <- sched.getRackForHost(loc.host)) {
+addTo(pendingTasksForRack.getOrElseUpdate(rack, new ArrayBuffer))
 hadAliveLocations = true
--- End diff --

Do you mean the TaskScheduler should provide something like 
"hasHostOnRack", and we have to check that before set hadAliveLocations to true?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread pwendell

Github user pwendell closed the pull request at:

https://github.com/apache/spark/pull/1172


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread pwendell

Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46802157
  
Thanks, closed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [WIP] [SQL] SPARK-1800 Add broadcast hash join...

2014-06-22 Thread aarondav

Github user aarondav commented on a diff in the pull request:

https://github.com/apache/spark/pull/1163#discussion_r14059227
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetRelation.scala ---
@@ -44,10 +49,21 @@ import 
org.apache.spark.sql.catalyst.plans.logical.{LogicalPlan, LeafNode}
  * @param path The path to the Parquet file.
  */
 private[sql] case class ParquetRelation(
-val path: String,
-@transient val conf: Option[Configuration] = None) extends LeafNode 
with MultiInstanceRelation {
+path: String,
+@transient conf: Option[Configuration] = None)
+  extends LeafNode
+  with MultiInstanceRelation
+  with SizeEstimatableRelation[SQLContext] {
+
   self: Product =>
 
+  def estimatedSize(context: SQLContext): Long = {
--- End diff --

Here we could probably estimate the size more accurately if we also had 
some semantic information, like which columns we wanted, as I believe Parquet 
stores stats for each column. Perhaps worthy of a TODO, this seems perfectly 
reasonable for now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: spark-ec2: quote command line args

2014-06-22 Thread pwendell

Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/1169#issuecomment-46803411
  
Thanks - I merged this into several maintenance branches and I also created 
this JIRA to track it:

https://issues.apache.org/jira/browse/SPARK-2241


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-2166 - Listing of instances to be termin...

2014-06-22 Thread pwendell

Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/270#issuecomment-46804399
  
This was actually a pretty tough merge since we changed the spacing around 
a lot in `spark_ec2` recently. I went ahead and manually dealt with the merge. 
I also made two minor changes on merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-2166 - Listing of instances to be termin...

2014-06-22 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/270


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-2150: Provide direct link to finished ap...

2014-06-22 Thread rahulsinghaliitd

Github user rahulsinghaliitd commented on the pull request:

https://github.com/apache/spark/pull/1094#issuecomment-46804664
  
@sryza thanks for the thumbs up.

Although I wonder if the approach in 
https://github.com/apache/spark/pull/1112 is better for passing the UI address 
(certainly is much cleaner).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...

2014-06-22 Thread pwendell

Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/1056#issuecomment-46805087
  
Sure - it would be great to add a general heartbeat mechanism that is 
shared between this and the blockmanager.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2242] HOTFIX: Do not mask pyspark stder...

2014-06-22 Thread andrewor14

GitHub user andrewor14 opened a pull request:

https://github.com/apache/spark/pull/1178

[SPARK-2242] HOTFIX: Do not mask pyspark stderr from output

This reverts a change introduced in 
3870248740d83b0292ccca88a494ce19783847f0 that masked stderr from surfacing to 
the `bin/pyspark` shell output. By itself this is not a bug. However, if your 
`spark.master` is not specified correctly, for example, your spark jobs just 
hang without any output instead of indicating that it cannot connect to the 
master.

That commit was not merged in branch-1.0, so this fix is for master only.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/andrewor14/spark fix-python

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1178.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1178


commit 21c9d7c5af9d1647b496734dcd8fa3901bf8b19a
Author: Andrew Or 
Date:   2014-06-23T04:10:04Z

Do not mask stderr from output




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2242] HOTFIX: Do not mask pyspark stder...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1178#issuecomment-46805167
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2242] HOTFIX: Do not mask pyspark stder...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1178#issuecomment-46805169
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2234][SQL]Spark SQL basicOperators add ...

2014-06-22 Thread adrian-wang

Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/1151#discussion_r14060166
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -369,6 +369,17 @@ class SQLQuerySuite extends QueryTest {
 (3, null)))
   }
 
+ test("subtract") {
+checkAnswer(
+  sql("SELECT * FROM lowerCaseData SUBTRACT SELECT * FROM 
upperCaseData "),
+  (1, "a") ::
+  (2, "b") ::
+  (3, "c") ::
+  4, "d") :: Nil)
--- End diff --

Maybe you missed a '(' here


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2234][SQL]Spark SQL basicOperators add ...

2014-06-22 Thread adrian-wang

Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/1151#discussion_r14060224
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala ---
@@ -119,6 +119,7 @@ class SqlParser extends StandardTokenParsers with 
PackratParsers {
   protected val UNCACHE = Keyword("UNCACHE")
   protected val UNION = Keyword("UNION")
   protected val WHERE = Keyword("WHERE")
+  protected val SUBTRACT = Keyword("SUBTRACT")
--- End diff --

I think we'd better use "MINUS" or "EXCEPT" instead of "SUBTRACT"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1178#issuecomment-46806151
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1178#issuecomment-46806156
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1946] Submit stage after (configured ra...

2014-06-22 Thread li-zhihui

Github user li-zhihui commented on a diff in the pull request:

https://github.com/apache/spark/pull/900#discussion_r14060589
  
--- Diff: 
yarn/common/src/main/scala/org/apache/spark/scheduler/cluster/YarnClusterSchedulerBackend.scala
 ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.scheduler.cluster
+
+
+import org.apache.spark.{Logging, SparkContext}
+import org.apache.spark.deploy.yarn.ApplicationMasterArguments
+import org.apache.spark.scheduler.TaskSchedulerImpl
+
+import scala.collection.mutable.ArrayBuffer
+
+private[spark] class YarnClusterSchedulerBackend(
+scheduler: TaskSchedulerImpl,
+sc: SparkContext)
+  extends CoarseGrainedSchedulerBackend(scheduler, sc.env.actorSystem)
+  with Logging {
+
+  private[spark] def addArg(optionName: String, envVar: String, sysProp: 
String,
+  arrayBuf: ArrayBuffer[String]) {
+if (System.getenv(envVar) != null) {
+  arrayBuf += (optionName, System.getenv(envVar))
+} else if (sc.getConf.contains(sysProp)) {
+  arrayBuf += (optionName, sc.getConf.get(sysProp))
+}
+  }
+
+  override def start() {
+super.start()
+val argsArrayBuf = new ArrayBuffer[String]()
+List(("--num-executors", "SPARK_EXECUTOR_INSTANCES", 
"spark.executor.instances"),
+  ("--num-executors", "SPARK_WORKER_INSTANCES", 
"spark.worker.instances"))
+  .foreach { case (optName, envVar, sysProp) => addArg(optName, 
envVar, sysProp, argsArrayBuf) }
+val args = new ApplicationMasterArguments(argsArrayBuf.toArray)
+totalExecutors.set(args.numExecutors)
--- End diff --

@kayousterhout Here ApplicationMaterArguments is used to get default value 
of numExecutors (It's 2, now).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1729. Make Flume pull data from source, ...

2014-06-22 Thread harishreedharan

Github user harishreedharan commented on the pull request:

https://github.com/apache/spark/pull/807#issuecomment-46806458
  
@tdas - Have you gotten a chance to take a look at this? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1178#issuecomment-46806569
  
Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1178#issuecomment-46806570
  
All automated tests passed.
Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16015/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1946] Submit stage after (configured ra...

2014-06-22 Thread li-zhihui

Github user li-zhihui commented on the pull request:

https://github.com/apache/spark/pull/900#issuecomment-46806867
  
@tgravescs @kayousterhout I add a new commit

* Code style
* Rename configuration property name and set default value of 
maxRegisteredExecutorsWaitingTime  to 3
spark.scheduler.minRegisteredExecutorsRatio = 0
spark.scheduler.maxRegisteredExecutorsWaitingTime = 3



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2234][SQL]Spark SQL basicOperators add ...

2014-06-22 Thread YanjieGao

Github user YanjieGao commented on a diff in the pull request:

https://github.com/apache/spark/pull/1151#discussion_r14060925
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -369,6 +369,17 @@ class SQLQuerySuite extends QueryTest {
 (3, null)))
   }
 
+ test("subtract") {
+checkAnswer(
+  sql("SELECT * FROM lowerCaseData SUBTRACT SELECT * FROM 
upperCaseData "),
+  (1, "a") ::
+  (2, "b") ::
+  (3, "c") ::
+  4, "d") :: Nil)
--- End diff --

Thanks  ,I have correct it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2234][SQL]Spark SQL basicOperators add ...

2014-06-22 Thread YanjieGao

Github user YanjieGao commented on a diff in the pull request:

https://github.com/apache/spark/pull/1151#discussion_r14060932
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala ---
@@ -119,6 +119,7 @@ class SqlParser extends StandardTokenParsers with 
PackratParsers {
   protected val UNCACHE = Keyword("UNCACHE")
   protected val UNION = Keyword("UNION")
   protected val WHERE = Keyword("WHERE")
+  protected val SUBTRACT = Keyword("SUBTRACT")
--- End diff --

Thanks, I have correct it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1178#issuecomment-46807454
  

Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16016/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1178#issuecomment-46807453
  
Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...

2014-06-22 Thread andrewor14

Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/1178#issuecomment-46807757
  
Jenkins test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1178#issuecomment-46807842
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1178#issuecomment-46807847
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2227] Support dfs command in SQL.

2014-06-22 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/1167#issuecomment-46808274
  
Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2227] Support dfs command in SQL.

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1167#issuecomment-46808494
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2227] Support dfs command in SQL.

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1167#issuecomment-46808499
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1937: fix issue with task locality

2014-06-22 Thread mateiz

Github user mateiz commented on a diff in the pull request:

https://github.com/apache/spark/pull/892#discussion_r14061551
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -181,16 +181,14 @@ private[spark] class TaskSetManager(
 var hadAliveLocations = false
 for (loc <- tasks(index).preferredLocations) {
   for (execId <- loc.executorId) {
-if (sched.isExecutorAlive(execId)) {
-  addTo(pendingTasksForExecutor.getOrElseUpdate(execId, new 
ArrayBuffer))
-  hadAliveLocations = true
-}
+addTo(pendingTasksForExecutor.getOrElseUpdate(execId, new 
ArrayBuffer))
   }
   if (sched.hasExecutorsAliveOnHost(loc.host)) {
-addTo(pendingTasksForHost.getOrElseUpdate(loc.host, new 
ArrayBuffer))
-for (rack <- sched.getRackForHost(loc.host)) {
-  addTo(pendingTasksForRack.getOrElseUpdate(rack, new ArrayBuffer))
-}
+hadAliveLocations = true
+  }
+  addTo(pendingTasksForHost.getOrElseUpdate(loc.host, new ArrayBuffer))
+  for (rack <- sched.getRackForHost(loc.host)) {
+addTo(pendingTasksForRack.getOrElseUpdate(rack, new ArrayBuffer))
 hadAliveLocations = true
--- End diff --

Yeah, but we can do it in another JIRA.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1937: fix issue with task locality

2014-06-22 Thread lirui-intel

Github user lirui-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/892#discussion_r14061614
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -181,16 +181,14 @@ private[spark] class TaskSetManager(
 var hadAliveLocations = false
 for (loc <- tasks(index).preferredLocations) {
   for (execId <- loc.executorId) {
-if (sched.isExecutorAlive(execId)) {
-  addTo(pendingTasksForExecutor.getOrElseUpdate(execId, new 
ArrayBuffer))
-  hadAliveLocations = true
-}
+addTo(pendingTasksForExecutor.getOrElseUpdate(execId, new 
ArrayBuffer))
   }
   if (sched.hasExecutorsAliveOnHost(loc.host)) {
-addTo(pendingTasksForHost.getOrElseUpdate(loc.host, new 
ArrayBuffer))
-for (rack <- sched.getRackForHost(loc.host)) {
-  addTo(pendingTasksForRack.getOrElseUpdate(rack, new ArrayBuffer))
-}
+hadAliveLocations = true
+  }
+  addTo(pendingTasksForHost.getOrElseUpdate(loc.host, new ArrayBuffer))
+  for (rack <- sched.getRackForHost(loc.host)) {
+addTo(pendingTasksForRack.getOrElseUpdate(rack, new ArrayBuffer))
 hadAliveLocations = true
--- End diff --

Sure :-)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1178#issuecomment-46809797
  
All automated tests passed.
Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16017/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...

2014-06-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1178#issuecomment-46809796
  
Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1416: PySpark support for SequenceFile a...

2014-06-22 Thread MLnick

Github user MLnick commented on the pull request:

https://github.com/apache/spark/pull/455#issuecomment-46810658
  
hmmm not sure - master built fine for me at the time I posted. Either try 
pull again or maybe checkout my branch and try that: 
https://github.com/MLnick/spark-1/tree/pyspark-inputformats


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1316. Remove use of Commons IO

2014-06-22 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/1173#discussion_r14062333
  
--- Diff: 
core/src/main/scala/org/apache/spark/util/logging/RollingFileAppender.scala ---
@@ -83,7 +83,7 @@ private[spark] class RollingFileAppender(
   logDebug(s"Attempting to rollover file $activeFile to file 
$rolloverFile")
   if (activeFile.exists) {
 if (!rolloverFile.exists) {
-  FileUtils.moveFile(activeFile, rolloverFile)
+  Files.move(activeFile, rolloverFile)
--- End diff --

why not using `java.nio.file.Files.move(Path source, Path target, 
CopyOption... options)`? Does `com.google.common.io.Files` have better 
performance than the built-in `Files` in jdk?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1316. Remove use of Commons IO

2014-06-22 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/1173#discussion_r14062363
  
--- Diff: 
core/src/main/scala/org/apache/spark/util/logging/RollingFileAppender.scala ---
@@ -83,7 +83,7 @@ private[spark] class RollingFileAppender(
   logDebug(s"Attempting to rollover file $activeFile to file 
$rolloverFile")
   if (activeFile.exists) {
 if (!rolloverFile.exists) {
-  FileUtils.moveFile(activeFile, rolloverFile)
+  Files.move(activeFile, rolloverFile)
--- End diff --

Nah, it's because it doesn't exist in Java 6 :) 
http://docs.oracle.com/javase/7/docs/api/java/nio/file/Files.html


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

99 matches

Mail list logo