[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread pwendell
Github user pwendell commented on a diff in the pull request:

https://github.com/apache/spark/pull/1172#discussion_r14053159
  
--- Diff: 
core/src/main/scala/org/apache/spark/executor/ExecutorBackend.scala ---
@@ -26,4 +26,7 @@ import org.apache.spark.TaskState.TaskState
  */
 private[spark] trait ExecutorBackend {
   def statusUpdate(taskId: Long, state: TaskState, data: ByteBuffer)
+
+  // Exists as a work around for SPARK-1112. This only exists in 
branch-1.x of Spark.
+  def akkaFrameSize(): Long = Long.MaxValue
--- End diff --

The `MesosExecutorBackend` sends results through mesos, not akka. The 
LocalBackend sends a message to an actor within the same actor system... which 
I assumed won't go over TCP.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-1316. Remove use of Commons IO

2014-06-22 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/1173#issuecomment-46773549
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-1316. Remove use of Commons IO

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1173#issuecomment-46773550
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-1316. Remove use of Commons IO

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1173#issuecomment-46773551
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-1416: PySpark support for SequenceFile a...

2014-06-22 Thread MLnick
Github user MLnick commented on the pull request:

https://github.com/apache/spark/pull/455#issuecomment-46773577
  
1.1 is not released yet. This PR is in master but not in 1.0 (it may be 
released in 1.0.1 or if not then 1.1).


So you'll have to clone master and run sbt/sbt publish-local which will 
publish the maven and sbt artifacts to your local repos.



—
Sent from Mailbox

On Sun, Jun 22, 2014 at 1:22 AM, Russell Jurney notificati...@github.com
wrote:

 Thanks a ton! One thing - how can I pull spark core 1.1 from maven?
 [ERROR] Failed to execute goal on project avro: Could not resolve
 dependencies for project example:avro:jar:0.1: Could not find artifact
 org.apache.spark:spark-core_2.10:jar:1.1.0-SNAPSHOT in scala-tools.org (
 http://scala-tools.org/repo-releases) - [Help 1]
 On Fri, Jun 20, 2014 at 10:45 PM, MLnick notificati...@github.com wrote:
 @rjurney https://github.com/rjurney this works for me (building Spark
 from current master): https://gist.github.com/MLnick/5864741781b9340cb211

 if you run mvn package and then add that to SPARK_CLASSPATH and use it in
 IPython console.

 However it seems to come through as only strings (not a dict). I verified
 that if I take only the string field and explicitly convert to string 
(ie Map[String,
 String]) then it works. I suspect then that Avro doesn't have the type
 information at all, so Pyrolite cannot pickle it. I guess you might have 
to
 do something more in depth in the AvroConverter to read the type info
 from the Avro schema and do a cast...

 —
 Reply to this email directly or view it on GitHub
 https://github.com/apache/spark/pull/455#issuecomment-46745394.

 -- 
 Russell Jurney twitter.com/rjurney russell.jur...@gmail.com 
datasyndrome.com
 ---
 Reply to this email directly or view it on GitHub:
 https://github.com/apache/spark/pull/455#issuecomment-46767642


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46773627
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46773676
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46773677
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-1996. Remove use of special Maven repo f...

2014-06-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/1170


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2034. KafkaInputDStream doesn't close re...

2014-06-22 Thread pwendell
Github user pwendell commented on a diff in the pull request:

https://github.com/apache/spark/pull/980#discussion_r14053246
  
--- Diff: 
external/kafka/src/main/scala/org/apache/spark/streaming/kafka/KafkaInputDStream.scala
 ---
@@ -112,10 +114,14 @@ class KafkaReceiver[
 val topicMessageStreams = consumerConnector.createMessageStreams(
   topics, keyDecoder, valueDecoder)
 
-
-// Start the messages handler for each partition
-topicMessageStreams.values.foreach { streams =
-  streams.foreach { stream = executorPool.submit(new 
MessageHandler(stream)) }
+val executorPool = Executors.newFixedThreadPool(topics.values.sum)
--- End diff --

minor - but to avoid a name collision with Spark's own `Executor` we 
usually try call variables like this `threadPool`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2034. KafkaInputDStream doesn't close re...

2014-06-22 Thread pwendell
Github user pwendell commented on a diff in the pull request:

https://github.com/apache/spark/pull/980#discussion_r14053248
  
--- Diff: 
external/kafka/src/main/scala/org/apache/spark/streaming/kafka/KafkaInputDStream.scala
 ---
@@ -112,10 +114,14 @@ class KafkaReceiver[
 val topicMessageStreams = consumerConnector.createMessageStreams(
   topics, keyDecoder, valueDecoder)
 
-
-// Start the messages handler for each partition
-topicMessageStreams.values.foreach { streams =
-  streams.foreach { stream = executorPool.submit(new 
MessageHandler(stream)) }
+val executorPool = Executors.newFixedThreadPool(topics.values.sum)
--- End diff --

I see that actually you didn't add this name, so nevermind!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2034. KafkaInputDStream doesn't close re...

2014-06-22 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/980#issuecomment-46774049
  
LGTM pending tests. Jenkins retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-1316. Remove use of Commons IO

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1173#issuecomment-46774097
  
Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2034. KafkaInputDStream doesn't close re...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/980#issuecomment-46774099
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-1316. Remove use of Commons IO

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1173#issuecomment-46774098
  
All automated tests passed.
Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16006/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2034. KafkaInputDStream doesn't close re...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/980#issuecomment-46774102
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2231: dev/run-tests should include YARN ...

2014-06-22 Thread pwendell
GitHub user pwendell opened a pull request:

https://github.com/apache/spark/pull/1175

SPARK-2231: dev/run-tests should include YARN and use a recent Hadoop 
version

...rsion

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pwendell/spark test-hadoop-version

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1175.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1175


commit 9210ef400f7e907d9f67f396bbb7806558d61930
Author: Patrick Wendell pwend...@gmail.com
Date:   2014-06-22T06:56:02Z

SPARK-2231: dev/run-tests should include YARN and use a recent Hadoop 
version




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2231: dev/run-tests should include YARN ...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1175#issuecomment-46774162
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46774166
  

Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16007/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2231: dev/run-tests should include YARN ...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1175#issuecomment-46774164
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2141] Adding getPersistentRddIds and un...

2014-06-22 Thread pwendell
Github user pwendell commented on a diff in the pull request:

https://github.com/apache/spark/pull/1082#discussion_r14053327
  
--- Diff: 
core/src/main/scala/org/apache/spark/api/java/JavaSparkContext.scala ---
@@ -559,6 +559,19 @@ class JavaSparkContext(val sc: SparkContext) extends 
JavaSparkContextVarargsWork
   def getLocalProperty(key: String): String = sc.getLocalProperty(key)
 
   /**
+   * Get a set of RDD IDs that have marked themselves as persistent via 
cache() call.
+   * Note that this does not necessarily mean the caching or computation 
was successful.
+   */
+  def getPersistentRddIds(): java.util.Set[Int] =
+setAsJavaSet(sc.getPersistentRDDs.keySet)
+
+  /**
+   * Unpersist an RDD from memory and/or disk storage
--- End diff --

Minor: needs to end with a `.`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2141] Adding getPersistentRddIds and un...

2014-06-22 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/1082#issuecomment-46774517
  
LGTM with a minor comment that can be addressed on merge. @rxin any further 
comments?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2034. KafkaInputDStream doesn't close re...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/980#issuecomment-46774624
  
All automated tests passed.
Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16008/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2231: dev/run-tests should include YARN ...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1175#issuecomment-46774725
  
Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2231: dev/run-tests should include YARN ...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1175#issuecomment-46774727
  
All automated tests passed.
Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16009/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread aarondav
Github user aarondav commented on a diff in the pull request:

https://github.com/apache/spark/pull/1172#discussion_r14053359
  
--- Diff: 
core/src/main/scala/org/apache/spark/executor/ExecutorBackend.scala ---
@@ -26,4 +26,7 @@ import org.apache.spark.TaskState.TaskState
  */
 private[spark] trait ExecutorBackend {
   def statusUpdate(taskId: Long, state: TaskState, data: ByteBuffer)
+
+  // Exists as a work around for SPARK-1112. This only exists in 
branch-1.x of Spark.
+  def akkaFrameSize(): Long = Long.MaxValue
--- End diff --

I see. So the only real change is that in certain cases the LocalBackend 
will no longer use the BlockManager for returning results. Sounds fine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2231: dev/run-tests should include YARN ...

2014-06-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/1175


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread mengxr
Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46775109
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2034. KafkaInputDStream doesn't close re...

2014-06-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/980


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46775157
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46775153
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46775720
  

Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16010/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46775719
  
Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2229: FileAppender throw an llegalArgume...

2014-06-22 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/1174#issuecomment-46778033
  
+1 I literally ran into this too 6 hours ago and had the same fix. It's 
from the change for SPARK-1940. I think it's a good idea that test be run on 
Java 6 as a result? this is another of several that would have been caught by 
that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [MLLIB] [SPARK-2222] Add multiclass evaluation...

2014-06-22 Thread avulanov
Github user avulanov commented on the pull request:

https://github.com/apache/spark/pull/1155#issuecomment-46778186
  
The micro averaged Precision and Recall are equal for multiclass 
classifier, because sum(fni)=sum(fpi), i.e. they are just the sum of all 
non-diagonal elements in confusion matrix. F1-measure, as a harmonic mean of 
teo equal numbers, also equals to P and R. For more details please refer to the 
book Introduction to IR by Manning.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-1316. Remove use of Commons IO

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1173#issuecomment-46778249
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-1316. Remove use of Commons IO

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1173#issuecomment-46778248
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-1316. Remove use of Commons IO

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1173#issuecomment-46779166
  
All automated tests passed.
Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16011/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-1316. Remove use of Commons IO

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1173#issuecomment-46779165
  
Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-1949. Servlet 2.5 vs 3.0 conflict in SBT...

2014-06-22 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/906#issuecomment-46781514
  
Ah OK, it did fail for me locally with `sbt clean assembly test`. Sorry, 
this did in fact have a problem. I think akka does need the old Netty; the 
second commit was a change too far. The first commit it the one cleaning up the 
immediate issue. I dropped the second commit and rebased and all is well. Let's 
see what Jenkins makes of it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-1949. Servlet 2.5 vs 3.0 conflict in SBT...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/906#issuecomment-46781581
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-1949. Servlet 2.5 vs 3.0 conflict in SBT...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/906#issuecomment-46782563
  
All automated tests passed.
Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16012/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-1949. Servlet 2.5 vs 3.0 conflict in SBT...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/906#issuecomment-46782562
  
Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [Spark 1199][WIP] Changed wrappers to not use ...

2014-06-22 Thread ScrapCodes
GitHub user ScrapCodes opened a pull request:

https://github.com/apache/spark/pull/1176

[Spark 1199][WIP] Changed wrappers to not use vals and thus avoid Path 
dependent types problem.

TODO: Write description. basically it fails for one particular scenario and 
I am enjoying tough time debugging it :)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ScrapCodes/spark-1 
SPARK-1199/repl-case-class-fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1176.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1176


commit c2a6498bb40253da9195d01927caa2748919ad96
Author: Prashant Sharma prashan...@imaginea.com
Date:   2014-06-18T12:34:12Z

Back porting scala 2.11 SI-7747's changes on top of my patch.

commit fa7ffca15d0d6cd1c8e2a0064ba4f12f35d5f263
Author: Prashant Sharma prashan...@imaginea.com
Date:   2014-06-19T12:06:08Z

Added a convenience for debugging the generated wrappers as it exists in 
scala 2.11 repl.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [Spark 1199][WIP] Changed wrappers to not use ...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1176#issuecomment-46785360
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1199][WIP] Changed wrappers to not use ...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1176#issuecomment-46785365
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [MLLIB] [SPARK-2222] Add multiclass evaluation...

2014-06-22 Thread xiejuncs
Github user xiejuncs commented on the pull request:

https://github.com/apache/spark/pull/1155#issuecomment-46786256
  
It makes sense. You are right. sum(fni)=sum(fpi). The recall and precision 
are the same. Thanks very much.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1199][WIP] Changed wrappers to not use ...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1176#issuecomment-46786334
  
Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1199][WIP] Changed wrappers to not use ...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1176#issuecomment-46786336
  

Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16013/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Update BasicOperationsSuite.scala

2014-06-22 Thread baishuo
Github user baishuo commented on the pull request:

https://github.com/apache/spark/pull/1084#issuecomment-46786699
  
let me do a check


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-1416: PySpark support for SequenceFile a...

2014-06-22 Thread rjurney
Github user rjurney commented on the pull request:

https://github.com/apache/spark/pull/455#issuecomment-46789138
  
Thanks, master doesn't build for me. Is there a particular commit you
recommend using?

[error]

[error]   last tree to typer:
Literal(Constant(org.apache.spark.sql.catalyst.types.PrimitiveType))

[error]   symbol: null

[error]symbol definition: null

[error]  tpe:
Class(classOf[org.apache.spark.sql.catalyst.types.PrimitiveType])

[error]symbol owners:

[error]   context owners: object TestSQLContext - package test

[error]

[error] == Enclosing template or block ==

[error]

[error] Template( // val local TestSQLContext: notype in object
TestSQLContext, tree.tpe=org.apache.spark.sql.test.TestSQLContext.type

[error]   org.apache.spark.sql.SQLContext // parents

[error]   ValDef(

[error] private

[error] _

[error] tpt

[error] empty

[error]   )

[error]   // 2 statements

[error]   DefDef( // private def readResolve(): Object in object
TestSQLContext

[error] method private synthetic

[error] readResolve

[error] []

[error] List(Nil)

[error] tpt // tree.tpe=Object

[error] test.this.TestSQLContext // object TestSQLContext in package
test, tree.tpe=org.apache.spark.sql.test.TestSQLContext.type

[error]   )

[error]   DefDef( // def init():
org.apache.spark.sql.test.TestSQLContext.type in object TestSQLContext

[error] method

[error] init

[error] []

[error] List(Nil)

[error] tpt // tree.tpe=org.apache.spark.sql.test.TestSQLContext.type

[error] Block( // tree.tpe=Unit

[error]   Apply( // def init(sparkContext:
org.apache.spark.SparkContext): org.apache.spark.sql.SQLContext in class
SQLContext, tree.tpe=org.apache.spark.sql.SQLContext

[error] TestSQLContext.super.init // def init(sparkContext:
org.apache.spark.SparkContext): org.apache.spark.sql.SQLContext in class
SQLContext, tree.tpe=(sparkContext:
org.apache.spark.SparkContext)org.apache.spark.sql.SQLContext

[error] Apply( // def init(master: String,appName: String,conf:
org.apache.spark.SparkConf): org.apache.spark.SparkContext in class
SparkContext, tree.tpe=org.apache.spark.SparkContext

[error]   new org.apache.spark.SparkContext.init // def
init(master: String,appName: String,conf: org.apache.spark.SparkConf):
org.apache.spark.SparkContext in class SparkContext, tree.tpe=(master:
String, appName: String, conf:
org.apache.spark.SparkConf)org.apache.spark.SparkContext

[error]   // 3 arguments

[error]   local

[error]   TestSQLContext

[error]   Apply( // def init(): org.apache.spark.SparkConf in
class SparkConf, tree.tpe=org.apache.spark.SparkConf

[error] new org.apache.spark.SparkConf.init // def
init(): org.apache.spark.SparkConf in class SparkConf,
tree.tpe=()org.apache.spark.SparkConf

[error] Nil

[error]   )

[error] )

[error]   )

[error]   ()

[error] )

[error]   )

[error] )

[error]

[error] == Expanded type of tree ==

[error]

[error] ConstantType(

[error]   value =
Constant(org.apache.spark.sql.catalyst.types.PrimitiveType)

[error] )

[error]

[error] uncaught exception during compilation: java.lang.AssertionError

java.lang.AssertionError: assertion failed: List(object package$DebugNode,
object package$DebugNode)

at scala.reflect.internal.Symbols$Symbol.suchThat(Symbols.scala:1678)

at

scala.reflect.internal.Symbols$ClassSymbol.companionModule0(Symbols.scala:2988)

at

scala.reflect.internal.Symbols$ClassSymbol.companionModule(Symbols.scala:2991)

at
scala.tools.nsc.backend.jvm.GenASM$JPlainBuilder.genClass(GenASM.scala:1371)

at scala.tools.nsc.backend.jvm.GenASM$AsmPhase.run(GenASM.scala:120)

at scala.tools.nsc.Global$Run.compileUnitsInternal(Global.scala:1583)

at scala.tools.nsc.Global$Run.compileUnits(Global.scala:1557)

at scala.tools.nsc.Global$Run.compileSources(Global.scala:1553)

at scala.tools.nsc.Global$Run.compile(Global.scala:1662)

at xsbt.CachedCompiler0.run(CompilerInterface.scala:123)

at xsbt.CachedCompiler0.run(CompilerInterface.scala:99)

at xsbt.CompilerInterface.run(CompilerInterface.scala:27)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at


[GitHub] spark pull request: SPARK-1316. Remove use of Commons IO

2014-06-22 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/1173#issuecomment-46789151
  
Thanks. I've merged this in master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-1316. Remove use of Commons IO

2014-06-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/1173


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2141] Adding getPersistentRddIds and un...

2014-06-22 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/1082#issuecomment-46789198
  
Yup looks good to me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46789575
  
I think these are failing because our tests assume that in local mode we 
enforce the frame size limit (which we actually don't need to). I'll make the 
appropriate adjustments in a bit. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: add a materialize method to materialize Vertex...

2014-06-22 Thread bxshi
GitHub user bxshi opened a pull request:

https://github.com/apache/spark/pull/1177

add a materialize method to materialize VertexRDD by calling RDD's count

Seems one can not materialize VertexRDD by simply calling count method, 
which is overridden by VertexRDD. But if you call RDD's count, it could 
materialize it. 

Is this a feature that designed to get the count without materialize 
VertexRDD? If so, do you guys think it is necessary to add a materialize method 
to VertexRDD? 

By the way, does count() is the cheapest way to materialize a RDD? Or it 
just cost the same resources like other actions? 

Best,

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bxshi/spark materialize_vertexRDD

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1177.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1177


commit 3be5d6a6f6285c6276d80210bf477c483c09c2f9
Author: bxshi baoxu@gmail.com
Date:   2014-06-22T20:39:52Z

add a materialize method to materialize VertexRDD by calling RDD's count 
method




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: add a materialize method to materialize Vertex...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1177#issuecomment-46792651
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: add a materialize method to materialize Vertex...

2014-06-22 Thread bxshi
Github user bxshi commented on the pull request:

https://github.com/apache/spark/pull/1177#issuecomment-46792759
  
Here's a simple code that could reproduce the problem

```
val conf = new SparkConf().setAppName(HDTM)
  .setMaster(local[4])

val sc = new SparkContext(conf)

sc.setCheckpointDir(./checkpoint)
val v = sc.parallelize(Seq[(VertexId, Long)]((0L, 0L), (1L, 1L), (2L, 
2L)))
val e = sc.parallelize(Seq[Edge[Long]](Edge(0L, 1L, 0L), Edge(1L, 2L, 
1L), Edge(2L, 0L, 2L)))
val g = Graph(v, e)
g.vertices.checkpoint()
g.edges.checkpoint()
g.vertices.count()
g.numEdges
println(s${g.vertices.isCheckpointed } ${g.edges.isCheckpointed})

g.vertices.materialize()
println(s${g.vertices.isCheckpointed } ${g.edges.isCheckpointed})
```

The first output is `false true` and after calling `materialize` the output 
is `true true`, which means vertexRDD is correctly check pointed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46793729
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread pwendell
Github user pwendell commented on a diff in the pull request:

https://github.com/apache/spark/pull/1172#discussion_r14056893
  
--- Diff: 
core/src/main/scala/org/apache/spark/executor/ExecutorBackend.scala ---
@@ -26,4 +26,7 @@ import org.apache.spark.TaskState.TaskState
  */
 private[spark] trait ExecutorBackend {
   def statusUpdate(taskId: Long, state: TaskState, data: ByteBuffer)
+
+  // Exists as a work around for SPARK-1112. This only exists in 
branch-1.x of Spark.
+  def akkaFrameSize(): Long = Long.MaxValue
--- End diff --

So that change actually alters the expectations of the unit tests, so I 
went ahead and just enforced the limit in the LocalBackend anwyays.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46793847
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46793850
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46794756
  
Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46794757
  
All automated tests passed.
Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16014/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2124] Move aggregation into shuffle imp...

2014-06-22 Thread jerryshao
Github user jerryshao commented on the pull request:

https://github.com/apache/spark/pull/1064#issuecomment-46798422
  
Hi Matei, thanks for your review, I will update the code soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2229: FileAppender throw an llegalArgume...

2014-06-22 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/1174#issuecomment-46799166
  
Thanks. I'm merging this in master.

@pwendell - we probably want to run tests on JDK6 ... (if possible both in 
the build matrix)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2229: FileAppender throw an llegalArgume...

2014-06-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/1174


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2234][SQL]Spark SQL basicOperators add ...

2014-06-22 Thread YanjieGao
Github user YanjieGao commented on the pull request:

https://github.com/apache/spark/pull/1151#issuecomment-46800258
  
Hi marmbrus
I update these files as your comment tips ,but i think i may make some 
mistakes in the code  .Could you help me and give me some tips ?I will continue 
to work around it  and debug it  to make it better 
Thanks a lot !


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SQL][SPARK-2212]HashJoin(Shuffled)

2014-06-22 Thread chenghao-intel
Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/1147#issuecomment-46800787
  
Thank you all for the comments, I will changed some of the code accordingly.
This PR actually contains 2 relevant parts:
- Code Re-factor for Join
  - Removed `FilteredOperation` from the patterns.scala, cause the 
filters(WHERE CONDITION  JOIN CONDITION) has been pushed down via the 
`PushPredicateThroughJoin` in logical.Optimizer.scala already. Discard the 
combination of filters(where and join condition) seems make the join pattern 
match more clean and simple.
  - Pattern matching order is actually very critical for the Join Operator 
Selection in SparkStrategies.scala, hence I merged the 3 Join Strategies into 1.
  - The trait `BinaryJoinNode`, which can be utilized by `HashJoin` / 
`SortMergeJoin`(will implement soon) / `CartesionProduct`(InnerJoin) / `MapSide 
Join` (Left/Inner/LeftSemi, assume the right table is the build table) for all 
of the join types; and if we want to add code gen for join condition, only we 
need to modify is the trait `BinaryJoinNode`.
- Add Outer Join Support for HashJoin
  - With `BinaryJoinNode`, add hash based outer join support is easy.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46801344
  
@aarondav mind taking a final pass and merging this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread aarondav
Github user aarondav commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46801472
  
Absolutely. LGTM, merging into branch-1.0.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Compression should be a setting for individual...

2014-06-22 Thread ScrapCodes
Github user ScrapCodes closed the pull request at:

https://github.com/apache/spark/pull/1091


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Compression should be a setting for individual...

2014-06-22 Thread ScrapCodes
Github user ScrapCodes commented on the pull request:

https://github.com/apache/spark/pull/1091#issuecomment-46801657
  
Thanks @rxin 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-1937: fix issue with task locality

2014-06-22 Thread lirui-intel
Github user lirui-intel commented on the pull request:

https://github.com/apache/spark/pull/892#issuecomment-46801884
  
Sorry about the code style and thanks @mateiz for pointing out. I've 
updated the patch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread aarondav
Github user aarondav commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46801936
  
You may have to close this manually, @pwendell, I'm not sure github will 
close it if it's not in master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-1937: fix issue with task locality

2014-06-22 Thread lirui-intel
Github user lirui-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/892#discussion_r14059200
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -181,16 +181,14 @@ private[spark] class TaskSetManager(
 var hadAliveLocations = false
 for (loc - tasks(index).preferredLocations) {
   for (execId - loc.executorId) {
-if (sched.isExecutorAlive(execId)) {
-  addTo(pendingTasksForExecutor.getOrElseUpdate(execId, new 
ArrayBuffer))
-  hadAliveLocations = true
-}
+addTo(pendingTasksForExecutor.getOrElseUpdate(execId, new 
ArrayBuffer))
   }
   if (sched.hasExecutorsAliveOnHost(loc.host)) {
-addTo(pendingTasksForHost.getOrElseUpdate(loc.host, new 
ArrayBuffer))
-for (rack - sched.getRackForHost(loc.host)) {
-  addTo(pendingTasksForRack.getOrElseUpdate(rack, new ArrayBuffer))
-}
+hadAliveLocations = true
+  }
+  addTo(pendingTasksForHost.getOrElseUpdate(loc.host, new ArrayBuffer))
+  for (rack - sched.getRackForHost(loc.host)) {
+addTo(pendingTasksForRack.getOrElseUpdate(rack, new ArrayBuffer))
 hadAliveLocations = true
--- End diff --

Do you mean the TaskScheduler should provide something like 
hasHostOnRack, and we have to check that before set hadAliveLocations to true?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread pwendell
Github user pwendell closed the pull request at:

https://github.com/apache/spark/pull/1172


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1112, 2156] (1.0 edition) Use correct a...

2014-06-22 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/1172#issuecomment-46802157
  
Thanks, closed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [WIP] [SQL] SPARK-1800 Add broadcast hash join...

2014-06-22 Thread aarondav
Github user aarondav commented on a diff in the pull request:

https://github.com/apache/spark/pull/1163#discussion_r14059227
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetRelation.scala ---
@@ -44,10 +49,21 @@ import 
org.apache.spark.sql.catalyst.plans.logical.{LogicalPlan, LeafNode}
  * @param path The path to the Parquet file.
  */
 private[sql] case class ParquetRelation(
-val path: String,
-@transient val conf: Option[Configuration] = None) extends LeafNode 
with MultiInstanceRelation {
+path: String,
+@transient conf: Option[Configuration] = None)
+  extends LeafNode
+  with MultiInstanceRelation
+  with SizeEstimatableRelation[SQLContext] {
+
   self: Product =
 
+  def estimatedSize(context: SQLContext): Long = {
--- End diff --

Here we could probably estimate the size more accurately if we also had 
some semantic information, like which columns we wanted, as I believe Parquet 
stores stats for each column. Perhaps worthy of a TODO, this seems perfectly 
reasonable for now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: spark-ec2: quote command line args

2014-06-22 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/1169#issuecomment-46803411
  
Thanks - I merged this into several maintenance branches and I also created 
this JIRA to track it:

https://issues.apache.org/jira/browse/SPARK-2241


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2166 - Listing of instances to be termin...

2014-06-22 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/270#issuecomment-46804399
  
This was actually a pretty tough merge since we changed the spacing around 
a lot in `spark_ec2` recently. I went ahead and manually dealt with the merge. 
I also made two minor changes on merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2166 - Listing of instances to be termin...

2014-06-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/270


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2150: Provide direct link to finished ap...

2014-06-22 Thread rahulsinghaliitd
Github user rahulsinghaliitd commented on the pull request:

https://github.com/apache/spark/pull/1094#issuecomment-46804664
  
@sryza thanks for the thumbs up.

Although I wonder if the approach in 
https://github.com/apache/spark/pull/1112 is better for passing the UI address 
(certainly is much cleaner).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...

2014-06-22 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/1056#issuecomment-46805087
  
Sure - it would be great to add a general heartbeat mechanism that is 
shared between this and the blockmanager.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2242] HOTFIX: Do not mask pyspark stder...

2014-06-22 Thread andrewor14
GitHub user andrewor14 opened a pull request:

https://github.com/apache/spark/pull/1178

[SPARK-2242] HOTFIX: Do not mask pyspark stderr from output

This reverts a change introduced in 
3870248740d83b0292ccca88a494ce19783847f0 that masked stderr from surfacing to 
the `bin/pyspark` shell output. By itself this is not a bug. However, if your 
`spark.master` is not specified correctly, for example, your spark jobs just 
hang without any output instead of indicating that it cannot connect to the 
master.

That commit was not merged in branch-1.0, so this fix is for master only.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/andrewor14/spark fix-python

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1178.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1178


commit 21c9d7c5af9d1647b496734dcd8fa3901bf8b19a
Author: Andrew Or andrewo...@gmail.com
Date:   2014-06-23T04:10:04Z

Do not mask stderr from output




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2242] HOTFIX: Do not mask pyspark stder...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1178#issuecomment-46805167
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2242] HOTFIX: Do not mask pyspark stder...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1178#issuecomment-46805169
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2234][SQL]Spark SQL basicOperators add ...

2014-06-22 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/1151#discussion_r14060166
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -369,6 +369,17 @@ class SQLQuerySuite extends QueryTest {
 (3, null)))
   }
 
+ test(subtract) {
+checkAnswer(
+  sql(SELECT * FROM lowerCaseData SUBTRACT SELECT * FROM 
upperCaseData ),
+  (1, a) ::
+  (2, b) ::
+  (3, c) ::
+  4, d) :: Nil)
--- End diff --

Maybe you missed a '(' here


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2234][SQL]Spark SQL basicOperators add ...

2014-06-22 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/1151#discussion_r14060224
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala ---
@@ -119,6 +119,7 @@ class SqlParser extends StandardTokenParsers with 
PackratParsers {
   protected val UNCACHE = Keyword(UNCACHE)
   protected val UNION = Keyword(UNION)
   protected val WHERE = Keyword(WHERE)
+  protected val SUBTRACT = Keyword(SUBTRACT)
--- End diff --

I think we'd better use MINUS or EXCEPT instead of SUBTRACT


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1178#issuecomment-46806151
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1178#issuecomment-46806156
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1946] Submit stage after (configured ra...

2014-06-22 Thread li-zhihui
Github user li-zhihui commented on a diff in the pull request:

https://github.com/apache/spark/pull/900#discussion_r14060589
  
--- Diff: 
yarn/common/src/main/scala/org/apache/spark/scheduler/cluster/YarnClusterSchedulerBackend.scala
 ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.scheduler.cluster
+
+
+import org.apache.spark.{Logging, SparkContext}
+import org.apache.spark.deploy.yarn.ApplicationMasterArguments
+import org.apache.spark.scheduler.TaskSchedulerImpl
+
+import scala.collection.mutable.ArrayBuffer
+
+private[spark] class YarnClusterSchedulerBackend(
+scheduler: TaskSchedulerImpl,
+sc: SparkContext)
+  extends CoarseGrainedSchedulerBackend(scheduler, sc.env.actorSystem)
+  with Logging {
+
+  private[spark] def addArg(optionName: String, envVar: String, sysProp: 
String,
+  arrayBuf: ArrayBuffer[String]) {
+if (System.getenv(envVar) != null) {
+  arrayBuf += (optionName, System.getenv(envVar))
+} else if (sc.getConf.contains(sysProp)) {
+  arrayBuf += (optionName, sc.getConf.get(sysProp))
+}
+  }
+
+  override def start() {
+super.start()
+val argsArrayBuf = new ArrayBuffer[String]()
+List((--num-executors, SPARK_EXECUTOR_INSTANCES, 
spark.executor.instances),
+  (--num-executors, SPARK_WORKER_INSTANCES, 
spark.worker.instances))
+  .foreach { case (optName, envVar, sysProp) = addArg(optName, 
envVar, sysProp, argsArrayBuf) }
+val args = new ApplicationMasterArguments(argsArrayBuf.toArray)
+totalExecutors.set(args.numExecutors)
--- End diff --

@kayousterhout Here ApplicationMaterArguments is used to get default value 
of numExecutors (It's 2, now).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-1729. Make Flume pull data from source, ...

2014-06-22 Thread harishreedharan
Github user harishreedharan commented on the pull request:

https://github.com/apache/spark/pull/807#issuecomment-46806458
  
@tdas - Have you gotten a chance to take a look at this? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1178#issuecomment-46806569
  
Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1178#issuecomment-46806570
  
All automated tests passed.
Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16015/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2234][SQL]Spark SQL basicOperators add ...

2014-06-22 Thread YanjieGao
Github user YanjieGao commented on a diff in the pull request:

https://github.com/apache/spark/pull/1151#discussion_r14060925
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -369,6 +369,17 @@ class SQLQuerySuite extends QueryTest {
 (3, null)))
   }
 
+ test(subtract) {
+checkAnswer(
+  sql(SELECT * FROM lowerCaseData SUBTRACT SELECT * FROM 
upperCaseData ),
+  (1, a) ::
+  (2, b) ::
+  (3, c) ::
+  4, d) :: Nil)
--- End diff --

Thanks  ,I have correct it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2234][SQL]Spark SQL basicOperators add ...

2014-06-22 Thread YanjieGao
Github user YanjieGao commented on a diff in the pull request:

https://github.com/apache/spark/pull/1151#discussion_r14060932
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala ---
@@ -119,6 +119,7 @@ class SqlParser extends StandardTokenParsers with 
PackratParsers {
   protected val UNCACHE = Keyword(UNCACHE)
   protected val UNION = Keyword(UNION)
   protected val WHERE = Keyword(WHERE)
+  protected val SUBTRACT = Keyword(SUBTRACT)
--- End diff --

Thanks, I have correct it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1178#issuecomment-46807454
  

Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16016/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...

2014-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1178#issuecomment-46807453
  
Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...

2014-06-22 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/1178#issuecomment-46807757
  
Jenkins test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


  1   2   >