date:20151005

[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...

2015-10-05 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8959#discussion_r41173657
  
--- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
@@ -1183,11 +1194,22 @@ object Client extends Logging {
   private def getUserClasspath(
--- End diff --

(I see the other one calls this; both could be merged into a single method, 
though.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10669] [Docs] Link to each language's A...

2015-10-05 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8977#issuecomment-145609285
  
  [Test build #1842 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1842/console)
 for   PR 8977 at commit 
[`5e298fb`](https://github.com/apache/spark/commit/5e298fb610ba35136cfdc6a2eeea5e9abe1b81fc).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10852][PySpark][SQL] Override built-in ...

2015-10-05 Thread davies

Github user davies commented on the pull request:

https://github.com/apache/spark/pull/8934#issuecomment-145609363
  
@viirya It will be great we can fix it magically. I'm worried that the 
current approach will introduce some performance regressions.

As we always have a way to workaround it using `row["count"]` (similar to 
escape column names in SQL), so it's not a blocker for uses.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8959#issuecomment-145609327
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10585][SQL] only copy data once when ge...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8747#issuecomment-145609517
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10917] [SQL] improve performance of com...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8971#issuecomment-145611558
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10917] [SQL] improve performance of com...

2015-10-05 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8971#issuecomment-145611452
  
  [Test build #43241 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43241/console)
 for   PR 8971 at commit 
[`59bb2f9`](https://github.com/apache/spark/commit/59bb2f969cef9079606e2918289ce33db3201db4).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10917] [SQL] improve performance of com...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8971#issuecomment-145611560
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43241/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10836] [SparkR] Added sort(x, decreasin...

2015-10-05 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8920#issuecomment-145613690
  
  [Test build #43244 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43244/consoleFull)
 for   PR 8920 at commit 
[`05f9009`](https://github.com/apache/spark/commit/05f9009f4ec42b7ccf26519a6bb9746cb9a2ccd6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1267][PYSPARK] Adds pip installer for p...

2015-10-05 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/8318#discussion_r41175012
  
--- Diff: python/pyspark/__init__.py ---
@@ -36,6 +36,31 @@
   Finer-grained cache persistence levels.
 
 """
+import os
+import sys
+
+import xml.etree.ElementTree as ET
+
+if (os.environ.get("SPARK_HOME", "not found") == "not found"):
+raise ImportError("Environment variable SPARK_HOME is undefined.")
+
+spark_home = os.environ['SPARK_HOME']
+pom_xml_file_path = os.path.join(spark_home, 'pom.xml')
--- End diff --

There is no pom file inside the released bin package, I think we should 
look for another way to find out the Spark version.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

2015-10-05 Thread travishegner

Github user travishegner commented on a diff in the pull request:

https://github.com/apache/spark/pull/8780#discussion_r41172742
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/types/DecimalType.scala ---
@@ -140,7 +140,12 @@ object DecimalType extends AbstractDataType {
   }
 
   private[sql] def bounded(precision: Int, scale: Int): DecimalType = {
-DecimalType(min(precision, MAX_PRECISION), min(scale, MAX_SCALE))
--- End diff --

I will take your word for the risk involved, I am very new to this project.

From a layman's perspective, it seems that doing some basic checks when 
instantiating the type would make the type more robust. If I understand 
correctly a `precision <= 0` is not allowed, so this patch returns a /default/ 
decimal. Similarly, a `scale > precision` is not allowed, so it returns a 
decimal with the scale truncated to the size of the precision. My thoughts are 
that this will catch unexpected inputs and still behave in an expected way. 
Users instantiating these decimals in ways are intended will still get the same 
type back.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10917] [SQL] improve performance of com...

2015-10-05 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8971#issuecomment-145607323
  
  [Test build #43241 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43241/consoleFull)
 for   PR 8971 at commit 
[`59bb2f9`](https://github.com/apache/spark/commit/59bb2f969cef9079606e2918289ce33db3201db4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10852][PySpark][SQL] Override built-in ...

2015-10-05 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/8934#discussion_r41173178
  
--- Diff: python/pyspark/sql/types.py ---
@@ -1189,6 +1189,16 @@ class Row(tuple):
 
 >>> Person("Alice", 11)
 Row(name='Alice', age=11)
+
+Some special column names such as aggregated column count, should
--- End diff --

These kind of tests should be in sql/tests.py


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10836] [SparkR] Added sort(x, decreasin...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8920#issuecomment-145610834
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10836] [SparkR] Added sort(x, decreasin...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8920#issuecomment-145610772
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10585][SQL] only copy data once when ge...

2015-10-05 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8747#issuecomment-145610401
  
  [Test build #43243 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43243/consoleFull)
 for   PR 8747 at commit 
[`d7f941d`](https://github.com/apache/spark/commit/d7f941d4edc6e3165790f2546fc3e7f378f04250).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8959#issuecomment-145606571
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10895][SQL] Push down string filters to...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8956#issuecomment-145606676
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10895][SQL] Push down string filters to...

2015-10-05 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8956#issuecomment-145606518
  
  [Test build #43238 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43238/console)
 for   PR 8956 at commit 
[`eb134b9`](https://github.com/apache/spark/commit/eb134b993720a42154c430e508847f852882c5c1).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  case class StringFilter(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1267][PYSPARK] Adds pip installer for p...

2015-10-05 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/8318#discussion_r41173642
  
--- Diff: python/pyspark/__init__.py ---
@@ -36,6 +36,31 @@
   Finer-grained cache persistence levels.
 
 """
+import os
+import sys
+
+import xml.etree.ElementTree as ET
+
+if (os.environ.get("SPARK_HOME", "not found") == "not found"):
--- End diff --

`if os.environ.get("SPARK_HOME") is None:`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...

2015-10-05 Thread tgravescs

Github user tgravescs commented on the pull request:

https://github.com/apache/spark/pull/8959#issuecomment-145619279
  
sorry typo, running scalastyle manually and will post updated patch shortly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10836] [SparkR] Added sort(x, decreasin...

2015-10-05 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8920#issuecomment-145619261
  
  [Test build #43244 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43244/console)
 for   PR 8920 at commit 
[`05f9009`](https://github.com/apache/spark/commit/05f9009f4ec42b7ccf26519a6bb9746cb9a2ccd6).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10836] [SparkR] Added sort(x, decreasin...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8920#issuecomment-145619405
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43244/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10836] [SparkR] Added sort(x, decreasin...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8920#issuecomment-145619402
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10829][SQL]Filter combine partition key...

2015-10-05 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/8916#discussion_r41177762
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
 ---
@@ -62,7 +62,30 @@ private[sql] object DataSourceStrategy extends Strategy 
with Logging {
 // Scanning partitioned HadoopFsRelation
 case PhysicalOperation(projects, filters, l @ LogicalRelation(t: 
HadoopFsRelation))
 if t.partitionSpec.partitionColumns.nonEmpty =>
-  val selectedPartitions = prunePartitions(filters, 
t.partitionSpec).toArray
+  // We divide the filter expressions into 3 parts
+  val partitionColumnNames = 
t.partitionSpec.partitionColumns.map(_.name).toSet
+  val filterMap = filters.groupBy { f =>
+// TODO this is case-senstive
+val referencedColumnNames = f.references.map(_.name).toSet
+if (referencedColumnNames.subsetOf(partitionColumnNames)) {
+  // Only reference the partition key
+  0
+} else if 
(referencedColumnNames.intersect(partitionColumnNames).isEmpty) {
+  // Not reference any partition key at all. can be push down
+  1
+} else {
+  // Reference both partition key and attributes
+  2
+}
+  }
+  // Only prunning the partition keys
+  val partitionFilters = filterMap.getOrElse(0, Nil)
+  // Only pushes down predicates that do not reference partition keys.
+  val pushedFilters = filterMap.getOrElse(1, Nil)
+  // Predicates with both partition keys and attributes
+  val combineFilters = filterMap.getOrElse(2, Nil)
--- End diff --

Instead of using `groupBy`, can we just use `filter` 3 times to split these 
3 kinds of filters? I think performance doesn't matter here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8959#issuecomment-145620685
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...

2015-10-05 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8959#issuecomment-145620682
  
  [Test build #43245 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43245/console)
 for   PR 8959 at commit 
[`e627adc`](https://github.com/apache/spark/commit/e627adc4c5e239ff52ec9c1e33fe57dfe8294b0d).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10917] [SQL] improve performance of com...

2015-10-05 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8971#issuecomment-145621181
  
  [Test build #1843 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1843/consoleFull)
 for   PR 8971 at commit 
[`59bb2f9`](https://github.com/apache/spark/commit/59bb2f969cef9079606e2918289ce33db3201db4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10900][Streaming]Add output operation e...

2015-10-05 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/8958#discussion_r41178108
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/scheduler/OutputOperationInfo.scala
 ---
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.streaming.scheduler
+
+import org.apache.spark.annotation.DeveloperApi
+
+/**
+ * :: DeveloperApi ::
+ * Class having information on output operations.
+ * @param id Id of this output operation. Different output operations have 
different ids in a batch.
+ * @param description the description of this output operation.
+ * @param startTime Clock time of when the output operation started 
processing
+ * @param endTime Clock time of when the output operation started 
processing
+ */
+@DeveloperApi
+case class OutputOperationInfo(
+id: Int,
--- End diff --

This probably should have some reference to which batch this output 
operation belongs to. Batch time probably.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8514] LU factorization on BlockMatrix

2015-10-05 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8563#issuecomment-145623644
  
  [Test build #43247 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43247/consoleFull)
 for   PR 8563 at commit 
[`e6e5c86`](https://github.com/apache/spark/commit/e6e5c86d0aaae57c043b82734b274382945d52af).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...

2015-10-05 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8959#issuecomment-145624942
  
  [Test build #43246 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43246/consoleFull)
 for   PR 8959 at commit 
[`ac3ffcf`](https://github.com/apache/spark/commit/ac3ffcfca047804a2cbdea47064a6234ae6f9fac).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-05 Thread dilipbiswal

GitHub user dilipbiswal opened a pull request:

https://github.com/apache/spark/pull/8983

[SPARK-8654][SQL] Fix Analysis exception when using NULL IN (...)

In the analysis phase , while processing the rules for IN predicate, we
compare the in-list types to the lhs expression type and generate
cast operation if necessary. In the case of NULL [NOT] IN expr1 , we end up
generating cast between in list types to NULL like cast (1 as NULL) which
is not a valid cast.

The fix is to not generate such a cast if the lhs type is a NullType instead
we translate the expression to Literal(Null).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dilipbiswal/spark spark_8654

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8983.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8983


commit 38f973bb124c63c1caabe14ee6e5cca7b764b15a
Author: Dilip Biswal 
Date:   2015-10-02T23:20:56Z

[SPARK-8654] Analysis exception when using NULL IN (...) : invalid cast

In the analysis phase , while processing the rules for IN predicate, we
compare the in-list types to the lhs expression type and generate
cast operation if necessary. In the case of NULL [NOT] IN expr1 , we end up
generating cast between in list types to NULL like cast (1 as NULL) which
is not a valid cast.

The fix is to not generate such a cast if the lhs type is a NullType instead
we translate the expression to Literal(Null).




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10709] [SQL] When loading a json datase...

2015-10-05 Thread piggybox

Github user piggybox commented on the pull request:

https://github.com/apache/spark/pull/8899#issuecomment-145629205
  
This is indeed confusing. I once used sqlContext.read.json() to read a path 
to a folder of JSON files and that worked, so I then tried the parent folder to 
read recursively and saw this error. Also I tried the parent path ended with 
'/*' or '/*/' as a glob I used in Hadoop and got the same error. 

I want to add that it's confusing also because in the log I can see things 
like " HadoopRDD: Input split: " followed by the path to one of actual JSON 
file, so that seems Spark does find the file and starts reading.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8514] LU factorization on BlockMatrix

2015-10-05 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8563#issuecomment-145634505
  
  [Test build #43247 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43247/console)
 for   PR 8563 at commit 
[`e6e5c86`](https://github.com/apache/spark/commit/e6e5c86d0aaae57c043b82734b274382945d52af).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8514] LU factorization on BlockMatrix

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8563#issuecomment-145634658
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43247/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10895][SQL] Push down string filters to...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8956#issuecomment-145606678
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43238/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8959#issuecomment-145606547
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10852][PySpark][SQL] Override built-in ...

2015-10-05 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/8934#discussion_r41172960
  
--- Diff: python/pyspark/sql/types.py ---
@@ -1209,6 +1219,12 @@ def __new__(self, *args, **kwargs):
 else:
 raise ValueError("No args or kwargs")
 
+def __init__(self, *args, **kwargs):
+if hasattr(self, "__fields__") and "count" in self.__fields__:
--- End diff --

Should we check all the names of method? `self.__fields__` is an list, the 
`in` could be expensive.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...

2015-10-05 Thread vanzin

Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/8959#issuecomment-145609903
  
LGTM pending tests and a minor cleanup.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8959#issuecomment-145619563
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8959#issuecomment-145619542
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8959#issuecomment-145620687
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43245/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6284][MESOS] Add mesos role, principal ...

2015-10-05 Thread tnachen

Github user tnachen commented on the pull request:

https://github.com/apache/spark/pull/4960#issuecomment-145625978
  
Hi @AndriiOmelianenko, I have a PR out to fix that here 
https://github.com/apache/spark/pull/8872


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...

2015-10-05 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8959#issuecomment-145609320
  
  [Test build #43242 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43242/console)
 for   PR 8959 at commit 
[`b860226`](https://github.com/apache/spark/commit/b860226c563fc86a84d2ed80353c6502852829f9).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...

2015-10-05 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8959#discussion_r41173518
  
--- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
@@ -1183,11 +1194,22 @@ object Client extends Logging {
   private def getUserClasspath(
--- End diff --

This method is not needed anymore, is it? Only the one that takes a 
`SparkConf` directly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10585][SQL] only copy data once when ge...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8747#issuecomment-145609556
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8959#issuecomment-145609329
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43242/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

2015-10-05 Thread travishegner

Github user travishegner commented on the pull request:

https://github.com/apache/spark/pull/8780#issuecomment-145609065
  
@cloud-fan @bdolbeare @davies I'm certainly open to doing this in an oracle 
specific way if that is what is required. I was simply hoping to solve my 
problem while simultaneously making the whole project more robust. I completely 
understand if you don't believe that it's the right direction. Thanks for 
looking into it with me!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1267][PYSPARK] Adds pip installer for p...

2015-10-05 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/8318#discussion_r41175819
  
--- Diff: python/pyspark/pyspark_version.py ---
@@ -0,0 +1,17 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+__version__ = '1.5.0'
--- End diff --

I think it's error-prone to have multiple copy of version in different 
places, if someone forget to update his, PySpark will break (even within the 
repo).

I'd vote for generate the version during generating PyPI package. If 
PySpark came along with Spark, we don't need this check (at least it shouldn't 
fail or slow).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1267][PYSPARK] Adds pip installer for p...

2015-10-05 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/8318#discussion_r41176383
  
--- Diff: python/setup.py ---
@@ -0,0 +1,18 @@
+#!/usr/bin/env python
+
+from setuptools import setup
+
+exec(compile(open("pyspark/pyspark_version.py").read(), 
+   "pyspark/pyspark_version.py", 'exec'))
+VERSION = __version__
+
+setup(name='pyspark',
+version=VERSION,
+description='Apache Spark Python API',
+author='Spark Developers',
+author_email='d...@spark.apache.org',
+url='https://github.com/apache/spark/tree/master/python',
+packages=['pyspark', 'pyspark.mllib', 'pyspark.ml', 'pyspark.sql', 
'pyspark.streaming'],
+install_requires=['numpy>=1.7', 'py4j==0.8.2.1', 'pandas'],
--- End diff --

pyspark does not depend on numpy and pandas, only pyspark.mllib/ml do. 
pyspark.sql does not require pandas.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...

2015-10-05 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8959#issuecomment-145620136
  
  [Test build #43245 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43245/consoleFull)
 for   PR 8959 at commit 
[`e627adc`](https://github.com/apache/spark/commit/e627adc4c5e239ff52ec9c1e33fe57dfe8294b0d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10900][Streaming]Add output operation e...

2015-10-05 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/8958#discussion_r41178301
  
--- Diff: 
streaming/src/test/scala/org/apache/spark/streaming/StreamingListenerSuite.scala
 ---
@@ -140,6 +140,27 @@ class StreamingListenerSuite extends TestSuiteBase 
with Matchers {
 }
   }
 
+  test("output operation reporting") {
+ssc = new StreamingContext("local[2]", "test", Milliseconds(1000))
+val inputStream = ssc.receiverStream(new 
StreamingListenerSuiteReceiver)
+inputStream.foreachRDD(_.count())
+inputStream.foreachRDD(_.collect())
+inputStream.foreachRDD(_.count())
+
+val collector = new OutputOperationInfoCollector
+ssc.addStreamingListener(collector)
+
+ssc.start()
+try {
+  eventually(timeout(30 seconds), interval(20 millis)) {
--- End diff --

makes sense especially now that there is a output operation info


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-145630551
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8514] LU factorization on BlockMatrix

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8563#issuecomment-145634657
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8514] LU factorization on BlockMatrix

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8563#issuecomment-145623267
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8514] LU factorization on BlockMatrix

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8563#issuecomment-145623227
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...

2015-10-05 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8959#issuecomment-145608874
  
  [Test build #43242 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43242/consoleFull)
 for   PR 8959 at commit 
[`b860226`](https://github.com/apache/spark/commit/b860226c563fc86a84d2ed80353c6502852829f9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8959#issuecomment-145623250
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8959#issuecomment-145623218
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-05 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/8983#discussion_r41183045
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala
 ---
@@ -135,4 +135,11 @@ class AnalysisSuite extends AnalysisTest {
 plan = testRelation.select(CreateStructUnsafe(Seq(a, (a + 
1).as("a+1"))).as("col"))
 checkAnalysis(plan, plan)
   }
+
+  test("SPARK-8654: invalid CAST in NULL IN(...) expression") {
+val plan = Project(Alias(In(Literal(null), Seq(Literal(1), 
Literal(2))), "a")() :: Nil,
+  LocalRelation()
+)
+assertAnalysisSuccess(plan, false)
--- End diff --

why change the default value of `caseSensitive`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10841][SQL] Add pushdown support of UDF...

2015-10-05 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/8922#issuecomment-145632836
  
"daily sql query" is not sufficiently descriptive.  Please post actual 
benchmark results with code when making pull requests that claim to improve 
performance.  It would also be good to evaluate the cost in degenerate cases.  
For example, I think you are adding an object allocation per input tuple when 
boxing for any queries that filter by UDF in parquet.  Are you slowing down 
cases where the filter is not selective?

If we want to improve the set of things that we push down, I don't think 
specializing for just UDFs in comparison operations is worth it given how much 
you are widening the API.  Could we just have a single Function filter:

```scala
case class FilterFunction(attribute: String, function: Any => Boolean)
```

or maybe some specialized variants:

```scala
case class IntegerFilter(attribute: String, Int => Boolean)
...
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10337][SQL] fix hive views on non-hive-...

2015-10-05 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/8990#discussion_r41225536
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala 
---
@@ -1248,4 +1248,12 @@ class SQLQuerySuite extends QueryTest with 
SQLTestUtils with TestHiveSingleton {
 """.stripMargin), Row("b", 6.0) :: Row("a", 7.0) :: Nil)
 }
   }
+
+  test("SPARK-10337: correctly handle hive views") {
+withSQLConf("spark.sql.hive.nonNativeView" -> "true") {
+  sqlContext.range(1, 10).write.format("json").saveAsTable("jt")
+  sql("CREATE VIEW testView AS SELECT id FROM jt")
+  checkAnswer(sql("SELECT * FROM testView ORDER BY id"), (1 to 
9).map(i => Row(i)))
+}
+  }
--- End diff --

Also, do we need to design more cases to test when this flag is true?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8848] [SQL] Refactors Parquet write pat...

2015-10-05 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8988#issuecomment-145745724
  
  [Test build #43272 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43272/consoleFull)
 for   PR 8988 at commit 
[`6fd20f7`](https://github.com/apache/spark/commit/6fd20f70baa535b1772c1c30a3f651ea673560f2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10885][Streaming]Display the failed out...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8950#issuecomment-145738954
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10337][SQL] fix hive views on non-hive-...

2015-10-05 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/8990#discussion_r41225160
  
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala ---
@@ -563,6 +580,77 @@ 
https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C
 }
   }
 
+ case view @ Token("TOK_CREATEVIEW", children)
+if children.collect { case t @ Token("TOK_QUERY", _) => t 
}.nonEmpty =>
+
+  val Seq(
+Some(viewNameParts),
+Some(query),
+maybeComment,
+allowExisting,
+maybeProperties,
+maybeColumns,
+maybePartCols
+  ) = getClauses(
+Seq(
+  "TOK_TABNAME",
+  "TOK_QUERY",
+  "TOK_TABLECOMMENT",
+  "TOK_IFNOTEXISTS",
+  "TOK_TABLEPROPERTIES",
+  "TOK_TABCOLNAME",
+  "TOK_VIEWPARTCOLS"),
+children)
+
+  // If the view is partitioned, we let hive handle it.
+  if (maybePartCols.isDefined) {
+NativePlaceholder
+  } else {
+val (db, viewName) = extractDbNameTableName(viewNameParts)
+
+val originalText = context.getTokenRewriteStream
+  .toString(query.getTokenStartIndex, query.getTokenStopIndex)
+
+val schema = maybeColumns.map { cols =>
+  BaseSemanticAnalyzer.getColumns(cols, true).asScala.map { field 
=>
+HiveColumn(field.getName, field.getType, field.getComment)
+  }
+}.getOrElse(Seq.empty[HiveColumn])
+
+val properties = scala.collection.mutable.Map.empty[String, String]
+
+maybeProperties.foreach {
+  case Token("TOK_TABLEPROPERTIES", list :: Nil) =>
+properties ++= getProperties(list)
+}
+
+maybeComment.foreach {
+  case Token("TOK_TABLECOMMENT", child :: Nil) =>
+val comment = 
BaseSemanticAnalyzer.unescapeSQLString(child.getText)
+if (comment ne null) {
+  properties += ("comment" -> comment)
+}
+}
+
+val tableDesc = HiveTable(
+  specifiedDatabase = db,
+  name = viewName,
+  schema = schema,
+  partitionColumns = Seq.empty[HiveColumn],
+  properties = properties.toMap,
+  serdeProperties = Map[String, String](),
+  tableType = VirtualView,
+  location = None,
+  inputFormat = None,
+  outputFormat = None,
+  serde = None,
+  viewText = Some(originalText))
+
+val sql = context.getTokenRewriteStream
+  .toString(view.getTokenStartIndex, view.getTokenStopIndex)
--- End diff --

Let's add a comment at here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10941] [SQL] Refactor AggregateFunction...

2015-10-05 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8973#issuecomment-145743801
  
  [Test build #43271 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43271/consoleFull)
 for   PR 8973 at commit 
[`e34e22e`](https://github.com/apache/spark/commit/e34e22ef25dabcf5ee03fd51631a3d8f1a227070).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10885][Streaming]Display the failed out...

2015-10-05 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8950#issuecomment-145738897
  
  [Test build #43270 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43270/console)
 for   PR 8950 at commit 
[`ca68ac8`](https://github.com/apache/spark/commit/ca68ac858462fff107d8a5ce7a5af3cf9416aca3).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10885][Streaming]Display the failed out...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8950#issuecomment-145738955
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43270/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10337][SQL] fix hive views on non-hive-...

2015-10-05 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/8990#discussion_r41225256
  
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala ---
@@ -563,6 +580,77 @@ 
https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C
 }
   }
 
+ case view @ Token("TOK_CREATEVIEW", children)
+if children.collect { case t @ Token("TOK_QUERY", _) => t 
}.nonEmpty =>
+
+  val Seq(
+Some(viewNameParts),
+Some(query),
+maybeComment,
+allowExisting,
+maybeProperties,
+maybeColumns,
+maybePartCols
+  ) = getClauses(
+Seq(
+  "TOK_TABNAME",
+  "TOK_QUERY",
+  "TOK_TABLECOMMENT",
+  "TOK_IFNOTEXISTS",
+  "TOK_TABLEPROPERTIES",
+  "TOK_TABCOLNAME",
+  "TOK_VIEWPARTCOLS"),
+children)
+
+  // If the view is partitioned, we let hive handle it.
+  if (maybePartCols.isDefined) {
+NativePlaceholder
+  } else {
+val (db, viewName) = extractDbNameTableName(viewNameParts)
+
+val originalText = context.getTokenRewriteStream
+  .toString(query.getTokenStartIndex, query.getTokenStopIndex)
+
+val schema = maybeColumns.map { cols =>
+  BaseSemanticAnalyzer.getColumns(cols, true).asScala.map { field 
=>
+HiveColumn(field.getName, field.getType, field.getComment)
--- End diff --

Does hive allow column types specified in the create view command?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10938] [SQL] remove typeId in columnar ...

2015-10-05 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8989#discussion_r41225242
  
--- Diff: project/MimaExcludes.scala ---
@@ -42,7 +42,9 @@ object MimaExcludes {
 excludePackage("org.spark-project.jetty"),
 MimaBuild.excludeSparkPackage("unused"),
 // SQL execution is considered private.
-excludePackage("org.apache.spark.sql.execution")
+excludePackage("org.apache.spark.sql.execution"),
+// SQL columnar is considered private.
+excludePackage("org.apache.spark.sql.columnar")
--- End diff --

Yup. It would be great to minimize the number of top packages we have. You 
can do it in a followup pr too.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10932] [PROJECT INFRA] Port two minor c...

2015-10-05 Thread shivaram

Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/8986#issuecomment-145740694
  
Ah I see - didn't know it was failing Jenkins as well. 
Code changes from rxin/spark-utils LGTM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10941] [SQL] Refactor AggregateFunction...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8973#issuecomment-145743534
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10885][Streaming]Display the failed out...

2015-10-05 Thread tdas

Github user tdas commented on the pull request:

https://github.com/apache/spark/pull/8950#issuecomment-145743591
  
This looks cool! Also I like that you used "details" for that. In the case 
of "Failed", unless the details is opened, there is not indication of failure. 
So it might be better to show "Failed due to error: $exceptionMessage", and the 
full stacktrace in the detail. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10941] [SQL] Refactor AggregateFunction...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8973#issuecomment-145743522
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8848] [SQL] Refactors Parquet write pat...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8988#issuecomment-145745373
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8848] [SQL] Refactors Parquet write pat...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8988#issuecomment-145745361
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10337][SQL] fix hive views on non-hive-...

2015-10-05 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/8990#discussion_r41225405
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala 
---
@@ -1248,4 +1248,12 @@ class SQLQuerySuite extends QueryTest with 
SQLTestUtils with TestHiveSingleton {
 """.stripMargin), Row("b", 6.0) :: Row("a", 7.0) :: Nil)
 }
   }
+
+  test("SPARK-10337: correctly handle hive views") {
+withSQLConf("spark.sql.hive.nonNativeView" -> "true") {
+  sqlContext.range(1, 10).write.format("json").saveAsTable("jt")
+  sql("CREATE VIEW testView AS SELECT id FROM jt")
+  checkAnswer(sql("SELECT * FROM testView ORDER BY id"), (1 to 
9).map(i => Row(i)))
+}
+  }
--- End diff --

Do we have view tests (to make sure we are good when this flag is false) in 
the hive compatibility suite?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10941] [SQL] Refactor AggregateFunction...

2015-10-05 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8973#issuecomment-145744272
  
  [Test build #43271 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43271/console)
 for   PR 8973 at commit 
[`e34e22e`](https://github.com/apache/spark/commit/e34e22ef25dabcf5ee03fd51631a3d8f1a227070).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Average(child: Expression) extends 
ExpressionAggregateFunction `
  * `case class Count(child: Expression) extends 
ExpressionAggregateFunction `
  * `case class First(child: Expression) extends 
ExpressionAggregateFunction `
  * `case class Last(child: Expression) extends ExpressionAggregateFunction 
`
  * `case class Max(child: Expression) extends ExpressionAggregateFunction `
  * `case class Min(child: Expression) extends ExpressionAggregateFunction `
  * `abstract class StddevAgg(child: Expression) extends 
ExpressionAggregateFunction `
  * `case class Sum(child: Expression) extends ExpressionAggregateFunction `
  * `sealed abstract class AggregateFunction2 extends Expression with 
ImplicitCastInputTypes `
  * `abstract class ImperativeAggregateFunction extends AggregateFunction2 `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10941] [SQL] Refactor AggregateFunction...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8973#issuecomment-145744274
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43271/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10941] [SQL] Refactor AggregateFunction...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8973#issuecomment-145744273
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10772][Streaming][Scala]: NullPointerEx...

2015-10-05 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/8881#discussion_r41189182
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/dstream/TransformedDStream.scala
 ---
@@ -38,6 +39,11 @@ class TransformedDStream[U: ClassTag] (
 
   override def compute(validTime: Time): Option[RDD[U]] = {
 val parentRDDs = parents.map(_.getOrCompute(validTime).orNull).toSeq
-Some(transformFunc(parentRDDs, validTime))
+val transformedRDD = transformFunc(parentRDDs, validTime)
+if (transformedRDD == null) {
+  throw new SparkException("Transform function may not return null. " +
--- End diff --

I would say "must not return null". Saying "may not" is a little ambiguous.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10917] [SQL] improve performance of com...

2015-10-05 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8971#issuecomment-145662420
  
  [Test build #1843 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1843/console)
 for   PR 8971 at commit 
[`59bb2f9`](https://github.com/apache/spark/commit/59bb2f969cef9079606e2918289ce33db3201db4).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10585][SQL] only copy data once when ge...

2015-10-05 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8747#issuecomment-145645748
  
  [Test build #43243 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43243/console)
 for   PR 8747 at commit 
[`d7f941d`](https://github.com/apache/spark/commit/d7f941d4edc6e3165790f2546fc3e7f378f04250).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `public class BufferHolder `
  * `public class UnsafeArrayWriter `
  * `public class UnsafeRowWriter `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-05 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/8983#discussion_r41188938
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -305,12 +305,17 @@ object HiveTypeCoercion {
 
   /**
* Convert all expressions in in() list to the left operator type
+   * except when the left operator type is NullType. In case when left hand
+   * operator type is NullType create a Literal(Null).
*/
   object InConversion extends Rule[LogicalPlan] {
 def apply(plan: LogicalPlan): LogicalPlan = plan resolveExpressions {
   // Skip nodes who's children have not been resolved yet.
   case e if !e.childrenResolved => e
 
+  case i @ In(a, b) if (a.dataType == NullType) =>
+Literal.create(null, BooleanType)
--- End diff --

instead of just casting null to boolean, can we come up with a better idea 
according to the data types of `b`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10585][SQL] only copy data once when ge...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8747#issuecomment-145645883
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43243/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10585][SQL] only copy data once when ge...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8747#issuecomment-145645874
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() to get...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8984#issuecomment-145648518
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10772][Streaming][Scala]: NullPointerEx...

2015-10-05 Thread tdas

Github user tdas commented on the pull request:

https://github.com/apache/spark/pull/8881#issuecomment-145650573
  
@jhu-chang Could you fix the style issue and one minor issue that I pointed 
out. 


Style issues:
```
[error] 
/home/jenkins/workspace/NewSparkPullRequestBuilder/streaming/src/test/scala/org/apache/spark/streaming/BasicOperationsSuite.scala:213:0:
 Whitespace at end of line
[error] 
/home/jenkins/workspace/NewSparkPullRequestBuilder/streaming/src/test/scala/org/apache/spark/streaming/BasicOperationsSuite.scala:221:10:
 Whitespace at end of line

```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10585][SQL] only copy data once when ge...

2015-10-05 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/8747


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10836] [SparkR] Added sort(x, decreasin...

2015-10-05 Thread NarineK

Github user NarineK commented on the pull request:

https://github.com/apache/spark/pull/8920#issuecomment-145657747
  
@sun-rui Thanks!
Done the changes, please check it out.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10913][SPARKR] attach() function suppor...

2015-10-05 Thread adrian555

GitHub user adrian555 opened a pull request:

https://github.com/apache/spark/pull/8985

[SPARK-10913][SPARKR] attach() function support



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/adrian555/spark attach_and_with

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8985.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8985


commit 319905f7d9db44ce326dee9ff2aa16f6e350ab50
Author: adrian555 
Date:   2015-10-02T22:33:22Z

attach() function support

commit 0aa94a2f4b1fbab975d6061db111e01fb9ccce2c
Author: adrian555 
Date:   2015-10-03T05:46:07Z

[SPARK-10913] [SPARKR] attach() function support




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10913][SPARKR] attach() function suppor...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8985#issuecomment-145661230
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10669] [Docs] Link to each language's A...

2015-10-05 Thread keypointt

GitHub user keypointt opened a pull request:

https://github.com/apache/spark/pull/8977

[SPARK-10669] [Docs] Link to each language's API in codetabs in ML docs: 
spark.mllib

In the Markdown docs for the spark.mllib Programming Guide, we have code 
examples with codetabs for each language. We should link to each language's API 
docs within the corresponding codetab, but we are inconsistent about this. For 
an example of what we want to do, see the "ChiSqSelector" section in 
https://github.com/apache/spark/blob/64743870f23bffb8d96dcc8a0181c1452782a151/docs/mllib-feature-extraction.md
This JIRA is just for spark.mllib, not spark.ml.

Please let me know if more work is needed, thanks a lot.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/keypointt/spark SPARK-10669

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8977.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8977


commit f8289891d5b32fffdc6a4ce077d8d206e015119f
Author: Xin Ren 
Date:   2015-10-02T07:00:36Z

[SPARK-10669] test modify

commit 67c67158c93a4a8c5a963ecf821f5e85e1228bf3
Author: Xin Ren 
Date:   2015-10-04T16:54:35Z

[SPARK-10669] Link to each language API in codetabs in spark mllib docs

commit 31960f6790a75fd037ffd879a8d17e546c5fa6fa
Author: Xin Ren 
Date:   2015-10-04T16:59:56Z

[SPARK-10669] minor correction

commit 82528d238a52a07c9e47c86050abb471775d2b20
Author: Xin Ren 
Date:   2015-10-05T05:37:45Z

[SPARK-10669] re-commit, Link to each language API in codetabs in spark 
mllib docs

commit ce92c03a3ea688e35560c7411fecf56971138c2b
Author: Xin Ren 
Date:   2015-10-05T05:48:10Z

[SPARK-10669] add up API links

commit 5fa2ef77aa1d04c4d2e210062bafd9bca0b48bd9
Author: Xin Ren 
Date:   2015-10-05T06:27:39Z

[SPARK-10669] undo previous wrong changes

commit c75d79da41edae8c08b838bbe44a96fb88bc5f71
Author: Xin Ren 
Date:   2015-10-05T06:32:02Z

[SPARK-10669] undo previous wrong changes for mllib-ensembles.md

commit 5beb6ddc5eb6e90c531e326335a797205eb6a505
Author: Xin Ren 
Date:   2015-10-05T06:34:06Z

[SPARK-10669] undo previous wrong changes for 
mllib-frequent-pattern-mining.md




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10669] [Docs] Link to each language's A...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8977#issuecomment-145446371
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10856][SQL] Mapping TimestampType to DA...

2015-10-05 Thread viirya

GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/8978

[SPARK-10856][SQL] Mapping TimestampType to DATETIME for SQL Server jdbc 
dialect

JIRA: https://issues.apache.org/jira/browse/SPARK-10856

For Microsoft SQL Server, TimestampType should be mapped to DATETIME 
instead of TIMESTAMP.

Related information for the datatype mapping: 
https://msdn.microsoft.com/en-us/library/ms378878(v=sql.110).aspx

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 mysql-jdbc-timestamp

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8978.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8978


commit 4e82ef152dca875b4e1523052ba130a6f6853d75
Author: Liang-Chi Hsieh 
Date:   2015-10-05T07:41:47Z

For Microsoft SQL Server, TimestampType should be mapped to DATETIME 
instead of TIMESTAMP.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10856][SQL] Mapping TimestampType to DA...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8978#issuecomment-145454237
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10836] [SparkR] Added sort(x, decreasin...

2015-10-05 Thread sun-rui

Github user sun-rui commented on a diff in the pull request:

https://github.com/apache/spark/pull/8920#discussion_r41112866
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1304,24 +1306,62 @@ setClassUnion("characterOrColumn", c("character", 
"Column"))
 #' path <- "path/to/file.json"
 #' df <- jsonFile(sqlContext, path)
 #' arrange(df, df$col1)
-#' arrange(df, "col1")
 #' arrange(df, asc(df$col1), desc(abs(df$col2)))
+#' arrange(df, "col1")
+#' arrange(df, "col2", FALSE)
--- End diff --

remove this line


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10836] [SparkR] Added sort(x, decreasin...

2015-10-05 Thread sun-rui

Github user sun-rui commented on a diff in the pull request:

https://github.com/apache/spark/pull/8920#discussion_r41112848
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1304,24 +1306,62 @@ setClassUnion("characterOrColumn", c("character", 
"Column"))
 #' path <- "path/to/file.json"
 #' df <- jsonFile(sqlContext, path)
 #' arrange(df, df$col1)
-#' arrange(df, "col1")
 #' arrange(df, asc(df$col1), desc(abs(df$col2)))
+#' arrange(df, "col1")
+#' arrange(df, "col2", FALSE)
+#' arrange(df, "col1", decreasing=TRUE)
+#' arrange(df, "col1", "col2", c(TRUE, FALSE))
 #' }
 setMethod("arrange",
-  signature(x = "DataFrame", col = "characterOrColumn"),
+  signature(x = "DataFrame", col="Column"),
   function(x, col, ...) {
-if (class(col) == "character") {
-  sdf <- callJMethod(x@sdf, "sort", col, list(...))
-} else if (class(col) == "Column") {
   jcols <- lapply(list(col, ...), function(c) {
 c@jc
   })
-  sdf <- callJMethod(x@sdf, "sort", jcols)
-}
+
+sdf <- callJMethod(x@sdf, "sort", jcols)
 dataFrame(sdf)
   })
 
 #' @rdname arrange
+#' @export
+setMethod("arrange",
+  signature(x = "DataFrame", col="character"),
+  function(x, col, ..., decreasing=FALSE) {
+
+# all sorting columns
+by <- list(col, ...)
+
+# extracting the last element and uses it as decreasing if it 
is boolean
--- End diff --

remove this block of code. Sorting direction must be specified by 
"decreasing = ". No need to check if last element in "..." is decreasing or not.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10836] [SparkR] Added sort(x, decreasin...

2015-10-05 Thread sun-rui

Github user sun-rui commented on a diff in the pull request:

https://github.com/apache/spark/pull/8920#discussion_r41112869
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1304,24 +1306,62 @@ setClassUnion("characterOrColumn", c("character", 
"Column"))
 #' path <- "path/to/file.json"
 #' df <- jsonFile(sqlContext, path)
 #' arrange(df, df$col1)
-#' arrange(df, "col1")
 #' arrange(df, asc(df$col1), desc(abs(df$col2)))
+#' arrange(df, "col1")
+#' arrange(df, "col2", FALSE)
+#' arrange(df, "col1", decreasing=TRUE)
--- End diff --

coding style. decreasing = TRUE


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10836] [SparkR] Added sort(x, decreasin...

2015-10-05 Thread sun-rui

Github user sun-rui commented on a diff in the pull request:

https://github.com/apache/spark/pull/8920#discussion_r41112888
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1304,24 +1306,62 @@ setClassUnion("characterOrColumn", c("character", 
"Column"))
 #' path <- "path/to/file.json"
 #' df <- jsonFile(sqlContext, path)
 #' arrange(df, df$col1)
-#' arrange(df, "col1")
 #' arrange(df, asc(df$col1), desc(abs(df$col2)))
+#' arrange(df, "col1")
+#' arrange(df, "col2", FALSE)
+#' arrange(df, "col1", decreasing=TRUE)
+#' arrange(df, "col1", "col2", c(TRUE, FALSE))
 #' }
 setMethod("arrange",
-  signature(x = "DataFrame", col = "characterOrColumn"),
+  signature(x = "DataFrame", col="Column"),
--- End diff --

coding style: col = "Column"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 >

1 - 100 of 387 matches

Mail list logo