[GitHub] spark pull request: [SPARK-14124] [SQL] [FOLLOWUP] Implement Datab...

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12081#issuecomment-203789698
  
**[Test build #54605 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54605/consoleFull)**
 for PR 12081 at commit 
[`7cd839e`](https://github.com/apache/spark/commit/7cd839e81f8c5b157c832a16ef5355378cc7b7c0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-14138][SQL] Fix generated SpecificColum...

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11984#issuecomment-203789700
  
**[Test build #54606 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54606/consoleFull)**
 for PR 11984 at commit 
[`a310bfc`](https://github.com/apache/spark/commit/a310bfc960f34f0ac6a61937a5519ac05ca52f94).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13674][SQL] Add wholestage codegen supp...

2016-03-31 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/11517#discussion_r58008649
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala ---
@@ -194,6 +194,71 @@ case class Sample(
   child.execute().randomSampleWithRange(lowerBound, upperBound, seed)
 }
   }
+
+  override def upstreams(): Seq[RDD[InternalRow]] = {
+child.asInstanceOf[CodegenSupport].upstreams()
+  }
+
+  private var rowBuffer: String = _
+
+  protected override def doProduce(ctx: CodegenContext): String = {
+child.asInstanceOf[CodegenSupport].produce(ctx, this)
+  }
+
+  override def doConsume(ctx: CodegenContext, input: Seq[ExprCode], row: 
ExprCode): String = {
+val sampler = ctx.freshName("sampler")
+
+if (withReplacement) {
+  val samplerClass = classOf[PoissonSampler[UnsafeRow]].getName
+  val classTag = ctx.freshName("classTag")
--- End diff --

ah, right, with the latest change to `PoissonSampler`. We don't need this 
anymore. Let me remove it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13674][SQL] Add wholestage codegen supp...

2016-03-31 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/11517#issuecomment-203790582
  
@viirya Could you also add `numOutputRow` for Sample ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13674][SQL] Add wholestage codegen supp...

2016-03-31 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/11517#discussion_r58008863
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala ---
@@ -194,6 +194,71 @@ case class Sample(
   child.execute().randomSampleWithRange(lowerBound, upperBound, seed)
 }
   }
+
+  override def upstreams(): Seq[RDD[InternalRow]] = {
+child.asInstanceOf[CodegenSupport].upstreams()
+  }
+
+  private var rowBuffer: String = _
+
+  protected override def doProduce(ctx: CodegenContext): String = {
+child.asInstanceOf[CodegenSupport].produce(ctx, this)
+  }
+
+  override def doConsume(ctx: CodegenContext, input: Seq[ExprCode], row: 
ExprCode): String = {
+val sampler = ctx.freshName("sampler")
+
+if (withReplacement) {
+  val samplerClass = classOf[PoissonSampler[UnsafeRow]].getName
+  val classTag = ctx.freshName("classTag")
+  val classTagClass = "scala.reflect.ClassTag"
+  ctx.addMutableState(classTagClass, classTag,
+s"$classTag = 
($classTagClass)scala.reflect.ClassTag$$.MODULE$$.apply(UnsafeRow.class);")
+
+  val initSampler = ctx.freshName("initSampler")
+  ctx.addMutableState(s"$samplerClass", sampler,
+s"$initSampler();")
+
+  val random = ctx.freshName("random")
--- End diff --

Since these are used in a separate function, we don't need to generate 
fresh name for them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13674][SQL] Add wholestage codegen supp...

2016-03-31 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/11517#discussion_r58009018
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/BufferedRowIterator.java 
---
@@ -61,6 +63,14 @@ public long durationMs() {
   public abstract void init(Iterator iters[]);
 
   /**
+   * Initializes from array of iterators of InternalRow.
+   */
+  public void init(int index, Iterator iters[]) {
+partitionIndex = index;
--- End diff --

Could you move this into generated code? (to save one API)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12864][YARN] initialize executorIdCount...

2016-03-31 Thread zhonghaihua
Github user zhonghaihua commented on a diff in the pull request:

https://github.com/apache/spark/pull/10794#discussion_r58009178
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedClusterMessage.scala
 ---
@@ -30,6 +30,8 @@ private[spark] object CoarseGrainedClusterMessages {
 
   case object RetrieveSparkProps extends CoarseGrainedClusterMessage
 
+  case object RetrieveMaxExecutorId extends CoarseGrainedClusterMessage
--- End diff --

@tgravescs Ok,I get your mean. Thanks a lot.
Use this name `RetrieveLastAllocatedExecutorId` is ok ? @vanzin What's your 
opinion ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-14138][SQL] Fix generated SpecificColum...

2016-03-31 Thread kiszk
Github user kiszk commented on the pull request:

https://github.com/apache/spark/pull/11984#issuecomment-203792626
  
Again, I confirmed whether these method are compiled.

```
hotspot_pid30419.log:
```

```
hotspot_pid30419.log:
hotspot_pid30419.log:
hotspot_pid30419.log:
hotspot_pid30419.log:
hotspot_pid30419.log:
hotspot_pid30419.log:
```

```
hotspot_pid30419.log:
hotspot_pid30419.log:
hotspot_pid30419.log:
hotspot_pid30419.log:
hotspot_pid30419.log:
hotspot_pid30419.log:
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14182] [SQL] Parse DDL Command: Alter V...

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11987#issuecomment-203793109
  
**[Test build #54598 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54598/consoleFull)**
 for PR 11987 at commit 
[`48aec92`](https://github.com/apache/spark/commit/48aec92480ec59ed4a965941d56126d9222cb853).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14182] [SQL] Parse DDL Command: Alter V...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11987#issuecomment-203793600
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14182] [SQL] Parse DDL Command: Alter V...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11987#issuecomment-203793602
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54598/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-14138][SQL] Fix generated SpecificColum...

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11984#issuecomment-203794122
  
**[Test build #54607 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54607/consoleFull)**
 for PR 11984 at commit 
[`c1acf82`](https://github.com/apache/spark/commit/c1acf825cc0dfdefc1074ad4ce3309743a3240b5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12343][YARN] Simplify Yarn client and c...

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11603#issuecomment-203794596
  
**[Test build #54596 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54596/consoleFull)**
 for PR 11603 at commit 
[`3bb44b4`](https://github.com/apache/spark/commit/3bb44b4b1b84f9a972ad8ea4876b70369ba07d0c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12343][YARN] Simplify Yarn client and c...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11603#issuecomment-203795173
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54596/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14182] [SQL] Parse DDL Command: Alter V...

2016-03-31 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/11987#issuecomment-203794945
  
@andrewor14 @rxin Based on your comments, the latest commits removed all 
the changes on the ANTLR3 codes, which will be removed soon. Now, the only 
changes in this PR are against ANTLR4. 

@hvanhovell  @viirya Please check if the changes on Parser is clean and 
clear. 

Thank you! : )


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12343][YARN] Simplify Yarn client and c...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11603#issuecomment-203795168
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-14138][SQL] Fix generated SpecificColum...

2016-03-31 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/11984#discussion_r58010571
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/GenerateColumnAccessor.scala
 ---
@@ -114,6 +117,42 @@ object GenerateColumnAccessor extends 
CodeGenerator[Seq[DataType], ColumnarItera
   (createCode, extract + patch)
 }.unzip
 
+/*
+ * 200 = 6000 bytes / 30 (up to 30 bytes per one call))
+ * the maximum byte code size to be compiled for HotSpot is 8000.
+ * We should keep less than 8000
+ */
+val numberOfStatementsThreshold = 200
+val (initializerAccessorFuncs, initializerAccessorCalls, 
extractorFuncs, extractorCalls) =
--- End diff --

we could use `ctx.addFunction` to simplify these 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-14138][SQL] Fix generated SpecificColum...

2016-03-31 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/11984#discussion_r58010624
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/GenerateColumnAccessor.scala
 ---
@@ -114,6 +117,42 @@ object GenerateColumnAccessor extends 
CodeGenerator[Seq[DataType], ColumnarItera
   (createCode, extract + patch)
 }.unzip
 
+/*
+ * 200 = 6000 bytes / 30 (up to 30 bytes per one call))
+ * the maximum byte code size to be compiled for HotSpot is 8000.
+ * We should keep less than 8000
+ */
+val numberOfStatementsThreshold = 200
+val (initializerAccessorFuncs, initializerAccessorCalls, 
extractorFuncs, extractorCalls) =
+  if (initializeAccessors.length <= numberOfStatementsThreshold) {
+("", initializeAccessors.mkString("\n"), "", 
extractors.mkString("\n"))
+  } else {
+val groupedAccessorsItr = 
initializeAccessors.grouped(numberOfStatementsThreshold)
+var groupedAccessorsLength = 0
+val groupedExtractorsItr = 
extractors.grouped(numberOfStatementsThreshold)
+var groupedExtractorsLength = 0
+(
+  groupedAccessorsItr.zipWithIndex.map { case (body, i) =>
+groupedAccessorsLength += 1
+s"""
+   |private void accessors$i() {
+   |  ${body.mkString("\n")}
+   |}
+ """.stripMargin
+  }.mkString(""),
+  (0 to groupedAccessorsLength - 1).map { i => s"accessors$i();" 
}.mkString("\n"),
+  groupedExtractorsItr.zipWithIndex.map { case (body, i) =>
+groupedExtractorsLength += 1
--- End diff --

this should be the same as groupedAccessorsLength, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14124] [SQL] [FOLLOWUP] Implement Datab...

2016-03-31 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/12081#issuecomment-203796029
  
To verify if `org.apache.hadoop.fs.Path` works properly. I did the 
following verification:

```scala
val m = new Path(new Path("/t/"), "a.db").toString
val n = new Path(new Path("/t"), "a.db").toString
val p = new Path(new Path("t"), "a.db").toString
```

In the above three cases, all of them can correctly put one and only one 
slash between two strings.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-14138][SQL] Fix generated SpecificColum...

2016-03-31 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/11984#discussion_r58010740
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/GenerateColumnAccessor.scala
 ---
@@ -114,6 +117,42 @@ object GenerateColumnAccessor extends 
CodeGenerator[Seq[DataType], ColumnarItera
   (createCode, extract + patch)
 }.unzip
 
+/*
+ * 200 = 6000 bytes / 30 (up to 30 bytes per one call))
+ * the maximum byte code size to be compiled for HotSpot is 8000.
+ * We should keep less than 8000
+ */
+val numberOfStatementsThreshold = 200
+val (initializerAccessorFuncs, initializerAccessorCalls, 
extractorFuncs, extractorCalls) =
+  if (initializeAccessors.length <= numberOfStatementsThreshold) {
--- End diff --

we could always put them in groups


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14270][SQL] whole stage codegen support...

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12061#issuecomment-203797122
  
**[Test build #54600 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54600/consoleFull)**
 for PR 12061 at commit 
[`bf9f5b5`](https://github.com/apache/spark/commit/bf9f5b53c28566674e47f8146ed9f53248dfcea6).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `s\"Unable to generate an encoder for inner class `$`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14270][SQL] whole stage codegen support...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12061#issuecomment-203797660
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54600/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14270][SQL] whole stage codegen support...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12061#issuecomment-203797655
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14191][SQL] Remove invalid Expand opera...

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11995#issuecomment-203802466
  
**[Test build #54601 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54601/consoleFull)**
 for PR 11995 at commit 
[`ab89e62`](https://github.com/apache/spark/commit/ab89e620883f581b2104fc60ffb32f77501f94c4).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14191][SQL] Remove invalid Expand opera...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11995#issuecomment-203802686
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54601/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14191][SQL] Remove invalid Expand opera...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11995#issuecomment-203802681
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13618][STREAMING][WEB-UI] Make Streamin...

2016-03-31 Thread lw-lin
Github user lw-lin closed the pull request at:

https://github.com/apache/spark/pull/11643


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13618][STREAMING][WEB-UI] Make Streamin...

2016-03-31 Thread lw-lin
Github user lw-lin commented on the pull request:

https://github.com/apache/spark/pull/11643#issuecomment-203804471
  
This PR has many conflicts to resolve, so I'm closing this for now and will 
open later, thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13618][STREAMING][WEB-UI] Make Streamin...

2016-03-31 Thread lw-lin
Github user lw-lin closed the pull request at:

https://github.com/apache/spark/pull/11633


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13618][STREAMING][WEB-UI] Make Streamin...

2016-03-31 Thread lw-lin
Github user lw-lin commented on the pull request:

https://github.com/apache/spark/pull/11633#issuecomment-203804570
  
This PR has many conflicts to resolve, so I'm closing this for now and will 
re-open later, thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13618][STREAMING][WEB-UI] Make Streamin...

2016-03-31 Thread lw-lin
Github user lw-lin commented on the pull request:

https://github.com/apache/spark/pull/11470#issuecomment-203804602
  
This PR has many conflicts to resolve, so I'm closing this for now and will 
re-open later, thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13618][STREAMING][WEB-UI] Make Streamin...

2016-03-31 Thread lw-lin
Github user lw-lin closed the pull request at:

https://github.com/apache/spark/pull/11470


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14124] [SQL] [FOLLOWUP] Implement Datab...

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12081#issuecomment-203808183
  
**[Test build #54605 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54605/consoleFull)**
 for PR 12081 at commit 
[`7cd839e`](https://github.com/apache/spark/commit/7cd839e81f8c5b157c832a16ef5355378cc7b7c0).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14124] [SQL] [FOLLOWUP] Implement Datab...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12081#issuecomment-203808605
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54605/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14124] [SQL] [FOLLOWUP] Implement Datab...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12081#issuecomment-203808598
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-14138][SQL] Fix generated SpecificColum...

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11984#issuecomment-203816545
  
**[Test build #54606 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54606/consoleFull)**
 for PR 11984 at commit 
[`a310bfc`](https://github.com/apache/spark/commit/a310bfc960f34f0ac6a61937a5519ac05ca52f94).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14182] [SQL] Parse DDL Command: Alter V...

2016-03-31 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/11987#discussion_r58016351
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala ---
@@ -253,6 +256,12 @@ case class AlterTableSkewedLocation(
 skewedMap: Map[String, String])(sql: String)
   extends NativeDDLCommand(sql) with Logging
 
+/**
+ * Add Partition in ALTER TABLE/VIEW: add the table/view partitions.
+ * 'partitionSpecsAndLocs': the syntax of ALTER VIEW is identical to ALTER 
TABLE,
+ * EXCEPT that it is ILLEGAL to specify a LOCATION clause.
+ * An error message will be issued if the partition exists, unless 
'ifNotExists' is false.
--- End diff --

unless 'ifNotExists' is false.  -> unless 'ifNotExists' is true ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-14138][SQL] Fix generated SpecificColum...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11984#issuecomment-203816692
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-14138][SQL] Fix generated SpecificColum...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11984#issuecomment-203816695
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54606/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14182] [SQL] Parse DDL Command: Alter V...

2016-03-31 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/11987#discussion_r58016427
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala ---
@@ -271,6 +280,14 @@ case class AlterTableExchangePartition(
 spec: TablePartitionSpec)(sql: String)
   extends NativeDDLCommand(sql) with Logging
 
+/**
+ * Drop Partition in ALTER TABLE/VIEW: to drop a particular partition for 
a table/view.
+ * This removes the data and metadata for this partition.
+ * The data is actually moved to the .Trash/Current directory if Trash is 
configured,
+ * unless 'purge' is true, but the metadata is completely lost.
+ * An error message will be issued if the partition does not exist, unless 
'ifExists' is false.
--- End diff --

As same as above.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14124] [SQL] [FOLLOWUP] Implement Datab...

2016-03-31 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/12081#issuecomment-203817230
  
The failed test cases clearly show the property value of `java.io.tmpdir` 
in the testing system does not contain the trailing slash. However, in my local 
macbook OS X, it has the trailing slash. 

In the test cases, I do not want to use `org.apache.hadoop.fs.Path`. This 
is to ensure we are not hiding any issue. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14267] [SQL] [PYSPARK] execute multiple...

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12057#issuecomment-203817436
  
**[Test build #54602 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54602/consoleFull)**
 for PR 12057 at commit 
[`8597bba`](https://github.com/apache/spark/commit/8597bbaf7d29de234f3f29e2929ce79c1ec99075).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  // enable memo iff we serialize the row with schema (schema and 
class should be memorized)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14124] [SQL] [FOLLOWUP] Implement Datab...

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12081#issuecomment-203817473
  
**[Test build #54608 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54608/consoleFull)**
 for PR 12081 at commit 
[`a9ddedc`](https://github.com/apache/spark/commit/a9ddedc6b735bd104ee45674e30486fd27a6f6f6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14267] [SQL] [PYSPARK] execute multiple...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12057#issuecomment-203817674
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54602/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14267] [SQL] [PYSPARK] execute multiple...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12057#issuecomment-203817672
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14123][SQL] Implement function related ...

2016-03-31 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/12036#discussion_r58017522
  
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala 
---
@@ -55,61 +57,133 @@ private[hive] class HiveFunctionRegistry(
 }
   }
 
-  override def lookupFunction(name: String, children: Seq[Expression]): 
Expression = {
-Try(underlying.lookupFunction(name, children)).getOrElse {
-  // We only look it up to see if it exists, but do not include it in 
the HiveUDF since it is
-  // not always serializable.
-  val functionInfo: FunctionInfo =
-Option(getFunctionInfo(name.toLowerCase)).getOrElse(
-  throw new AnalysisException(s"undefined function $name"))
+  def loadHivePermanentFunction(name: String): Option[CatalogFunction] = {
+val databaseName = sessionStage.catalog.getCurrentDatabase
+val func = FunctionIdentifier(name, Option(databaseName))
+val catalogFunc =
+  if (sessionStage.catalog.listFunctions(databaseName, name).size != 
0) {
+Some(sessionStage.catalog.getFunction(func))
+  } else {
+None
+  }
+catalogFunc.map(_.resources.foreach { resource =>
+  resource._1.toLowerCase match {
+case "jar" => sessionStage.ctx.addJar(resource._2)
+case _ =>
+  sessionStage.ctx.runSqlHive(s"ADD FILE ${resource._2}")
+  sessionStage.ctx.sparkContext.addFile(resource._2)
+  }
+})
+catalogFunc
+  }
 
-  val functionClassName = functionInfo.getFunctionClass.getName
+  override def makeFunctionBuilderAndInfo(
+  name: String,
+  functionClassName: String): (ExpressionInfo, FunctionBuilder) = {
+val hiveUDFWrapper = new HiveFunctionWrapper(functionClassName)
+val hiveUDFClass = hiveUDFWrapper.createFunction().getClass
+val info = new ExpressionInfo(functionClassName, name)
+val builder = makeHiveUDFBuilder(name, functionClassName, 
hiveUDFClass, hiveUDFWrapper)
+(info, builder)
+  }
 
-  // When we instantiate hive UDF wrapper class, we may throw 
exception if the input expressions
-  // don't satisfy the hive UDF, such as type mismatch, input number 
mismatch, etc. Here we
-  // catch the exception and throw AnalysisException instead.
+  /**
+   * Generates a Spark FunctionBuilder for a Hive UDF which is specified 
by a given classname.
+   */
+  def makeHiveUDFBuilder(
+  name: String,
+  functionClassName: String,
+  hiveUDFClass: Class[_],
+  hiveUDFWrapper: HiveFunctionWrapper): FunctionBuilder = {
+val builder = (children: Seq[Expression]) => {
   try {
-if 
(classOf[GenericUDFMacro].isAssignableFrom(functionInfo.getFunctionClass)) {
+if (classOf[GenericUDFMacro].isAssignableFrom(hiveUDFClass)) {
   val udf = HiveGenericUDF(
-name, new HiveFunctionWrapper(functionClassName, 
functionInfo.getGenericUDF), children)
-  udf.dataType // Force it to check input data types.
+name, hiveUDFWrapper, children)
+  if (udf.resolved) {
+udf.dataType // Force it to check input data types.
+  }
   udf
-} else if 
(classOf[UDF].isAssignableFrom(functionInfo.getFunctionClass)) {
-  val udf = HiveSimpleUDF(name, new 
HiveFunctionWrapper(functionClassName), children)
-  udf.dataType // Force it to check input data types.
+} else if (classOf[UDF].isAssignableFrom(hiveUDFClass)) {
+  val udf = HiveSimpleUDF(name, hiveUDFWrapper, children)
+  if (udf.resolved) {
+udf.dataType // Force it to check input data types.
+  }
   udf
-} else if 
(classOf[GenericUDF].isAssignableFrom(functionInfo.getFunctionClass)) {
-  val udf = HiveGenericUDF(name, new 
HiveFunctionWrapper(functionClassName), children)
-  udf.dataType // Force it to check input data types.
+} else if (classOf[GenericUDF].isAssignableFrom(hiveUDFClass)) {
+  val udf = HiveGenericUDF(name, hiveUDFWrapper, children)
+  if (udf.resolved) {
+udf.dataType // Force it to check input data types.
+  }
   udf
 } else if (
-  
classOf[AbstractGenericUDAFResolver].isAssignableFrom(functionInfo.getFunctionClass))
 {
-  val udaf = HiveUDAFFunction(name, new 
HiveFunctionWrapper(functionClassName), children)
-  udaf.dataType // Force it to check input data types.
+  
classOf[AbstractGenericUDAFResolver].isAssignableFrom(hiveUDFClass)) {
+  val udaf = HiveUDAFFunction(name, hiveUDFWrapper, children)
+  

[GitHub] spark pull request: [SPARK-14267] [SQL] [PYSPARK] execute multiple...

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12057#issuecomment-203821060
  
**[Test build #54603 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54603/consoleFull)**
 for PR 12057 at commit 
[`72a5ec0`](https://github.com/apache/spark/commit/72a5ec08123f5f7b8c515256a09cd0a87e05cc9f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14267] [SQL] [PYSPARK] execute multiple...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12057#issuecomment-203821281
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14267] [SQL] [PYSPARK] execute multiple...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12057#issuecomment-203821286
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54603/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14123][SQL] Implement function related ...

2016-03-31 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/12036#discussion_r58018075
  
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala 
---
@@ -55,61 +57,133 @@ private[hive] class HiveFunctionRegistry(
 }
   }
 
-  override def lookupFunction(name: String, children: Seq[Expression]): 
Expression = {
-Try(underlying.lookupFunction(name, children)).getOrElse {
-  // We only look it up to see if it exists, but do not include it in 
the HiveUDF since it is
-  // not always serializable.
-  val functionInfo: FunctionInfo =
-Option(getFunctionInfo(name.toLowerCase)).getOrElse(
-  throw new AnalysisException(s"undefined function $name"))
+  def loadHivePermanentFunction(name: String): Option[CatalogFunction] = {
+val databaseName = sessionStage.catalog.getCurrentDatabase
+val func = FunctionIdentifier(name, Option(databaseName))
+val catalogFunc =
+  if (sessionStage.catalog.listFunctions(databaseName, name).size != 
0) {
+Some(sessionStage.catalog.getFunction(func))
+  } else {
+None
+  }
+catalogFunc.map(_.resources.foreach { resource =>
+  resource._1.toLowerCase match {
+case "jar" => sessionStage.ctx.addJar(resource._2)
+case _ =>
+  sessionStage.ctx.runSqlHive(s"ADD FILE ${resource._2}")
+  sessionStage.ctx.sparkContext.addFile(resource._2)
+  }
+})
+catalogFunc
+  }
 
-  val functionClassName = functionInfo.getFunctionClass.getName
+  override def makeFunctionBuilderAndInfo(
+  name: String,
+  functionClassName: String): (ExpressionInfo, FunctionBuilder) = {
+val hiveUDFWrapper = new HiveFunctionWrapper(functionClassName)
+val hiveUDFClass = hiveUDFWrapper.createFunction().getClass
+val info = new ExpressionInfo(functionClassName, name)
+val builder = makeHiveUDFBuilder(name, functionClassName, 
hiveUDFClass, hiveUDFWrapper)
+(info, builder)
+  }
 
-  // When we instantiate hive UDF wrapper class, we may throw 
exception if the input expressions
-  // don't satisfy the hive UDF, such as type mismatch, input number 
mismatch, etc. Here we
-  // catch the exception and throw AnalysisException instead.
+  /**
+   * Generates a Spark FunctionBuilder for a Hive UDF which is specified 
by a given classname.
+   */
+  def makeHiveUDFBuilder(
+  name: String,
+  functionClassName: String,
+  hiveUDFClass: Class[_],
+  hiveUDFWrapper: HiveFunctionWrapper): FunctionBuilder = {
+val builder = (children: Seq[Expression]) => {
   try {
-if 
(classOf[GenericUDFMacro].isAssignableFrom(functionInfo.getFunctionClass)) {
+if (classOf[GenericUDFMacro].isAssignableFrom(hiveUDFClass)) {
   val udf = HiveGenericUDF(
-name, new HiveFunctionWrapper(functionClassName, 
functionInfo.getGenericUDF), children)
-  udf.dataType // Force it to check input data types.
+name, hiveUDFWrapper, children)
+  if (udf.resolved) {
+udf.dataType // Force it to check input data types.
+  }
   udf
-} else if 
(classOf[UDF].isAssignableFrom(functionInfo.getFunctionClass)) {
-  val udf = HiveSimpleUDF(name, new 
HiveFunctionWrapper(functionClassName), children)
-  udf.dataType // Force it to check input data types.
+} else if (classOf[UDF].isAssignableFrom(hiveUDFClass)) {
+  val udf = HiveSimpleUDF(name, hiveUDFWrapper, children)
+  if (udf.resolved) {
+udf.dataType // Force it to check input data types.
+  }
   udf
-} else if 
(classOf[GenericUDF].isAssignableFrom(functionInfo.getFunctionClass)) {
-  val udf = HiveGenericUDF(name, new 
HiveFunctionWrapper(functionClassName), children)
-  udf.dataType // Force it to check input data types.
+} else if (classOf[GenericUDF].isAssignableFrom(hiveUDFClass)) {
+  val udf = HiveGenericUDF(name, hiveUDFWrapper, children)
+  if (udf.resolved) {
+udf.dataType // Force it to check input data types.
+  }
   udf
 } else if (
-  
classOf[AbstractGenericUDAFResolver].isAssignableFrom(functionInfo.getFunctionClass))
 {
-  val udaf = HiveUDAFFunction(name, new 
HiveFunctionWrapper(functionClassName), children)
-  udaf.dataType // Force it to check input data types.
+  
classOf[AbstractGenericUDAFResolver].isAssignableFrom(hiveUDFClass)) {
+  val udaf = HiveUDAFFunction(name, hiveUDFWrapper, children)
+  

[GitHub] spark pull request: [Spark-14138][SQL] Fix generated SpecificColum...

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11984#issuecomment-203822187
  
**[Test build #54607 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54607/consoleFull)**
 for PR 11984 at commit 
[`c1acf82`](https://github.com/apache/spark/commit/c1acf825cc0dfdefc1074ad4ce3309743a3240b5).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-14138][SQL] Fix generated SpecificColum...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11984#issuecomment-203822344
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-14138][SQL] Fix generated SpecificColum...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11984#issuecomment-203822347
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54607/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14123][SQL] Implement function related ...

2016-03-31 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/12036#discussion_r58020732
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala
 ---
@@ -55,6 +55,7 @@ class HiveQuerySuite extends HiveComparisonTest with 
BeforeAndAfter {
 TimeZone.setDefault(TimeZone.getTimeZone("America/Los_Angeles"))
 // Add Locale setting
 Locale.setDefault(Locale.US)
+sql("DROP TEMPORARY FUNCTION IF EXISTS udtf_count2")
--- End diff --

ya. I think it is because `SQLQuerySuite` which creates `udtf_count2` but 
never drops it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Minor][MLLIB] Add args-checking for LDA and S...

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12062#issuecomment-203835649
  
**[Test build #54609 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54609/consoleFull)**
 for PR 12062 at commit 
[`45b61a5`](https://github.com/apache/spark/commit/45b61a55feb52cce86e3eb69d91b6903bbd87acf).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Minor][DOC] Add python examples for DCT,MinMa...

2016-03-31 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request:

https://github.com/apache/spark/pull/12063#discussion_r58021533
  
--- Diff: examples/src/main/python/ml/dct_example.py ---
@@ -0,0 +1,46 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from __future__ import print_function
+
+from pyspark import SparkContext
+from pyspark.sql import SQLContext
+# $example on$
+from pyspark.ml.feature import DCT
+from pyspark.mllib.linalg import Vectors
+# $example off$
+
+if __name__ == "__main__":
+sc = SparkContext(appName="DCTExample")
+sqlContext = SQLContext(sc)
+
+# $example on$
+df = sqlContext\
+.createDataFrame([(Vectors.dense([0.0, 1.0, -2.0, 3.0]),),
+  (Vectors.dense([-1.0, 2.0, 4.0, -7.0]),),
+  (Vectors.dense([14.0, -2.0, -5.0, 1.0]),)],
+ ["features"])
--- End diff --

thanks, I will fix it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14123][SQL] Implement function related ...

2016-03-31 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/12036#discussion_r58021704
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala 
---
@@ -557,7 +558,10 @@ private[hive] class HiveClientImpl(
   override def getFunctionOption(
   db: String,
   name: String): Option[CatalogFunction] = withHiveState {
-Option(client.getFunction(db, name)).map(fromHiveFunction)
+client.getFunction(db, name)
+Try {
+  Option(client.getFunction(db, name)).map(fromHiveFunction)
--- End diff --

yes. it is for local test. I should remove it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14211][SQL] Remove ANTLR3 based parser

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12071#issuecomment-203839495
  
**[Test build #54610 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54610/consoleFull)**
 for PR 12071 at commit 
[`2a93f3d`](https://github.com/apache/spark/commit/2a93f3dadab5f0e23f4927c6bf9643f286657e3e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Minor][DOC] Add python examples for DCT,MinMa...

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12063#issuecomment-203842374
  
**[Test build #54611 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54611/consoleFull)**
 for PR 12063 at commit 
[`e374e58`](https://github.com/apache/spark/commit/e374e58ac3f2e04d1327b3c265017adebbc92cd0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Minor][DOC] Add python examples for DCT,MinMa...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12063#issuecomment-203850484
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54611/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14123][SQL] Implement function related ...

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12036#issuecomment-203850041
  
**[Test build #54612 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54612/consoleFull)**
 for PR 12036 at commit 
[`2cab41c`](https://github.com/apache/spark/commit/2cab41cdc254764ed976842475ce2c2547b0ddcc).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Minor][DOC] Add python examples for DCT,MinMa...

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12063#issuecomment-203850383
  
**[Test build #54611 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54611/consoleFull)**
 for PR 12063 at commit 
[`e374e58`](https://github.com/apache/spark/commit/e374e58ac3f2e04d1327b3c265017adebbc92cd0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Minor][DOC] Add python examples for DCT,MinMa...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12063#issuecomment-203850483
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Minor][MLLIB] Add args-checking for LDA and S...

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12062#issuecomment-203855687
  
**[Test build #54609 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54609/consoleFull)**
 for PR 12062 at commit 
[`45b61a5`](https://github.com/apache/spark/commit/45b61a55feb52cce86e3eb69d91b6903bbd87acf).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Minor][MLLIB] Add args-checking for LDA and S...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12062#issuecomment-203855811
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54609/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Minor][MLLIB] Add args-checking for LDA and S...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12062#issuecomment-203855809
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14123][SQL] Implement function related ...

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12036#issuecomment-203855868
  
**[Test build #54612 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54612/consoleFull)**
 for PR 12036 at commit 
[`2cab41c`](https://github.com/apache/spark/commit/2cab41cdc254764ed976842475ce2c2547b0ddcc).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Resource(resourceType: String, path: String)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14123][SQL] Implement function related ...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12036#issuecomment-203855901
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54612/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14123][SQL] Implement function related ...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12036#issuecomment-203855898
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13063] [YARN] Make the SPARK YARN STAGI...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12082#issuecomment-203856261
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13063] [YARN] Make the SPARK YARN STAGI...

2016-03-31 Thread devaraj-kavali
GitHub user devaraj-kavali opened a pull request:

https://github.com/apache/spark/pull/12082

[SPARK-13063] [YARN] Make the SPARK YARN STAGING DIR as configurable

## What changes were proposed in this pull request?
Made the SPARK YARN STAGING DIR as configurable with the configuration as 
'spark.yarn.staging-dir'.

## How was this patch tested?

I have verified it manually by running applications on yarn, If the 
'spark.yarn.staging-dir' is configured then the value used as staging directory 
otherwise uses the default value i.e. file system’s home directory for the 
user.




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/devaraj-kavali/spark SPARK-13063

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/12082.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #12082


commit c3f02fdbdeb9c9dbe3d2a7361414005eed987509
Author: Devaraj K 
Date:   2016-03-31T09:41:22Z

[SPARK-13063] [YARN] Make the SPARK YARN STAGING DIR as configurable




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14124] [SQL] [FOLLOWUP] Implement Datab...

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12081#issuecomment-203857283
  
**[Test build #54608 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54608/consoleFull)**
 for PR 12081 at commit 
[`a9ddedc`](https://github.com/apache/spark/commit/a9ddedc6b735bd104ee45674e30486fd27a6f6f6).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14124] [SQL] [FOLLOWUP] Implement Datab...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12081#issuecomment-203857633
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54608/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14290][CORE][Network] avoid significant...

2016-03-31 Thread liyezhang556520
GitHub user liyezhang556520 opened a pull request:

https://github.com/apache/spark/pull/12083

[SPARK-14290][CORE][Network] avoid significant memory copy in netty's 
transferTo

## What changes were proposed in this pull request?
When netty transfer data that is not `FileRegion`, data will be in format 
of `ByteBuf`, If the data is large, there will occur significant performance 
issue because there is memory copy underlying in `sun.nio.ch.IOUtil.write`, the 
CPU is 100% used, and network is very low.

In this PR, if data size is large, we will split it into small chunks to 
call `WritableByteChannel.write()`, so that avoid wasting of memory copy. 
Because the data can't be written within a single write, and it will call 
`transferTo` multiple times. 

## How was this patch tested?
Spark unit test and manual test.
Manual test:
sc.parallelize(Array(1,2,3),3).mapPartitions(a=>Array(new 
Array[Double](1024 * 1024 * 50)).iterator).reduce((a,b)=> a).length

For more details, please refer to 
[SPARK-14290](https://issues.apache.org/jira/browse/SPARK-14290)




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/liyezhang556520/spark spark-14290

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/12083.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #12083


commit 63ca85a5548858b4fe46a4ade062776cb6747cba
Author: Zhang, Liye 
Date:   2016-03-31T09:44:41Z

spark-14290 avoid significant memory copy in netty's transferTo




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14124] [SQL] [FOLLOWUP] Implement Datab...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12081#issuecomment-203857632
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14290][CORE][Network] avoid significant...

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12083#issuecomment-203860710
  
**[Test build #54613 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54613/consoleFull)**
 for PR 12083 at commit 
[`63ca85a`](https://github.com/apache/spark/commit/63ca85a5548858b4fe46a4ade062776cb6747cba).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13538][ML] Add GaussianMixture to ML

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11419#issuecomment-203864758
  
**[Test build #54615 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54615/consoleFull)**
 for PR 11419 at commit 
[`a6dc5af`](https://github.com/apache/spark/commit/a6dc5afe8898621b4b29c9d257f2699058bdc6df).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13674][SQL] Add wholestage codegen supp...

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11517#issuecomment-203864696
  
**[Test build #54614 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54614/consoleFull)**
 for PR 11517 at commit 
[`fa51f62`](https://github.com/apache/spark/commit/fa51f62244e45e5cc3fea24806ccd5a6eff51464).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13674][SQL] Add wholestage codegen supp...

2016-03-31 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/11517#discussion_r58030135
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/BufferedRowIterator.java 
---
@@ -61,6 +63,14 @@ public long durationMs() {
   public abstract void init(Iterator iters[]);
 
   /**
+   * Initializes from array of iterators of InternalRow.
+   */
+  public void init(int index, Iterator iters[]) {
+partitionIndex = index;
--- End diff --

The problem is the case of two RDDs. Under this case, we will use the 
original API.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13538][ML] Add GaussianMixture to ML

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11419#issuecomment-203866803
  
**[Test build #54616 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54616/consoleFull)**
 for PR 11419 at commit 
[`39dc264`](https://github.com/apache/spark/commit/39dc26499d2ee3c407d684254eff4d500632e20a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13538][ML] Add GaussianMixture to ML

2016-03-31 Thread zhengruifeng
Github user zhengruifeng commented on the pull request:

https://github.com/apache/spark/pull/11419#issuecomment-203867515
  
@jkbradley 
I have fix those issue according your comments, and I am willing to 
following this PR.
The is something wrong in my force push operation... I have recreate this 
commit, but your comments are lost. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13674][SQL] Add wholestage codegen supp...

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11517#issuecomment-203868077
  
**[Test build #54614 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54614/consoleFull)**
 for PR 11517 at commit 
[`fa51f62`](https://github.com/apache/spark/commit/fa51f62244e45e5cc3fea24806ccd5a6eff51464).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class PoissonSampler[T](`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13674][SQL] Add wholestage codegen supp...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11517#issuecomment-203868097
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54614/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13674][SQL] Add wholestage codegen supp...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11517#issuecomment-203868096
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14098][SQL][WIP] Generate Java code tha...

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11956#issuecomment-203869653
  
**[Test build #54604 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54604/consoleFull)**
 for PR 11956 at commit 
[`35a352a`](https://github.com/apache/spark/commit/35a352a2ab24ed31a8c5a5c54e940ba32f807601).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14098][SQL][WIP] Generate Java code tha...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11956#issuecomment-203870144
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54604/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14098][SQL][WIP] Generate Java code tha...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11956#issuecomment-203870136
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Minor][MLLIB] Add args-checking for LDA and S...

2016-03-31 Thread zhengruifeng
Github user zhengruifeng commented on the pull request:

https://github.com/apache/spark/pull/12062#issuecomment-203870258
  
@yanboliang Thanks a lot


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14123][SQL] Implement function related ...

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12036#issuecomment-203870738
  
**[Test build #54617 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54617/consoleFull)**
 for PR 12036 at commit 
[`65d9dbd`](https://github.com/apache/spark/commit/65d9dbdfbb26c0a1e917767aae08f73df4f9763c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14290][CORE][Network] avoid significant...

2016-03-31 Thread liyezhang556520
Github user liyezhang556520 commented on a diff in the pull request:

https://github.com/apache/spark/pull/12083#discussion_r58032868
  
--- Diff: 
common/network-common/src/main/java/org/apache/spark/network/protocol/MessageWithHeader.java
 ---
@@ -44,6 +45,14 @@
   private long totalBytesTransferred;
 
   /**
+   * When the write buffer size is larger than this limit, I/O will be 
done in chunks of this size.
+   * The size should not be too large as it will waste underlying memory 
copy. e.g. If network
+   * avaliable buffer is smaller than this limit, the data cannot be sent 
within one single write
+   * operation while it still will make memory copy with this size.
+   */
+  private static final int NIO_BUFFER_LIMIT = 512 * 1024;
--- End diff --

I set this limit to 512K because in my test, it can successfully write 
about 600KB ~1.5MB size data for each `WritableByteChannel.write()`. This size 
need to be decided after more tests by someone else.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12864][YARN] initialize executorIdCount...

2016-03-31 Thread zhonghaihua
Github user zhonghaihua commented on a diff in the pull request:

https://github.com/apache/spark/pull/10794#discussion_r58032934
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -155,6 +158,9 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, val rpcEnv: Rp
   // in this block are read when requesting executors
   CoarseGrainedSchedulerBackend.this.synchronized {
 executorDataMap.put(executorId, data)
+if (currentExecutorIdCounter < Integer.parseInt(executorId)) {
+  currentExecutorIdCounter = Integer.parseInt(executorId)
+}
--- End diff --

@vanzin Thanks for your comments. I will optimize it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14290][CORE][Network] avoid significant...

2016-03-31 Thread liyezhang556520
Github user liyezhang556520 commented on the pull request:

https://github.com/apache/spark/pull/12083#issuecomment-203872402
  
cc @rxin 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13674][SQL] Add wholestage codegen supp...

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11517#issuecomment-203874596
  
**[Test build #54618 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54618/consoleFull)**
 for PR 11517 at commit 
[`76be6cf`](https://github.com/apache/spark/commit/76be6cf6e0ffd5c3b39feede8703175751806e76).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13538][ML] Add GaussianMixture to ML

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11419#issuecomment-203875174
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54615/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13538][ML] Add GaussianMixture to ML

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11419#issuecomment-203875170
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13538][ML] Add GaussianMixture to ML

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11419#issuecomment-203875134
  
**[Test build #54615 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54615/consoleFull)**
 for PR 11419 at commit 
[`a6dc5af`](https://github.com/apache/spark/commit/a6dc5afe8898621b4b29c9d257f2699058bdc6df).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-14138][SQL] Fix generated SpecificColum...

2016-03-31 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/11984#discussion_r58035019
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/GenerateColumnAccessor.scala
 ---
@@ -114,6 +117,42 @@ object GenerateColumnAccessor extends 
CodeGenerator[Seq[DataType], ColumnarItera
   (createCode, extract + patch)
 }.unzip
 
+/*
+ * 200 = 6000 bytes / 30 (up to 30 bytes per one call))
+ * the maximum byte code size to be compiled for HotSpot is 8000.
+ * We should keep less than 8000
+ */
+val numberOfStatementsThreshold = 200
+val (initializerAccessorFuncs, initializerAccessorCalls, 
extractorFuncs, extractorCalls) =
+  if (initializeAccessors.length <= numberOfStatementsThreshold) {
--- End diff --

Yes we could do. But, (another 
PR)[https://github.com/apache/spark/pull/7076#issuecomment-122176653] 
intentionally avoid putting them into a group if they can be put into one group.
Which is better?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13538][ML] Add GaussianMixture to ML

2016-03-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11419#issuecomment-203877075
  
**[Test build #54616 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54616/consoleFull)**
 for PR 11419 at commit 
[`39dc264`](https://github.com/apache/spark/commit/39dc26499d2ee3c407d684254eff4d500632e20a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class GaussianMixture @Since(\"2.0.0\") (`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13674][SQL] Add wholestage codegen supp...

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11517#issuecomment-203877205
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13538][ML] Add GaussianMixture to ML

2016-03-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11419#issuecomment-203877191
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54616/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   8   >