[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2018-04-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19222
  
**[Test build #88928 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88928/testReport)**
 for PR 19222 at commit 
[`e4a7016`](https://github.com/apache/spark/commit/e4a701681290b824241bc18f4d897e435d108693).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-04-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20756
  
**[Test build #88929 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88929/testReport)**
 for PR 20756 at commit 
[`573db59`](https://github.com/apache/spark/commit/573db595ee84e894b7d7093884d7fdef2c040150).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-04-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20756
  
**[Test build #88927 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88927/testReport)**
 for PR 20756 at commit 
[`96a8fe6`](https://github.com/apache/spark/commit/96a8fe6102d8d175003392a14d67cd74c0b86319).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20756
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88929/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...

2018-04-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20981
  
**[Test build #88926 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88926/testReport)**
 for PR 20981 at commit 
[`35539be`](https://github.com/apache/spark/commit/35539be14bc79647409bfcdc5dac1fb99d00f5ec).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19222
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88928/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20756
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88927/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20981
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88926/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19222
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20981
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20756
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20756
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...

2018-04-05 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/20981
  
retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-04-05 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/20756
  
retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19082: [SPARK-21870][SQL] Split aggregation code into sm...

2018-04-05 Thread rednaxelafx
Github user rednaxelafx commented on a diff in the pull request:

https://github.com/apache/spark/pull/19082#discussion_r179367236
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala
 ---
@@ -257,6 +259,78 @@ case class HashAggregateExec(
  """.stripMargin
   }
 
+  // Extracts all the input variable references for a given `aggExpr`. 
This result will be used
+  // to split aggregation into small functions.
+  private def getInputVariableReferences(
+  context: CodegenContext,
+  aggregateExpression: Expression,
+  subExprs: Map[Expression, SubExprEliminationState]): Set[(String, 
String)] = {
+// `argSet` collects all the pairs of variable names and their types, 
the first in the pair is
+// a type name and the second is a variable name.
+val argSet = mutable.Set[(String, String)]()
+val stack = mutable.Stack[Expression](aggregateExpression)
+while (stack.nonEmpty) {
+  stack.pop() match {
+case e if subExprs.contains(e) =>
+  val exprCode = subExprs(e)
+  if (CodegenContext.isJavaIdentifier(exprCode.value)) {
--- End diff --

Once we have @viirya 's https://github.com/apache/spark/pull/20043 merged 
we won't need the ugly `CodegenContext.isJavaIdentifier` hack any more >_<|||


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2018-04-05 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/19222
  
retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20981
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1995/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20981
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-04-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20756
  
**[Test build #88931 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88931/testReport)**
 for PR 20756 at commit 
[`573db59`](https://github.com/apache/spark/commit/573db595ee84e894b7d7093884d7fdef2c040150).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2018-04-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19222
  
**[Test build #88932 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88932/testReport)**
 for PR 19222 at commit 
[`e4a7016`](https://github.com/apache/spark/commit/e4a701681290b824241bc18f4d897e435d108693).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...

2018-04-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20981
  
**[Test build #88930 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88930/testReport)**
 for PR 20981 at commit 
[`35539be`](https://github.com/apache/spark/commit/35539be14bc79647409bfcdc5dac1fb99d00f5ec).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20756
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1996/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19222
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20756
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19222
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1997/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20975: [SPARK-23863] [SQL] Wholetext mode should not add line b...

2018-04-05 Thread barrenlake
Github user barrenlake commented on the issue:

https://github.com/apache/spark/pull/20975
  
@HyukjinKwon  Hello, the unit test describes the scene where this problem 
occurred. You can run the unit test in the original system. Spark2.3 introduced 
`wholetext` ( default `false`): If true, read a file as a single row and not 
split by "\n". So, read the textual content in wholetext mode and  write the 
content to a new file directly, there is no need to add "\n".


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20890
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20890
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1998/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-04-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20890
  
**[Test build #88933 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88933/testReport)**
 for PR 20890 at commit 
[`6ef1ea3`](https://github.com/apache/spark/commit/6ef1ea35e0f20cce037b3edd273d42c410cb68de).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-04-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20756
  
**[Test build #88931 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88931/testReport)**
 for PR 20756 at commit 
[`573db59`](https://github.com/apache/spark/commit/573db595ee84e894b7d7093884d7fdef2c040150).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20756
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20756
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88931/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20913: [SPARK-23799] FilterEstimation.evaluateInSet produces de...

2018-04-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20913
  
**[Test build #88934 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88934/testReport)**
 for PR 20913 at commit 
[`984faf5`](https://github.com/apache/spark/commit/984faf50ad055a770d190c8116fa427fec1cb983).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...

2018-04-05 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19222#discussion_r179385907
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java ---
@@ -45,38 +45,164 @@
*/
   public static final int FREED_IN_ALLOCATOR_PAGE_NUMBER = -3;
 
-  private final long length;
+  @Nullable
+  protected Object obj;
+
+  protected long offset;
+
+  protected long length;
 
   /**
* Optional page number; used when this MemoryBlock represents a page 
allocated by a
-   * TaskMemoryManager. This field is public so that it can be modified by 
the TaskMemoryManager,
-   * which lives in a different package.
+   * TaskMemoryManager. This field can be updated using setPageNumber 
method so that
+   * this can be modified by the TaskMemoryManager, which lives in a 
different package.
*/
-  public int pageNumber = NO_PAGE_NUMBER;
+  private int pageNumber = NO_PAGE_NUMBER;
 
-  public MemoryBlock(@Nullable Object obj, long offset, long length) {
-super(obj, offset);
+  protected MemoryBlock(@Nullable Object obj, long offset, long length) {
+if (offset < 0 || length < 0) {
+  throw new IllegalArgumentException(
+"Length " + length + " and offset " + offset + "must be 
non-negative");
+}
+this.obj = obj;
+this.offset = offset;
 this.length = length;
   }
 
+  protected MemoryBlock() {
+this(null, 0, 0);
+  }
+
+  public final Object getBaseObject() {
+return obj;
+  }
+
+  public final long getBaseOffset() {
+return offset;
+  }
+
+  public void resetObjAndOffset() {
+this.obj = null;
+this.offset = 0;
+  }
+
   /**
* Returns the size of the memory block.
*/
-  public long size() {
+  public final long size() {
 return length;
   }
 
-  /**
-   * Creates a memory block pointing to the memory used by the long array.
-   */
-  public static MemoryBlock fromLongArray(final long[] array) {
-return new MemoryBlock(array, Platform.LONG_ARRAY_OFFSET, array.length 
* 8L);
+  public final void setPageNumber(int pageNum) {
+pageNumber = pageNum;
+  }
+
+  public final int getPageNumber() {
+return pageNumber;
   }
 
   /**
* Fills the memory block with the specified byte value.
*/
-  public void fill(byte value) {
+  public final void fill(byte value) {
 Platform.setMemory(obj, offset, length, value);
   }
+
+  /**
+   * Instantiate MemoryBlock for given object type with new offset
+   */
+  public final static MemoryBlock allocateFromObject(Object obj, long 
offset, long length) {
+MemoryBlock mb = null;
+if (obj instanceof byte[]) {
+  byte[] array = (byte[])obj;
+  mb = new ByteArrayMemoryBlock(array, offset, length);
+} else if (obj instanceof long[]) {
+  long[] array = (long[])obj;
+  mb = new OnHeapMemoryBlock(array, offset, length);
+} else if (obj == null) {
+  // we assume that to pass null pointer means off-heap
+  mb = new OffHeapMemoryBlock(offset, length);
+} else {
+  throw new UnsupportedOperationException(
+"Instantiate MemoryBlock for type " + obj.getClass() + " is not 
supported now");
+}
+return mb;
+  }
+
+  /**
+   * Just instantiate the same type of MemoryBlock with the new absolute 
offset and size. The data
--- End diff --

This seems a dangerous API to me, we should not allow users to "expand" a 
memory block, do we have a use case for it? I'd like to make the `offset` 
parameter an increment of the current offset, i.e. `this.subBlock(0, 
this.size)` returns the same block.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20979: [SPARK-23588][SQL] CatalystToExternalMap should support ...

2018-04-05 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/20979
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20979: [SPARK-23588][SQL] CatalystToExternalMap should support ...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20979
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1999/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20979: [SPARK-23588][SQL] CatalystToExternalMap should support ...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20979
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20979: [SPARK-23588][SQL] CatalystToExternalMap should support ...

2018-04-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20979
  
**[Test build #88935 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88935/testReport)**
 for PR 20979 at commit 
[`352777b`](https://github.com/apache/spark/commit/352777b162752397efac0556a87c924935000e4a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...

2018-04-05 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/19222#discussion_r179391228
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java ---
@@ -45,38 +45,164 @@
*/
   public static final int FREED_IN_ALLOCATOR_PAGE_NUMBER = -3;
 
-  private final long length;
+  @Nullable
+  protected Object obj;
+
+  protected long offset;
+
+  protected long length;
 
   /**
* Optional page number; used when this MemoryBlock represents a page 
allocated by a
-   * TaskMemoryManager. This field is public so that it can be modified by 
the TaskMemoryManager,
-   * which lives in a different package.
+   * TaskMemoryManager. This field can be updated using setPageNumber 
method so that
+   * this can be modified by the TaskMemoryManager, which lives in a 
different package.
*/
-  public int pageNumber = NO_PAGE_NUMBER;
+  private int pageNumber = NO_PAGE_NUMBER;
 
-  public MemoryBlock(@Nullable Object obj, long offset, long length) {
-super(obj, offset);
+  protected MemoryBlock(@Nullable Object obj, long offset, long length) {
+if (offset < 0 || length < 0) {
+  throw new IllegalArgumentException(
+"Length " + length + " and offset " + offset + "must be 
non-negative");
+}
+this.obj = obj;
+this.offset = offset;
 this.length = length;
   }
 
+  protected MemoryBlock() {
+this(null, 0, 0);
+  }
+
+  public final Object getBaseObject() {
+return obj;
+  }
+
+  public final long getBaseOffset() {
+return offset;
+  }
+
+  public void resetObjAndOffset() {
+this.obj = null;
+this.offset = 0;
+  }
+
   /**
* Returns the size of the memory block.
*/
-  public long size() {
+  public final long size() {
 return length;
   }
 
-  /**
-   * Creates a memory block pointing to the memory used by the long array.
-   */
-  public static MemoryBlock fromLongArray(final long[] array) {
-return new MemoryBlock(array, Platform.LONG_ARRAY_OFFSET, array.length 
* 8L);
+  public final void setPageNumber(int pageNum) {
+pageNumber = pageNum;
+  }
+
+  public final int getPageNumber() {
+return pageNumber;
   }
 
   /**
* Fills the memory block with the specified byte value.
*/
-  public void fill(byte value) {
+  public final void fill(byte value) {
 Platform.setMemory(obj, offset, length, value);
   }
+
+  /**
+   * Instantiate MemoryBlock for given object type with new offset
+   */
+  public final static MemoryBlock allocateFromObject(Object obj, long 
offset, long length) {
+MemoryBlock mb = null;
+if (obj instanceof byte[]) {
+  byte[] array = (byte[])obj;
+  mb = new ByteArrayMemoryBlock(array, offset, length);
+} else if (obj instanceof long[]) {
+  long[] array = (long[])obj;
+  mb = new OnHeapMemoryBlock(array, offset, length);
+} else if (obj == null) {
+  // we assume that to pass null pointer means off-heap
+  mb = new OffHeapMemoryBlock(offset, length);
+} else {
+  throw new UnsupportedOperationException(
+"Instantiate MemoryBlock for type " + obj.getClass() + " is not 
supported now");
+}
+return mb;
+  }
+
+  /**
+   * Just instantiate the same type of MemoryBlock with the new absolute 
offset and size. The data
--- End diff --

First of all, this API does not allow users to *expand* a memory block due 
to `checkSubBlockRange()`. This check does not accept `offset < 0` or `offset + 
size > length`.

I have to update this assert as `this.offset + offset + size > length`. 
Also, I will add *subset* in the comment, and add some tests in 
`MemoryBlockSuite`

```
  public MemoryBlock subBlock(long offset, long size) {
checkSubBlockRange(offset, size);
return new OnHeapMemoryBlock(array, this.offset + offset, size);
  }
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19082: [SPARK-21870][SQL] Split aggregation code into sm...

2018-04-05 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/19082#discussion_r179391181
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala
 ---
@@ -257,6 +259,78 @@ case class HashAggregateExec(
  """.stripMargin
   }
 
+  // Extracts all the input variable references for a given `aggExpr`. 
This result will be used
+  // to split aggregation into small functions.
+  private def getInputVariableReferences(
+  context: CodegenContext,
+  aggregateExpression: Expression,
+  subExprs: Map[Expression, SubExprEliminationState]): Set[(String, 
String)] = {
+// `argSet` collects all the pairs of variable names and their types, 
the first in the pair is
+// a type name and the second is a variable name.
+val argSet = mutable.Set[(String, String)]()
+val stack = mutable.Stack[Expression](aggregateExpression)
+while (stack.nonEmpty) {
+  stack.pop() match {
+case e if subExprs.contains(e) =>
+  val exprCode = subExprs(e)
+  if (CodegenContext.isJavaIdentifier(exprCode.value)) {
--- End diff --

hey, good news! Thanks for letting me know ;)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20043: [SPARK-22856][SQL] Add wrappers for codegen output and n...

2018-04-05 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/20043
  
This pr will be merge soon? I'd like to use this in my pr: 
https://github.com/apache/spark/pull/20965


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-04-05 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/20756
  
retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-04-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20756
  
**[Test build #88936 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88936/testReport)**
 for PR 20756 at commit 
[`573db59`](https://github.com/apache/spark/commit/573db595ee84e894b7d7093884d7fdef2c040150).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20756
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2000/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20756
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...

2018-04-05 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19222#discussion_r179397907
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java ---
@@ -45,38 +45,164 @@
*/
   public static final int FREED_IN_ALLOCATOR_PAGE_NUMBER = -3;
 
-  private final long length;
+  @Nullable
+  protected Object obj;
+
+  protected long offset;
+
+  protected long length;
 
   /**
* Optional page number; used when this MemoryBlock represents a page 
allocated by a
-   * TaskMemoryManager. This field is public so that it can be modified by 
the TaskMemoryManager,
-   * which lives in a different package.
+   * TaskMemoryManager. This field can be updated using setPageNumber 
method so that
+   * this can be modified by the TaskMemoryManager, which lives in a 
different package.
*/
-  public int pageNumber = NO_PAGE_NUMBER;
+  private int pageNumber = NO_PAGE_NUMBER;
 
-  public MemoryBlock(@Nullable Object obj, long offset, long length) {
-super(obj, offset);
+  protected MemoryBlock(@Nullable Object obj, long offset, long length) {
+if (offset < 0 || length < 0) {
+  throw new IllegalArgumentException(
+"Length " + length + " and offset " + offset + "must be 
non-negative");
+}
+this.obj = obj;
+this.offset = offset;
 this.length = length;
   }
 
+  protected MemoryBlock() {
+this(null, 0, 0);
+  }
+
+  public final Object getBaseObject() {
+return obj;
+  }
+
+  public final long getBaseOffset() {
+return offset;
+  }
+
+  public void resetObjAndOffset() {
+this.obj = null;
+this.offset = 0;
+  }
+
   /**
* Returns the size of the memory block.
*/
-  public long size() {
+  public final long size() {
 return length;
   }
 
-  /**
-   * Creates a memory block pointing to the memory used by the long array.
-   */
-  public static MemoryBlock fromLongArray(final long[] array) {
-return new MemoryBlock(array, Platform.LONG_ARRAY_OFFSET, array.length 
* 8L);
+  public final void setPageNumber(int pageNum) {
+pageNumber = pageNum;
+  }
+
+  public final int getPageNumber() {
+return pageNumber;
   }
 
   /**
* Fills the memory block with the specified byte value.
*/
-  public void fill(byte value) {
+  public final void fill(byte value) {
 Platform.setMemory(obj, offset, length, value);
   }
+
+  /**
+   * Instantiate MemoryBlock for given object type with new offset
+   */
+  public final static MemoryBlock allocateFromObject(Object obj, long 
offset, long length) {
+MemoryBlock mb = null;
+if (obj instanceof byte[]) {
+  byte[] array = (byte[])obj;
+  mb = new ByteArrayMemoryBlock(array, offset, length);
+} else if (obj instanceof long[]) {
+  long[] array = (long[])obj;
+  mb = new OnHeapMemoryBlock(array, offset, length);
+} else if (obj == null) {
+  // we assume that to pass null pointer means off-heap
+  mb = new OffHeapMemoryBlock(offset, length);
+} else {
+  throw new UnsupportedOperationException(
+"Instantiate MemoryBlock for type " + obj.getClass() + " is not 
supported now");
+}
+return mb;
+  }
+
+  /**
+   * Just instantiate the same type of MemoryBlock with the new absolute 
offset and size. The data
--- End diff --

As the implementation of `subBlock` returns something like `new 
OffHeapMemoryBlock(this.offset + offset, size)`, doesn't it be relative offset 
to original offset?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...

2018-04-05 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/19222#discussion_r179402310
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java ---
@@ -45,38 +45,164 @@
*/
   public static final int FREED_IN_ALLOCATOR_PAGE_NUMBER = -3;
 
-  private final long length;
+  @Nullable
+  protected Object obj;
+
+  protected long offset;
+
+  protected long length;
 
   /**
* Optional page number; used when this MemoryBlock represents a page 
allocated by a
-   * TaskMemoryManager. This field is public so that it can be modified by 
the TaskMemoryManager,
-   * which lives in a different package.
+   * TaskMemoryManager. This field can be updated using setPageNumber 
method so that
+   * this can be modified by the TaskMemoryManager, which lives in a 
different package.
*/
-  public int pageNumber = NO_PAGE_NUMBER;
+  private int pageNumber = NO_PAGE_NUMBER;
 
-  public MemoryBlock(@Nullable Object obj, long offset, long length) {
-super(obj, offset);
+  protected MemoryBlock(@Nullable Object obj, long offset, long length) {
+if (offset < 0 || length < 0) {
+  throw new IllegalArgumentException(
+"Length " + length + " and offset " + offset + "must be 
non-negative");
+}
+this.obj = obj;
+this.offset = offset;
 this.length = length;
   }
 
+  protected MemoryBlock() {
+this(null, 0, 0);
+  }
+
+  public final Object getBaseObject() {
+return obj;
+  }
+
+  public final long getBaseOffset() {
+return offset;
+  }
+
+  public void resetObjAndOffset() {
+this.obj = null;
+this.offset = 0;
+  }
+
   /**
* Returns the size of the memory block.
*/
-  public long size() {
+  public final long size() {
 return length;
   }
 
-  /**
-   * Creates a memory block pointing to the memory used by the long array.
-   */
-  public static MemoryBlock fromLongArray(final long[] array) {
-return new MemoryBlock(array, Platform.LONG_ARRAY_OFFSET, array.length 
* 8L);
+  public final void setPageNumber(int pageNum) {
+pageNumber = pageNum;
+  }
+
+  public final int getPageNumber() {
+return pageNumber;
   }
 
   /**
* Fills the memory block with the specified byte value.
*/
-  public void fill(byte value) {
+  public final void fill(byte value) {
 Platform.setMemory(obj, offset, length, value);
   }
+
+  /**
+   * Instantiate MemoryBlock for given object type with new offset
+   */
+  public final static MemoryBlock allocateFromObject(Object obj, long 
offset, long length) {
+MemoryBlock mb = null;
+if (obj instanceof byte[]) {
+  byte[] array = (byte[])obj;
+  mb = new ByteArrayMemoryBlock(array, offset, length);
+} else if (obj instanceof long[]) {
+  long[] array = (long[])obj;
+  mb = new OnHeapMemoryBlock(array, offset, length);
+} else if (obj == null) {
+  // we assume that to pass null pointer means off-heap
+  mb = new OffHeapMemoryBlock(offset, length);
+} else {
+  throw new UnsupportedOperationException(
+"Instantiate MemoryBlock for type " + obj.getClass() + " is not 
supported now");
+}
+return mb;
+  }
+
+  /**
+   * Just instantiate the same type of MemoryBlock with the new absolute 
offset and size. The data
--- End diff --

I may misread your comment. I thought that you wanted to say "absolute".


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2018-04-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19222
  
**[Test build #88937 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88937/testReport)**
 for PR 19222 at commit 
[`3d03f92`](https://github.com/apache/spark/commit/3d03f92d4cb0158fa608d8a3dc35962d3e0f0861).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19222
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2001/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19222
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20981: [SPARK-23873][SQL] Use accessors in interpreted L...

2018-04-05 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/20981#discussion_r179418102
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 ---
@@ -550,11 +550,33 @@ case class LambdaVariable(
 dataType: DataType,
 nullable: Boolean = true) extends LeafExpression with NonSQLExpression 
{
 
+  private lazy val accessor: InternalRow => Any = dataType match {
--- End diff --

We could make this a bit more generic and also use this for 
`BoundReference`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20982: [SPARK-23859][ML] Initial PR for Instrumentation ...

2018-04-05 Thread WeichenXu123
GitHub user WeichenXu123 opened a pull request:

https://github.com/apache/spark/pull/20982

[SPARK-23859][ML] Initial PR for Instrumentation improvements: UUID and 
logging levels

## What changes were proposed in this pull request?

Initial PR for Instrumentation improvements: UUID and logging levels.
This PR takes over #20837 

Closes #20837

## How was this patch tested?

Manual.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/WeichenXu123/spark better-instrumentation

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20982.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20982


commit 42c3eca78a58ddca059b150681a0deaf462981f8
Author: Bago Amirbekian 
Date:   2018-03-15T18:14:48Z

Log data about logistic regression.

commit bef1084283f66fcd3ec5be3d14f30b206b7c45a5
Author: Bago Amirbekian 
Date:   2018-03-29T21:48:48Z

Have instrumentation use uuid & standard logger tags.

commit 8a3ce3e4958420dc5e92e7c1d31d5eab1735bb9e
Author: Bago Amirbekian 
Date:   2018-03-30T23:34:15Z

Add logWarning to instrumentation class.

commit f05bcdf32f5709a1a4b1b4fc433ac88dedc5d1ae
Author: WeichenXu 
Date:   2018-04-05T10:29:29Z

add logError and fix typo




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20982: [SPARK-23859][ML] Initial PR for Instrumentation improve...

2018-04-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20982
  
**[Test build #88938 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88938/testReport)**
 for PR 20982 at commit 
[`f05bcdf`](https://github.com/apache/spark/commit/f05bcdf32f5709a1a4b1b4fc433ac88dedc5d1ae).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20982: [SPARK-23859][ML] Initial PR for Instrumentation improve...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20982
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2002/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20982: [SPARK-23859][ML] Initial PR for Instrumentation improve...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20982
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20981: [SPARK-23873][SQL] Use accessors in interpreted L...

2018-04-05 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/20981#discussion_r179420439
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ObjectExpressionsSuite.scala
 ---
@@ -277,4 +278,31 @@ class ObjectExpressionsSuite extends SparkFunSuite 
with ExpressionEvalHelper {
   checkEvaluation(decodeUsingSerializer, null, 
InternalRow.fromSeq(Seq(null)))
 }
   }
+
+  test("LambdaVariable should support interpreted execution") {
+val elementTypes = Seq(BooleanType, ByteType, ShortType, IntegerType, 
LongType, FloatType,
+  DoubleType, DecimalType.USER_DEFAULT, StringType, BinaryType, 
DateType, TimestampType,
+  CalendarIntervalType)
+val arrayTypes = elementTypes.flatMap { elementType =>
+  Seq(ArrayType(elementType, containsNull = false), 
ArrayType(elementType, containsNull = true))
+}
+val mapTypes = elementTypes.flatMap { elementType =>
+  Seq(MapType(elementType, elementType, false), MapType(elementType, 
elementType, true))
+}
+val structTypes = elementTypes.flatMap { elementType =>
+  Seq(StructType(StructField("col1", elementType, false) :: Nil),
+StructType(StructField("col1", elementType, true) :: Nil))
+}
+
+val acceptedTypes = elementTypes ++ arrayTypes ++ mapTypes ++ 
structTypes
+val random = new Random()
--- End diff --

Please set a seed or use `withClue()` otherwise it will be very annoying to 
debug these when something goes south.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...

2018-04-05 Thread hvanhovell
Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/20981
  
A few minor comments. Looks good otherwise.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2018-04-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19222
  
**[Test build #88932 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88932/testReport)**
 for PR 19222 at commit 
[`e4a7016`](https://github.com/apache/spark/commit/e4a701681290b824241bc18f4d897e435d108693).
 * This patch **fails Spark unit tests**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19222
  
Build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19222
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88932/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20904: [SPARK-23751][ML][PySpark] Kolmogorov-Smirnoff test Pyth...

2018-04-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20904
  
**[Test build #88939 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88939/testReport)**
 for PR 20904 at commit 
[`02a1409`](https://github.com/apache/spark/commit/02a1409bc4b9fc9aa42faa6e816235184a55a8dc).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20945: [SPARK-23790][Mesos] fix metastore connection issue

2018-04-05 Thread skonto
Github user skonto commented on the issue:

https://github.com/apache/spark/pull/20945
  
@vanzin Sure we will try to comply the thing is pure mesos does not have an 
api for secrets only DC/OS has one and we cannot bring that api in the Spark 
project, otherwise I would just implement option 1) as with yarn and everyone 
would be happy and secure ;)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20904: [SPARK-23751][ML][PySpark] Kolmogorov-Smirnoff test Pyth...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20904
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20904: [SPARK-23751][ML][PySpark] Kolmogorov-Smirnoff test Pyth...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20904
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2003/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...

2018-04-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20981
  
**[Test build #88930 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88930/testReport)**
 for PR 20981 at commit 
[`35539be`](https://github.com/apache/spark/commit/35539be14bc79647409bfcdc5dac1fb99d00f5ec).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20981
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88930/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20981
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20975: [SPARK-23863] [SQL] Wholetext mode should not add line b...

2018-04-05 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20975
  
I am sorry, the unit test uses many private classes, and I think I am not 
getting the point of this change. Do you mind if I ask some code snippets? 

I think `wholetext` is not related in write path and wonder why it matters 
here.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20979: [SPARK-23588][SQL] CatalystToExternalMap should s...

2018-04-05 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/20979#discussion_r179430079
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 ---
@@ -988,8 +988,20 @@ case class CatalystToExternalMap private(
   override def children: Seq[Expression] =
 keyLambdaFunction :: valueLambdaFunction :: inputData :: Nil
 
-  override def eval(input: InternalRow): Any =
-throw new UnsupportedOperationException("Only code-generated 
evaluation is supported")
+  private lazy val toScalaValue: Any => Any = {
+assert(inputData.dataType.isInstanceOf[MapType])
+val mapType = inputData.dataType.asInstanceOf[MapType]
+CatalystTypeConverters.createToScalaConverter(mapType)
--- End diff --

I am not sure if I buy that argument. This expression can produce a map 
type other than `scala.collection.immutable.Map`, so we should either implement 
this, or we should remove the ability to produce different map types.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-04-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20756
  
**[Test build #88936 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88936/testReport)**
 for PR 20756 at commit 
[`573db59`](https://github.com/apache/spark/commit/573db595ee84e894b7d7093884d7fdef2c040150).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20982: [SPARK-23859][ML] Initial PR for Instrumentation improve...

2018-04-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20982
  
**[Test build #88938 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88938/testReport)**
 for PR 20982 at commit 
[`f05bcdf`](https://github.com/apache/spark/commit/f05bcdf32f5709a1a4b1b4fc433ac88dedc5d1ae).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20982: [SPARK-23859][ML] Initial PR for Instrumentation improve...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20982
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20913: [SPARK-23799] FilterEstimation.evaluateInSet produces de...

2018-04-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20913
  
**[Test build #88934 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88934/testReport)**
 for PR 20913 at commit 
[`984faf5`](https://github.com/apache/spark/commit/984faf50ad055a770d190c8116fa427fec1cb983).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20982: [SPARK-23859][ML] Initial PR for Instrumentation improve...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20982
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88938/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20756
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20756
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88936/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20756: [SPARK-23593][SQL] Add interpreted execution for ...

2018-04-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20756


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20913: [SPARK-23799] FilterEstimation.evaluateInSet produces de...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20913
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88934/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20913: [SPARK-23799] FilterEstimation.evaluateInSet produces de...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20913
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20976: [SPARK-23835][SQL] Add not-null check to Tuples' argumen...

2018-04-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20976
  
**[Test build #88940 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88940/testReport)**
 for PR 20976 at commit 
[`b78ff6c`](https://github.com/apache/spark/commit/b78ff6c480f25a080586402fcea51ce5732fa9ac).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20976: [SPARK-23835][SQL] Add not-null check to Tuples' argumen...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20976
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20976: [SPARK-23835][SQL] Add not-null check to Tuples' argumen...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20976
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2004/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-04-05 Thread hvanhovell
Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/20756
  
@viirya I thought it passed tests, but your last change is causing errors. 
I am going to revert this for now. Can you reopen? Sorry about this.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20753: [SPARK-23582][SQL] StaticInvoke should support in...

2018-04-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20753


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20904: [SPARK-23751][ML][PySpark] Kolmogorov-Smirnoff test Pyth...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20904
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20904: [SPARK-23751][ML][PySpark] Kolmogorov-Smirnoff test Pyth...

2018-04-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20904
  
**[Test build #88939 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88939/testReport)**
 for PR 20904 at commit 
[`02a1409`](https://github.com/apache/spark/commit/02a1409bc4b9fc9aa42faa6e816235184a55a8dc).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20904: [SPARK-23751][ML][PySpark] Kolmogorov-Smirnoff test Pyth...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20904
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88939/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-04-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20890
  
**[Test build #88933 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88933/testReport)**
 for PR 20890 at commit 
[`6ef1ea3`](https://github.com/apache/spark/commit/6ef1ea35e0f20cce037b3edd273d42c410cb68de).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20890
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88933/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20890
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20979: [SPARK-23588][SQL] CatalystToExternalMap should support ...

2018-04-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20979
  
**[Test build #88935 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88935/testReport)**
 for PR 20979 at commit 
[`352777b`](https://github.com/apache/spark/commit/352777b162752397efac0556a87c924935000e4a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20979: [SPARK-23588][SQL] CatalystToExternalMap should support ...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20979
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88935/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20979: [SPARK-23588][SQL] CatalystToExternalMap should support ...

2018-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20979
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20965: [SPARK-21870][SQL] Split aggregation code into sm...

2018-04-05 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/20965#discussion_r179448598
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala
 ---
@@ -254,6 +256,80 @@ case class HashAggregateExec(
  """.stripMargin
   }
 
+  // Extracts all the input variable references for a given `aggExpr`. 
This result will be used
+  // to split aggregation into small functions.
+  private def getInputVariableReferences(
--- End diff --

hopefully this method will become easier after @viirya's PR gets merged. 
Anyway, this seems to me quite a generic method: shall we move it to 
`CodeGenerator`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20965: [SPARK-21870][SQL] Split aggregation code into sm...

2018-04-05 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/20965#discussion_r179448442
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala
 ---
@@ -266,28 +342,53 @@ case class HashAggregateExec(
   
e.aggregateFunction.asInstanceOf[DeclarativeAggregate].mergeExpressions
   }
 }
-ctx.currentVars = bufVars ++ input
+
+// We need to copy the aggregation buffer to local variables first 
because each aggregate
+// function directly updates the buffer when it finishes.
+val localBufVars = bufVars.zip(updateExpr).map { case (ev, e) =>
--- End diff --

why is this needed now and it was not before?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20965: [SPARK-21870][SQL] Split aggregation code into sm...

2018-04-05 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/20965#discussion_r179447746
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala
 ---
@@ -254,6 +256,80 @@ case class HashAggregateExec(
  """.stripMargin
   }
 
+  // Extracts all the input variable references for a given `aggExpr`. 
This result will be used
+  // to split aggregation into small functions.
+  private def getInputVariableReferences(
+  context: CodegenContext,
+  aggregateExpression: Expression,
+  subExprs: Map[Expression, SubExprEliminationState]): Seq[(String, 
String, Expression)] = {
+// `argMap` collects all the pairs of variable names and their types, 
the first in the pair
+// is a type name and the second is a variable name.
+val argMap = mutable.Map[(String, String), Expression]()
+val stack = mutable.Stack[Expression](aggregateExpression)
+while (stack.nonEmpty) {
+  stack.pop() match {
+case e if subExprs.contains(e) =>
+  val exprCode = subExprs(e)
+  if (CodeGenerator.isJavaIdentifier(exprCode.value)) {
+argMap += (CodeGenerator.javaType(e.dataType), exprCode.value) 
-> e
+  }
+  if (CodeGenerator.isJavaIdentifier(exprCode.isNull)) {
+argMap += ("boolean", exprCode.isNull) -> e
+  }
+  // Since the children possibly has common expressions, we push 
them here
+  stack.pushAll(e.children)
+case ref: BoundReference
+if context.currentVars != null && 
context.currentVars(ref.ordinal) != null =>
+  val value = context.currentVars(ref.ordinal).value
+  val isNull = context.currentVars(ref.ordinal).isNull
+  if (CodeGenerator.isJavaIdentifier(value)) {
+argMap += (CodeGenerator.javaType(ref.dataType), value) -> ref
+  }
+  if (CodeGenerator.isJavaIdentifier(isNull)) {
+argMap += ("boolean", isNull) -> ref
+  }
+case ref: BoundReference =>
+  argMap += ("InternalRow", context.INPUT_ROW) -> ref
+case e =>
+  stack.pushAll(e.children)
+  }
+}
+
+argMap.map { case ((tpe, name), e) => (tpe, name, e) }.toSeq
+  }
+
+  // Splits aggregate code into small functions because JVMs does not 
compile too long functions
+  private def splitAggregateExpressions(
--- End diff --

is there any reason why we are not using `CodeGenerator.splitExpressions` 
instead of this?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-04-05 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/20756
  
@hvanhovell Looks like we have tests using null as value for setters. So 
the added null check fails.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19675: [SPARK-14540][BUILD] Support Scala 2.12 closures and Jav...

2018-04-05 Thread ekrich
Github user ekrich commented on the issue:

https://github.com/apache/spark/pull/19675
  
Interestingly, when searching for spores I ran across this post - 
https://www.reddit.com/r/scala/comments/702lct/what_ever_happened_to_sip21_aka_scala_spores/


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20933: [SPARK-23817][SQL]Migrate ORC file format read path to d...

2018-04-05 Thread gengliangwang
Github user gengliangwang commented on the issue:

https://github.com/apache/spark/pull/20933
  
retest this please.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-04-05 Thread hvanhovell
Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/20756
  
I think the semantic should be that we do not invoke the setter if the 
input is null.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   >