date:20190726

[GitHub] [spark] SparkQA commented on issue #25251: [MINOR] Trivial cleanups

2019-07-26 Thread GitBox

SparkQA commented on issue #25251: [MINOR] Trivial cleanups
URL: https://github.com/apache/spark/pull/25251#issuecomment-515335905
 
 
   **[Test build #108191 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108191/testReport)**
 for PR 25251 at commit 
[`5cbeaf0`](https://github.com/apache/spark/commit/5cbeaf02cafdd627d6f31c8a41396a37ef971a7d).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25259: [SPARK-28518][SQL][TEST] Refer to ChecksumFileSystem#isChecksumFile to fix StatisticsCollectionTestBase#getDataSize

2019-07-26 Thread GitBox

SparkQA commented on issue #25259: [SPARK-28518][SQL][TEST] Refer to 
ChecksumFileSystem#isChecksumFile to fix 
StatisticsCollectionTestBase#getDataSize
URL: https://github.com/apache/spark/pull/25259#issuecomment-515335903
 
 
   **[Test build #108196 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108196/testReport)**
 for PR 25259 at commit 
[`8158d5e`](https://github.com/apache/spark/commit/8158d5e27fce8e4bc5877ed7bb4f7c3876007c13).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25085: [SPARK-28313][SQL] Spark sql null type incompatible with hive void type

2019-07-26 Thread GitBox

SparkQA commented on issue #25085: [SPARK-28313][SQL] Spark sql null type 
incompatible with hive void type
URL: https://github.com/apache/spark/pull/25085#issuecomment-515335910
 
 
   **[Test build #108194 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108194/testReport)**
 for PR 25085 at commit 
[`e71c4d2`](https://github.com/apache/spark/commit/e71c4d2878fff642d34abbc71b9dff65354dafe5).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-07-26 Thread GitBox

peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] 
Support recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r307610800
 
 

 ##
 File path: sql/core/src/test/resources/sql-tests/results/cte.sql.out
 ##
 @@ -328,16 +328,891 @@ struct
 
 
 -- !query 25
-DROP VIEW IF EXISTS t
+WITH r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r
 -- !query 25 schema
 struct<>
 -- !query 25 output
-
+org.apache.spark.sql.AnalysisException
+Table or view not found: r; line 4 pos 24
 
 
 -- !query 26
-DROP VIEW IF EXISTS t2
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r
 -- !query 26 schema
-struct<>
+struct
 -- !query 26 output
+0
+1
+10
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 27
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT * FROM r
+-- !query 27 schema
+struct<>
+-- !query 27 output
+org.apache.spark.SparkException
+Recursion level limit 100 reached but query has not exhausted, try increasing 
spark.sql.cte.recursion.level.limit
+
+
+-- !query 28
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT * FROM r LIMIT 10
+-- !query 28 schema
+struct
+-- !query 28 output
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 29
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT level, level FROM r LIMIT 10
+-- !query 29 schema
+struct
+-- !query 29 output
+0  0
+1  1
+2  2
+3  3
+4  4
+5  5
+6  6
+7  7
+8  8
+9  9
+
+
+-- !query 30
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT level, level FROM r ORDER BY level LIMIT 10
+-- !query 30 schema
+struct<>
+-- !query 30 output
+org.apache.spark.SparkException
+Recursion level limit 100 reached but query has not exhausted, try increasing 
spark.sql.cte.recursion.level.limit
+
+
+-- !query 31
+WITH RECURSIVE r(c) AS (
+  SELECT 'a'
+  UNION ALL
+  SELECT c || ' b' FROM r WHERE LENGTH(c) < 10
+)
+SELECT * FROM r
+-- !query 31 schema
+struct
+-- !query 31 output
+a
+a b
+a b b
+a b b b
+a b b b b
+a b b b b b
+
+
+-- !query 32
+WITH RECURSIVE r(level) AS (
+  SELECT level + 1 FROM r WHERE level < 10
+  UNION ALL
+  VALUES (0)
+)
+SELECT * FROM r
+-- !query 32 schema
+struct
+-- !query 32 output
+0
+1
+10
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 33
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  VALUES (0, 'B')
+  UNION ALL
+  SELECT level + 1, data || 'C' FROM r WHERE level < 3
+)
+SELECT * FROM r
+-- !query 33 schema
+struct
+-- !query 33 output
+0  A
+0  B
+1  AC
+1  BC
+2  ACC
+2  BCC
+3  ACCC
+3  BCCC
+
+
+-- !query 34
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  SELECT level + 1, data || 'B' FROM r WHERE level < 2
+  UNION ALL
+  SELECT level + 1, data || 'C' FROM r WHERE level < 3
+)
+SELECT * FROM r
+-- !query 34 schema
+struct
+-- !query 34 output
+0  A
+1  AB
+1  AC
+2  ABB
+2  ABC
+2  ACB
+2  ACC
+3  ABBC
+3  ABCC
+3  ACBC
+3  ACCC
+
+
+-- !query 35
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  VALUES (0, 'B')
+  UNION ALL
+  SELECT level + 1, data || 'C' FROM r WHERE level < 2
+  UNION ALL
+  SELECT level + 1, data || 'D' FROM r WHERE level < 3
+)
+SELECT * FROM r
+-- !query 35 schema
+struct
+-- !query 35 output
+0  A
+0  B
+1  AC
+1  AD
+1  BC
+1  BD
+2  ACC
+2  ACD
+2  ADC
+2  ADD
+2  BCC
+2  BCD
+2  BDC
+2  BDD
+3  ACCD
+3  ACDD
+3  ADCD
+3  ADDD
+3  BCCD
+3  BCDD
+3  BDCD
+3  BDDD
+
+
+-- !query 36
+WITH RECURSIVE r(level) AS (
+  SELECT level + 1 FROM r WHERE level < 3
+)
+SELECT * FROM r
+-- !query 36 schema
+struct<>
+-- !query 36 output
+org.apache.spark.sql.AnalysisException
+Recursive query r should contain UNION or UNION ALL statements only. This 
error can also be caused by ORDER BY or LIMIT keywords used on result of UNION 
or UNION ALL.;
+
+
+-- !query 37
+WITH RECURSIVE r(level) AS (
+  VALUES (0), (0)
+  UNION
+  SELECT (level + 1) % 10 FROM r
+)
+SELECT * FROM r
+-- !query 37 schema
+struct
+-- !query 37 output
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 38
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  INTERSECT
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r
+-- !query 38 schema
+struct<>
+-- !query 38 output
+org.apache.spark.sql.AnalysisException
+Recursive query r should contain UNION or UNION ALL statements only. This 
error can also be caused by ORDER BY or LIMIT keywords used on result of UNION 
or UNION ALL.;
+
+
+-- !query 39
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE (SELECT SUM(level) FROM r) < 10
+)
+SELECT * FROM r
+-- !query 39 schema
+struct<>
+

[GitHub] [spark] SparkQA commented on issue #22290: [SPARK-25285][CORE] Add startedTasks and finishedTasks to the metrics system in the executor instance

2019-07-26 Thread GitBox

SparkQA commented on issue #22290: [SPARK-25285][CORE] Add startedTasks and 
finishedTasks to the metrics system in the executor instance
URL: https://github.com/apache/spark/pull/22290#issuecomment-515335911
 
 
   **[Test build #108198 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108198/testReport)**
 for PR 22290 at commit 
[`45cfa21`](https://github.com/apache/spark/commit/45cfa2146dbb3ca6f6530c0147246dafd4ada762).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24947: [SPARK-28143][SQL] Expressions without proper constructors should throw AnalysisException

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #24947: [SPARK-28143][SQL] Expressions without 
proper constructors should throw AnalysisException
URL: https://github.com/apache/spark/pull/24947#issuecomment-515335972
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24947: [SPARK-28143][SQL] Expressions without proper constructors should throw AnalysisException

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #24947: [SPARK-28143][SQL] Expressions without 
proper constructors should throw AnalysisException
URL: https://github.com/apache/spark/pull/24947#issuecomment-515335978
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108197/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #22290: [SPARK-25285][CORE] Add startedTasks and finishedTasks to the metrics system in the executor instance

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #22290: [SPARK-25285][CORE] Add startedTasks 
and finishedTasks to the metrics system in the executor instance
URL: https://github.com/apache/spark/pull/22290#issuecomment-515335941
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108198/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] s1ck commented on a change in pull request #24851: [SPARK-27303][GRAPH] Add Spark Graph API

2019-07-26 Thread GitBox

s1ck commented on a change in pull request #24851: [SPARK-27303][GRAPH] Add 
Spark Graph API
URL: https://github.com/apache/spark/pull/24851#discussion_r307610854
 
 

 ##
 File path: 
graph/api/src/main/scala/org/apache/spark/graph/api/GraphElementFrame.scala
 ##
 @@ -0,0 +1,260 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.graph.api
+
+import scala.collection.JavaConverters._
+
+import org.apache.spark.sql.DataFrame
+
+/**
+ * A [[PropertyGraph]] is created from GraphElementFrames.
+ *
+ * A graph element is either a node or a relationship.
+ * A GraphElementFrame wraps a DataFrame and describes how it maps to graph 
elements.
+ *
+ * @since 3.0.0
+ */
+abstract class GraphElementFrame {
+
+  /**
+   * Initial DataFrame that can still contain unmapped, arbitrarily ordered 
columns.
+   *
+   * @since 3.0.0
+   */
+  def df: DataFrame
+
+  /**
+   * Name of the column that contains the graph element identifier.
+   *
+   * @since 3.0.0
+   */
+  def idColumn: String
+
+  /**
+   * Name of all columns that contain graph element identifiers.
+   *
+   * @since 3.0.0
+   */
+  def idColumns: Seq[String] = Seq(idColumn)
+
+  /**
+   * Mapping from graph element property keys to the columns that contain the 
corresponding property
+   * values.
+   *
+   * @since 3.0.0
+   */
+  def properties: Map[String, String]
+
+}
+
+object NodeFrame {
+
+  /**
+   * Describes how to map an initial DataFrame to nodes.
+   *
+   * All columns apart from the given `idColumn` are mapped to node properties.
+   *
+   * @param dfDataFrame containing a single node in each row
+   * @param idColumn  column that contains the node identifier
+   * @param labelSet  labels that are assigned to all nodes
+   * @since 3.0.0
+   */
+  def create(df: DataFrame, idColumn: String, labelSet: Set[String]): 
NodeFrame = {
+val properties = (df.columns.toSet - idColumn)
+  .map(columnName => columnName -> columnName)
+  .toMap
+create(df, idColumn, labelSet, properties)
+  }
+
+  /**
+   * Describes how to map an initial DataFrame to nodes.
+   *
+   * All columns apart from the given `idColumn` are mapped to node properties.
+   *
+   * @param dfDataFrame containing a single node in each row
+   * @param idColumn  column that contains the node identifier
+   * @param labelSet  labels that are assigned to all nodes
+   * @param properties mapping from property keys to corresponding columns
+   * @since 3.0.0
+   */
+  def create(
+  df: DataFrame,
+  idColumn: String,
+  labelSet: Set[String],
+  properties: Map[String, String]): NodeFrame = {
 
 Review comment:
   Mh, I think having convenience methods for that is actually very helpful for 
the user. Imho, the documentation makes the purpose of that Map clear, we could 
however rename the field to `propertiesToColumns` or `propertyColumns`? (I 
prefer the latter).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25259: [SPARK-28518][SQL][TEST] Refer to ChecksumFileSystem#isChecksumFile to fix StatisticsCollectionTestBase#getDataSize

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #25259: [SPARK-28518][SQL][TEST] Refer to 
ChecksumFileSystem#isChecksumFile to fix 
StatisticsCollectionTestBase#getDataSize
URL: https://github.com/apache/spark/pull/25259#issuecomment-515335948
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108196/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-07-26 Thread GitBox

peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] 
Support recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r307606058
 
 

 ##
 File path: sql/core/src/test/resources/sql-tests/results/cte.sql.out
 ##
 @@ -1,5 +1,5 @@
 -- Automatically generated by SQLQueryTestSuite
--- Number of queries: 27
+-- Number of queries: 63
 
 Review comment:
   All right, then I believe only the most simple case should be allowed in 
this PR, which is 1 non-recursive term then `UNION ALL` then 1 recursive term 
within a recursive CTE is allowed.
   (And maybe follow-up PRs can deal with advanced constructs.)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25074: [SPARK-27924][SQL] Support ANSI SQL Boolean-Predicate syntax

2019-07-26 Thread GitBox

SparkQA commented on issue #25074: [SPARK-27924][SQL] Support ANSI SQL 
Boolean-Predicate syntax 
URL: https://github.com/apache/spark/pull/25074#issuecomment-515335908
 
 
   **[Test build #108192 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108192/testReport)**
 for PR 25074 at commit 
[`65d4100`](https://github.com/apache/spark/commit/65d41002f19acbced86c1cf49f0a443d6450ac74).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `trait BooleanTest extends UnaryExpression with Predicate with 
ExpectsInputTypes `
 * `case class IsTrue(child: Expression) extends BooleanTest `
 * `case class IsNotTrue(child: Expression) extends BooleanTest `
 * `case class IsFalse(child: Expression) extends BooleanTest `
 * `case class IsNotFalse(child: Expression) extends BooleanTest `


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-07-26 Thread GitBox

peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] 
Support recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r307610800
 
 

 ##
 File path: sql/core/src/test/resources/sql-tests/results/cte.sql.out
 ##
 @@ -328,16 +328,891 @@ struct
 
 
 -- !query 25
-DROP VIEW IF EXISTS t
+WITH r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r
 -- !query 25 schema
 struct<>
 -- !query 25 output
-
+org.apache.spark.sql.AnalysisException
+Table or view not found: r; line 4 pos 24
 
 
 -- !query 26
-DROP VIEW IF EXISTS t2
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r
 -- !query 26 schema
-struct<>
+struct
 -- !query 26 output
+0
+1
+10
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 27
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT * FROM r
+-- !query 27 schema
+struct<>
+-- !query 27 output
+org.apache.spark.SparkException
+Recursion level limit 100 reached but query has not exhausted, try increasing 
spark.sql.cte.recursion.level.limit
+
+
+-- !query 28
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT * FROM r LIMIT 10
+-- !query 28 schema
+struct
+-- !query 28 output
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 29
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT level, level FROM r LIMIT 10
+-- !query 29 schema
+struct
+-- !query 29 output
+0  0
+1  1
+2  2
+3  3
+4  4
+5  5
+6  6
+7  7
+8  8
+9  9
+
+
+-- !query 30
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT level, level FROM r ORDER BY level LIMIT 10
+-- !query 30 schema
+struct<>
+-- !query 30 output
+org.apache.spark.SparkException
+Recursion level limit 100 reached but query has not exhausted, try increasing 
spark.sql.cte.recursion.level.limit
+
+
+-- !query 31
+WITH RECURSIVE r(c) AS (
+  SELECT 'a'
+  UNION ALL
+  SELECT c || ' b' FROM r WHERE LENGTH(c) < 10
+)
+SELECT * FROM r
+-- !query 31 schema
+struct
+-- !query 31 output
+a
+a b
+a b b
+a b b b
+a b b b b
+a b b b b b
+
+
+-- !query 32
+WITH RECURSIVE r(level) AS (
+  SELECT level + 1 FROM r WHERE level < 10
+  UNION ALL
+  VALUES (0)
+)
+SELECT * FROM r
+-- !query 32 schema
+struct
+-- !query 32 output
+0
+1
+10
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 33
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  VALUES (0, 'B')
+  UNION ALL
+  SELECT level + 1, data || 'C' FROM r WHERE level < 3
+)
+SELECT * FROM r
+-- !query 33 schema
+struct
+-- !query 33 output
+0  A
+0  B
+1  AC
+1  BC
+2  ACC
+2  BCC
+3  ACCC
+3  BCCC
+
+
+-- !query 34
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  SELECT level + 1, data || 'B' FROM r WHERE level < 2
+  UNION ALL
+  SELECT level + 1, data || 'C' FROM r WHERE level < 3
+)
+SELECT * FROM r
+-- !query 34 schema
+struct
+-- !query 34 output
+0  A
+1  AB
+1  AC
+2  ABB
+2  ABC
+2  ACB
+2  ACC
+3  ABBC
+3  ABCC
+3  ACBC
+3  ACCC
+
+
+-- !query 35
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  VALUES (0, 'B')
+  UNION ALL
+  SELECT level + 1, data || 'C' FROM r WHERE level < 2
+  UNION ALL
+  SELECT level + 1, data || 'D' FROM r WHERE level < 3
+)
+SELECT * FROM r
+-- !query 35 schema
+struct
+-- !query 35 output
+0  A
+0  B
+1  AC
+1  AD
+1  BC
+1  BD
+2  ACC
+2  ACD
+2  ADC
+2  ADD
+2  BCC
+2  BCD
+2  BDC
+2  BDD
+3  ACCD
+3  ACDD
+3  ADCD
+3  ADDD
+3  BCCD
+3  BCDD
+3  BDCD
+3  BDDD
+
+
+-- !query 36
+WITH RECURSIVE r(level) AS (
+  SELECT level + 1 FROM r WHERE level < 3
+)
+SELECT * FROM r
+-- !query 36 schema
+struct<>
+-- !query 36 output
+org.apache.spark.sql.AnalysisException
+Recursive query r should contain UNION or UNION ALL statements only. This 
error can also be caused by ORDER BY or LIMIT keywords used on result of UNION 
or UNION ALL.;
+
+
+-- !query 37
+WITH RECURSIVE r(level) AS (
+  VALUES (0), (0)
+  UNION
+  SELECT (level + 1) % 10 FROM r
+)
+SELECT * FROM r
+-- !query 37 schema
+struct
+-- !query 37 output
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 38
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  INTERSECT
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r
+-- !query 38 schema
+struct<>
+-- !query 38 output
+org.apache.spark.sql.AnalysisException
+Recursive query r should contain UNION or UNION ALL statements only. This 
error can also be caused by ORDER BY or LIMIT keywords used on result of UNION 
or UNION ALL.;
+
+
+-- !query 39
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE (SELECT SUM(level) FROM r) < 10
+)
+SELECT * FROM r
+-- !query 39 schema
+struct<>
+

[GitHub] [spark] gatorsmile commented on issue #25217: [SPARK-28463][SQL][test-hadoop3.2] Thriftserver throws BigDecimal incompatible with HiveDecimal

2019-07-26 Thread GitBox

gatorsmile commented on issue #25217: [SPARK-28463][SQL][test-hadoop3.2] 
Thriftserver throws BigDecimal incompatible with HiveDecimal
URL: https://github.com/apache/spark/pull/25217#issuecomment-515334020
 
 
   cc @gengliangwang @juliuszsompolski 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25085: [SPARK-28313][SQL] Spark sql null type incompatible with hive void type

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #25085: [SPARK-28313][SQL] Spark sql null type 
incompatible with hive void type
URL: https://github.com/apache/spark/pull/25085#issuecomment-515336058
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108194/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #22290: [SPARK-25285][CORE] Add startedTasks and finishedTasks to the metrics system in the executor instance

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #22290: [SPARK-25285][CORE] Add startedTasks 
and finishedTasks to the metrics system in the executor instance
URL: https://github.com/apache/spark/pull/22290#issuecomment-515335933
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25085: [SPARK-28313][SQL] Spark sql null type incompatible with hive void type

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #25085: [SPARK-28313][SQL] Spark sql null type 
incompatible with hive void type
URL: https://github.com/apache/spark/pull/25085#issuecomment-515336051
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25074: [SPARK-27924][SQL] Support ANSI SQL Boolean-Predicate syntax

2019-07-26 Thread GitBox

SparkQA commented on issue #25074: [SPARK-27924][SQL] Support ANSI SQL 
Boolean-Predicate syntax 
URL: https://github.com/apache/spark/pull/25074#issuecomment-515335907
 
 
   **[Test build #108193 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108193/testReport)**
 for PR 25074 at commit 
[`ebd2dcf`](https://github.com/apache/spark/commit/ebd2dcfd63560758dda8407c0ee1e17a2eb77bdc).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25074: [SPARK-27924][SQL] Support ANSI SQL Boolean-Predicate syntax

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #25074: [SPARK-27924][SQL] Support ANSI SQL 
Boolean-Predicate syntax 
URL: https://github.com/apache/spark/pull/25074#issuecomment-515336116
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108192/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25259: [SPARK-28518][SQL][TEST] Refer to ChecksumFileSystem#isChecksumFile to fix StatisticsCollectionTestBase#getDataSize

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #25259: [SPARK-28518][SQL][TEST] Refer to 
ChecksumFileSystem#isChecksumFile to fix 
StatisticsCollectionTestBase#getDataSize
URL: https://github.com/apache/spark/pull/25259#issuecomment-515335940
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] s1ck commented on a change in pull request #24851: [SPARK-27303][GRAPH] Add Spark Graph API

2019-07-26 Thread GitBox

s1ck commented on a change in pull request #24851: [SPARK-27303][GRAPH] Add 
Spark Graph API
URL: https://github.com/apache/spark/pull/24851#discussion_r307612518
 
 

 ##
 File path: 
graph/api/src/main/scala/org/apache/spark/graph/api/GraphElementFrame.scala
 ##
 @@ -0,0 +1,260 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.graph.api
+
+import scala.collection.JavaConverters._
+
+import org.apache.spark.sql.DataFrame
+
+/**
+ * A [[PropertyGraph]] is created from GraphElementFrames.
+ *
+ * A graph element is either a node or a relationship.
+ * A GraphElementFrame wraps a DataFrame and describes how it maps to graph 
elements.
+ *
+ * @since 3.0.0
+ */
+abstract class GraphElementFrame {
+
+  /**
+   * Initial DataFrame that can still contain unmapped, arbitrarily ordered 
columns.
+   *
+   * @since 3.0.0
+   */
+  def df: DataFrame
+
+  /**
+   * Name of the column that contains the graph element identifier.
+   *
+   * @since 3.0.0
+   */
+  def idColumn: String
+
+  /**
+   * Name of all columns that contain graph element identifiers.
+   *
+   * @since 3.0.0
+   */
+  def idColumns: Seq[String] = Seq(idColumn)
+
+  /**
+   * Mapping from graph element property keys to the columns that contain the 
corresponding property
+   * values.
+   *
+   * @since 3.0.0
+   */
+  def properties: Map[String, String]
+
+}
+
+object NodeFrame {
+
+  /**
+   * Describes how to map an initial DataFrame to nodes.
+   *
+   * All columns apart from the given `idColumn` are mapped to node properties.
+   *
+   * @param dfDataFrame containing a single node in each row
+   * @param idColumn  column that contains the node identifier
+   * @param labelSet  labels that are assigned to all nodes
+   * @since 3.0.0
+   */
+  def create(df: DataFrame, idColumn: String, labelSet: Set[String]): 
NodeFrame = {
+val properties = (df.columns.toSet - idColumn)
+  .map(columnName => columnName -> columnName)
+  .toMap
+create(df, idColumn, labelSet, properties)
+  }
+
+  /**
+   * Describes how to map an initial DataFrame to nodes.
+   *
+   * All columns apart from the given `idColumn` are mapped to node properties.
+   *
+   * @param dfDataFrame containing a single node in each row
+   * @param idColumn  column that contains the node identifier
+   * @param labelSet  labels that are assigned to all nodes
+   * @param properties mapping from property keys to corresponding columns
+   * @since 3.0.0
+   */
+  def create(
+  df: DataFrame,
+  idColumn: String,
+  labelSet: Set[String],
+  properties: Map[String, String]): NodeFrame = {
+NodeFrame(df, idColumn, labelSet, properties)
+  }
+
+  /**
+   * Describes how to map an initial DataFrame to nodes.
+   *
+   * All columns apart from the given `idColumn` are mapped to node properties.
+   *
+   * @param dfDataFrame containing a single node in each row
+   * @param idColumn  column that contains the node identifier
+   * @param labelSet  labels that are assigned to all nodes
+   * @since 3.0.0
+   */
+  def create(df: DataFrame, idColumn: String, labelSet: 
java.util.Set[String]): NodeFrame = {
+create(df, idColumn, labelSet.asScala.toSet)
+  }
+
+  /**
+   * Describes how to map an initial DataFrame to nodes.
+   *
+   * All columns apart from the given `idColumn` are mapped to node properties.
+   *
+   * @param dfDataFrame containing a single node in each row
+   * @param idColumn  column that contains the node identifier
+   * @param labelSet  labels that are assigned to all nodes
+   * @param properties mapping from property keys to corresponding columns
+   * @since 3.0.0
+   */
+  def create(
+  df: DataFrame,
+  idColumn: String,
+  labelSet: java.util.Set[String],
+  properties: java.util.Map[String, String]): NodeFrame = {
+val scalaLabelSet = labelSet.asScala.toSet
+val scalaProperties = properties.asScala.toMap
+NodeFrame(df, idColumn, scalaLabelSet, scalaProperties)
+  }
+
+}
+
+/**
+ * Describes how to map a DataFrame to nodes.
+ *
+ * Each row in the DataFrame represents a node which has exactly the labels 
defined by the given
+ * label set.
+ *
+ * @param df

[GitHub] [spark] AmplabJenkins commented on issue #25074: [SPARK-27924][SQL] Support ANSI SQL Boolean-Predicate syntax

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #25074: [SPARK-27924][SQL] Support ANSI SQL 
Boolean-Predicate syntax 
URL: https://github.com/apache/spark/pull/25074#issuecomment-515336073
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25074: [SPARK-27924][SQL] Support ANSI SQL Boolean-Predicate syntax

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #25074: [SPARK-27924][SQL] Support ANSI SQL 
Boolean-Predicate syntax 
URL: https://github.com/apache/spark/pull/25074#issuecomment-515336079
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108193/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25074: [SPARK-27924][SQL] Support ANSI SQL Boolean-Predicate syntax

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #25074: [SPARK-27924][SQL] Support ANSI SQL 
Boolean-Predicate syntax 
URL: https://github.com/apache/spark/pull/25074#issuecomment-515336106
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24947: [SPARK-28143][SQL] Expressions without proper constructors should throw AnalysisException

2019-07-26 Thread GitBox

SparkQA commented on issue #24947: [SPARK-28143][SQL] Expressions without 
proper constructors should throw AnalysisException
URL: https://github.com/apache/spark/pull/24947#issuecomment-515335913
 
 
   **[Test build #108197 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108197/testReport)**
 for PR 24947 at commit 
[`a0363de`](https://github.com/apache/spark/commit/a0363de8932ad6b886d9deaa043354543da0ed56).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25259: [SPARK-28518][SQL][TEST] Refer to ChecksumFileSystem#isChecksumFile to fix StatisticsCollectionTestBase#getDataSize

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #25259: [SPARK-28518][SQL][TEST] Refer 
to ChecksumFileSystem#isChecksumFile to fix 
StatisticsCollectionTestBase#getDataSize
URL: https://github.com/apache/spark/pull/25259#issuecomment-515335940
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #22290: [SPARK-25285][CORE] Add startedTasks and finishedTasks to the metrics system in the executor instance

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #22290: [SPARK-25285][CORE] Add 
startedTasks and finishedTasks to the metrics system in the executor instance
URL: https://github.com/apache/spark/pull/22290#issuecomment-515335933
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25251: [MINOR] Trivial cleanups

2019-07-26 Thread GitBox

SparkQA removed a comment on issue #25251: [MINOR] Trivial cleanups
URL: https://github.com/apache/spark/pull/25251#issuecomment-515293715
 
 
   **[Test build #108191 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108191/testReport)**
 for PR 25251 at commit 
[`5cbeaf0`](https://github.com/apache/spark/commit/5cbeaf02cafdd627d6f31c8a41396a37ef971a7d).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25074: [SPARK-27924][SQL] Support ANSI SQL Boolean-Predicate syntax

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #25074: [SPARK-27924][SQL] Support 
ANSI SQL Boolean-Predicate syntax 
URL: https://github.com/apache/spark/pull/25074#issuecomment-515336106
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #22290: [SPARK-25285][CORE] Add startedTasks and finishedTasks to the metrics system in the executor instance

2019-07-26 Thread GitBox

SparkQA removed a comment on issue #22290: [SPARK-25285][CORE] Add startedTasks 
and finishedTasks to the metrics system in the executor instance
URL: https://github.com/apache/spark/pull/22290#issuecomment-515332546
 
 
   **[Test build #108198 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108198/testReport)**
 for PR 22290 at commit 
[`45cfa21`](https://github.com/apache/spark/commit/45cfa2146dbb3ca6f6530c0147246dafd4ada762).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25074: [SPARK-27924][SQL] Support ANSI SQL Boolean-Predicate syntax

2019-07-26 Thread GitBox

SparkQA removed a comment on issue #25074: [SPARK-27924][SQL] Support ANSI SQL 
Boolean-Predicate syntax 
URL: https://github.com/apache/spark/pull/25074#issuecomment-515302840
 
 
   **[Test build #108192 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108192/testReport)**
 for PR 25074 at commit 
[`65d4100`](https://github.com/apache/spark/commit/65d41002f19acbced86c1cf49f0a443d6450ac74).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25074: [SPARK-27924][SQL] Support ANSI SQL Boolean-Predicate syntax

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #25074: [SPARK-27924][SQL] Support 
ANSI SQL Boolean-Predicate syntax 
URL: https://github.com/apache/spark/pull/25074#issuecomment-515336073
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] s1ck commented on a change in pull request #24851: [SPARK-27303][GRAPH] Add Spark Graph API

2019-07-26 Thread GitBox

s1ck commented on a change in pull request #24851: [SPARK-27303][GRAPH] Add 
Spark Graph API
URL: https://github.com/apache/spark/pull/24851#discussion_r307612864
 
 

 ##
 File path: 
graph/api/src/main/scala/org/apache/spark/graph/api/GraphElementFrame.scala
 ##
 @@ -0,0 +1,260 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.graph.api
+
+import scala.collection.JavaConverters._
+
+import org.apache.spark.sql.DataFrame
+
+/**
+ * A [[PropertyGraph]] is created from GraphElementFrames.
+ *
+ * A graph element is either a node or a relationship.
+ * A GraphElementFrame wraps a DataFrame and describes how it maps to graph 
elements.
+ *
+ * @since 3.0.0
+ */
+abstract class GraphElementFrame {
+
+  /**
+   * Initial DataFrame that can still contain unmapped, arbitrarily ordered 
columns.
+   *
+   * @since 3.0.0
+   */
+  def df: DataFrame
+
+  /**
+   * Name of the column that contains the graph element identifier.
+   *
+   * @since 3.0.0
+   */
+  def idColumn: String
+
+  /**
+   * Name of all columns that contain graph element identifiers.
+   *
+   * @since 3.0.0
+   */
+  def idColumns: Seq[String] = Seq(idColumn)
+
+  /**
+   * Mapping from graph element property keys to the columns that contain the 
corresponding property
+   * values.
+   *
+   * @since 3.0.0
+   */
+  def properties: Map[String, String]
+
+}
+
+object NodeFrame {
+
+  /**
+   * Describes how to map an initial DataFrame to nodes.
+   *
+   * All columns apart from the given `idColumn` are mapped to node properties.
+   *
+   * @param dfDataFrame containing a single node in each row
+   * @param idColumn  column that contains the node identifier
+   * @param labelSet  labels that are assigned to all nodes
+   * @since 3.0.0
+   */
+  def create(df: DataFrame, idColumn: String, labelSet: Set[String]): 
NodeFrame = {
+val properties = (df.columns.toSet - idColumn)
+  .map(columnName => columnName -> columnName)
+  .toMap
+create(df, idColumn, labelSet, properties)
+  }
+
+  /**
+   * Describes how to map an initial DataFrame to nodes.
+   *
+   * All columns apart from the given `idColumn` are mapped to node properties.
+   *
+   * @param dfDataFrame containing a single node in each row
+   * @param idColumn  column that contains the node identifier
+   * @param labelSet  labels that are assigned to all nodes
+   * @param properties mapping from property keys to corresponding columns
+   * @since 3.0.0
+   */
+  def create(
+  df: DataFrame,
+  idColumn: String,
+  labelSet: Set[String],
+  properties: Map[String, String]): NodeFrame = {
+NodeFrame(df, idColumn, labelSet, properties)
+  }
+
+  /**
+   * Describes how to map an initial DataFrame to nodes.
+   *
+   * All columns apart from the given `idColumn` are mapped to node properties.
+   *
+   * @param dfDataFrame containing a single node in each row
+   * @param idColumn  column that contains the node identifier
+   * @param labelSet  labels that are assigned to all nodes
+   * @since 3.0.0
+   */
+  def create(df: DataFrame, idColumn: String, labelSet: 
java.util.Set[String]): NodeFrame = {
+create(df, idColumn, labelSet.asScala.toSet)
+  }
+
+  /**
+   * Describes how to map an initial DataFrame to nodes.
+   *
+   * All columns apart from the given `idColumn` are mapped to node properties.
+   *
+   * @param dfDataFrame containing a single node in each row
+   * @param idColumn  column that contains the node identifier
+   * @param labelSet  labels that are assigned to all nodes
+   * @param properties mapping from property keys to corresponding columns
+   * @since 3.0.0
+   */
+  def create(
+  df: DataFrame,
+  idColumn: String,
+  labelSet: java.util.Set[String],
+  properties: java.util.Map[String, String]): NodeFrame = {
+val scalaLabelSet = labelSet.asScala.toSet
+val scalaProperties = properties.asScala.toMap
+NodeFrame(df, idColumn, scalaLabelSet, scalaProperties)
+  }
+
+}
+
+/**
+ * Describes how to map a DataFrame to nodes.
+ *
+ * Each row in the DataFrame represents a node which has exactly the labels 
defined by the given
+ * label set.
+ *
+ * @param df

[GitHub] [spark] AmplabJenkins commented on issue #25251: [MINOR] Trivial cleanups

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #25251: [MINOR] Trivial cleanups
URL: https://github.com/apache/spark/pull/25251#issuecomment-515336363
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108191/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] s1ck commented on a change in pull request #24851: [SPARK-27303][GRAPH] Add Spark Graph API

2019-07-26 Thread GitBox

s1ck commented on a change in pull request #24851: [SPARK-27303][GRAPH] Add 
Spark Graph API
URL: https://github.com/apache/spark/pull/24851#discussion_r307612780
 
 

 ##
 File path: 
graph/api/src/main/scala/org/apache/spark/graph/api/GraphElementFrame.scala
 ##
 @@ -0,0 +1,260 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.graph.api
+
+import scala.collection.JavaConverters._
+
+import org.apache.spark.sql.DataFrame
+
+/**
+ * A [[PropertyGraph]] is created from GraphElementFrames.
+ *
+ * A graph element is either a node or a relationship.
+ * A GraphElementFrame wraps a DataFrame and describes how it maps to graph 
elements.
+ *
+ * @since 3.0.0
+ */
+abstract class GraphElementFrame {
+
+  /**
+   * Initial DataFrame that can still contain unmapped, arbitrarily ordered 
columns.
+   *
+   * @since 3.0.0
+   */
+  def df: DataFrame
+
+  /**
+   * Name of the column that contains the graph element identifier.
+   *
+   * @since 3.0.0
+   */
+  def idColumn: String
+
+  /**
+   * Name of all columns that contain graph element identifiers.
+   *
+   * @since 3.0.0
+   */
+  def idColumns: Seq[String] = Seq(idColumn)
+
+  /**
+   * Mapping from graph element property keys to the columns that contain the 
corresponding property
+   * values.
+   *
+   * @since 3.0.0
+   */
+  def properties: Map[String, String]
+
+}
+
+object NodeFrame {
+
+  /**
+   * Describes how to map an initial DataFrame to nodes.
+   *
+   * All columns apart from the given `idColumn` are mapped to node properties.
+   *
+   * @param dfDataFrame containing a single node in each row
+   * @param idColumn  column that contains the node identifier
+   * @param labelSet  labels that are assigned to all nodes
+   * @since 3.0.0
+   */
+  def create(df: DataFrame, idColumn: String, labelSet: Set[String]): 
NodeFrame = {
+val properties = (df.columns.toSet - idColumn)
+  .map(columnName => columnName -> columnName)
+  .toMap
+create(df, idColumn, labelSet, properties)
+  }
+
+  /**
+   * Describes how to map an initial DataFrame to nodes.
+   *
+   * All columns apart from the given `idColumn` are mapped to node properties.
+   *
+   * @param dfDataFrame containing a single node in each row
+   * @param idColumn  column that contains the node identifier
+   * @param labelSet  labels that are assigned to all nodes
+   * @param properties mapping from property keys to corresponding columns
+   * @since 3.0.0
+   */
+  def create(
+  df: DataFrame,
+  idColumn: String,
+  labelSet: Set[String],
+  properties: Map[String, String]): NodeFrame = {
+NodeFrame(df, idColumn, labelSet, properties)
+  }
+
+  /**
+   * Describes how to map an initial DataFrame to nodes.
+   *
+   * All columns apart from the given `idColumn` are mapped to node properties.
+   *
+   * @param dfDataFrame containing a single node in each row
+   * @param idColumn  column that contains the node identifier
+   * @param labelSet  labels that are assigned to all nodes
+   * @since 3.0.0
+   */
+  def create(df: DataFrame, idColumn: String, labelSet: 
java.util.Set[String]): NodeFrame = {
+create(df, idColumn, labelSet.asScala.toSet)
+  }
+
+  /**
+   * Describes how to map an initial DataFrame to nodes.
+   *
+   * All columns apart from the given `idColumn` are mapped to node properties.
+   *
+   * @param dfDataFrame containing a single node in each row
+   * @param idColumn  column that contains the node identifier
+   * @param labelSet  labels that are assigned to all nodes
+   * @param properties mapping from property keys to corresponding columns
+   * @since 3.0.0
+   */
+  def create(
+  df: DataFrame,
+  idColumn: String,
+  labelSet: java.util.Set[String],
+  properties: java.util.Map[String, String]): NodeFrame = {
+val scalaLabelSet = labelSet.asScala.toSet
+val scalaProperties = properties.asScala.toMap
+NodeFrame(df, idColumn, scalaLabelSet, scalaProperties)
+  }
+
+}
+
+/**
+ * Describes how to map a DataFrame to nodes.
+ *
+ * Each row in the DataFrame represents a node which has exactly the labels 
defined by the given
+ * label set.
+ *
+ * @param df

[GitHub] [spark] AmplabJenkins commented on issue #25251: [MINOR] Trivial cleanups

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #25251: [MINOR] Trivial cleanups
URL: https://github.com/apache/spark/pull/25251#issuecomment-515336358
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25085: [SPARK-28313][SQL] Spark sql null type incompatible with hive void type

2019-07-26 Thread GitBox

SparkQA removed a comment on issue #25085: [SPARK-28313][SQL] Spark sql null 
type incompatible with hive void type
URL: https://github.com/apache/spark/pull/25085#issuecomment-515309221
 
 
   **[Test build #108194 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108194/testReport)**
 for PR 25085 at commit 
[`e71c4d2`](https://github.com/apache/spark/commit/e71c4d2878fff642d34abbc71b9dff65354dafe5).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25085: [SPARK-28313][SQL] Spark sql null type incompatible with hive void type

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #25085: [SPARK-28313][SQL] Spark sql 
null type incompatible with hive void type
URL: https://github.com/apache/spark/pull/25085#issuecomment-515336051
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25074: [SPARK-27924][SQL] Support ANSI SQL Boolean-Predicate syntax

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #25074: [SPARK-27924][SQL] Support 
ANSI SQL Boolean-Predicate syntax 
URL: https://github.com/apache/spark/pull/25074#issuecomment-515336079
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108193/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25074: [SPARK-27924][SQL] Support ANSI SQL Boolean-Predicate syntax

2019-07-26 Thread GitBox

SparkQA removed a comment on issue #25074: [SPARK-27924][SQL] Support ANSI SQL 
Boolean-Predicate syntax 
URL: https://github.com/apache/spark/pull/25074#issuecomment-515305438
 
 
   **[Test build #108193 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108193/testReport)**
 for PR 25074 at commit 
[`ebd2dcf`](https://github.com/apache/spark/commit/ebd2dcfd63560758dda8407c0ee1e17a2eb77bdc).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #24947: [SPARK-28143][SQL] Expressions without proper constructors should throw AnalysisException

2019-07-26 Thread GitBox

SparkQA removed a comment on issue #24947: [SPARK-28143][SQL] Expressions 
without proper constructors should throw AnalysisException
URL: https://github.com/apache/spark/pull/24947#issuecomment-515332547
 
 
   **[Test build #108197 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108197/testReport)**
 for PR 24947 at commit 
[`a0363de`](https://github.com/apache/spark/commit/a0363de8932ad6b886d9deaa043354543da0ed56).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25259: [SPARK-28518][SQL][TEST] Refer to ChecksumFileSystem#isChecksumFile to fix StatisticsCollectionTestBase#getDataSize

2019-07-26 Thread GitBox

SparkQA removed a comment on issue #25259: [SPARK-28518][SQL][TEST] Refer to 
ChecksumFileSystem#isChecksumFile to fix 
StatisticsCollectionTestBase#getDataSize
URL: https://github.com/apache/spark/pull/25259#issuecomment-515328592
 
 
   **[Test build #108196 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108196/testReport)**
 for PR 25259 at commit 
[`8158d5e`](https://github.com/apache/spark/commit/8158d5e27fce8e4bc5877ed7bb4f7c3876007c13).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24947: [SPARK-28143][SQL] Expressions without proper constructors should throw AnalysisException

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #24947: [SPARK-28143][SQL] Expressions 
without proper constructors should throw AnalysisException
URL: https://github.com/apache/spark/pull/24947#issuecomment-515335972
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25251: [MINOR] Trivial cleanups

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #25251: [MINOR] Trivial cleanups
URL: https://github.com/apache/spark/pull/25251#issuecomment-515336358
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #22290: [SPARK-25285][CORE] Add startedTasks and finishedTasks to the metrics system in the executor instance

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #22290: [SPARK-25285][CORE] Add 
startedTasks and finishedTasks to the metrics system in the executor instance
URL: https://github.com/apache/spark/pull/22290#issuecomment-515335941
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108198/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25259: [SPARK-28518][SQL][TEST] Refer to ChecksumFileSystem#isChecksumFile to fix StatisticsCollectionTestBase#getDataSize

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #25259: [SPARK-28518][SQL][TEST] Refer 
to ChecksumFileSystem#isChecksumFile to fix 
StatisticsCollectionTestBase#getDataSize
URL: https://github.com/apache/spark/pull/25259#issuecomment-515335948
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108196/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24947: [SPARK-28143][SQL] Expressions without proper constructors should throw AnalysisException

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #24947: [SPARK-28143][SQL] Expressions 
without proper constructors should throw AnalysisException
URL: https://github.com/apache/spark/pull/24947#issuecomment-515335978
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108197/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25085: [SPARK-28313][SQL] Spark sql null type incompatible with hive void type

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #25085: [SPARK-28313][SQL] Spark sql 
null type incompatible with hive void type
URL: https://github.com/apache/spark/pull/25085#issuecomment-515336058
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108194/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25074: [SPARK-27924][SQL] Support ANSI SQL Boolean-Predicate syntax

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #25074: [SPARK-27924][SQL] Support 
ANSI SQL Boolean-Predicate syntax 
URL: https://github.com/apache/spark/pull/25074#issuecomment-515336116
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108192/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25251: [MINOR] Trivial cleanups

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #25251: [MINOR] Trivial cleanups
URL: https://github.com/apache/spark/pull/25251#issuecomment-515336363
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108191/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] gengliangwang commented on issue #25259: [SPARK-28518][SQL][TEST] Refer to ChecksumFileSystem#isChecksumFile to fix StatisticsCollectionTestBase#getDataSize

2019-07-26 Thread GitBox

gengliangwang commented on issue #25259: [SPARK-28518][SQL][TEST] Refer to 
ChecksumFileSystem#isChecksumFile to fix 
StatisticsCollectionTestBase#getDataSize
URL: https://github.com/apache/spark/pull/25259#issuecomment-515337307
 
 
   retest this please.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] gengliangwang commented on issue #25251: [MINOR] Trivial cleanups

2019-07-26 Thread GitBox

gengliangwang commented on issue #25251: [MINOR] Trivial cleanups
URL: https://github.com/apache/spark/pull/25251#issuecomment-515337208
 
 
   retest this please.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] s1ck commented on a change in pull request #24851: [SPARK-27303][GRAPH] Add Spark Graph API

2019-07-26 Thread GitBox

s1ck commented on a change in pull request #24851: [SPARK-27303][GRAPH] Add 
Spark Graph API
URL: https://github.com/apache/spark/pull/24851#discussion_r307614116
 
 

 ##
 File path: 
graph/api/src/main/scala/org/apache/spark/graph/api/CypherSession.scala
 ##
 @@ -0,0 +1,186 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.graph.api
+
+import scala.collection.JavaConverters._
+
+import org.apache.spark.sql.{DataFrame, SaveMode, SparkSession}
+
+object CypherSession {
+  val ID_COLUMN = "$ID"
+  val SOURCE_ID_COLUMN = "$SOURCE_ID"
+  val TARGET_ID_COLUMN = "$TARGET_ID"
+  val LABEL_COLUMN_PREFIX = ":"
+}
+
+/**
+ * The entry point for using property graphs in Spark.
+ *
+ * Provides factory methods for creating [[PropertyGraph]] instances.
+ *
+ * Wraps a [[org.apache.spark.sql.SparkSession]].
+ *
+ * @since 3.0.0
+ */
+trait CypherSession {
+
+  def sparkSession: SparkSession
+
+  /**
+   * Executes a Cypher query on the given input graph.
+   *
+   * @param graph [[PropertyGraph]] on which the query is executed
+   * @param query Cypher query to execute
+   * @since 3.0.0
+   */
+  def cypher(graph: PropertyGraph, query: String): CypherResult
+
+  /**
+   * Executes a Cypher query on the given input graph.
+   *
+   * @param graph  [[PropertyGraph]] on which the query is executed
+   * @param query  Cypher query to execute
+   * @param parameters parameters used by the Cypher query
+   * @since 3.0.0
+   */
+  def cypher(graph: PropertyGraph, query: String, parameters: Map[String, 
Any]): CypherResult
+
+  /**
+   * Executes a Cypher query on the given input graph.
+   *
+   * @param graph  [[PropertyGraph]] on which the query is executed
+   * @param query  Cypher query to execute
+   * @param parameters parameters used by the Cypher query
+   * @since 3.0.0
+   */
+  def cypher(graph: PropertyGraph,
+ query: String,
+ parameters: java.util.Map[String, Object]): CypherResult = {
+cypher(graph, query, parameters.asScala.toMap)
+  }
+
+  /**
+   * Creates a [[PropertyGraph]] from a sequence of [[NodeFrame]]s and 
[[RelationshipFrame]]s.
+   * At least one [[NodeFrame]] has to be provided.
+   *
+   * For each label set and relationship type there can be at most one 
[[NodeFrame]] and at most one
+   * [[RelationshipFrame]], respectively.
+   *
+   * @param nodes NodeFrames that define the nodes in the graph
+   * @param relationships RelationshipFrames that define the relationships in 
the graph
+   * @since 3.0.0
+   */
+  def createGraph(nodes: Seq[NodeFrame], relationships: 
Seq[RelationshipFrame]): PropertyGraph
+
+  /**
+   * Creates a [[PropertyGraph]] from a sequence of [[NodeFrame]]s and 
[[RelationshipFrame]]s.
+   * At least one [[NodeFrame]] has to be provided.
+   *
+   * For each label set and relationship type there can be at most one 
[[NodeFrame]] and at most one
+   * [[RelationshipFrame]], respectively.
+   *
+   * @param nodes NodeFrames that define the nodes in the graph
+   * @param relationships RelationshipFrames that define the relationships in 
the graph
+   * @since 3.0.0
+   */
+  def createGraph(
+  nodes: java.util.List[NodeFrame],
+  relationships: java.util.List[RelationshipFrame]): PropertyGraph = {
+createGraph(nodes.asScala, relationships.asScala)
+  }
+
+  /**
+   * Creates a [[PropertyGraph]] from nodes and relationships.
+   *
+   * The given DataFrames need to adhere to the following column naming 
conventions:
+   *
+   * {{{
+   * Id column:`$ID`(nodes and relationships)
+   * SourceId column:  `$SOURCE_ID` (relationships)
+   * TargetId column:  `$TARGET_ID` (relationships)
+   *
+   * Label columns:`:{LABEL_NAME}`  (nodes)
+   * RelType columns:  `:{REL_TYPE}`(relationships)
+   *
+   * Property columns: `{Property_Key}` (nodes and relationships)
+   * }}}
+   *
+   * @see [[CypherSession]]
+   * @param nodes node DataFrame
+   * @param relationships relationship DataFrame
+   * @since 3.0.0
+   */
+  def createGraph(nodes: DataFrame, relationships: DataFrame): PropertyGraph = 
{
 
 Review comment:
   This is a big change. `spa

[GitHub] [spark] cloud-fan commented on a change in pull request #25249: [SPARK-28237][SQL] Enforce Idempotence for Once batches in RuleExecutor

2019-07-26 Thread GitBox

cloud-fan commented on a change in pull request #25249: [SPARK-28237][SQL] 
Enforce Idempotence for Once batches in RuleExecutor
URL: https://github.com/apache/spark/pull/25249#discussion_r307614296
 
 

 ##
 File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/PullupCorrelatedPredicatesSuite.scala
 ##
 @@ -27,6 +27,8 @@ import org.apache.spark.sql.catalyst.rules.RuleExecutor
 class PullupCorrelatedPredicatesSuite extends PlanTest {
 
   object Optimize extends RuleExecutor[LogicalPlan] {
+override protected val blacklistedOnceBatches = 
Set("PullupCorrelatedPredicates")
 
 Review comment:
   shall we also change `Once` to `FixedPoint(1)` here instead of adding 
blacklist?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25251: [MINOR] Trivial cleanups

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #25251: [MINOR] Trivial cleanups
URL: https://github.com/apache/spark/pull/25251#issuecomment-515338231
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/13301/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25259: [SPARK-28518][SQL][TEST] Refer to ChecksumFileSystem#isChecksumFile to fix StatisticsCollectionTestBase#getDataSize

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #25259: [SPARK-28518][SQL][TEST] Refer 
to ChecksumFileSystem#isChecksumFile to fix 
StatisticsCollectionTestBase#getDataSize
URL: https://github.com/apache/spark/pull/25259#issuecomment-515338206
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/13300/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25259: [SPARK-28518][SQL][TEST] Refer to ChecksumFileSystem#isChecksumFile to fix StatisticsCollectionTestBase#getDataSize

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #25259: [SPARK-28518][SQL][TEST] Refer to 
ChecksumFileSystem#isChecksumFile to fix 
StatisticsCollectionTestBase#getDataSize
URL: https://github.com/apache/spark/pull/25259#issuecomment-515338206
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/13300/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25251: [MINOR] Trivial cleanups

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #25251: [MINOR] Trivial cleanups
URL: https://github.com/apache/spark/pull/25251#issuecomment-515338226
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25259: [SPARK-28518][SQL][TEST] Refer to ChecksumFileSystem#isChecksumFile to fix StatisticsCollectionTestBase#getDataSize

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #25259: [SPARK-28518][SQL][TEST] Refer to 
ChecksumFileSystem#isChecksumFile to fix 
StatisticsCollectionTestBase#getDataSize
URL: https://github.com/apache/spark/pull/25259#issuecomment-515338199
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25259: [SPARK-28518][SQL][TEST] Refer to ChecksumFileSystem#isChecksumFile to fix StatisticsCollectionTestBase#getDataSize

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #25259: [SPARK-28518][SQL][TEST] Refer 
to ChecksumFileSystem#isChecksumFile to fix 
StatisticsCollectionTestBase#getDataSize
URL: https://github.com/apache/spark/pull/25259#issuecomment-515338199
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25251: [MINOR] Trivial cleanups

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #25251: [MINOR] Trivial cleanups
URL: https://github.com/apache/spark/pull/25251#issuecomment-515338226
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25251: [MINOR] Trivial cleanups

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #25251: [MINOR] Trivial cleanups
URL: https://github.com/apache/spark/pull/25251#issuecomment-515338231
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/13301/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun closed pull request #25237: [SPARK-28489][SS] Fix a bug that KafkaOffsetRangeCalculator.getRanges may drop offsets

2019-07-26 Thread GitBox

dongjoon-hyun closed pull request #25237: [SPARK-28489][SS] Fix a bug that 
KafkaOffsetRangeCalculator.getRanges may drop offsets
URL: https://github.com/apache/spark/pull/25237
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25259: [SPARK-28518][SQL][TEST] Refer to ChecksumFileSystem#isChecksumFile to fix StatisticsCollectionTestBase#getDataSize

2019-07-26 Thread GitBox

SparkQA commented on issue #25259: [SPARK-28518][SQL][TEST] Refer to 
ChecksumFileSystem#isChecksumFile to fix 
StatisticsCollectionTestBase#getDataSize
URL: https://github.com/apache/spark/pull/25259#issuecomment-515338730
 
 
   **[Test build #108199 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108199/testReport)**
 for PR 25259 at commit 
[`8158d5e`](https://github.com/apache/spark/commit/8158d5e27fce8e4bc5877ed7bb4f7c3876007c13).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25251: [MINOR] Trivial cleanups

2019-07-26 Thread GitBox

SparkQA commented on issue #25251: [MINOR] Trivial cleanups
URL: https://github.com/apache/spark/pull/25251#issuecomment-515338754
 
 
   **[Test build #108200 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108200/testReport)**
 for PR 25251 at commit 
[`5cbeaf0`](https://github.com/apache/spark/commit/5cbeaf02cafdd627d6f31c8a41396a37ef971a7d).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] beliefer commented on issue #25074: [SPARK-27924][SQL] Support ANSI SQL Boolean-Predicate syntax

2019-07-26 Thread GitBox

beliefer commented on issue #25074: [SPARK-27924][SQL] Support ANSI SQL 
Boolean-Predicate syntax 
URL: https://github.com/apache/spark/pull/25074#issuecomment-515339020
 
 
   Retest this please.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25243: [SPARK-28498][SQL][TEST] always create a fresh copy of the SparkSession before each test

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #25243: [SPARK-28498][SQL][TEST] always create 
a fresh copy of the SparkSession before each test
URL: https://github.com/apache/spark/pull/25243#issuecomment-515340394
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/13302/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25074: [SPARK-27924][SQL] Support ANSI SQL Boolean-Predicate syntax

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #25074: [SPARK-27924][SQL] Support ANSI SQL 
Boolean-Predicate syntax 
URL: https://github.com/apache/spark/pull/25074#issuecomment-515340414
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25243: [SPARK-28498][SQL][TEST] always create a fresh copy of the SparkSession before each test

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #25243: [SPARK-28498][SQL][TEST] always create 
a fresh copy of the SparkSession before each test
URL: https://github.com/apache/spark/pull/25243#issuecomment-515340389
 
 
   Build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25074: [SPARK-27924][SQL] Support ANSI SQL Boolean-Predicate syntax

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #25074: [SPARK-27924][SQL] Support ANSI SQL 
Boolean-Predicate syntax 
URL: https://github.com/apache/spark/pull/25074#issuecomment-515340419
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/13303/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25243: [SPARK-28498][SQL][TEST] always create a fresh copy of the SparkSession before each test

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #25243: [SPARK-28498][SQL][TEST] 
always create a fresh copy of the SparkSession before each test
URL: https://github.com/apache/spark/pull/25243#issuecomment-515340389
 
 
   Build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25074: [SPARK-27924][SQL] Support ANSI SQL Boolean-Predicate syntax

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #25074: [SPARK-27924][SQL] Support 
ANSI SQL Boolean-Predicate syntax 
URL: https://github.com/apache/spark/pull/25074#issuecomment-515340414
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25243: [SPARK-28498][SQL][TEST] always create a fresh copy of the SparkSession before each test

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #25243: [SPARK-28498][SQL][TEST] 
always create a fresh copy of the SparkSession before each test
URL: https://github.com/apache/spark/pull/25243#issuecomment-515340394
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/13302/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25074: [SPARK-27924][SQL] Support ANSI SQL Boolean-Predicate syntax

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #25074: [SPARK-27924][SQL] Support 
ANSI SQL Boolean-Predicate syntax 
URL: https://github.com/apache/spark/pull/25074#issuecomment-515340419
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/13303/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] s1ck commented on a change in pull request #24851: [SPARK-27303][GRAPH] Add Spark Graph API

2019-07-26 Thread GitBox

s1ck commented on a change in pull request #24851: [SPARK-27303][GRAPH] Add 
Spark Graph API
URL: https://github.com/apache/spark/pull/24851#discussion_r307617547
 
 

 ##
 File path: 
graph/api/src/main/scala/org/apache/spark/graph/api/PropertyGraph.scala
 ##
 @@ -0,0 +1,140 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.graph.api
+
+import org.apache.spark.sql.{DataFrame, SaveMode}
+
+/**
+ * A Property Graph as defined by the openCypher Property Graph Data Model.
+ *
+ * A graph is always tied to and managed by a [[CypherSession]].
+ * The lifetime of a graph is bound by the session lifetime.
+ *
+ * @see http://www.opencypher.org/";>openCypher project
 
 Review comment:
   I added one initially, but it exceeds the line length. The [Databricks Scala 
Style Guide](https://github.com/databricks/scala-style-guide#linelength) 
mentions that more than 100 chars are fine for URLs, but scalastyle still 
complains. Probably because it's behind an `@see` tag. I re-added it and let's 
see what CI says.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25243: [SPARK-28498][SQL][TEST] always create a fresh copy of the SparkSession before each test

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #25243: [SPARK-28498][SQL][TEST] always create 
a fresh copy of the SparkSession before each test
URL: https://github.com/apache/spark/pull/25243#issuecomment-515341051
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108201/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25243: [SPARK-28498][SQL][TEST] always create a fresh copy of the SparkSession before each test

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #25243: [SPARK-28498][SQL][TEST] always create 
a fresh copy of the SparkSession before each test
URL: https://github.com/apache/spark/pull/25243#issuecomment-515341044
 
 
   Build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24829: [WIP][SPARK-27988][SQL][TEST] Port AGGREGATES.sql [Part 3]

2019-07-26 Thread GitBox

SparkQA commented on issue #24829: [WIP][SPARK-27988][SQL][TEST] Port 
AGGREGATES.sql [Part 3]
URL: https://github.com/apache/spark/pull/24829#issuecomment-515341095
 
 
   **[Test build #108203 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108203/testReport)**
 for PR 24829 at commit 
[`0a425c4`](https://github.com/apache/spark/commit/0a425c41b26225512cb9d0e8cb58986d76513f6c).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25074: [SPARK-27924][SQL] Support ANSI SQL Boolean-Predicate syntax

2019-07-26 Thread GitBox

SparkQA commented on issue #25074: [SPARK-27924][SQL] Support ANSI SQL 
Boolean-Predicate syntax 
URL: https://github.com/apache/spark/pull/25074#issuecomment-515341067
 
 
   **[Test build #108202 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108202/testReport)**
 for PR 25074 at commit 
[`ebd2dcf`](https://github.com/apache/spark/commit/ebd2dcfd63560758dda8407c0ee1e17a2eb77bdc).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25243: [SPARK-28498][SQL][TEST] always create a fresh copy of the SparkSession before each test

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #25243: [SPARK-28498][SQL][TEST] 
always create a fresh copy of the SparkSession before each test
URL: https://github.com/apache/spark/pull/25243#issuecomment-515341044
 
 
   Build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25243: [SPARK-28498][SQL][TEST] clear the states of SparkSession after each test

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #25243: [SPARK-28498][SQL][TEST] clear the 
states of SparkSession after each test
URL: https://github.com/apache/spark/pull/25243#issuecomment-515342730
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/13304/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25243: [SPARK-28498][SQL][TEST] clear the states of SparkSession after each test

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #25243: [SPARK-28498][SQL][TEST] clear the 
states of SparkSession after each test
URL: https://github.com/apache/spark/pull/25243#issuecomment-515342718
 
 
   Build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25243: [SPARK-28498][SQL][TEST] clear the states of SparkSession after each test

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #25243: [SPARK-28498][SQL][TEST] clear 
the states of SparkSession after each test
URL: https://github.com/apache/spark/pull/25243#issuecomment-515342730
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/13304/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25243: [SPARK-28498][SQL][TEST] clear the states of SparkSession after each test

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #25243: [SPARK-28498][SQL][TEST] clear 
the states of SparkSession after each test
URL: https://github.com/apache/spark/pull/25243#issuecomment-515342718
 
 
   Build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25243: [SPARK-28498][SQL][TEST] clear the states of SparkSession after each test

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #25243: [SPARK-28498][SQL][TEST] clear 
the states of SparkSession after each test
URL: https://github.com/apache/spark/pull/25243#issuecomment-515341051
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108201/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25243: [SPARK-28498][SQL][TEST] clear the states of SparkSession after each test

2019-07-26 Thread GitBox

SparkQA commented on issue #25243: [SPARK-28498][SQL][TEST] clear the states of 
SparkSession after each test
URL: https://github.com/apache/spark/pull/25243#issuecomment-515343305
 
 
   **[Test build #108204 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108204/testReport)**
 for PR 25243 at commit 
[`094e65b`](https://github.com/apache/spark/commit/094e65b2c571461cc0ef652f53f3004001ab6bdd).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] sarutak opened a new pull request #25260: [SPARK-28520][SQL] WholeStageCodegen does not work property for LocalTableScanExec

2019-07-26 Thread GitBox

sarutak opened a new pull request #25260: [SPARK-28520][SQL] WholeStageCodegen 
does not work property for LocalTableScanExec
URL: https://github.com/apache/spark/pull/25260
 
 
   Code is not generated for LocalTableScanExec although proper situations.
   
   If a LocalTableScanExec plan has the direct parent plan which supports 
WholeStageCodegen,
   the LocalTableScanExec plan also should be within a WholeStageCodegen domain.
   But code is not generated for LocalTableScanExec and InputAdapter is 
inserted for now.
   
   ```
   val df1 = spark.createDataset(1 to 10).toDF
   val df2 = spark.createDataset(1 to 10).toDF
   val df3 = df1.join(df2, df1("value") === df2("value"))
   df3.explain(true)
   
   ...
   
   == Physical Plan ==
   *(1) BroadcastHashJoin [value#1], [value#6], Inner, BuildRight
   :- LocalTableScan [value#1] // 
LocalTableScanExec is not within a WholeStageCodegen domain
   +- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, 
false] as bigint)))
  +- LocalTableScan [value#6]
   ```
   
   ```
   scala> df3.queryExecution.executedPlan.children.head.children.head.getClass
   res4: Class[_ <: org.apache.spark.sql.execution.SparkPlan] = class 
org.apache.spark.sql.execution.InputAdapter
   ```
   
   For the current implementation of LocalTableScanExec, codegen is enabled in 
case `parent` is not null
   but `parent` is set in `consume`, which is called after `insertInputAdapter` 
so it doesn't work as intended.
   
   After applying this cnahge, we can get following plan, which means 
LocalTableScanExec is within a WholeStageCodegen domain.
   
   ```
   == Physical Plan ==
   *(1) BroadcastHashJoin [value#63], [value#68], Inner, BuildRight
   :- *(1) LocalTableScan [value#63]
   +- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, 
false] as bigint)))
  +- LocalTableScan [value#68]
   
   ## How was this patch tested?
   
   New test cases are added into WholeStageCodegenSuite.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25260: [SPARK-28520][SQL] WholeStageCodegen does not work property for LocalTableScanExec

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #25260: [SPARK-28520][SQL] WholeStageCodegen 
does not work property for LocalTableScanExec
URL: https://github.com/apache/spark/pull/25260#issuecomment-515345052
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25260: [SPARK-28520][SQL] WholeStageCodegen does not work property for LocalTableScanExec

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #25260: [SPARK-28520][SQL] WholeStageCodegen 
does not work property for LocalTableScanExec
URL: https://github.com/apache/spark/pull/25260#issuecomment-515345057
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/13305/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25260: [SPARK-28520][SQL] WholeStageCodegen does not work property for LocalTableScanExec

2019-07-26 Thread GitBox

SparkQA commented on issue #25260: [SPARK-28520][SQL] WholeStageCodegen does 
not work property for LocalTableScanExec
URL: https://github.com/apache/spark/pull/25260#issuecomment-515345720
 
 
   **[Test build #108205 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108205/testReport)**
 for PR 25260 at commit 
[`24d51ba`](https://github.com/apache/spark/commit/24d51ba1c30472cffc4e44641ece2c4e76e54139).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25260: [SPARK-28520][SQL] WholeStageCodegen does not work property for LocalTableScanExec

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #25260: [SPARK-28520][SQL] 
WholeStageCodegen does not work property for LocalTableScanExec
URL: https://github.com/apache/spark/pull/25260#issuecomment-515345057
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/13305/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25260: [SPARK-28520][SQL] WholeStageCodegen does not work property for LocalTableScanExec

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #25260: [SPARK-28520][SQL] 
WholeStageCodegen does not work property for LocalTableScanExec
URL: https://github.com/apache/spark/pull/25260#issuecomment-515345052
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] mgaido91 commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-07-26 Thread GitBox

mgaido91 commented on a change in pull request #23531: [SPARK-24497][SQL] 
Support recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r307624090
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala
 ##
 @@ -228,6 +234,156 @@ case class FilterExec(condition: Expression, child: 
SparkPlan)
   override def outputPartitioning: Partitioning = child.outputPartitioning
 }
 
+/**
+ * Physical plan node for a recursive table that encapsulates the physical 
plans of the anchor
+ * terms and the logical plans of the recursive terms and the maximum number 
of rows to return.
+ *
+ * Anchor terms are physical plans and they are used to initialize the query 
in the first run.
+ * Recursive terms are used to extend the result with new rows, They are 
logical plans and contain
+ * references to the result of the previous iteration or to the so far 
cumulated result. These
+ * references are updated with new statistics and compiled to physical plan 
and then updated to
+ * reflect the appropriate RDD before execution.
+ *
+ * The execution terminates once the anchor terms or the current iteration of 
the recursive terms
+ * return no rows or the number of cumulated rows reaches the limit.
+ *
+ * During the execution of a recursive query the previously computed results 
are reused multiple
+ * times. To avoid massive recomputation of these pieces of the final result, 
they are cached.
+ *
+ * @param name the name of the recursive table
+ * @param anchorTerms this child is used for initializing the query
+ * @param recursiveTerms this child is used for extending the set of results 
with new rows based on
+ *   the results of the previous iteration (or the anchor 
in the first
+ *   iteration)
+ * @param limit the maximum number of rows to return
+ */
+case class RecursiveTableExec(
+name: String,
+anchorTerms: Seq[SparkPlan],
+@transient
+val recursiveTerms: Seq[LogicalPlan],
+limit: Option[Long]) extends SparkPlan {
+  override def children: Seq[SparkPlan] = anchorTerms
+
+  override def output: Seq[Attribute] = 
anchorTerms.head.output.map(_.withNullability(true))
+
+  override def simpleString(maxFields: Int): String =
+s"RecursiveTable $name${limit.map(", " + _).getOrElse("")}"
+
+  override def innerChildren: Seq[QueryPlan[_]] = recursiveTerms ++ 
super.innerChildren
+
+  override protected def doExecute(): RDD[InternalRow] = {
+val storageLevel = 
StorageLevel.fromString(conf.getConf(SQLConf.RECURSION_CACHE_STORAGE_LEVEL))
 
 Review comment:
   I remember in the past there were suggestions of using directly the conf 
when we use it only once by many people to avoid the proliferation of these 
methods...so that was the reason of my comment.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] mgaido91 commented on a change in pull request #25253: [SPARK-28470][SQL] Cast to decimal throws ArithemticException on overflow

2019-07-26 Thread GitBox

mgaido91 commented on a change in pull request #25253: [SPARK-28470][SQL] Cast 
to decimal throws ArithemticException on overflow
URL: https://github.com/apache/spark/pull/25253#discussion_r307625834
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 ##
 @@ -498,22 +499,34 @@ case class Cast(child: Expression, dataType: DataType, 
timeZoneId: Option[String
   b => x.numeric.asInstanceOf[Numeric[Any]].toInt(b).toByte
   }
 
+  private val nullOnOverflow = SQLConf.get.decimalOperationsNullOnOverflow
+
   /**
* Change the precision / scale in a given decimal to those set in 
`decimalType` (if any),
* returning null if it overflows or modifying `value` in-place and 
returning it if successful.
*
* NOTE: this modifies `value` in-place, so don't call it on external data.
*/
   private[this] def changePrecision(value: Decimal, decimalType: DecimalType): 
Decimal = {
-if (value.changePrecision(decimalType.precision, decimalType.scale)) value 
else null
+if (value.changePrecision(decimalType.precision, decimalType.scale)) {
+  value
+} else {
 
 Review comment:
   I agree with @gengliangwang but I am fine changing it. Please @HyukjinKwon 
let me know if you think we should change it, I'll do it. Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] mgaido91 commented on a change in pull request #25253: [SPARK-28470][SQL] Cast to decimal throws ArithemticException on overflow

2019-07-26 Thread GitBox

mgaido91 commented on a change in pull request #25253: [SPARK-28470][SQL] Cast 
to decimal throws ArithemticException on overflow
URL: https://github.com/apache/spark/pull/25253#discussion_r307626380
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 ##
 @@ -498,22 +499,34 @@ case class Cast(child: Expression, dataType: DataType, 
timeZoneId: Option[String
   b => x.numeric.asInstanceOf[Numeric[Any]].toInt(b).toByte
   }
 
+  private val nullOnOverflow = SQLConf.get.decimalOperationsNullOnOverflow
+
   /**
* Change the precision / scale in a given decimal to those set in 
`decimalType` (if any),
* returning null if it overflows or modifying `value` in-place and 
returning it if successful.
*
* NOTE: this modifies `value` in-place, so don't call it on external data.
*/
   private[this] def changePrecision(value: Decimal, decimalType: DecimalType): 
Decimal = {
-if (value.changePrecision(decimalType.precision, decimalType.scale)) value 
else null
+if (value.changePrecision(decimalType.precision, decimalType.scale)) {
+  value
+} else {
+  if (nullOnOverflow) {
+null
+  } else {
+throw new ArithmeticException(s"${value.toDebugString} cannot be 
represented as " +
+  s"Decimal(${decimalType.precision}, ${decimalType.scale}).")
 
 Review comment:
   this is consistent with other similar error messages. We should change it in 
all cases, then. WDYT?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25243: [SPARK-28498][SQL][TEST] clear the states of SparkSession after each test

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #25243: [SPARK-28498][SQL][TEST] clear the 
states of SparkSession after each test
URL: https://github.com/apache/spark/pull/25243#issuecomment-515350300
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25243: [SPARK-28498][SQL][TEST] clear the states of SparkSession after each test

2019-07-26 Thread GitBox

AmplabJenkins commented on issue #25243: [SPARK-28498][SQL][TEST] clear the 
states of SparkSession after each test
URL: https://github.com/apache/spark/pull/25243#issuecomment-515350306
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/13306/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25243: [SPARK-28498][SQL][TEST] clear the states of SparkSession after each test

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #25243: [SPARK-28498][SQL][TEST] clear 
the states of SparkSession after each test
URL: https://github.com/apache/spark/pull/25243#issuecomment-515350306
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/13306/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25243: [SPARK-28498][SQL][TEST] clear the states of SparkSession after each test

2019-07-26 Thread GitBox

AmplabJenkins removed a comment on issue #25243: [SPARK-28498][SQL][TEST] clear 
the states of SparkSession after each test
URL: https://github.com/apache/spark/pull/25243#issuecomment-515350300
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25243: [SPARK-28498][SQL][TEST] clear the states of SparkSession after each test

2019-07-26 Thread GitBox

SparkQA commented on issue #25243: [SPARK-28498][SQL][TEST] clear the states of 
SparkSession after each test
URL: https://github.com/apache/spark/pull/25243#issuecomment-515351097
 
 
   **[Test build #108206 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108206/testReport)**
 for PR 25243 at commit 
[`ec7b7bd`](https://github.com/apache/spark/commit/ec7b7bd0e58a06ac2800a824c25b051938db9b67).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 6 7 8 >

1 - 100 of 762 matches

Mail list logo