[GitHub] spark issue #17346: [SPARK-19965][SS] DataFrame batch reader may fail to inf...

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17346
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17346: [SPARK-19965][SS] DataFrame batch reader may fail to inf...

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17346
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76408/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17346: [SPARK-19965][SS] DataFrame batch reader may fail to inf...

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17346
  
**[Test build #76408 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76408/testReport)**
 for PR 17346 at commit 
[`49ee54d`](https://github.com/apache/spark/commit/49ee54d7a644b916e5c1c2c58f4cd1e011c7abc6).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17838: [SPARK-20567] Lazily bind in GenerateExec

2017-05-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/17838


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17836: [SPARK-20566][SQL] ColumnVector should support `appendFl...

2017-05-02 Thread hvanhovell
Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/17836
  
cc @michal-databricks 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17838: [SPARK-20567] Lazily bind in GenerateExec

2017-05-02 Thread hvanhovell
Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/17838
  
LGTM - merging to master/2.2


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17770: [SPARK-20392][SQL] Set barrier to prevent re-entering a ...

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17770
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17770: [SPARK-20392][SQL] Set barrier to prevent re-entering a ...

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17770
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76407/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17770: [SPARK-20392][SQL] Set barrier to prevent re-entering a ...

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17770
  
**[Test build #76407 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76407/testReport)**
 for PR 17770 at commit 
[`a076d83`](https://github.com/apache/spark/commit/a076d83cfc9e87f8234eda639957d663d87eaac4).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class AnalysisBarrier(child: LogicalPlan) extends LeafNode `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add a Bucketizer that can bin mul...

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17819
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add a Bucketizer that can bin mul...

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17819
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76406/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add a Bucketizer that can bin mul...

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17819
  
**[Test build #76406 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76406/testReport)**
 for PR 17819 at commit 
[`6ff9c79`](https://github.com/apache/spark/commit/6ff9c7998688107a835875ea41e6fe9576a1558c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17832: [SPARK-20557][SQL] Support for db column type TIM...

2017-05-02 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/17832#discussion_r114472869
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
 ---
@@ -223,6 +223,9 @@ object JdbcUtils extends Logging {
   case java.sql.Types.STRUCT=> StringType
   case java.sql.Types.TIME  => TimestampType
   case java.sql.Types.TIMESTAMP => TimestampType
+  case java.sql.Types.TIMESTAMP_WITH_TIMEZONE
+=> TimestampType
+  case -101 => TimestampType
--- End diff --

Hi, @JannikArndt 
Could you add a comment describing about `-101` here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17838: [SPARK-20567] Lazily bind in GenerateExec

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17838
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76405/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17838: [SPARK-20567] Lazily bind in GenerateExec

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17838
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17838: [SPARK-20567] Lazily bind in GenerateExec

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17838
  
**[Test build #76405 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76405/testReport)**
 for PR 17838 at commit 
[`7c86b0e`](https://github.com/apache/spark/commit/7c86b0e997e87bce77cdf6064975ff5cab245c08).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17736: [SPARK-20399][SQL] Can't use same regex pattern between ...

2017-05-02 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/17736
  
>  This also seems unreasonable to me because so many backslashes are 
confusing and it seems to me that no other systems have similar behavior

Like I said before, it's because java string literal plays a role here, try 
to use `"""string"""` and it can be much better. If we wanna compare with other 
systems, we should compare the SQL shell.

Migration is a real problem, but it's also a problem for string literals. 
We can add a config to fallback to old SQL parser behavior.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17100: [SPARK-13947][SQL] PySpark DataFrames: The error message...

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17100
  
Build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17100: [SPARK-13947][SQL] PySpark DataFrames: The error message...

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17100
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76401/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17100: [SPARK-13947][SQL] PySpark DataFrames: The error message...

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17100
  
**[Test build #76401 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76401/testReport)**
 for PR 17100 at commit 
[`4ac8143`](https://github.com/apache/spark/commit/4ac8143cf63f5b4777a66f236824671b0bb05933).
 * This patch passes all tests.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17346: [SPARK-19965][SS] DataFrame batch reader may fail...

2017-05-02 Thread lw-lin
Github user lw-lin commented on a diff in the pull request:

https://github.com/apache/spark/pull/17346#discussion_r114468906
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala
 ---
@@ -36,20 +37,27 @@ import org.apache.spark.util.SerializableConfiguration
  * A [[FileIndex]] that generates the list of files to process by 
recursively listing all the
  * files present in `paths`.
  *
- * @param rootPaths the list of root table paths to scan
+ * @param rootPathsSpecified the list of root table paths to scan (some of 
which might be
+ *   filtered out later)
  * @param parameters as set of options to control discovery
  * @param partitionSchema an optional partition schema that will be use to 
provide types for the
  *discovered partitions
  */
 class InMemoryFileIndex(
 sparkSession: SparkSession,
-override val rootPaths: Seq[Path],
+rootPathsSpecified: Seq[Path],
 parameters: Map[String, String],
 partitionSchema: Option[StructType],
 fileStatusCache: FileStatusCache = NoopCache)
   extends PartitioningAwareFileIndex(
 sparkSession, parameters, partitionSchema, fileStatusCache) {
 
+  // Filter out streaming metadata dirs or files such as 
"/.../_spark_metadata" (the metadata dir)
+  // or "/.../_spark_metadata/0" (a file in the metadata dir). 
`rootPathsSpecified` might contain
+  // such streaming metadata dir or files, e.g. when after globbing 
"basePath/*" where "basePath"
+  // is the output of a streaming query.
+  override val rootPaths = 
rootPathsSpecified.filterNot(FileStreamSink.ancestorIsMetadataDirectory)
--- End diff --

Yea your are quite correct! They will be filted by 
`InMemoryFileIndex.shouldFilterOut`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17346: [SPARK-19965][SS] DataFrame batch reader may fail...

2017-05-02 Thread lw-lin
Github user lw-lin commented on a diff in the pull request:

https://github.com/apache/spark/pull/17346#discussion_r114468833
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSinkSuite.scala
 ---
@@ -145,6 +147,41 @@ class FileStreamSinkSuite extends StreamTest {
 }
   }
 
+  test("partitioned writing and batch reading with 'basePath'") {
+val inputData = MemoryStream[Int]
+val ds = inputData.toDS()
+
+val outputDir = Utils.createTempDir(namePrefix = 
"stream.output").getCanonicalPath
--- End diff --

done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17346: [SPARK-19965][SS] DataFrame batch reader may fail...

2017-05-02 Thread lw-lin
Github user lw-lin commented on a diff in the pull request:

https://github.com/apache/spark/pull/17346#discussion_r114468801
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSink.scala
 ---
@@ -53,6 +53,29 @@ object FileStreamSink extends Logging {
   case _ => false
 }
   }
+
+  /**
+   * Returns true if the path is the metadata dir or its ancestor is the 
metadata dir.
+   * E.g.:
+   *  - ancestorIsMetadataDirectory(/.../_spark_metadata) => true
+   *  - ancestorIsMetadataDirectory(/.../_spark_metadata/0) => true
+   *  - ancestorIsMetadataDirectory(/a/b/c) => false
+   */
+  def ancestorIsMetadataDirectory(path: Path): Boolean = {
+require(path.isAbsolute, s"$path is required to be absolute")
--- End diff --

switched to `makeQualified`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17346: [SPARK-19965][SS] DataFrame batch reader may fail...

2017-05-02 Thread lw-lin
Github user lw-lin commented on a diff in the pull request:

https://github.com/apache/spark/pull/17346#discussion_r114468821
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSink.scala
 ---
@@ -53,6 +53,29 @@ object FileStreamSink extends Logging {
   case _ => false
 }
   }
+
+  /**
+   * Returns true if the path is the metadata dir or its ancestor is the 
metadata dir.
+   * E.g.:
+   *  - ancestorIsMetadataDirectory(/.../_spark_metadata) => true
+   *  - ancestorIsMetadataDirectory(/.../_spark_metadata/0) => true
+   *  - ancestorIsMetadataDirectory(/a/b/c) => false
+   */
+  def ancestorIsMetadataDirectory(path: Path): Boolean = {
+require(path.isAbsolute, s"$path is required to be absolute")
+var currentPath = path
+var finished = false
+while (!finished) {
--- End diff --

fixed. good point!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17346: [SPARK-19965][SS] DataFrame batch reader may fail to inf...

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17346
  
**[Test build #76408 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76408/testReport)**
 for PR 17346 at commit 
[`49ee54d`](https://github.com/apache/spark/commit/49ee54d7a644b916e5c1c2c58f4cd1e011c7abc6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17251
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17251
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76403/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17251
  
**[Test build #76403 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76403/testReport)**
 for PR 17251 at commit 
[`2150ce5`](https://github.com/apache/spark/commit/2150ce552a7a02d656329761e04a7fcb38e5e648).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17836: [SPARK-20566][SQL] ColumnVector should support `appendFl...

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17836
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17836: [SPARK-20566][SQL] ColumnVector should support `appendFl...

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17836
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76404/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17836: [SPARK-20566][SQL] ColumnVector should support `appendFl...

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17836
  
**[Test build #76404 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76404/testReport)**
 for PR 17836 at commit 
[`d979d0f`](https://github.com/apache/spark/commit/d979d0f482758d2d763fc88f456388ba9caf9274).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17736: [SPARK-20399][SQL] Can't use same regex pattern between ...

2017-05-02 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/17736
  
For the regex, currently users need to write something like 
`df.filter("value rlike '^x20[x20-x23]+$'")`. This seems 
unreasonable to me and it is this patch tries to fix.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add a Bucketizer that can bin mul...

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17819
  
**[Test build #76406 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76406/testReport)**
 for PR 17819 at commit 
[`6ff9c79`](https://github.com/apache/spark/commit/6ff9c7998688107a835875ea41e6fe9576a1558c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17770: [SPARK-20392][SQL] Set barrier to prevent re-entering a ...

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17770
  
**[Test build #76407 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76407/testReport)**
 for PR 17770 at commit 
[`a076d83`](https://github.com/apache/spark/commit/a076d83cfc9e87f8234eda639957d663d87eaac4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17736: [SPARK-20399][SQL] Can't use same regex pattern between ...

2017-05-02 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/17736
  
yea, much clearer now, and the string literal in Spark 2.0 looks more 
reasonable.

For the regex, I think it's unfair to compare `df.filter("value rlike 
'^\\x20[\\x20-\\x23]+$'")` with 
`df.filter($"value".rlike("^\\x20[\\x20-\\x23]+$"))`, because java string 
literal also plays a role here.

Think about a SQL shell, users can write `SELECT ... WHERE value RLIKE 
'^\\x20[\\x20-\\x23]+$'`, which is consistent with the java version, so I think 
the current SQL parser is corrected.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17836: [SPARK-20566][SQL] ColumnVector should support `appendFl...

2017-05-02 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/17836
  
Thank you for review, @kiszk 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17836: [SPARK-20566][SQL] ColumnVector should support `appendFl...

2017-05-02 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/17836
  
LGTM
cc: @sameeragarwal @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17698: [SPARK-20403][SQL][Documentation]Modify the instructions...

2017-05-02 Thread 10110346
Github user 10110346 commented on the issue:

https://github.com/apache/spark/pull/17698
  
@rxin, would you help me review it again?thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17838: [SPARK-20567] Lazily bind in GenerateExec

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17838
  
**[Test build #76405 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76405/testReport)**
 for PR 17838 at commit 
[`7c86b0e`](https://github.com/apache/spark/commit/7c86b0e997e87bce77cdf6064975ff5cab245c08).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17736: [SPARK-20399][SQL] Can't use same regex pattern between ...

2017-05-02 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/17736
  
@cloud-fan I've updated the example. Please check if it is better for you. 
Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17838: [SPARK-20567] Lazily bind in GenerateExec

2017-05-02 Thread brkyvz
Github user brkyvz commented on the issue:

https://github.com/apache/spark/pull/17838
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17838: [SPARK-20567] Lazily bind in GenerateExec

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17838
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76399/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17838: [SPARK-20567] Lazily bind in GenerateExec

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17838
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17838: [SPARK-20567] Lazily bind in GenerateExec

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17838
  
**[Test build #76399 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76399/testReport)**
 for PR 17838 at commit 
[`7c86b0e`](https://github.com/apache/spark/commit/7c86b0e997e87bce77cdf6064975ff5cab245c08).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17735: [SPARK-20441][SPARK-20432][SS] Within the same streaming...

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17735
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17735: [SPARK-20441][SPARK-20432][SS] Within the same streaming...

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17735
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76398/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17735: [SPARK-20441][SPARK-20432][SS] Within the same streaming...

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17735
  
**[Test build #76398 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76398/testReport)**
 for PR 17735 at commit 
[`63ed28a`](https://github.com/apache/spark/commit/63ed28ac1f9062b0f7d88f91a8eada601df6f6e9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17794: [SPARK-20518][CORE]Supplement the new blockidsuite unit ...

2017-05-02 Thread heary-cao
Github user heary-cao commented on the issue:

https://github.com/apache/spark/pull/17794
  
@jerryshao 
I  have been updated it.
please review it again.
thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17833: [SPARK-20558][CORE] clear InheritableThreadLocal ...

2017-05-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/17833


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17833: [SPARK-20558][CORE] clear InheritableThreadLocal variabl...

2017-05-02 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/17833
  
merging to master/2.2/2.1/2.0


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17833: [SPARK-20558][CORE] clear InheritableThreadLocal variabl...

2017-05-02 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/17833
  
> I think it only cleans localProperties in the current thread. 
localProperties overrides childValue and always clones a new Properties for 
child threads.

Yea, that's true. If some child threads are already there and cloned the 
local properties, we can't clean them. But we can avoid future child threads to 
inherit this local properties, which can reduce the memory footprint a lot if 
users create new `SparkContext` and stop it, and repeat this many times.

Anyway, I'll merge this PR and see if it can fix the flaky test. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17100: [SPARK-13947][SQL] PySpark DataFrames: The error message...

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17100
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76402/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17100: [SPARK-13947][SQL] PySpark DataFrames: The error message...

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17100
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17100: [SPARK-13947][SQL] PySpark DataFrames: The error message...

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17100
  
**[Test build #76402 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76402/testReport)**
 for PR 17100 at commit 
[`766a033`](https://github.com/apache/spark/commit/766a033f19602dbd7da6eff96947236c8c0fd2a2).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17556: [SPARK-16957][MLlib] Use midpoints for split valu...

2017-05-02 Thread sethah
Github user sethah commented on a diff in the pull request:

https://github.com/apache/spark/pull/17556#discussion_r114457816
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala ---
@@ -1037,7 +1042,8 @@ private[spark] object RandomForest extends Logging {
   // makes the gap between currentCount and targetCount smaller,
   // previous value is a split threshold.
   if (previousGap < currentGap) {
-splitsBuilder += valueCounts(index - 1)._1
+// perhaps weighted mean will be used later, see SPARK-16957 
and Github PR 17556.
--- End diff --

Comments like these tend to just get left around and sit there forever. 
Unless we file a _new_ JIRA that intends to decide on future behavior, I would 
like to remove this comment altogether. Otherwise, no one will follow up on 
this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17835: [SPARK-20557] [SQL] Improve the error message for...

2017-05-02 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/17835#discussion_r114460827
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
 ---
@@ -230,7 +230,9 @@ object JdbcUtils extends Logging {
   // scalastyle:on
 }
 
-if (answer == null) throw new SQLException("Unsupported type " + 
sqlType)
+if (answer == null) {
+  throw new SQLException("Unsupported type " + 
JDBCType.valueOf(sqlType).getName)
--- End diff --

Hi, @gatorsmile .
Then, it seems that we need to consider `IllegalArgumentException` from 
`JDBCType.valueOf` then.
Can we do that here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17723: [SPARK-20434][YARN][CORE] Move kerberos delegation token...

2017-05-02 Thread mridulm
Github user mridulm commented on the issue:

https://github.com/apache/spark/pull/17723
  

@vanzin:
> @mridulm the main argument for just dealing with Hadoop security is that 
it's been sufficient since the inception of Spark. I have never seen anyone ask 
for integration with any other type of system. 

Support for long running applications (which require token renewal, etc) 
was added much later in spark - in 1.4 IIRC : 
https://issues.apache.org/jira/browse/SPARK-5342

> Would it make you more comfortable if the new API were kept 
private[spark]? It would limit extensibility in the case of Mesos (would be 
restricted to built-in providers), but would free us from this discussion and 
allow progress to happen.

If we are not exposing an api for spark core, while maintaining backward 
compatibility - I am fine with the change. (Please see [1] below too)
Either we move implementations from yarn to core - or a new module which 
yarn and mesos depends on (if that helps).


@mgummelt:
> @mridulm You keep mentioning hadoop-security as if it's a library. It's 
not. UserGroupInformation and Credentials, for example, are are security 
classes in hadoop-commons, which core already depends on. So this coupling 
already exists. Is your concern that we're increasing this coupling?

When I refer to hadoop-security, I do not mean a maven package - but use of 
`org.apache.hadoop.security` for handling authentication/authorization. 
hadoop-common contains a collection of libraries & utilities, bundled together 
for convenience; `security` package implements classes in context of security 
design of hadoop.

[1] If we want to base security in spark on hadoop-security, and think it 
is sufficient for our current & anticipated needs -  we should be explicit 
about the (design) dependency.
We should solicit opinion about it in dev@ and proceed based on feedback 
from from the list (this PR discussion might not be followed by many interested 
parties).
As @vanzin mentioned above, there have not integration requests for other 
systems - so perhaps it is sufficient for our needs and I might be being overly 
paranoid (due to my past experiences with api design).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16677: [SPARK-19355][SQL] Use map output statistices to improve...

2017-05-02 Thread watermen
Github user watermen commented on the issue:

https://github.com/apache/spark/pull/16677
  
When we create a DataSource table like below
```sql
create table t1 using parquet select * from src limit 1000;
```
It will call `CollectLimitExec.doExecute`, it also use SinglePartition, so 
we should cover this case.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17251
  
**[Test build #76403 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76403/testReport)**
 for PR 17251 at commit 
[`2150ce5`](https://github.com/apache/spark/commit/2150ce552a7a02d656329761e04a7fcb38e5e648).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...

2017-05-02 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/17251
  
Retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17836: [SPARK-20566][SQL] ColumnVector should support `appendFl...

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17836
  
**[Test build #76404 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76404/testReport)**
 for PR 17836 at commit 
[`d979d0f`](https://github.com/apache/spark/commit/d979d0f482758d2d763fc88f456388ba9caf9274).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17540: [SPARK-20213][SQL][UI] Fix DataFrameWriter operations in...

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17540
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17540: [SPARK-20213][SQL][UI] Fix DataFrameWriter operations in...

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17540
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76396/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17825
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76400/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17540: [SPARK-20213][SQL][UI] Fix DataFrameWriter operations in...

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17540
  
**[Test build #76396 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76396/testReport)**
 for PR 17540 at commit 
[`69ed59e`](https://github.com/apache/spark/commit/69ed59e2d723fff9756671b390765f9106b5b720).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17825
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17825
  
**[Test build #76400 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76400/testReport)**
 for PR 17825 at commit 
[`32fd836`](https://github.com/apache/spark/commit/32fd8361d693d106658020d89dc6563aade6abb7).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17821: [SPARK-20529][Core]Allow worker and master work with a p...

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17821
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76392/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17821: [SPARK-20529][Core]Allow worker and master work with a p...

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17821
  
**[Test build #76392 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76392/testReport)**
 for PR 17821 at commit 
[`f4699ad`](https://github.com/apache/spark/commit/f4699add54bb3fbe40d489aeabbf8f192e6ecb1f).
 * This patch **fails from timeout after a configured wait of \`250m\`**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17821: [SPARK-20529][Core]Allow worker and master work with a p...

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17821
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias

2017-05-02 Thread zero323
Github user zero323 commented on a diff in the pull request:

https://github.com/apache/spark/pull/17825#discussion_r114457011
  
--- Diff: R/pkg/R/column.R ---
@@ -132,17 +132,24 @@ createMethods()
 
 #' alias
 #'
-#' Set a new name for a column
+#' Set a new name for an object. Equivalent to SQL "AS" keyword.
--- End diff --

Moving to `generics.R` sounds good.  "Column or SparkDataFrame" in place of 
"object" as well.

Regarding "AS"... In SQL it can be used with both expressions and tables so 
I deliberately didn't quantify this with `Column`.

I am not sure if we really need to state that it returns a new object. 
Maybe  _Return a new Column or SparkDataFrame with an alias. Equivalent to SQL 
"AS" keyword._? But it doesn't sound great.
 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17100: [SPARK-13947][SQL] PySpark DataFrames: The error message...

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17100
  
**[Test build #76402 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76402/testReport)**
 for PR 17100 at commit 
[`766a033`](https://github.com/apache/spark/commit/766a033f19602dbd7da6eff96947236c8c0fd2a2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17825
  
**[Test build #76400 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76400/testReport)**
 for PR 17825 at commit 
[`32fd836`](https://github.com/apache/spark/commit/32fd8361d693d106658020d89dc6563aade6abb7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17100: [SPARK-13947][SQL] PySpark DataFrames: The error message...

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17100
  
**[Test build #76401 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76401/testReport)**
 for PR 17100 at commit 
[`4ac8143`](https://github.com/apache/spark/commit/4ac8143cf63f5b4777a66f236824671b0bb05933).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17838: [SPARK-20567] Lazily bind in GenerateExec

2017-05-02 Thread brkyvz
Github user brkyvz commented on the issue:

https://github.com/apache/spark/pull/17838
  
LGTM!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17818: [SPARK-20544] R wrapper for input_file_name

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17818
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76397/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17818: [SPARK-20544] R wrapper for input_file_name

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17818
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17838: [SPARK-20567] Lazily bind in GenerateExec

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17838
  
**[Test build #76399 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76399/testReport)**
 for PR 17838 at commit 
[`7c86b0e`](https://github.com/apache/spark/commit/7c86b0e997e87bce77cdf6064975ff5cab245c08).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17818: [SPARK-20544] R wrapper for input_file_name

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17818
  
**[Test build #76397 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76397/testReport)**
 for PR 17818 at commit 
[`72f3fb7`](https://github.com/apache/spark/commit/72f3fb739240b9f27fcab47cbb9d82aff3272f93).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17838: [SPARK-20567] Lazily bind in GenerateExec

2017-05-02 Thread marmbrus
GitHub user marmbrus opened a pull request:

https://github.com/apache/spark/pull/17838

[SPARK-20567] Lazily bind in GenerateExec

It is not valid to eagerly bind with the child's output as this causes 
failures when we attempt to canonicalize the plan (replacing the attribute 
references with dummies).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/marmbrus/spark fixBindExplode

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17838.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17838


commit 7c86b0e997e87bce77cdf6064975ff5cab245c08
Author: Michael Armbrust 
Date:   2017-05-03T00:13:52Z

[SPARK-20567] Lazily bind in GenerateExec




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17735: [SPARK-20441][SPARK-20432][SS] Within the same st...

2017-05-02 Thread lw-lin
Github user lw-lin commented on a diff in the pull request:

https://github.com/apache/spark/pull/17735#discussion_r114453379
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamSuite.scala ---
@@ -120,6 +141,32 @@ class StreamSuite extends StreamTest {
 assertDF(df)
   }
 
+  test("Within the same streaming query, one StreamingRelation should only 
be transformed to one " +
+"StreamingExecutionRelation") {
+val df = 
spark.readStream.format(classOf[FakeDefaultSource].getName).load()
+var query: StreamExecution = null
+try {
+  query =
+df.union(df)
+  .writeStream
+  .format("memory")
+  .queryName("memory")
+  .start()
+  .asInstanceOf[StreamingQueryWrapper]
+  .streamingQuery
+  val executionRelations =
+query
+  .logicalPlan
--- End diff --

ah, i see.
fixed. thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17735: [SPARK-20441][SPARK-20432][SS] Within the same streaming...

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17735
  
**[Test build #76398 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76398/testReport)**
 for PR 17735 at commit 
[`63ed28a`](https://github.com/apache/spark/commit/63ed28ac1f9062b0f7d88f91a8eada601df6f6e9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #9699: [SPARK-2344] [MLlib] Add fuzzifier (m) parameter to KMean...

2017-05-02 Thread LuciferWong
Github user LuciferWong commented on the issue:

https://github.com/apache/spark/pull/9699
  
This is what I wrote it myself : https://github.com/LuciferWong/spark 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17835: [SPARK-20557] [SQL] Improve the error message for unsupp...

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17835
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17835: [SPARK-20557] [SQL] Improve the error message for unsupp...

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17835
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76393/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17835: [SPARK-20557] [SQL] Improve the error message for unsupp...

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17835
  
**[Test build #76393 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76393/testReport)**
 for PR 17835 at commit 
[`d199f7b`](https://github.com/apache/spark/commit/d199f7bdfcdda3859efc0d021b42ab65915db009).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17818: [SPARK-20544] R wrapper for input_file_name

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17818
  
**[Test build #76397 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76397/testReport)**
 for PR 17818 at commit 
[`72f3fb7`](https://github.com/apache/spark/commit/72f3fb739240b9f27fcab47cbb9d82aff3272f93).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17818: [SPARK-20544] R wrapper for input_file_name

2017-05-02 Thread zero323
Github user zero323 commented on a diff in the pull request:

https://github.com/apache/spark/pull/17818#discussion_r114450054
  
--- Diff: R/pkg/R/functions.R ---
@@ -3890,3 +3890,23 @@ setMethod("not",
 jc <- callJStatic("org.apache.spark.sql.functions", "not", 
x@jc)
 column(jc)
   })
+
+#' input_file_name
+#'
+#' Creates a string column for the file name of the current Spark task.
--- End diff --

How about the new one?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17818: [SPARK-20544] R wrapper for input_file_name

2017-05-02 Thread zero323
Github user zero323 commented on a diff in the pull request:

https://github.com/apache/spark/pull/17818#discussion_r114450006
  
--- Diff: R/pkg/R/functions.R ---
@@ -3974,3 +3974,23 @@ setMethod("grouping_id",
 jc <- callJStatic("org.apache.spark.sql.functions", 
"grouping_id", jcols)
 column(jc)
   })
+
+#' input_file_name
+#'
+#' Creates a string column for the file name of the current Spark task.
+#'
+#' @rdname input_file_name
+#' @name input_file_name
+#' @aliases input_file_name,missing-method
--- End diff --

Done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17831: [SPARK-18777][PYTHON][SQL] Return UDF from udf.register

2017-05-02 Thread zero323
Github user zero323 commented on the issue:

https://github.com/apache/spark/pull/17831
  
@gatorsmile This sounds reasonable but I am not sure if I fully understand 
your concerns.

If anything this brings PySpark closer to the Scala API. At this moment we 
have

```
 registerFunction(self, name: str, f: Callable[[T], U], returnType: 
DataType) -> None: ...
```

and we would move to:

```
 registerFunction(self, name: str, f: Callable[[T], U], returnType: 
DataType) -> Callable[[Column, ...], Column]: ...
```

This, as pointed out by @holdenk, matches `register` API for `Function0` .. 
`Function22`.

If you're planning breaking changes in the Scala API, it may render this PR 
obsolete, but we don't commit here to any particular implementation. The only 
promise here is that registering udf for SQL applications, returns an object, 
which can be used with `DataFrame` API.  I believe this sounds like a 
reasonable requirement for any upcoming API. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17540: [SPARK-20213][SQL][UI] Fix DataFrameWriter operations in...

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17540
  
**[Test build #76396 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76396/testReport)**
 for PR 17540 at commit 
[`69ed59e`](https://github.com/apache/spark/commit/69ed59e2d723fff9756671b390765f9106b5b720).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17836: [SPARK-20566] ColumnVector should support `appendFloats`...

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17836
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76394/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17836: [SPARK-20566] ColumnVector should support `appendFloats`...

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17836
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17836: [SPARK-20566] ColumnVector should support `appendFloats`...

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17836
  
**[Test build #76394 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76394/testReport)**
 for PR 17836 at commit 
[`a03f927`](https://github.com/apache/spark/commit/a03f92740aa36c85519c170112473a4bae89ae99).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17821: [SPARK-20529][Core]Allow worker and master work w...

2017-05-02 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/17821#discussion_r114446352
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala 
---
@@ -266,7 +289,8 @@ private[deploy] class Worker(
 if (registerMasterFutures != null) {
   registerMasterFutures.foreach(_.cancel(true))
 }
-val masterAddress = masterRef.address
+val masterAddress =
+  if (preferConfiguredMasterAddress) 
masterAddressToConnect.get else masterRef.address
--- End diff --

Right now `masterRef` and `masterAddressToConnect` are set at the same 
time. It's impossible unless we break something in future. It's better to fail 
rather than hiding the broken change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17540: [SPARK-20213][SQL][UI] Fix DataFrameWriter operations in...

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17540
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17540: [SPARK-20213][SQL][UI] Fix DataFrameWriter operations in...

2017-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17540
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76395/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17540: [SPARK-20213][SQL][UI] Fix DataFrameWriter operations in...

2017-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17540
  
**[Test build #76395 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76395/testReport)**
 for PR 17540 at commit 
[`7131c32`](https://github.com/apache/spark/commit/7131c329e34b7961ea478532faa4f202255da246).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17100: [SPARK-13947][SQL] PySpark DataFrames: The error message...

2017-05-02 Thread rberenguel
Github user rberenguel commented on the issue:

https://github.com/apache/spark/pull/17100
  
@gatorsmile Thanks for the pointers, finally found some time to come back 
to this. I'm not sure if my approach to get the `SQLConf` into `checkAnalysis` 
is the correct one in my current local changes (since it seems to change a 
possible API endpoint). I changed the current implementation in the trait to be 
named instead `def checkAnalysisWithConf(plan: LogicalPlan, conf: SQLConf): 
Unit` and added an abstract method `def checkAnalysis(plan: LogicalPlan): Unit` 
that is then implemented in `Analyzer` (where we have a `conf` we can pass 
around). I haven't fixed all the rest yet, was puzzled enough with the 
correctness of this for now ;) Thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17831: [SPARK-18777][PYTHON][SQL] Return UDF from udf.register

2017-05-02 Thread holdenk
Github user holdenk commented on the issue:

https://github.com/apache/spark/pull/17831
  
I feel like that's an unrelated challenge. I'm happy to see other 
improvements but I'm worried that we will hold up changes for things which 
aren't happening soon - is there a JIRA for these changes?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   >