date:20160816

[GitHub] spark issue #14678: [MINOR][SQL] Add missing functions for some options in S...

2016-08-16 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14678
  
Looks good, but I didn't look super carefully. 

@gatorsmile do you have time to take a more careful look at this?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14680: [SPARK-17101][SQL] Provide format identifier for TextFil...

2016-08-16 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14680
  
Can you show the before/after comparison in pr description?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14680: [SPARK-17101][SQL] Provide format identifier for TextFil...

2016-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14680
  
**[Test build #63904 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63904/consoleFull)**
 for PR 14680 at commit 
[`133e5de`](https://github.com/apache/spark/commit/133e5deff497ce1be92497344bb7e0d4e7d57c21).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14680: [SPARK-17101][SQL] Provide format identifier for ...

2016-08-16 Thread jaceklaskowski

GitHub user jaceklaskowski opened a pull request:

https://github.com/apache/spark/pull/14680

[SPARK-17101][SQL] Provide format identifier for TextFileFormat

## What changes were proposed in this pull request?

Define the format identifier that is used in Optimized Logical Plan in 
explain for text file format (following CSV and JSON formats).

```
scala> spark.read.text("people.csv").cache.explain(extended = true)
== Parsed Logical Plan ==
Relation[value#0] text

== Analyzed Logical Plan ==
value: string
Relation[value#0] text

== Optimized Logical Plan ==
InMemoryRelation [value#0], true, 1, StorageLevel(disk, memory, 
deserialized, 1 replicas)
   +- *FileScan text [value#0] Batched: false, Format: TEXT, InputPaths: 
file:/Users/jacek/dev/oss/spark/people.csv, PartitionFilters: [], 
PushedFilters: [], ReadSchema: struct

== Physical Plan ==
InMemoryTableScan [value#0]
   +- InMemoryRelation [value#0], true, 1, StorageLevel(disk, memory, 
deserialized, 1 replicas)
 +- *FileScan text [value#0] Batched: false, Format: TEXT, 
InputPaths: file:/Users/jacek/dev/oss/spark/people.csv, PartitionFilters: [], 
PushedFilters: [], ReadSchema: struct
```

## How was this patch tested?

Local build.




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jaceklaskowski/spark SPARK-17101

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14680.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14680


commit 133e5deff497ce1be92497344bb7e0d4e7d57c21
Author: Jacek Laskowski 
Date:   2016-08-17T06:43:34Z

[SPARK-17101][SQL] Provide format identifier for TextFileFormat




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13704: [SPARK-15985][SQL] Eliminate redundant cast from an arra...

2016-08-16 Thread kiszk

Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/13704
  
ping @liancheng


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14676: [SPARK-16947][SQL] Support type coercion and foldable ex...

2016-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14676
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14676: [SPARK-16947][SQL] Support type coercion and foldable ex...

2016-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14676
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63897/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14676: [SPARK-16947][SQL] Support type coercion and foldable ex...

2016-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14676
  
**[Test build #63897 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63897/consoleFull)**
 for PR 14676 at commit 
[`092605b`](https://github.com/apache/spark/commit/092605be786adae0aa241da43e25d1f1be5de492).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14576: [SPARK-16391][SQL] Support partial aggregation for reduc...

2016-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14576
  
**[Test build #63903 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63903/consoleFull)**
 for PR 14576 at commit 
[`1b57b74`](https://github.com/apache/spark/commit/1b57b7452557ec34821159ca35c8ff5189d825c0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14679: [SPARK-17102][SQL] bypass UserDefinedGenerator for json ...

2016-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14679
  
**[Test build #63902 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63902/consoleFull)**
 for PR 14679 at commit 
[`c04084d`](https://github.com/apache/spark/commit/c04084d29c3d049053a5929d6860006549c96573).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14576: [SPARK-16391][SQL] Support partial aggregation for reduc...

2016-08-16 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14576
  
cc @cloud-fan can you review this? This should target branch-2.0 too.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14679: [SPARK-17102][SQL] bypass UserDefinedGenerator for json ...

2016-08-16 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14679
  
cc @yhuai 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14679: [SPARK-17102][SQL] bypass UserDefinedGenerator fo...

2016-08-16 Thread cloud-fan

GitHub user cloud-fan opened a pull request:

https://github.com/apache/spark/pull/14679

[SPARK-17102][SQL] bypass UserDefinedGenerator for json format check

## What changes were proposed in this pull request?

We use reflection to convert `TreeNode` to json string, and currently don't 
support arbitrary object. `UserDefinedGenerator` takes a function object, so we 
should skip json format test for it, or the tests can be flacky, e.g. 
`DataFrameSuite.simple explode`, this test always fail in branch 1.6, but pass 
in master, because of the different scala versions.

## How was this patch tested?

N/A



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cloud-fan/spark json

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14679.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14679


commit c04084d29c3d049053a5929d6860006549c96573
Author: Wenchen Fan 
Date:   2016-08-17T06:32:17Z

bypass UserDefinedGenerator for json format check




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14576: [SPARK-16391][SQL] ReduceAggregator and partial a...

2016-08-16 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/14576#discussion_r75069508
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/expressions/ReduceAggregator.scala 
---
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.expressions
+
+import org.apache.spark.annotation.Experimental
+import org.apache.spark.sql.Encoder
+import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder
+
+/**
+ * :: Experimental ::
+ * An aggregator that uses a single associative and commutative reduce 
function. This reduce
+ * function can be used to go through all input values and reduces them to 
a single value.
+ * If there is no input, a null value is returned.
+ *
+ * @since 2.1.0
+ */
+@Experimental
+abstract class ReduceAggregator[T] extends Aggregator[T, (Boolean, T), T] {
--- End diff --

I think you are right -- I will keep it private to begin with.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14676: [SPARK-16947][SQL] Support type coercion and foldable ex...

2016-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14676
  
Build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14676: [SPARK-16947][SQL] Support type coercion and foldable ex...

2016-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14676
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63896/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14676: [SPARK-16947][SQL] Support type coercion and foldable ex...

2016-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14676
  
**[Test build #63896 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63896/consoleFull)**
 for PR 14676 at commit 
[`fcc3caf`](https://github.com/apache/spark/commit/fcc3cafe08c9549be51b33b5ac993fbb3fa46d37).
 * This patch **fails Spark unit tests**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14537: [SPARK-16948][SQL] Querying empty partitioned orc tables...

2016-08-16 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14537
  
cc @cloud-fan  @gatorsmile can you also take a look at this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14537: [SPARK-16948][SQL] Querying empty partitioned orc...

2016-08-16 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/14537#discussion_r75067717
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala ---
@@ -287,21 +287,21 @@ private[hive] class 
HiveMetastoreCatalog(sparkSession: SparkSession) extends Log
   new Path(metastoreRelation.catalogTable.storage.locationUri.get),
   partitionSpec)
 
-val inferredSchema = if (fileType.equals("parquet")) {
-  val inferredSchema =
-defaultSource.inferSchema(sparkSession, options, 
fileCatalog.allFiles())
-  inferredSchema.map { inferred =>
-ParquetFileFormat.mergeMetastoreParquetSchema(metastoreSchema, 
inferred)
+val inferredSchema =
+  defaultSource.inferSchema(sparkSession, options, 
fileCatalog.allFiles())
--- End diff --

can we be more specific here? e.g. doing it only if it is parquet or orc.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14537: [SPARK-16948][SQL] Querying empty partitioned orc...

2016-08-16 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/14537#discussion_r75067639
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcFileFormat.scala ---
@@ -56,10 +59,11 @@ private[sql] class OrcFileFormat
   sparkSession: SparkSession,
   options: Map[String, String],
   files: Seq[FileStatus]): Option[StructType] = {
-OrcFileOperator.readSchema(
-  files.map(_.getPath.toUri.toString),
-  Some(sparkSession.sessionState.newHadoopConf())
-)
+val schema = Try(OrcFileOperator.readSchema(
+files.map(_.getPath.toUri.toString),
+Some(sparkSession.sessionState.newHadoopConf(
+  .recover { case _: FileNotFoundException => None }
--- End diff --

why are we ignoring file not found exception here?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14624: Fix PySpark DataFrameWriter JDBC method docstring...

2016-08-16 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14624


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14677: [MINOR][DOC] Fix the descriptions for `properties...

2016-08-16 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14677


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14677: [MINOR][DOC] Fix the descriptions for `properties` argum...

2016-08-16 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14677
  
Thanks - merging in master/2.0.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14657: [SPARK-17068][SQL] Make view-usage visible during...

2016-08-16 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14657


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14657: [SPARK-17068][SQL] Make view-usage visible during analys...

2016-08-16 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14657
  
(I didn't merge this in 2.0)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14657: [SPARK-17068][SQL] Make view-usage visible during analys...

2016-08-16 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14657
  
LGTM - merging in master/2.0.
.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14678: [MINOR][SQL] Add missing functions for some optio...

2016-08-16 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/14678#discussion_r75066526
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -322,11 +322,6 @@ object SQLConf {
   .intConf
   .createWithDefault(4000)
 
-  val PARTITION_DISCOVERY_ENABLED = 
SQLConfigBuilder("spark.sql.sources.partitionDiscovery.enabled")
--- End diff --

It seems this was removed while refactoring.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14678: [MINOR][SQL] Add missing functions for some options in S...

2016-08-16 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/14678
  
cc @rxin, Could you check if this make sense please?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14155: [SPARK-16498][SQL] move hive hack for data source...

2016-08-16 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14155#discussion_r75066432
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -144,16 +161,147 @@ private[spark] class HiveExternalCatalog(client: 
HiveClient, hadoopConf: Configu
 assert(tableDefinition.identifier.database.isDefined)
 val db = tableDefinition.identifier.database.get
 requireDbExists(db)
+verifyTableProperties(tableDefinition)
+
+if (tableDefinition.provider == Some("hive") || 
tableDefinition.tableType == VIEW) {
+  client.createTable(tableDefinition, ignoreIfExists)
+} else {
+  val tableProperties = tableMetadataToProperties(tableDefinition)
+
+  def newSparkSQLSpecificMetastoreTable(): CatalogTable = {
+tableDefinition.copy(
+  schema = new StructType,
+  partitionColumnNames = Nil,
+  bucketSpec = None,
+  properties = tableDefinition.properties ++ tableProperties)
+  }
+
+  def newHiveCompatibleMetastoreTable(serde: HiveSerDe, path: String): 
CatalogTable = {
+tableDefinition.copy(
+  storage = tableDefinition.storage.copy(
+locationUri = Some(new Path(path).toUri.toString),
+inputFormat = serde.inputFormat,
+outputFormat = serde.outputFormat,
+serde = serde.serde
+  ),
+  properties = tableDefinition.properties ++ tableProperties)
+  }
+
+  val qualifiedTableName = tableDefinition.identifier.quotedString
+  val maybeSerde = 
HiveSerDe.sourceToSerDe(tableDefinition.provider.get)
+  val maybePath = new 
CaseInsensitiveMap(tableDefinition.storage.properties).get("path")
+  val skipHiveMetadata = tableDefinition.storage.properties
+.getOrElse("skipHiveMetadata", "false").toBoolean
+
+  val (hiveCompatibleTable, logMessage) = (maybeSerde, maybePath) 
match {
+case _ if skipHiveMetadata =>
+  val message =
+s"Persisting data source table $qualifiedTableName into Hive 
metastore in" +
+  "Spark SQL specific format, which is NOT compatible with 
Hive."
+  (None, message)
+
+// our bucketing is un-compatible with hive(different hash 
function)
+case _ if tableDefinition.bucketSpec.nonEmpty =>
+  val message =
+s"Persisting bucketed data source table $qualifiedTableName 
into " +
+  "Hive metastore in Spark SQL specific format, which is NOT 
compatible with Hive. "
+  (None, message)
+
+case (Some(serde), Some(path)) =>
+  val message =
+s"Persisting data source table $qualifiedTableName with a 
single input path " +
+  s"into Hive metastore in Hive compatible format."
+  (Some(newHiveCompatibleMetastoreTable(serde, path)), message)
+
+case (Some(_), None) =>
+  val message =
+s"Data source table $qualifiedTableName is not file based. 
Persisting it into " +
+  s"Hive metastore in Spark SQL specific format, which is NOT 
compatible with Hive."
+  (None, message)
+
+case _ =>
+  val provider = tableDefinition.provider.get
+  val message =
+s"Couldn't find corresponding Hive SerDe for data source 
provider $provider. " +
+  s"Persisting data source table $qualifiedTableName into Hive 
metastore in " +
+  s"Spark SQL specific format, which is NOT compatible with 
Hive."
+  (None, message)
+  }
+
+  (hiveCompatibleTable, logMessage) match {
+case (Some(table), message) =>
+  // We first try to save the metadata of the table in a Hive 
compatible way.
+  // If Hive throws an error, we fall back to save its metadata in 
the Spark SQL
+  // specific way.
+  try {
+logInfo(message)
+saveTableIntoHive(table, ignoreIfExists)
+  } catch {
+case NonFatal(e) =>
+  val warningMessage =
+s"Could not persist 
${tableDefinition.identifier.quotedString} in a Hive " +
+  "compatible way. Persisting it into Hive metastore in 
Spark SQL specific format."
+  logWarning(warningMessage, e)
+  saveTableIntoHive(newSparkSQLSpecificMetastoreTable(), 
ignoreIfExists)
+  }
+
+case (None, message) =>
+  logWarning(message)
+  saveTableIntoHive(newSparkSQLSpecificMetastoreTable(), 
ignoreIfExists)
+  }
+}
+  }
+
+  private def tableMetadataToP

[GitHub] spark pull request #14155: [SPARK-16498][SQL] move hive hack for data source...

2016-08-16 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14155#discussion_r75066449
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -144,16 +161,147 @@ private[spark] class HiveExternalCatalog(client: 
HiveClient, hadoopConf: Configu
 assert(tableDefinition.identifier.database.isDefined)
 val db = tableDefinition.identifier.database.get
 requireDbExists(db)
+verifyTableProperties(tableDefinition)
+
+if (tableDefinition.provider == Some("hive") || 
tableDefinition.tableType == VIEW) {
+  client.createTable(tableDefinition, ignoreIfExists)
+} else {
+  val tableProperties = tableMetadataToProperties(tableDefinition)
+
+  def newSparkSQLSpecificMetastoreTable(): CatalogTable = {
+tableDefinition.copy(
+  schema = new StructType,
+  partitionColumnNames = Nil,
+  bucketSpec = None,
+  properties = tableDefinition.properties ++ tableProperties)
+  }
+
+  def newHiveCompatibleMetastoreTable(serde: HiveSerDe, path: String): 
CatalogTable = {
+tableDefinition.copy(
+  storage = tableDefinition.storage.copy(
+locationUri = Some(new Path(path).toUri.toString),
+inputFormat = serde.inputFormat,
+outputFormat = serde.outputFormat,
+serde = serde.serde
+  ),
+  properties = tableDefinition.properties ++ tableProperties)
+  }
+
+  val qualifiedTableName = tableDefinition.identifier.quotedString
+  val maybeSerde = 
HiveSerDe.sourceToSerDe(tableDefinition.provider.get)
+  val maybePath = new 
CaseInsensitiveMap(tableDefinition.storage.properties).get("path")
--- End diff --

yup


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14155: [SPARK-16498][SQL] move hive hack for data source...

2016-08-16 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14155#discussion_r75066364
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -200,22 +348,73 @@ private[spark] class HiveExternalCatalog(client: 
HiveClient, hadoopConf: Configu
* Alter a table whose name that matches the one specified in 
`tableDefinition`,
* assuming the table exists.
*
-   * Note: As of now, this only supports altering table properties, serde 
properties,
-   * and num buckets!
+   * Note: As of now, this only supports altering table properties and 
serde properties.
*/
   override def alterTable(tableDefinition: CatalogTable): Unit = 
withClient {
 assert(tableDefinition.identifier.database.isDefined)
 val db = tableDefinition.identifier.database.get
 requireTableExists(db, tableDefinition.identifier.table)
-client.alterTable(tableDefinition)
+verifyTableProperties(tableDefinition)
+
+if (tableDefinition.provider == Some("hive") || 
tableDefinition.tableType == VIEW) {
+  client.alterTable(tableDefinition)
+} else {
+  val oldDef = client.getTable(db, tableDefinition.identifier.table)
+  // Sets the `schema`, `partitionColumnNames` and `bucketSpec` from 
the old table definition,
+  // to retain the spark specific format if it is.
+  // Also add table meta properties to table properties, to retain the 
data source table format.
+  val newDef = tableDefinition.copy(
+schema = oldDef.schema,
+partitionColumnNames = oldDef.partitionColumnNames,
+bucketSpec = oldDef.bucketSpec,
+properties = tableMetadataToProperties(tableDefinition) ++ 
tableDefinition.properties)
--- End diff --

The comment in this method says we only support alter table properties and 
storage format. Maybe we should add an assert to make this assumption clear?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14155: [SPARK-16498][SQL] move hive hack for data source...

2016-08-16 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14155#discussion_r75066296
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -200,22 +348,73 @@ private[spark] class HiveExternalCatalog(client: 
HiveClient, hadoopConf: Configu
* Alter a table whose name that matches the one specified in 
`tableDefinition`,
* assuming the table exists.
*
-   * Note: As of now, this only supports altering table properties, serde 
properties,
-   * and num buckets!
+   * Note: As of now, this only supports altering table properties and 
serde properties.
*/
   override def alterTable(tableDefinition: CatalogTable): Unit = 
withClient {
 assert(tableDefinition.identifier.database.isDefined)
 val db = tableDefinition.identifier.database.get
 requireTableExists(db, tableDefinition.identifier.table)
-client.alterTable(tableDefinition)
+verifyTableProperties(tableDefinition)
+
+if (tableDefinition.provider == Some("hive") || 
tableDefinition.tableType == VIEW) {
+  client.alterTable(tableDefinition)
+} else {
+  val oldDef = client.getTable(db, tableDefinition.identifier.table)
+  // Sets the `schema`, `partitionColumnNames` and `bucketSpec` from 
the old table definition,
+  // to retain the spark specific format if it is.
+  // Also add table meta properties to table properties, to retain the 
data source table format.
+  val newDef = tableDefinition.copy(
+schema = oldDef.schema,
+partitionColumnNames = oldDef.partitionColumnNames,
+bucketSpec = oldDef.bucketSpec,
+properties = tableMetadataToProperties(tableDefinition) ++ 
tableDefinition.properties)
+
+  client.alterTable(newDef)
+}
   }
 
   override def getTable(db: String, table: String): CatalogTable = 
withClient {
-client.getTable(db, table)
+restoreTableMetadata(client.getTable(db, table))
   }
 
   override def getTableOption(db: String, table: String): 
Option[CatalogTable] = withClient {
-client.getTableOption(db, table)
+client.getTableOption(db, table).map(restoreTableMetadata)
+  }
+
+  /**
+   * Restores table metadata from the table properties if it's a datasouce 
table. This method is
+   * kind of a opposite version of [[createTable]].
+   */
+  private def restoreTableMetadata(table: CatalogTable): CatalogTable = {
+if (table.tableType == VIEW) {
+  table
+} else {
+  getProviderFromTableProperties(table).map { provider =>
--- End diff --

no, for `hive` we won't go to the `table meta to properties` branch, thus 
if we have provider in table properties, it must not be hive.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14155: [SPARK-16498][SQL] move hive hack for data source...

2016-08-16 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14155#discussion_r75066185
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -81,6 +86,18 @@ private[spark] class HiveExternalCatalog(client: 
HiveClient, hadoopConf: Configu
 withClient { getTable(db, table) }
   }
 
+  /**
+   * If the given table properties contains datasource properties, throw 
an exception.
+   */
+  private def verifyTableProperties(table: CatalogTable): Unit = {
+val datasourceKeys = 
table.properties.keys.filter(_.startsWith(DATASOURCE_PREFIX))
+if (datasourceKeys.nonEmpty) {
+  throw new AnalysisException(s"Cannot persistent 
${table.qualifiedName} into hive metastore " +
+s"as table property keys may not start with '$DATASOURCE_PREFIX': 
" +
+datasourceKeys.mkString("[", ", ", "]"))
+}
+  }
--- End diff --

yea, e.g. `CreateTableLike`. But `restoreTableMetadata` generates table 
without the data source table properties, and should be ok here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #8880: [SPARK-5682][Core] Add encrypted shuffle in spark

2016-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/8880
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14676: [SPARK-16947][SQL] Support type coercion and foldable ex...

2016-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14676
  
**[Test build #63901 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63901/consoleFull)**
 for PR 14676 at commit 
[`4723902`](https://github.com/apache/spark/commit/47239020ac7008d8630a5c810df3db29a77795cf).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #8880: [SPARK-5682][Core] Add encrypted shuffle in spark

2016-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/8880
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63894/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #8880: [SPARK-5682][Core] Add encrypted shuffle in spark

2016-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/8880
  
**[Test build #63894 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63894/consoleFull)**
 for PR 8880 at commit 
[`77122bb`](https://github.com/apache/spark/commit/77122bb3662c65ffa5596d740efab41f5dfc3a0f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14678: [MINOR][SQL] Add missing functions for some optio...

2016-08-16 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/14678#discussion_r75065275
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -322,11 +322,6 @@ object SQLConf {
   .intConf
   .createWithDefault(4000)
 
-  val PARTITION_DISCOVERY_ENABLED = 
SQLConfigBuilder("spark.sql.sources.partitionDiscovery.enabled")
--- End diff --

It seems we always enables this and this option is not referenced anywhere.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14670: [SPARK-15285][SQL] Generated SpecificSafeProjection.appl...

2016-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14670
  
**[Test build #63900 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63900/consoleFull)**
 for PR 14670 at commit 
[`86258eb`](https://github.com/apache/spark/commit/86258eb3ca13284ad5593a105b2ddc6d0cde58e4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14676: [SPARK-16947][SQL] Support type coercion and fold...

2016-08-16 Thread petermaxlee

Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/14676#discussion_r75065140
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/inline-table.sql ---
@@ -0,0 +1,39 @@
+
+-- single row, without table and column alias
+select * from values ("one", 1);
+
+-- single row, without column alias
+select * from values ("one", 1) as data;
+
+-- single row
+select * from values ("one", 1) as data(a, b);
--- End diff --

added


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14660: [SPARK-17071][SQL] Add an option to support for reading ...

2016-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14660
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14155: [SPARK-16498][SQL] move hive hack for data source table ...

2016-08-16 Thread yhuai

Github user yhuai commented on the issue:

https://github.com/apache/spark/pull/14155
  
I have took a close look at `HiveExternalCatalog`. My overall feeling is 
that the current version still not very clear and people may have a hard time 
to understand the code. Let me also think about it and see how to improve the 
code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14660: [SPARK-17071][SQL] Add an option to support for reading ...

2016-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14660
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63895/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14660: [SPARK-17071][SQL] Add an option to support for reading ...

2016-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14660
  
**[Test build #63895 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63895/consoleFull)**
 for PR 14660 at commit 
[`0130c39`](https://github.com/apache/spark/commit/0130c39450c7bcecfb3a14db1e581c3f3a9f6a20).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14678: [MINOR][SQL] Add missing functions for some options in S...

2016-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14678
  
**[Test build #63899 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63899/consoleFull)**
 for PR 14678 at commit 
[`a57dd5e`](https://github.com/apache/spark/commit/a57dd5ebaef470dbc311e31e35b2f457321b7a9f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14155: [SPARK-16498][SQL] move hive hack for data source...

2016-08-16 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14155#discussion_r75064724
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -81,6 +86,18 @@ private[spark] class HiveExternalCatalog(client: 
HiveClient, hadoopConf: Configu
 withClient { getTable(db, table) }
   }
 
+  /**
+   * If the given table properties contains datasource properties, throw 
an exception.
+   */
+  private def verifyTableProperties(table: CatalogTable): Unit = {
+val datasourceKeys = 
table.properties.keys.filter(_.startsWith(DATASOURCE_PREFIX))
+if (datasourceKeys.nonEmpty) {
+  throw new AnalysisException(s"Cannot persistent 
${table.qualifiedName} into hive metastore " +
+s"as table property keys may not start with '$DATASOURCE_PREFIX': 
" +
+datasourceKeys.mkString("[", ", ", "]"))
+}
+  }
--- End diff --

Just realized one thing. Is it possible that we somehow create `table` 
based on a `CatalogTable` generated from `restoreTableMetadata`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13796: [SPARK-7159][ML] Add multiclass logistic regressi...

2016-08-16 Thread dbtsai

Github user dbtsai commented on a diff in the pull request:

https://github.com/apache/spark/pull/13796#discussion_r75064596
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
 ---
@@ -1082,57 +1343,62 @@ private class LogisticCostFun(
 fitIntercept: Boolean,
 standardization: Boolean,
 bcFeaturesStd: Broadcast[Array[Double]],
-regParamL2: Double) extends DiffFunction[BDV[Double]] {
+regParamL2: Double,
+multinomial: Boolean) extends DiffFunction[BDV[Double]] {
 
   val featuresStd = bcFeaturesStd.value
 
   override def calculate(coefficients: BDV[Double]): (Double, BDV[Double]) 
= {
-val numFeatures = featuresStd.length
 val coeffs = Vectors.fromBreeze(coefficients)
 val bcCoeffs = instances.context.broadcast(coeffs)
-val n = coeffs.size
+val localFeaturesStd = featuresStd
+val numFeatures = localFeaturesStd.length
+val numFeaturesPlusIntercept = if (fitIntercept) numFeatures + 1 else 
numFeatures
 
 val logisticAggregator = {
-  val seqOp = (c: LogisticAggregator, instance: Instance) => 
c.add(instance)
+  val seqOp = (c: LogisticAggregator, instance: Instance) =>
+c.add(instance)
   val combOp = (c1: LogisticAggregator, c2: LogisticAggregator) => 
c1.merge(c2)
 
   instances.treeAggregate(
-new LogisticAggregator(bcCoeffs, bcFeaturesStd, numFeatures, 
numClasses, fitIntercept)
+new LogisticAggregator(bcCoeffs, bcFeaturesStd, numFeatures, 
numClasses, fitIntercept,
+  multinomial)
   )(seqOp, combOp)
 }
 
 val totalGradientArray = logisticAggregator.gradient.toArray
-
 // regVal is the sum of coefficients squares excluding intercept for 
L2 regularization.
 val regVal = if (regParamL2 == 0.0) {
   0.0
 } else {
+  val K = if (multinomial) numClasses else numClasses - 1
   var sum = 0.0
-  coeffs.foreachActive { (index, value) =>
-// If `fitIntercept` is true, the last term which is intercept 
doesn't
-// contribute to the regularization.
-if (index != numFeatures) {
+  (0 until K).foreach { k =>
+var j = 0
+while (j < numFeatures) {
   // The following code will compute the loss of the 
regularization; also
   // the gradient of the regularization, and add back to 
totalGradientArray.
+  val value = coeffs(k * numFeaturesPlusIntercept + j)
--- End diff --

Why are you not using `foreachActive`? Although we know that `coeffs` is 
dense array, but if we implement strong rule which can know which column of 
`coeffs` will be zeros before the optimization, we may store it as sparse 
vector. As a result, using `foreachActive` will be a good abstraction. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14155: [SPARK-16498][SQL] move hive hack for data source...

2016-08-16 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14155#discussion_r75064560
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -200,22 +348,73 @@ private[spark] class HiveExternalCatalog(client: 
HiveClient, hadoopConf: Configu
* Alter a table whose name that matches the one specified in 
`tableDefinition`,
* assuming the table exists.
*
-   * Note: As of now, this only supports altering table properties, serde 
properties,
-   * and num buckets!
+   * Note: As of now, this only supports altering table properties and 
serde properties.
*/
   override def alterTable(tableDefinition: CatalogTable): Unit = 
withClient {
 assert(tableDefinition.identifier.database.isDefined)
 val db = tableDefinition.identifier.database.get
 requireTableExists(db, tableDefinition.identifier.table)
-client.alterTable(tableDefinition)
+verifyTableProperties(tableDefinition)
+
+if (tableDefinition.provider == Some("hive") || 
tableDefinition.tableType == VIEW) {
+  client.alterTable(tableDefinition)
+} else {
+  val oldDef = client.getTable(db, tableDefinition.identifier.table)
+  // Sets the `schema`, `partitionColumnNames` and `bucketSpec` from 
the old table definition,
+  // to retain the spark specific format if it is.
+  // Also add table meta properties to table properties, to retain the 
data source table format.
+  val newDef = tableDefinition.copy(
+schema = oldDef.schema,
+partitionColumnNames = oldDef.partitionColumnNames,
+bucketSpec = oldDef.bucketSpec,
+properties = tableMetadataToProperties(tableDefinition) ++ 
tableDefinition.properties)
+
+  client.alterTable(newDef)
+}
   }
 
   override def getTable(db: String, table: String): CatalogTable = 
withClient {
-client.getTable(db, table)
+restoreTableMetadata(client.getTable(db, table))
   }
 
   override def getTableOption(db: String, table: String): 
Option[CatalogTable] = withClient {
-client.getTableOption(db, table)
+client.getTableOption(db, table).map(restoreTableMetadata)
+  }
+
+  /**
+   * Restores table metadata from the table properties if it's a datasouce 
table. This method is
+   * kind of a opposite version of [[createTable]].
+   */
+  private def restoreTableMetadata(table: CatalogTable): CatalogTable = {
+if (table.tableType == VIEW) {
+  table
+} else {
+  getProviderFromTableProperties(table).map { provider =>
--- End diff --

`provider` can be `hive`, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14155: [SPARK-16498][SQL] move hive hack for data source...

2016-08-16 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14155#discussion_r75064362
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -200,22 +348,73 @@ private[spark] class HiveExternalCatalog(client: 
HiveClient, hadoopConf: Configu
* Alter a table whose name that matches the one specified in 
`tableDefinition`,
* assuming the table exists.
*
-   * Note: As of now, this only supports altering table properties, serde 
properties,
-   * and num buckets!
+   * Note: As of now, this only supports altering table properties and 
serde properties.
*/
   override def alterTable(tableDefinition: CatalogTable): Unit = 
withClient {
 assert(tableDefinition.identifier.database.isDefined)
 val db = tableDefinition.identifier.database.get
 requireTableExists(db, tableDefinition.identifier.table)
-client.alterTable(tableDefinition)
+verifyTableProperties(tableDefinition)
+
+if (tableDefinition.provider == Some("hive") || 
tableDefinition.tableType == VIEW) {
+  client.alterTable(tableDefinition)
+} else {
+  val oldDef = client.getTable(db, tableDefinition.identifier.table)
+  // Sets the `schema`, `partitionColumnNames` and `bucketSpec` from 
the old table definition,
+  // to retain the spark specific format if it is.
+  // Also add table meta properties to table properties, to retain the 
data source table format.
+  val newDef = tableDefinition.copy(
+schema = oldDef.schema,
+partitionColumnNames = oldDef.partitionColumnNames,
+bucketSpec = oldDef.bucketSpec,
+properties = tableMetadataToProperties(tableDefinition) ++ 
tableDefinition.properties)
--- End diff --

If we only look at this method, it is not clear if the new 
`tableDefinition` changes other fields like `storage`. Also, we are using the 
existing `bucketSpec`. But, is it possible that we have a new `bucketSpec` in 
`tableDefinition`? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13796: [SPARK-7159][ML] Add multiclass logistic regressi...

2016-08-16 Thread dbtsai

Github user dbtsai commented on a diff in the pull request:

https://github.com/apache/spark/pull/13796#discussion_r75064380
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
 ---
@@ -1082,57 +1343,62 @@ private class LogisticCostFun(
 fitIntercept: Boolean,
 standardization: Boolean,
 bcFeaturesStd: Broadcast[Array[Double]],
-regParamL2: Double) extends DiffFunction[BDV[Double]] {
+regParamL2: Double,
+multinomial: Boolean) extends DiffFunction[BDV[Double]] {
 
   val featuresStd = bcFeaturesStd.value
 
   override def calculate(coefficients: BDV[Double]): (Double, BDV[Double]) 
= {
-val numFeatures = featuresStd.length
 val coeffs = Vectors.fromBreeze(coefficients)
 val bcCoeffs = instances.context.broadcast(coeffs)
-val n = coeffs.size
+val localFeaturesStd = featuresStd
+val numFeatures = localFeaturesStd.length
+val numFeaturesPlusIntercept = if (fitIntercept) numFeatures + 1 else 
numFeatures
 
 val logisticAggregator = {
-  val seqOp = (c: LogisticAggregator, instance: Instance) => 
c.add(instance)
+  val seqOp = (c: LogisticAggregator, instance: Instance) =>
+c.add(instance)
   val combOp = (c1: LogisticAggregator, c2: LogisticAggregator) => 
c1.merge(c2)
 
   instances.treeAggregate(
-new LogisticAggregator(bcCoeffs, bcFeaturesStd, numFeatures, 
numClasses, fitIntercept)
+new LogisticAggregator(bcCoeffs, bcFeaturesStd, numFeatures, 
numClasses, fitIntercept,
+  multinomial)
   )(seqOp, combOp)
 }
 
 val totalGradientArray = logisticAggregator.gradient.toArray
-
 // regVal is the sum of coefficients squares excluding intercept for 
L2 regularization.
 val regVal = if (regParamL2 == 0.0) {
   0.0
 } else {
+  val K = if (multinomial) numClasses else numClasses - 1
--- End diff --

just make else as `1` for clarity. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13796: [SPARK-7159][ML] Add multiclass logistic regressi...

2016-08-16 Thread dbtsai

Github user dbtsai commented on a diff in the pull request:

https://github.com/apache/spark/pull/13796#discussion_r75064330
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
 ---
@@ -1082,57 +1343,62 @@ private class LogisticCostFun(
 fitIntercept: Boolean,
 standardization: Boolean,
 bcFeaturesStd: Broadcast[Array[Double]],
-regParamL2: Double) extends DiffFunction[BDV[Double]] {
+regParamL2: Double,
+multinomial: Boolean) extends DiffFunction[BDV[Double]] {
 
   val featuresStd = bcFeaturesStd.value
 
   override def calculate(coefficients: BDV[Double]): (Double, BDV[Double]) 
= {
-val numFeatures = featuresStd.length
 val coeffs = Vectors.fromBreeze(coefficients)
 val bcCoeffs = instances.context.broadcast(coeffs)
-val n = coeffs.size
+val localFeaturesStd = featuresStd
+val numFeatures = localFeaturesStd.length
+val numFeaturesPlusIntercept = if (fitIntercept) numFeatures + 1 else 
numFeatures
 
 val logisticAggregator = {
-  val seqOp = (c: LogisticAggregator, instance: Instance) => 
c.add(instance)
+  val seqOp = (c: LogisticAggregator, instance: Instance) =>
+c.add(instance)
   val combOp = (c1: LogisticAggregator, c2: LogisticAggregator) => 
c1.merge(c2)
 
   instances.treeAggregate(
-new LogisticAggregator(bcCoeffs, bcFeaturesStd, numFeatures, 
numClasses, fitIntercept)
+new LogisticAggregator(bcCoeffs, bcFeaturesStd, numFeatures, 
numClasses, fitIntercept,
+  multinomial)
   )(seqOp, combOp)
 }
 
 val totalGradientArray = logisticAggregator.gradient.toArray
-
--- End diff --

revert this if not need.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13796: [SPARK-7159][ML] Add multiclass logistic regressi...

2016-08-16 Thread dbtsai

Github user dbtsai commented on a diff in the pull request:

https://github.com/apache/spark/pull/13796#discussion_r75064275
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
 ---
@@ -1082,57 +1343,62 @@ private class LogisticCostFun(
 fitIntercept: Boolean,
 standardization: Boolean,
 bcFeaturesStd: Broadcast[Array[Double]],
-regParamL2: Double) extends DiffFunction[BDV[Double]] {
+regParamL2: Double,
+multinomial: Boolean) extends DiffFunction[BDV[Double]] {
 
   val featuresStd = bcFeaturesStd.value
 
   override def calculate(coefficients: BDV[Double]): (Double, BDV[Double]) 
= {
-val numFeatures = featuresStd.length
 val coeffs = Vectors.fromBreeze(coefficients)
 val bcCoeffs = instances.context.broadcast(coeffs)
-val n = coeffs.size
+val localFeaturesStd = featuresStd
--- End diff --

Where is `localFeaturesStd` being used? Why not move `val featuresStd = 
bcFeaturesStd.value` into the `calculate method`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13796: [SPARK-7159][ML] Add multiclass logistic regressi...

2016-08-16 Thread dbtsai

Github user dbtsai commented on a diff in the pull request:

https://github.com/apache/spark/pull/13796#discussion_r75064097
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
 ---
@@ -1082,57 +1343,62 @@ private class LogisticCostFun(
 fitIntercept: Boolean,
 standardization: Boolean,
 bcFeaturesStd: Broadcast[Array[Double]],
-regParamL2: Double) extends DiffFunction[BDV[Double]] {
+regParamL2: Double,
+multinomial: Boolean) extends DiffFunction[BDV[Double]] {
 
   val featuresStd = bcFeaturesStd.value
 
   override def calculate(coefficients: BDV[Double]): (Double, BDV[Double]) 
= {
-val numFeatures = featuresStd.length
 val coeffs = Vectors.fromBreeze(coefficients)
 val bcCoeffs = instances.context.broadcast(coeffs)
-val n = coeffs.size
+val localFeaturesStd = featuresStd
+val numFeatures = localFeaturesStd.length
+val numFeaturesPlusIntercept = if (fitIntercept) numFeatures + 1 else 
numFeatures
 
 val logisticAggregator = {
-  val seqOp = (c: LogisticAggregator, instance: Instance) => 
c.add(instance)
+  val seqOp = (c: LogisticAggregator, instance: Instance) =>
+c.add(instance)
--- End diff --

revert this if no change. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14678: [MINOR][SQL] Add missing functions for some options in S...

2016-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14678
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14155: [SPARK-16498][SQL] move hive hack for data source...

2016-08-16 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14155#discussion_r75063920
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -144,16 +161,147 @@ private[spark] class HiveExternalCatalog(client: 
HiveClient, hadoopConf: Configu
 assert(tableDefinition.identifier.database.isDefined)
 val db = tableDefinition.identifier.database.get
 requireDbExists(db)
+verifyTableProperties(tableDefinition)
+
+if (tableDefinition.provider == Some("hive") || 
tableDefinition.tableType == VIEW) {
+  client.createTable(tableDefinition, ignoreIfExists)
+} else {
+  val tableProperties = tableMetadataToProperties(tableDefinition)
+
+  def newSparkSQLSpecificMetastoreTable(): CatalogTable = {
+tableDefinition.copy(
+  schema = new StructType,
+  partitionColumnNames = Nil,
+  bucketSpec = None,
+  properties = tableDefinition.properties ++ tableProperties)
+  }
+
+  def newHiveCompatibleMetastoreTable(serde: HiveSerDe, path: String): 
CatalogTable = {
+tableDefinition.copy(
+  storage = tableDefinition.storage.copy(
+locationUri = Some(new Path(path).toUri.toString),
+inputFormat = serde.inputFormat,
+outputFormat = serde.outputFormat,
+serde = serde.serde
+  ),
+  properties = tableDefinition.properties ++ tableProperties)
+  }
+
+  val qualifiedTableName = tableDefinition.identifier.quotedString
+  val maybeSerde = 
HiveSerDe.sourceToSerDe(tableDefinition.provider.get)
+  val maybePath = new 
CaseInsensitiveMap(tableDefinition.storage.properties).get("path")
+  val skipHiveMetadata = tableDefinition.storage.properties
+.getOrElse("skipHiveMetadata", "false").toBoolean
+
+  val (hiveCompatibleTable, logMessage) = (maybeSerde, maybePath) 
match {
+case _ if skipHiveMetadata =>
+  val message =
+s"Persisting data source table $qualifiedTableName into Hive 
metastore in" +
+  "Spark SQL specific format, which is NOT compatible with 
Hive."
+  (None, message)
+
+// our bucketing is un-compatible with hive(different hash 
function)
+case _ if tableDefinition.bucketSpec.nonEmpty =>
+  val message =
+s"Persisting bucketed data source table $qualifiedTableName 
into " +
+  "Hive metastore in Spark SQL specific format, which is NOT 
compatible with Hive. "
+  (None, message)
+
+case (Some(serde), Some(path)) =>
+  val message =
+s"Persisting data source table $qualifiedTableName with a 
single input path " +
+  s"into Hive metastore in Hive compatible format."
+  (Some(newHiveCompatibleMetastoreTable(serde, path)), message)
+
+case (Some(_), None) =>
+  val message =
+s"Data source table $qualifiedTableName is not file based. 
Persisting it into " +
+  s"Hive metastore in Spark SQL specific format, which is NOT 
compatible with Hive."
+  (None, message)
+
+case _ =>
+  val provider = tableDefinition.provider.get
+  val message =
+s"Couldn't find corresponding Hive SerDe for data source 
provider $provider. " +
+  s"Persisting data source table $qualifiedTableName into Hive 
metastore in " +
+  s"Spark SQL specific format, which is NOT compatible with 
Hive."
+  (None, message)
+  }
+
+  (hiveCompatibleTable, logMessage) match {
+case (Some(table), message) =>
+  // We first try to save the metadata of the table in a Hive 
compatible way.
+  // If Hive throws an error, we fall back to save its metadata in 
the Spark SQL
+  // specific way.
+  try {
+logInfo(message)
+saveTableIntoHive(table, ignoreIfExists)
+  } catch {
+case NonFatal(e) =>
+  val warningMessage =
+s"Could not persist 
${tableDefinition.identifier.quotedString} in a Hive " +
+  "compatible way. Persisting it into Hive metastore in 
Spark SQL specific format."
+  logWarning(warningMessage, e)
+  saveTableIntoHive(newSparkSQLSpecificMetastoreTable(), 
ignoreIfExists)
+  }
+
+case (None, message) =>
+  logWarning(message)
+  saveTableIntoHive(newSparkSQLSpecificMetastoreTable(), 
ignoreIfExists)
+  }
+}
+  }
+
+  private def tableMetadataToPrope

[GitHub] spark issue #14678: [MINOR][SQL] Add missing functions for some options in S...

2016-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14678
  
**[Test build #63898 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63898/consoleFull)**
 for PR 14678 at commit 
[`c959f3b`](https://github.com/apache/spark/commit/c959f3b4e9ed23a9cee67db64b35fdc4d0e2301d).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14678: [MINOR][SQL] Add missing functions for some options in S...

2016-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14678
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63898/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14155: [SPARK-16498][SQL] move hive hack for data source...

2016-08-16 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14155#discussion_r75063736
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -144,16 +161,147 @@ private[spark] class HiveExternalCatalog(client: 
HiveClient, hadoopConf: Configu
 assert(tableDefinition.identifier.database.isDefined)
 val db = tableDefinition.identifier.database.get
 requireDbExists(db)
+verifyTableProperties(tableDefinition)
+
+if (tableDefinition.provider == Some("hive") || 
tableDefinition.tableType == VIEW) {
+  client.createTable(tableDefinition, ignoreIfExists)
+} else {
+  val tableProperties = tableMetadataToProperties(tableDefinition)
+
+  def newSparkSQLSpecificMetastoreTable(): CatalogTable = {
+tableDefinition.copy(
+  schema = new StructType,
+  partitionColumnNames = Nil,
+  bucketSpec = None,
+  properties = tableDefinition.properties ++ tableProperties)
+  }
+
+  def newHiveCompatibleMetastoreTable(serde: HiveSerDe, path: String): 
CatalogTable = {
+tableDefinition.copy(
+  storage = tableDefinition.storage.copy(
+locationUri = Some(new Path(path).toUri.toString),
+inputFormat = serde.inputFormat,
+outputFormat = serde.outputFormat,
+serde = serde.serde
+  ),
+  properties = tableDefinition.properties ++ tableProperties)
+  }
+
+  val qualifiedTableName = tableDefinition.identifier.quotedString
+  val maybeSerde = 
HiveSerDe.sourceToSerDe(tableDefinition.provider.get)
+  val maybePath = new 
CaseInsensitiveMap(tableDefinition.storage.properties).get("path")
+  val skipHiveMetadata = tableDefinition.storage.properties
+.getOrElse("skipHiveMetadata", "false").toBoolean
+
+  val (hiveCompatibleTable, logMessage) = (maybeSerde, maybePath) 
match {
+case _ if skipHiveMetadata =>
+  val message =
+s"Persisting data source table $qualifiedTableName into Hive 
metastore in" +
+  "Spark SQL specific format, which is NOT compatible with 
Hive."
+  (None, message)
+
+// our bucketing is un-compatible with hive(different hash 
function)
+case _ if tableDefinition.bucketSpec.nonEmpty =>
+  val message =
+s"Persisting bucketed data source table $qualifiedTableName 
into " +
+  "Hive metastore in Spark SQL specific format, which is NOT 
compatible with Hive. "
+  (None, message)
+
+case (Some(serde), Some(path)) =>
+  val message =
+s"Persisting data source table $qualifiedTableName with a 
single input path " +
+  s"into Hive metastore in Hive compatible format."
+  (Some(newHiveCompatibleMetastoreTable(serde, path)), message)
+
+case (Some(_), None) =>
+  val message =
+s"Data source table $qualifiedTableName is not file based. 
Persisting it into " +
+  s"Hive metastore in Spark SQL specific format, which is NOT 
compatible with Hive."
+  (None, message)
+
+case _ =>
+  val provider = tableDefinition.provider.get
+  val message =
+s"Couldn't find corresponding Hive SerDe for data source 
provider $provider. " +
+  s"Persisting data source table $qualifiedTableName into Hive 
metastore in " +
+  s"Spark SQL specific format, which is NOT compatible with 
Hive."
+  (None, message)
+  }
+
+  (hiveCompatibleTable, logMessage) match {
+case (Some(table), message) =>
+  // We first try to save the metadata of the table in a Hive 
compatible way.
+  // If Hive throws an error, we fall back to save its metadata in 
the Spark SQL
+  // specific way.
+  try {
+logInfo(message)
+saveTableIntoHive(table, ignoreIfExists)
+  } catch {
+case NonFatal(e) =>
+  val warningMessage =
+s"Could not persist 
${tableDefinition.identifier.quotedString} in a Hive " +
+  "compatible way. Persisting it into Hive metastore in 
Spark SQL specific format."
+  logWarning(warningMessage, e)
+  saveTableIntoHive(newSparkSQLSpecificMetastoreTable(), 
ignoreIfExists)
+  }
+
+case (None, message) =>
+  logWarning(message)
+  saveTableIntoHive(newSparkSQLSpecificMetastoreTable(), 
ignoreIfExists)
+  }
+}
+  }
+
+  private def tableMetadataToPrope

[GitHub] spark issue #14678: [MINOR][SQL] Add missing functions for some options in S...

2016-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14678
  
**[Test build #63898 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63898/consoleFull)**
 for PR 14678 at commit 
[`c959f3b`](https://github.com/apache/spark/commit/c959f3b4e9ed23a9cee67db64b35fdc4d0e2301d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14155: [SPARK-16498][SQL] move hive hack for data source...

2016-08-16 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14155#discussion_r75063676
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -144,16 +161,147 @@ private[spark] class HiveExternalCatalog(client: 
HiveClient, hadoopConf: Configu
 assert(tableDefinition.identifier.database.isDefined)
 val db = tableDefinition.identifier.database.get
 requireDbExists(db)
+verifyTableProperties(tableDefinition)
+
+if (tableDefinition.provider == Some("hive") || 
tableDefinition.tableType == VIEW) {
+  client.createTable(tableDefinition, ignoreIfExists)
+} else {
+  val tableProperties = tableMetadataToProperties(tableDefinition)
+
+  def newSparkSQLSpecificMetastoreTable(): CatalogTable = {
+tableDefinition.copy(
+  schema = new StructType,
+  partitionColumnNames = Nil,
+  bucketSpec = None,
+  properties = tableDefinition.properties ++ tableProperties)
+  }
+
+  def newHiveCompatibleMetastoreTable(serde: HiveSerDe, path: String): 
CatalogTable = {
+tableDefinition.copy(
+  storage = tableDefinition.storage.copy(
+locationUri = Some(new Path(path).toUri.toString),
+inputFormat = serde.inputFormat,
+outputFormat = serde.outputFormat,
+serde = serde.serde
+  ),
+  properties = tableDefinition.properties ++ tableProperties)
+  }
+
+  val qualifiedTableName = tableDefinition.identifier.quotedString
+  val maybeSerde = 
HiveSerDe.sourceToSerDe(tableDefinition.provider.get)
+  val maybePath = new 
CaseInsensitiveMap(tableDefinition.storage.properties).get("path")
+  val skipHiveMetadata = tableDefinition.storage.properties
+.getOrElse("skipHiveMetadata", "false").toBoolean
+
+  val (hiveCompatibleTable, logMessage) = (maybeSerde, maybePath) 
match {
+case _ if skipHiveMetadata =>
+  val message =
+s"Persisting data source table $qualifiedTableName into Hive 
metastore in" +
+  "Spark SQL specific format, which is NOT compatible with 
Hive."
+  (None, message)
+
+// our bucketing is un-compatible with hive(different hash 
function)
+case _ if tableDefinition.bucketSpec.nonEmpty =>
+  val message =
+s"Persisting bucketed data source table $qualifiedTableName 
into " +
+  "Hive metastore in Spark SQL specific format, which is NOT 
compatible with Hive. "
+  (None, message)
+
+case (Some(serde), Some(path)) =>
+  val message =
+s"Persisting data source table $qualifiedTableName with a 
single input path " +
+  s"into Hive metastore in Hive compatible format."
+  (Some(newHiveCompatibleMetastoreTable(serde, path)), message)
+
+case (Some(_), None) =>
+  val message =
+s"Data source table $qualifiedTableName is not file based. 
Persisting it into " +
+  s"Hive metastore in Spark SQL specific format, which is NOT 
compatible with Hive."
+  (None, message)
+
+case _ =>
+  val provider = tableDefinition.provider.get
+  val message =
+s"Couldn't find corresponding Hive SerDe for data source 
provider $provider. " +
+  s"Persisting data source table $qualifiedTableName into Hive 
metastore in " +
+  s"Spark SQL specific format, which is NOT compatible with 
Hive."
+  (None, message)
+  }
+
+  (hiveCompatibleTable, logMessage) match {
+case (Some(table), message) =>
+  // We first try to save the metadata of the table in a Hive 
compatible way.
+  // If Hive throws an error, we fall back to save its metadata in 
the Spark SQL
+  // specific way.
+  try {
+logInfo(message)
+saveTableIntoHive(table, ignoreIfExists)
+  } catch {
+case NonFatal(e) =>
+  val warningMessage =
+s"Could not persist 
${tableDefinition.identifier.quotedString} in a Hive " +
+  "compatible way. Persisting it into Hive metastore in 
Spark SQL specific format."
+  logWarning(warningMessage, e)
+  saveTableIntoHive(newSparkSQLSpecificMetastoreTable(), 
ignoreIfExists)
+  }
+
+case (None, message) =>
+  logWarning(message)
+  saveTableIntoHive(newSparkSQLSpecificMetastoreTable(), 
ignoreIfExists)
+  }
+}
+  }
+
+  private def tableMetadataToPrope

[GitHub] spark pull request #14678: [MINOR][SQL] Add missing functions for some optio...

2016-08-16 Thread HyukjinKwon

GitHub user HyukjinKwon opened a pull request:

https://github.com/apache/spark/pull/14678

[MINOR][SQL] Add missing functions for some options in SQLConf and use them 
where applicable

## What changes were proposed in this pull request?

I first thought they are missing because they are kind of hidden options 
but it seems they are just missing.

For example, `spark.sql.parquet.mergeSchema` is documented in 
[sql-programming-guide.md](https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md)
 but this functions is missing whereas many options such as 
`spark.sql.join.preferSortMergeJoin` are not documented but has its own 
function.

So, this PR suggests make them consistent by adding the missing functions 
for some options in `SQLConf` and use them where applicable. 

## How was this patch tested?

Existing tests should cover this.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HyukjinKwon/spark sqlconf-cleanup

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14678.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14678


commit 27fbbd902dbfca34dd5edd5a219dc64abf9691cf
Author: hyukjinkwon 
Date:   2016-08-17T04:50:15Z

Add missing functions for some options and use them where applicable

commit c959f3b4e9ed23a9cee67db64b35fdc4d0e2301d
Author: hyukjinkwon 
Date:   2016-08-17T04:59:22Z

Fix typos




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14155: [SPARK-16498][SQL] move hive hack for data source...

2016-08-16 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14155#discussion_r75063449
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -144,16 +161,147 @@ private[spark] class HiveExternalCatalog(client: 
HiveClient, hadoopConf: Configu
 assert(tableDefinition.identifier.database.isDefined)
 val db = tableDefinition.identifier.database.get
 requireDbExists(db)
+verifyTableProperties(tableDefinition)
+
+if (tableDefinition.provider == Some("hive") || 
tableDefinition.tableType == VIEW) {
+  client.createTable(tableDefinition, ignoreIfExists)
+} else {
+  val tableProperties = tableMetadataToProperties(tableDefinition)
+
+  def newSparkSQLSpecificMetastoreTable(): CatalogTable = {
+tableDefinition.copy(
+  schema = new StructType,
+  partitionColumnNames = Nil,
+  bucketSpec = None,
+  properties = tableDefinition.properties ++ tableProperties)
+  }
+
+  def newHiveCompatibleMetastoreTable(serde: HiveSerDe, path: String): 
CatalogTable = {
+tableDefinition.copy(
+  storage = tableDefinition.storage.copy(
+locationUri = Some(new Path(path).toUri.toString),
+inputFormat = serde.inputFormat,
+outputFormat = serde.outputFormat,
+serde = serde.serde
+  ),
+  properties = tableDefinition.properties ++ tableProperties)
+  }
+
+  val qualifiedTableName = tableDefinition.identifier.quotedString
+  val maybeSerde = 
HiveSerDe.sourceToSerDe(tableDefinition.provider.get)
+  val maybePath = new 
CaseInsensitiveMap(tableDefinition.storage.properties).get("path")
--- End diff --

I think this path will be set by the ddl command (e.g. 
`CreateDataSourceTableAsSelectCommand`).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14155: [SPARK-16498][SQL] move hive hack for data source...

2016-08-16 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14155#discussion_r75063156
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -144,16 +161,147 @@ private[spark] class HiveExternalCatalog(client: 
HiveClient, hadoopConf: Configu
 assert(tableDefinition.identifier.database.isDefined)
 val db = tableDefinition.identifier.database.get
 requireDbExists(db)
+verifyTableProperties(tableDefinition)
+
+if (tableDefinition.provider == Some("hive") || 
tableDefinition.tableType == VIEW) {
+  client.createTable(tableDefinition, ignoreIfExists)
+} else {
+  val tableProperties = tableMetadataToProperties(tableDefinition)
+
+  def newSparkSQLSpecificMetastoreTable(): CatalogTable = {
+tableDefinition.copy(
+  schema = new StructType,
+  partitionColumnNames = Nil,
+  bucketSpec = None,
+  properties = tableDefinition.properties ++ tableProperties)
+  }
+
+  def newHiveCompatibleMetastoreTable(serde: HiveSerDe, path: String): 
CatalogTable = {
+tableDefinition.copy(
+  storage = tableDefinition.storage.copy(
+locationUri = Some(new Path(path).toUri.toString),
+inputFormat = serde.inputFormat,
+outputFormat = serde.outputFormat,
+serde = serde.serde
+  ),
+  properties = tableDefinition.properties ++ tableProperties)
+  }
+
+  val qualifiedTableName = tableDefinition.identifier.quotedString
+  val maybeSerde = 
HiveSerDe.sourceToSerDe(tableDefinition.provider.get)
+  val maybePath = new 
CaseInsensitiveMap(tableDefinition.storage.properties).get("path")
--- End diff --

If the create table command does not specify the location, does this 
`maybePath` contains the default location?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14155: [SPARK-16498][SQL] move hive hack for data source...

2016-08-16 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14155#discussion_r75063094
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -144,16 +161,147 @@ private[spark] class HiveExternalCatalog(client: 
HiveClient, hadoopConf: Configu
 assert(tableDefinition.identifier.database.isDefined)
 val db = tableDefinition.identifier.database.get
 requireDbExists(db)
+verifyTableProperties(tableDefinition)
+
+if (tableDefinition.provider == Some("hive") || 
tableDefinition.tableType == VIEW) {
+  client.createTable(tableDefinition, ignoreIfExists)
+} else {
+  val tableProperties = tableMetadataToProperties(tableDefinition)
--- End diff --

Let's explain what will be put into this `tableProperties`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14155: [SPARK-16498][SQL] move hive hack for data source...

2016-08-16 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14155#discussion_r75062850
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -144,16 +161,147 @@ private[spark] class HiveExternalCatalog(client: 
HiveClient, hadoopConf: Configu
 assert(tableDefinition.identifier.database.isDefined)
 val db = tableDefinition.identifier.database.get
 requireDbExists(db)
+verifyTableProperties(tableDefinition)
+
+if (tableDefinition.provider == Some("hive") || 
tableDefinition.tableType == VIEW) {
+  client.createTable(tableDefinition, ignoreIfExists)
+} else {
+  val tableProperties = tableMetadataToProperties(tableDefinition)
+
+  def newSparkSQLSpecificMetastoreTable(): CatalogTable = {
+tableDefinition.copy(
+  schema = new StructType,
+  partitionColumnNames = Nil,
+  bucketSpec = None,
+  properties = tableDefinition.properties ++ tableProperties)
+  }
+
+  def newHiveCompatibleMetastoreTable(serde: HiveSerDe, path: String): 
CatalogTable = {
--- End diff --

comment


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14155: [SPARK-16498][SQL] move hive hack for data source...

2016-08-16 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14155#discussion_r75062846
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -144,16 +161,147 @@ private[spark] class HiveExternalCatalog(client: 
HiveClient, hadoopConf: Configu
 assert(tableDefinition.identifier.database.isDefined)
 val db = tableDefinition.identifier.database.get
 requireDbExists(db)
+verifyTableProperties(tableDefinition)
+
+if (tableDefinition.provider == Some("hive") || 
tableDefinition.tableType == VIEW) {
+  client.createTable(tableDefinition, ignoreIfExists)
+} else {
+  val tableProperties = tableMetadataToProperties(tableDefinition)
+
+  def newSparkSQLSpecificMetastoreTable(): CatalogTable = {
--- End diff --

comment


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14539: [SPARK-16947][SQL] Improve type coercion for inline tabl...

2016-08-16 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14539
  
How to specify `null` when we creating inline table? 
```
sql(
  """
|create temporary view src as select * from values
|(201, null),
|(86, "val_86"),
  """.stripMargin)
```

Is that supported?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14155: [SPARK-16498][SQL] move hive hack for data source...

2016-08-16 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14155#discussion_r75062743
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -144,16 +161,147 @@ private[spark] class HiveExternalCatalog(client: 
HiveClient, hadoopConf: Configu
 assert(tableDefinition.identifier.database.isDefined)
 val db = tableDefinition.identifier.database.get
 requireDbExists(db)
+verifyTableProperties(tableDefinition)
+
+if (tableDefinition.provider == Some("hive") || 
tableDefinition.tableType == VIEW) {
+  client.createTable(tableDefinition, ignoreIfExists)
+} else {
--- End diff --

Let's add comment to explain what we are doing at here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14676: [SPARK-16947][SQL] Support type coercion and fold...

2016-08-16 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14676#discussion_r75062469
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/inline-table.sql ---
@@ -0,0 +1,39 @@
+
+-- single row, without table and column alias
+select * from values ("one", 1);
+
+-- single row, without column alias
+select * from values ("one", 1) as data;
+
+-- single row
+select * from values ("one", 1) as data(a, b);
--- End diff --

Could you add a case for `NULL`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14676: [SPARK-16947][SQL] Support type coercion and foldable ex...

2016-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14676
  
**[Test build #63897 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63897/consoleFull)**
 for PR 14676 at commit 
[`092605b`](https://github.com/apache/spark/commit/092605be786adae0aa241da43e25d1f1be5de492).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14676: [SPARK-16947][SQL] Support type coercion and foldable ex...

2016-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14676
  
**[Test build #63896 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63896/consoleFull)**
 for PR 14676 at commit 
[`fcc3caf`](https://github.com/apache/spark/commit/fcc3cafe08c9549be51b33b5ac993fbb3fa46d37).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14665: [SPARK-17084][SQL] Rename ParserUtils.assert to v...

2016-08-16 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14665


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14551: [SPARK-16961][CORE] Fixed off-by-one error that biased r...

2016-08-16 Thread wangmiao1981

Github user wangmiao1981 commented on the issue:

https://github.com/apache/spark/pull/14551
  
`model.gaussiansDF.show()` displays the `mean` and `variance` of the 
gaussians, which are Dataframes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14665: [SPARK-17084][SQL] Rename ParserUtils.assert to validate

2016-08-16 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14665
  
Merging in master/2.0.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14648: [SPARK-16995][SQL] TreeNodeException when flat mapping R...

2016-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14648
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63892/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14648: [SPARK-16995][SQL] TreeNodeException when flat mapping R...

2016-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14648
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14648: [SPARK-16995][SQL] TreeNodeException when flat mapping R...

2016-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14648
  
**[Test build #63892 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63892/consoleFull)**
 for PR 14648 at commit 
[`0008c3e`](https://github.com/apache/spark/commit/0008c3e11dfb85523f9f4606d5dec714339d5f43).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14660: [SPARK-17071][SQL] Add an option to support for reading ...

2016-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14660
  
**[Test build #63895 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63895/consoleFull)**
 for PR 14660 at commit 
[`0130c39`](https://github.com/apache/spark/commit/0130c39450c7bcecfb3a14db1e581c3f3a9f6a20).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #8880: [SPARK-5682][Core] Add encrypted shuffle in spark

2016-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/8880
  
**[Test build #63894 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63894/consoleFull)**
 for PR 8880 at commit 
[`77122bb`](https://github.com/apache/spark/commit/77122bb3662c65ffa5596d740efab41f5dfc3a0f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14677: [MINOR][DOC] Fix the descriptions for `properties` argum...

2016-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14677
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14677: [MINOR][DOC] Fix the descriptions for `properties` argum...

2016-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14677
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63893/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14677: [MINOR][DOC] Fix the descriptions for `properties` argum...

2016-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14677
  
**[Test build #63893 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63893/consoleFull)**
 for PR 14677 at commit 
[`f501861`](https://github.com/apache/spark/commit/f5018616eee50544c22432c2256c75325b537e82).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14677: [MINOR][DOC] Fix the descriptions for `properties` argum...

2016-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14677
  
**[Test build #63893 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63893/consoleFull)**
 for PR 14677 at commit 
[`f501861`](https://github.com/apache/spark/commit/f5018616eee50544c22432c2256c75325b537e82).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14597: [SPARK-17017][MLLIB] add a chiSquare Selector bas...

2016-08-16 Thread mpjlu

Github user mpjlu closed the pull request at:

https://github.com/apache/spark/pull/14597


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14620: [SPARK-17032][SQL] Add test cases for methods in ParserU...

2016-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14620
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14620: [SPARK-17032][SQL] Add test cases for methods in ParserU...

2016-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14620
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63888/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14620: [SPARK-17032][SQL] Add test cases for methods in ParserU...

2016-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14620
  
**[Test build #63888 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63888/consoleFull)**
 for PR 14620 at commit 
[`36e049a`](https://github.com/apache/spark/commit/36e049a8fdafb3d311029b292e7d2de5efce4c6a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14677: [MINOR][DOC] Fix the descriptions for `properties` argum...

2016-08-16 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/14677
  
cc @srowen and @mvervuurt 

I just opened this just for verifying duplicately again as I already did. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14677: [MINOR][DOC] Fix the descriptions for `properties...

2016-08-16 Thread HyukjinKwon

GitHub user HyukjinKwon opened a pull request:

https://github.com/apache/spark/pull/14677

[MINOR][DOC] Fix the descriptions for `properties` argument in the 
documenation for jdbc APIs

## What changes were proposed in this pull request?

This should be credited to @mvervuurt. The main purpose of this PR is
 - simply to include the change for the same instance in `DataFrameReader` 
just to match up.
 - just avoid duplicately verifying the PR (as I already did).

The documentation for both should be the same because both assume the 
`properties` should be  the same `dict`.

## How was this patch tested?

Manually building Python documentation.

This will produce the output as below:

- `DataFrameReader`

![2016-08-17 11 12 
00](https://cloud.githubusercontent.com/assets/6477701/17722764/b3f6568e-646f-11e6-8b75-4fb672f3f366.png)

- `DataFrameWriter`

![2016-08-17 11 12 
10](https://cloud.githubusercontent.com/assets/6477701/17722765/b58cb308-646f-11e6-841a-32f19800d139.png)

Closes #14624


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HyukjinKwon/spark typo-python

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14677.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14677


commit b0864ecb51a28452f6e33f4bdcd30795c7c2ec99
Author: mvervuurt 
Date:   2016-08-12T18:42:40Z

Fix docstring of method jdbc of PySpark DataFrameWriter because a 
dictionary of JDBC connection arguments is used instead of a list.

commit f5018616eee50544c22432c2256c75325b537e82
Author: hyukjinkwon 
Date:   2016-08-17T02:17:48Z

Fix the descriptions for `properties` argument in the documenation for jdbc 
APIs




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14648: [SPARK-16995][SQL] TreeNodeException when flat mapping R...

2016-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14648
  
**[Test build #63892 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63892/consoleFull)**
 for PR 14648 at commit 
[`0008c3e`](https://github.com/apache/spark/commit/0008c3e11dfb85523f9f4606d5dec714339d5f43).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14676: [SPARK-16947][SQL] Support type coercion and foldable ex...

2016-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14676
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14676: [SPARK-16947][SQL] Support type coercion and foldable ex...

2016-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14676
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63889/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14676: [SPARK-16947][SQL] Support type coercion and foldable ex...

2016-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14676
  
**[Test build #63889 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63889/consoleFull)**
 for PR 14676 at commit 
[`2327b79`](https://github.com/apache/spark/commit/2327b7971a845ce01b0b18fd5ccd2b1f0bb99be0).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14547: [SPARK-16718][MLlib] gbm-style treeboost [WIP]

2016-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14547
  
**[Test build #3224 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3224/consoleFull)**
 for PR 14547 at commit 
[`a040da5`](https://github.com/apache/spark/commit/a040da5ea64778d766720ecd6a8859893d7204f0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14676: [SPARK-16947][SQL] Support type coercion and foldable ex...

2016-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14676
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63887/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14676: [SPARK-16947][SQL] Support type coercion and foldable ex...

2016-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14676
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14676: [SPARK-16947][SQL] Support type coercion and foldable ex...

2016-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14676
  
**[Test build #63887 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63887/consoleFull)**
 for PR 14676 at commit 
[`d7acae5`](https://github.com/apache/spark/commit/d7acae55034d4ff5da3e7579cf44acb7b704b4a1).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class UnresolvedInlineTable(`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14639: [SPARK-17054][SPARKR] SparkR can not run in yarn-cluster...

2016-08-16 Thread zjffdu

Github user zjffdu commented on the issue:

https://github.com/apache/spark/pull/14639
  
Thanks @sun-rui  Another commit resolved the downloading issue. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14639: [SPARK-17054][SPARKR] SparkR can not run in yarn-cluster...

2016-08-16 Thread sun-rui

Github user sun-rui commented on the issue:

https://github.com/apache/spark/pull/14639
  
This is not only about the correct cache dir under MAC OS, but also in 
yarn-cluster mode, there should not be downloading of Spark.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14675: [SPARK-17096][SQL][STREAMING] Improve exception string r...

2016-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14675
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 6 >

1 - 100 of 556 matches

Mail list logo