[GitHub] spark issue #16404: [SPARK-18969][SQL] Support grouping by nondeterministic ...

2017-01-03 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16404
  
MySQL treats them differently... 
```SQL
mysql> select c1, concat(rand(), c1) from t1 group by c1;
+--+--+
| c1   | concat(rand(), c1)   |
+--+--+
|1 | 0.084388771172974981 |
|3 | 0.116890648488784823 |
+--+--+
2 rows in set (0.00 sec)

mysql> select c1, concat(rand(), c1) from t1 group by c1, concat(rand(), 
c1);
+--+--+
| c1   | concat(rand(), c1)   |
+--+--+
|1 | 0.16241911441313021  |
|1 | 0.461423657332941551 |
|3 | 0.81986097415896223  |
+--+--+
3 rows in set (0.00 sec)
```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16464: [SPARK-19066][SparkR]:SparkR LDA doesn't set optimizer c...

2017-01-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16464
  
**[Test build #70863 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70863/testReport)**
 for PR 16464 at commit 
[`14bafc1`](https://github.com/apache/spark/commit/14bafc1bd8b2c621cfd2f83f543182a2e38f8fd6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16460: [SPARK-19058][SQL] fix partition related behavior...

2017-01-03 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16460#discussion_r94537072
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala
 ---
@@ -74,12 +69,29 @@ case class InsertIntoHadoopFsRelationCommand(
 val fs = outputPath.getFileSystem(hadoopConf)
 val qualifiedOutputPath = outputPath.makeQualified(fs.getUri, 
fs.getWorkingDirectory)
 
+val partitionsTrackedByCatalog = catalogTable.isDefined &&
+  catalogTable.get.partitionColumnNames.nonEmpty &&
+  catalogTable.get.tracksPartitionsInCatalog
--- End diff --

do you mean we should completely ignore the partition information in 
metastore, when the flag is off, so that we should also ignore the data in 
custom partition path?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15880: [SPARK-17913][SQL] compare long and string type column m...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15880
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15880: [SPARK-17913][SQL] compare long and string type column m...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15880
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70860/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16417: [SPARK-19014][SQL] support complex aggregate buffer in H...

2017-01-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16417
  
**[Test build #70862 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70862/testReport)**
 for PR 16417 at commit 
[`32e527d`](https://github.com/apache/spark/commit/32e527d902318c9e81e8586f592968ee08416acd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15880: [SPARK-17913][SQL] compare long and string type column m...

2017-01-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15880
  
**[Test build #70860 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70860/testReport)**
 for PR 15880 at commit 
[`821cca6`](https://github.com/apache/spark/commit/821cca6cd836f11ea917c89938f288f126d633ab).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16460: [SPARK-19058][SQL] fix partition related behavior...

2017-01-03 Thread ericl
Github user ericl commented on a diff in the pull request:

https://github.com/apache/spark/pull/16460#discussion_r94536423
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala
 ---
@@ -74,12 +69,29 @@ case class InsertIntoHadoopFsRelationCommand(
 val fs = outputPath.getFileSystem(hadoopConf)
 val qualifiedOutputPath = outputPath.makeQualified(fs.getUri, 
fs.getWorkingDirectory)
 
+val partitionsTrackedByCatalog = catalogTable.isDefined &&
+  catalogTable.get.partitionColumnNames.nonEmpty &&
+  catalogTable.get.tracksPartitionsInCatalog
--- End diff --

Hm, in other parts of the code we assume that the feature is completely 
disabled when the flag is off. This is probably needed since there is no way to 
revert a table otherwise.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16404: [SPARK-18969][SQL] Support grouping by nondeterministic ...

2017-01-03 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16404
  
Oracle allows it. It sounds like they treat ` (username || 
dbms_random.string('a', 10))` in aggregate and group-by as the same expression.

```SQL
SQL> select (username || dbms_random.string('a', 10)) from all_users group 
by (username || dbms_random.string('a', 10));

(USERNAME||DBMS_RANDOM.STRING('A',10))


APEX_04cklbMYhekl
FLOWS_FILESVmTbIIeiUs
CTXSYSPmgqeRFPry
SYSTEMxQLrzXxHth
XDBRRTfatsLlU
SYSoLDWRKMvlZ
XS$NULLXAaOykZCDH
APEX_PUBLIC_USERvcLswvpbcw
ANONYMOUSgupWiktQKh
OUTLNjLdKOTZoFI
MDSYSxEOhwTwQqa

(USERNAME||DBMS_RANDOM.STRING('A',10))


HRkovpxQztYU

12 rows selected.
```
If I change the order, I got the error:
```SQL
SQL> select (dbms_random.string('a', 10) || username) from all_users group 
by (username || dbms_random.string('a', 10))
  2  ;
select (dbms_random.string('a', 10) || username) from all_users group by 
(username || dbms_random.string('a', 10))
   *
ERROR at line 1:
ORA-00979: not a GROUP BY expression
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14284: [SPARK-16633] [SPARK-16642] [SPARK-16721] [SQL] Fixes th...

2017-01-03 Thread chengat1314
Github user chengat1314 commented on the issue:

https://github.com/apache/spark/pull/14284
  
Is possible add feature to enable ignore nulls? 
for example: 
LAG (value_expr [, offset ])
[ IGNORE NULLS | RESPECT NULLS ]
OVER ( [ PARTITION BY window_partition ] ORDER BY window_ordering )

thanks
Cheng Feng


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16460: [SPARK-19058][SQL] fix partition related behavior...

2017-01-03 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16460#discussion_r94535816
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala
 ---
@@ -152,4 +190,29 @@ case class InsertIntoHadoopFsRelationCommand(
   }
 }
   }
+
+  /**
+   * Given a set of input partitions, returns those that have locations 
that differ from the
+   * Hive default (e.g. /k1=v1/k2=v2). These partitions were manually 
assigned locations by
+   * the user.
+   *
+   * @return a mapping from partition specs to their custom locations
+   */
+  private def getCustomPartitionLocations(
+  fs: FileSystem,
+  qualifiedOutputPath: Path,
+  partitions: Seq[CatalogTablePartition]): Map[TablePartitionSpec, 
String] = {
+val table = catalogTable.get
--- End diff --

yea good idea


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16460: [SPARK-19058][SQL] fix partition related behavior...

2017-01-03 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16460#discussion_r94535760
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
 ---
@@ -473,22 +473,26 @@ case class DataSource(
   s"Unable to resolve $name given 
[${plan.output.map(_.name).mkString(", ")}]")
   }.asInstanceOf[Attribute]
 }
+val fileIndex = catalogTable.map(_.identifier).map { tableIdent =>
+  sparkSession.table(tableIdent).queryExecution.analyzed.collect {
+case LogicalRelation(t: HadoopFsRelation, _, _) => t.location
+  }.head
+}
 // For partitioned relation r, r.schema's column ordering can be 
different from the column
 // ordering of data.logicalPlan (partition columns are all moved 
after data column).  This
 // will be adjusted within InsertIntoHadoopFsRelation.
 val plan =
   InsertIntoHadoopFsRelationCommand(
 outputPath = outputPath,
 staticPartitions = Map.empty,
-customPartitionLocations = Map.empty,
 partitionColumns = columns,
 bucketSpec = bucketSpec,
 fileFormat = format,
-refreshFunction = _ => Unit, // No existing table needs to be 
refreshed.
--- End diff --

Previously, we did not refresh anything here, but we will repair the 
partitions in 
[`CreateDataSourceTableAsSelectCommand`](https://github.com/apache/spark/pull/16460/files#diff-945e51801b84b92da242fcb42f83f5f5L171).
 After this PR, we only repair the partitions in 
`CreateDataSourceTableAsSelectCommand` when we are creating a new table.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15178: [SPARK-17556][SQL] Executor side broadcast for broadcast...

2017-01-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15178
  
**[Test build #70861 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70861/testReport)**
 for PR 15178 at commit 
[`1b499d1`](https://github.com/apache/spark/commit/1b499d1f7b5689fd544d7adc4aac709ff74fe684).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16469: [SPARK-19072][SQL] codegen of Literal should not ...

2017-01-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16469


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15178: [SPARK-17556][SQL] Executor side broadcast for broadcast...

2017-01-03 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/15178
  
retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15178: [SPARK-17556][SQL] Executor side broadcast for broadcast...

2017-01-03 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/15178
  
`org.apache.spark.rdd.AsyncRDDActionsSuite.async failure handling` passes 
locally.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16469: [SPARK-19072][SQL] codegen of Literal should not output ...

2017-01-03 Thread yhuai
Github user yhuai commented on the issue:

https://github.com/apache/spark/pull/16469
  
LGTM. Merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory fail to...

2017-01-03 Thread merlintang
Github user merlintang commented on the issue:

https://github.com/apache/spark/pull/15819
  
@gatorsmile can you retest the patch, then we can merge. Sorry to ping you 
multiple times since several users are asking this. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16308: [SPARK-18936][SQL] Infrastructure for session local time...

2017-01-03 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/16308
  
@hvanhovell anything else to do here other than bringing it up to date?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16451: [WIP][SPARK-18922][SQL][CORE][STREAMING][TESTS] Fix all ...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16451
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16451: [WIP][SPARK-18922][SQL][CORE][STREAMING][TESTS] Fix all ...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16451
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70858/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16451: [WIP][SPARK-18922][SQL][CORE][STREAMING][TESTS] Fix all ...

2017-01-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16451
  
**[Test build #70858 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70858/testReport)**
 for PR 16451 at commit 
[`d50d10c`](https://github.com/apache/spark/commit/d50d10cf1456137f69ca13a686c3fa67a46bc707).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16469: [SPARK-19072][SQL] codegen of Literal should not output ...

2017-01-03 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16469
  
some more explanation: when `Literal` codegen produce boxed values, the 
double equality will break, because the code is `(Double.isNaN(d1) && 
Double.isNaN(d2)) || d == d2`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16404: [SPARK-18969][SQL] Support grouping by nondeterministic ...

2017-01-03 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16404
  
DB2 has such a limit. See the error message `SQL -583`: 
http://www.ibm.com/support/knowledgecenter/SSEPGG_10.5.0/com.ibm.db2.luw.messages.sql.doc/doc/msql00583n.html

> The routine (function or method) or expression is defined as 
non-deterministic or as having external action. This is not supported in the 
context in which it is used. The contexts in which these are not valid are:

> in an expression of a GROUP BY clause

It documents the same workaround: 
> Remove the non-deterministic or external action routine or expression 
from the GROUP BY clause. If grouping is desired on a column of the result that 
is based on a non-deterministic or external action routine or expression use a 
nested table expression or a common table expression to first provide a result 
table with the expression as a column of the result.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16460: [SPARK-19058][SQL] fix partition related behavior...

2017-01-03 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/16460#discussion_r94533881
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
 ---
@@ -473,22 +473,26 @@ case class DataSource(
   s"Unable to resolve $name given 
[${plan.output.map(_.name).mkString(", ")}]")
   }.asInstanceOf[Attribute]
 }
+val fileIndex = catalogTable.map(_.identifier).map { tableIdent =>
+  sparkSession.table(tableIdent).queryExecution.analyzed.collect {
+case LogicalRelation(t: HadoopFsRelation, _, _) => t.location
+  }.head
+}
 // For partitioned relation r, r.schema's column ordering can be 
different from the column
 // ordering of data.logicalPlan (partition columns are all moved 
after data column).  This
 // will be adjusted within InsertIntoHadoopFsRelation.
 val plan =
   InsertIntoHadoopFsRelationCommand(
 outputPath = outputPath,
 staticPartitions = Map.empty,
-customPartitionLocations = Map.empty,
 partitionColumns = columns,
 bucketSpec = bucketSpec,
 fileFormat = format,
-refreshFunction = _ => Unit, // No existing table needs to be 
refreshed.
--- End diff --

Previously, in this case, we do not call `refreshPartitionsCallback`. After 
this PR, we always refresh it. Is my understanding right?

How did it work without this PR changes? Does that mean we just rely on 
Hive to implicitly call `AlterTableAddPartitionCommand`/`createPartition` when 
the existing table does not exist? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16439: [SPARK-19026]SPARK_LOCAL_DIRS(multiple directories on di...

2017-01-03 Thread zuotingbing
Github user zuotingbing commented on the issue:

https://github.com/apache/spark/pull/16439
  
@srowen i do not think there should be a fatal error since some of 
SPARK_LOCAL_DIRS can be written successfully, even there is only one of  
SPARK_LOCAL_DIRS can be written, the application is able to run successfully.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16461: [SPARK-19060][SQL] remove the supportsPartial flag in Ag...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16461
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70856/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16461: [SPARK-19060][SQL] remove the supportsPartial flag in Ag...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16461
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16461: [SPARK-19060][SQL] remove the supportsPartial flag in Ag...

2017-01-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16461
  
**[Test build #70856 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70856/testReport)**
 for PR 16461 at commit 
[`e213cbb`](https://github.com/apache/spark/commit/e213cbb87618e51e9dfa171eacbfeab4a5874552).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16469: [SPARK-19072][SQL] codegen of Literal should not output ...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16469
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16469: [SPARK-19072][SQL] codegen of Literal should not output ...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16469
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70855/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16469: [SPARK-19072][SQL] codegen of Literal should not output ...

2017-01-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16469
  
**[Test build #70855 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70855/testReport)**
 for PR 16469 at commit 
[`b382117`](https://github.com/apache/spark/commit/b382117566006034007040b0925504a2c1a70ea0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16469: [SPARK-19072][SQL] codegen of Literal should not output ...

2017-01-03 Thread kayousterhout
Github user kayousterhout commented on the issue:

https://github.com/apache/spark/pull/16469
  
Thanks for the quick fix!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16460: [SPARK-19058][SQL] fix partition related behavior...

2017-01-03 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/16460#discussion_r94532732
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala
 ---
@@ -152,4 +190,29 @@ case class InsertIntoHadoopFsRelationCommand(
   }
 }
   }
+
+  /**
+   * Given a set of input partitions, returns those that have locations 
that differ from the
+   * Hive default (e.g. /k1=v1/k2=v2). These partitions were manually 
assigned locations by
+   * the user.
+   *
+   * @return a mapping from partition specs to their custom locations
+   */
+  private def getCustomPartitionLocations(
+  fs: FileSystem,
+  qualifiedOutputPath: Path,
+  partitions: Seq[CatalogTablePartition]): Map[TablePartitionSpec, 
String] = {
+val table = catalogTable.get
--- End diff --

Shall we pass `catalogTable` as a function parm? `.get` looks a little bit 
risky.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14284: [SPARK-16633] [SPARK-16642] [SPARK-16721] [SQL] Fixes th...

2017-01-03 Thread chengat1314
Github user chengat1314 commented on the issue:

https://github.com/apache/spark/pull/14284
  
Are we able to enable ignore null feature in Spark 2.1? 
like lag(comm ignore nulls) over (order by empno) prev_comm. 
thx


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15178: [SPARK-17556][SQL] Executor side broadcast for broadcast...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15178
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15178: [SPARK-17556][SQL] Executor side broadcast for broadcast...

2017-01-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15178
  
**[Test build #70857 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70857/testReport)**
 for PR 15178 at commit 
[`1b499d1`](https://github.com/apache/spark/commit/1b499d1f7b5689fd544d7adc4aac709ff74fe684).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15178: [SPARK-17556][SQL] Executor side broadcast for broadcast...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15178
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70857/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16429: [SPARK-19019][PYTHON] Fix hijacked `collections.namedtup...

2017-01-03 Thread azmras
Github user azmras commented on the issue:

https://github.com/apache/spark/pull/16429
  
just checked other things, ml,  sql  etc... everything is looking fine... I 
can safely say goodbye to python 3.5 now... 

Thank you.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16455: [MINOR][DOCS] Remove consecutive duplicated words/typo i...

2017-01-03 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/16455
  
whoa. LGTM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16429: [SPARK-19019][PYTHON] Fix hijacked `collections.namedtup...

2017-01-03 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/16429
  
@azmras Thank you for confirming this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16429: [SPARK-19019][PYTHON] Fix hijacked `collections.namedtup...

2017-01-03 Thread azmras
Github user azmras commented on the issue:

https://github.com/apache/spark/pull/16429
  
Python 3.6.0 (default, Dec 24 2016, 08:01:42) 
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
NoSuchObjectException
Welcome to
    __
 / __/__  ___ _/ /__
_\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 2.1.0
  /_/

Using Python version 3.6.0 (default, Dec 24 2016 08:01:42)
SparkSession available as 'spark'.
>>> sc.parallelize(range(1000), 20).take(5)
[0, 1, 2, 3, 4]


Thanks a lot it is working now.. had to patch zipped lib too..


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16417: [SPARK-19014][SQL] support complex aggregate buff...

2017-01-03 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/16417#discussion_r94529892
  
--- Diff: 
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
 ---
@@ -201,6 +210,25 @@ public void setNullAt(int i) {
 Platform.putLong(baseObject, getFieldOffset(i), 0);
   }
 
+  public void setNullData(int ordinal) {
--- End diff --

Ok. Good for me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16417: [SPARK-19014][SQL] support complex aggregate buff...

2017-01-03 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16417#discussion_r94529822
  
--- Diff: 
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
 ---
@@ -201,6 +210,25 @@ public void setNullAt(int i) {
 Platform.putLong(baseObject, getFieldOffset(i), 0);
   }
 
+  public void setNullData(int ordinal) {
--- End diff --

how about `setNullForFixedLenthNonPrimitive`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16240: [SPARK-16792][SQL] Dataset containing a Case Class with ...

2017-01-03 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16240
  
you need to fix mima:
```
[error]  * method newDoubleSeqEncoder()org.apache.spark.sql.Encoder in 
class org.apache.spark.sql.SQLImplicits does not have a correspondent in 
current version
[error]filter with: 
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.SQLImplicits.newDoubleSeqEncoder")
[error]  * method newFloatSeqEncoder()org.apache.spark.sql.Encoder in class 
org.apache.spark.sql.SQLImplicits does not have a correspondent in current 
version
[error]filter with: 
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.SQLImplicits.newFloatSeqEncoder")
[error]  * method newByteSeqEncoder()org.apache.spark.sql.Encoder in class 
org.apache.spark.sql.SQLImplicits does not have a correspondent in current 
version
[error]filter with: 
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.SQLImplicits.newByteSeqEncoder")
[error]  * method newLongSeqEncoder()org.apache.spark.sql.Encoder in class 
org.apache.spark.sql.SQLImplicits does not have a correspondent in current 
version
[error]filter with: 
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.SQLImplicits.newLongSeqEncoder")
[error]  * method newStringSeqEncoder()org.apache.spark.sql.Encoder in 
class org.apache.spark.sql.SQLImplicits does not have a correspondent in 
current version
[error]filter with: 
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.SQLImplicits.newStringSeqEncoder")
[error]  * method newIntSeqEncoder()org.apache.spark.sql.Encoder in class 
org.apache.spark.sql.SQLImplicits does not have a correspondent in current 
version
[error]filter with: 
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.SQLImplicits.newIntSeqEncoder")
[error]  * method newBooleanSeqEncoder()org.apache.spark.sql.Encoder in 
class org.apache.spark.sql.SQLImplicits does not have a correspondent in 
current version
[error]filter with: 
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.SQLImplicits.newBooleanSeqEncoder")
[error]  * method newShortSeqEncoder()org.apache.spark.sql.Encoder in class 
org.apache.spark.sql.SQLImplicits does not have a correspondent in current 
version
[error]filter with: 
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.SQLImplicits.newShortSeqEncoder")
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16240: [SPARK-16792][SQL] Dataset containing a Case Class with ...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16240
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16240: [SPARK-16792][SQL] Dataset containing a Case Class with ...

2017-01-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16240
  
**[Test build #70859 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70859/testReport)**
 for PR 16240 at commit 
[`efd0801`](https://github.com/apache/spark/commit/efd0801e24088b90c1157de0cb0bfe8159aeaac5).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class SeqCC(s: Seq[Int])`
  * `case class ListCC(l: List[Int])`
  * `case class QueueCC(q: Queue[Int])`
  * `case class ComplexCC(seq: SeqCC, list: ListCC, queue: QueueCC)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16240: [SPARK-16792][SQL] Dataset containing a Case Class with ...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16240
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70859/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15880: [SPARK-17913][SQL] compare long and string type column m...

2017-01-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15880
  
**[Test build #70860 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70860/testReport)**
 for PR 15880 at commit 
[`821cca6`](https://github.com/apache/spark/commit/821cca6cd836f11ea917c89938f288f126d633ab).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16240: [SPARK-16792][SQL] Dataset containing a Case Class with ...

2017-01-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16240
  
**[Test build #70859 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70859/testReport)**
 for PR 16240 at commit 
[`efd0801`](https://github.com/apache/spark/commit/efd0801e24088b90c1157de0cb0bfe8159aeaac5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16240: [SPARK-16792][SQL] Dataset containing a Case Class with ...

2017-01-03 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16240
  
LGTM, please create 2 more tickets for the optimization you metioned in 
https://github.com/apache/spark/pull/16240#issuecomment-266318016 and the 
nested custom collection.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15880: [SPARK-17913][SQL] compare long and string type column m...

2017-01-03 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/15880
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16240: [SPARK-16792][SQL] Dataset containing a Case Clas...

2017-01-03 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16240#discussion_r94528665
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DatasetPrimitiveSuite.scala ---
@@ -17,10 +17,21 @@
 
 package org.apache.spark.sql
 
+import scala.collection.immutable.Queue
+import scala.collection.mutable.ArrayBuffer
+
 import org.apache.spark.sql.test.SharedSQLContext
 
 case class IntClass(value: Int)
 
+case class SeqCC(s: Seq[Int])
--- End diff --

what does `CC` short for? How about `SeqClass`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16240: [SPARK-16792][SQL] Dataset containing a Case Class with ...

2017-01-03 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16240
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16460: [SPARK-19058][SQL] fix partition related behavior...

2017-01-03 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16460#discussion_r94528412
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala
 ---
@@ -74,12 +69,29 @@ case class InsertIntoHadoopFsRelationCommand(
 val fs = outputPath.getFileSystem(hadoopConf)
 val qualifiedOutputPath = outputPath.makeQualified(fs.getUri, 
fs.getWorkingDirectory)
 
+val partitionsTrackedByCatalog = catalogTable.isDefined &&
+  catalogTable.get.partitionColumnNames.nonEmpty &&
+  catalogTable.get.tracksPartitionsInCatalog
--- End diff --

This is something I wanna check with @ericl . What if users create a table 
with partition management, then turn it off, and read this table? If we treat 
this table as normal table, then the data in custom partition path will be 
ignored.

I think we should respect the partition management flag when the table was 
created, not when the table is read.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16404: [SPARK-18969][SQL] Support grouping by nondetermi...

2017-01-03 Thread cloud-fan
Github user cloud-fan closed the pull request at:

https://github.com/apache/spark/pull/16404


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16404: [SPARK-18969][SQL] Support grouping by nondetermi...

2017-01-03 Thread cloud-fan
GitHub user cloud-fan reopened a pull request:

https://github.com/apache/spark/pull/16404

[SPARK-18969][SQL] Support grouping by nondeterministic expressions

## What changes were proposed in this pull request?

Currently nondeterministic expressions are allowed in `Aggregate`(see the 
[comment](https://github.com/apache/spark/blob/v2.0.2/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala#L249-L251)),
 but the `PullOutNondeterministic` analyzer rule failed to handle `Aggregate`, 
this PR fixes it.

close https://github.com/apache/spark/pull/16379

## How was this patch tested?

a new test suite

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cloud-fan/spark groupby

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16404.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16404


commit f1451883df9077ecbf31f3a86d2427b60262f863
Author: Wenchen Fan 
Date:   2016-12-26T10:24:07Z

Support grouping by nondeterministic expressions




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16404: [SPARK-18969][SQL] Support grouping by nondeterministic ...

2017-01-03 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16404
  
How do other databases handle this case? Do they forbid using 
non-deterministic expressions in GROUP BY, or give a better error message?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16422: [SPARK-17642] [SQL] support DESC EXTENDED/FORMATTED tabl...

2017-01-03 Thread wzhfy
Github user wzhfy commented on the issue:

https://github.com/apache/spark/pull/16422
  
@gatorsmile Why statistics info is sensitive? Users can run sql queries to 
get each of them (max, min, ndv, etc) anyway.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16466: [SPARK-19070] Clean-up dataset actions

2017-01-03 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16466
  
LGTM, if you can pass the test :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16467: [SPARK-19017][SQL] NOT IN subquery with more than one co...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16467
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70849/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16467: [SPARK-19017][SQL] NOT IN subquery with more than one co...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16467
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16451: [WIP][SPARK-18922][SQL][CORE][STREAMING][TESTS] Fix all ...

2017-01-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16451
  
**[Test build #70858 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70858/testReport)**
 for PR 16451 at commit 
[`d50d10c`](https://github.com/apache/spark/commit/d50d10cf1456137f69ca13a686c3fa67a46bc707).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16467: [SPARK-19017][SQL] NOT IN subquery with more than one co...

2017-01-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16467
  
**[Test build #70849 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70849/testReport)**
 for PR 16467 at commit 
[`de655d0`](https://github.com/apache/spark/commit/de655d0d00693a2bc98fddad7be6f55fb2690555).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16451: [WIP][SPARK-18922][SQL][CORE][STREAMING][TESTS] Fix all ...

2017-01-03 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/16451
  
Build started: [TESTS] 
`org.apache.spark.streaming.kafka010.DirectKafkaStreamSuite` 
[![PR-16451](https://ci.appveyor.com/api/projects/status/github/spark-test/spark?branch=E8488472-738C-4ADF-A924-8F858728D120&svg=true)](https://ci.appveyor.com/project/spark-test/spark/branch/E8488472-738C-4ADF-A924-8F858728D120)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15178: [SPARK-17556][SQL] Executor side broadcast for broadcast...

2017-01-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15178
  
**[Test build #70857 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70857/testReport)**
 for PR 15178 at commit 
[`1b499d1`](https://github.com/apache/spark/commit/1b499d1f7b5689fd544d7adc4aac709ff74fe684).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16453: [SPARK-19054][ML] Eliminate extra pass in NB

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16453
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16453: [SPARK-19054][ML] Eliminate extra pass in NB

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16453
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70854/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16453: [SPARK-19054][ML] Eliminate extra pass in NB

2017-01-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16453
  
**[Test build #70854 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70854/testReport)**
 for PR 16453 at commit 
[`1b3b5a0`](https://github.com/apache/spark/commit/1b3b5a03236c0c42d8e20f24db339c4e7cdbfcf1).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15178: [SPARK-17556][SQL] Executor side broadcast for broadcast...

2017-01-03 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/15178
  
retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16461: [SPARK-19060][SQL] remove the supportsPartial flag in Ag...

2017-01-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16461
  
**[Test build #70856 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70856/testReport)**
 for PR 16461 at commit 
[`e213cbb`](https://github.com/apache/spark/commit/e213cbb87618e51e9dfa171eacbfeab4a5874552).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15178: [SPARK-17556][SQL] Executor side broadcast for broadcast...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15178
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70851/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15178: [SPARK-17556][SQL] Executor side broadcast for broadcast...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15178
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15178: [SPARK-17556][SQL] Executor side broadcast for broadcast...

2017-01-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15178
  
**[Test build #70851 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70851/testReport)**
 for PR 15178 at commit 
[`1b499d1`](https://github.com/apache/spark/commit/1b499d1f7b5689fd544d7adc4aac709ff74fe684).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16469: [SPARK-19072][SQL] codegen of Literal should not output ...

2017-01-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16469
  
**[Test build #70855 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70855/testReport)**
 for PR 16469 at commit 
[`b382117`](https://github.com/apache/spark/commit/b382117566006034007040b0925504a2c1a70ea0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16402: [SPARK-18999][SQL][minor] simplify Literal codegen

2017-01-03 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16402
  
Sorry it's my bad, I should take a look at the test result before retest 
it. I've sent a PR to fix it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16469: [SPARK-19072][SQL] codegen of Literal should not output ...

2017-01-03 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16469
  
cc @kayousterhout @gatorsmile 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16465: [SPARK-19064][PySpark]Fix pip installing of sub componen...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16465
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16469: [SPARK-19072][SQL] codegen of Literal should not ...

2017-01-03 Thread cloud-fan
GitHub user cloud-fan opened a pull request:

https://github.com/apache/spark/pull/16469

[SPARK-19072][SQL] codegen of Literal should not output boxed value

## What changes were proposed in this pull request?

In https://github.com/apache/spark/pull/16402 we made a mistake that, when 
double/float is infinity, the `Literal` codegen will output boxed value and 
cause wrong result.

This PR fixes this by special handling infinity to not output boxed value.

## How was this patch tested?

new regression test

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cloud-fan/spark literal

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16469.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16469


commit b382117566006034007040b0925504a2c1a70ea0
Author: Wenchen Fan 
Date:   2017-01-04T03:37:25Z

codegen of Literal should not output boxed value




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16465: [SPARK-19064][PySpark]Fix pip installing of sub componen...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16465
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70848/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16465: [SPARK-19064][PySpark]Fix pip installing of sub componen...

2017-01-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16465
  
**[Test build #70848 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70848/testReport)**
 for PR 16465 at commit 
[`b28d9ca`](https://github.com/apache/spark/commit/b28d9ca5e553e453b34d6199549d845ff5b6e1e2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16462: [SPARK-19062] Utils.writeByteBuffer bug fix

2017-01-03 Thread mridulm
Github user mridulm commented on the issue:

https://github.com/apache/spark/pull/16462
  
LGTM, thanks for fixing this @kayousterhout !


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16445: [SPARK-19043][SQL]Make SparkSQLSessionManager more confi...

2017-01-03 Thread yaooqinn
Github user yaooqinn commented on the issue:

https://github.com/apache/spark/pull/16445
  
ping @srowen would you plz take a look at this pr?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16457: [SPARK-19057][ML] Instances' weight must be non-negative

2017-01-03 Thread zhengruifeng
Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/16457
  
Agreed. Now five algs inherit `HasWeightCol`: GLR/LoR/LiR/NB/IsotonicReg
I found that some algs use `RDD[Instance]` in `train` : GLR/LoR/LiR 
```
val instances: RDD[Instance] =
  dataset.select(col($(labelCol)), w, col($(featuresCol))).rdd.map {
case Row(label: Double, weight: Double, features: Vector) =>
  Instance(label, weight, features)
  }
```
NB can also be modified to start with `RDD[Instance]`.
We can create a new API `extractInstance` in `Predictor` and validate 
weight in it, like the way that we check `label` in 
`extractLabeledPoints(dataset: Dataset[_], numClasses: Int)` in `Classifier`.
For IsotonicReg, we add a validatation in `extractWeightedLabeledPoints`.
What about this plan? @srowen @sethah @jkbradley 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16456: [SPARK-18994] clean up the local directories for applica...

2017-01-03 Thread liujianhuiouc
Github user liujianhuiouc commented on the issue:

https://github.com/apache/spark/pull/16456
  
in actual scene, it's only one executor's director for an app, does you 
mean that delete the child directories in parallel?  in my opinion, it's 
unnecessary  to delete that in parallel, could be deleted in future, to avoid 
other message to block the heartbeat, does it right to send heartbeat in 
another thread?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16457: [SPARK-19057][ML] Instances' weight must be non-negative

2017-01-03 Thread zhengruifeng
Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/16457
  
@srowen OK. This is the list of algs that deals with weights:


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16453: [SPARK-19054][ML] Eliminate extra pass in NB

2017-01-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16453
  
**[Test build #70854 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70854/testReport)**
 for PR 16453 at commit 
[`1b3b5a0`](https://github.com/apache/spark/commit/1b3b5a03236c0c42d8e20f24db339c4e7cdbfcf1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #12775: [SPARK-14958][Core] Failed task not handled when there's...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/12775
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70844/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #12775: [SPARK-14958][Core] Failed task not handled when there's...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/12775
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #12775: [SPARK-14958][Core] Failed task not handled when there's...

2017-01-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/12775
  
**[Test build #70844 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70844/testReport)**
 for PR 12775 at commit 
[`9778cef`](https://github.com/apache/spark/commit/9778cefce3e152d559e53cd4e2f5a113e561f0ff).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16453: [SPARK-19054][ML] Eliminate extra pass in NB

2017-01-03 Thread zhengruifeng
Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/16453
  
Updated. Thanks for reviewing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16402: [SPARK-18999][SQL][minor] simplify Literal codegen

2017-01-03 Thread kayousterhout
Github user kayousterhout commented on the issue:

https://github.com/apache/spark/pull/16402
  
This commit introduced a bug where IN doesn't work right for Infinity / 
-Infinity (JIRA [here](https://issues.apache.org/jira/browse/SPARK-19072)).  
I'm not sure how to fix the underlying bug (or if this PR should just be 
reverted) -- @cloud-fan @gatorsmile can one of you fix this?  

The relevant test also failed for this PR the first time tests were run -- 
remember to make sure that test failures aren't related to the PR!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16466: [SPARK-19070] Clean-up dataset actions

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16466
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16466: [SPARK-19070] Clean-up dataset actions

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16466
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70847/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16466: [SPARK-19070] Clean-up dataset actions

2017-01-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16466
  
**[Test build #70847 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70847/testReport)**
 for PR 16466 at commit 
[`dca1b56`](https://github.com/apache/spark/commit/dca1b56810cd3c3469f70cc653a985b78519f6c6).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16468: [SPARK-19074][SS][DOCS] Updated Structured Streaming Pro...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16468
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70853/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16468: [SPARK-19074][SS][DOCS] Updated Structured Streaming Pro...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16468
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16468: [SPARK-19074][SS][DOCS] Updated Structured Streaming Pro...

2017-01-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16468
  
**[Test build #70853 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70853/testReport)**
 for PR 16468 at commit 
[`fbacbf4`](https://github.com/apache/spark/commit/fbacbf4f26afc5bd67a014b2134a5c97cb33cfda).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16451: [WIP][SPARK-18922][SQL][CORE][STREAMING][TESTS] Fix all ...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16451
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16451: [WIP][SPARK-18922][SQL][CORE][STREAMING][TESTS] Fix all ...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16451
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70843/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   >