[GitHub] spark issue #16404: [SPARK-18969][SQL] Support grouping by nondeterministic ...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16404 MySQL treats them differently... ```SQL mysql> select c1, concat(rand(), c1) from t1 group by c1; +--+--+ | c1 | concat(rand(), c1) | +--+--+ |1 | 0.084388771172974981 | |3 | 0.116890648488784823 | +--+--+ 2 rows in set (0.00 sec) mysql> select c1, concat(rand(), c1) from t1 group by c1, concat(rand(), c1); +--+--+ | c1 | concat(rand(), c1) | +--+--+ |1 | 0.16241911441313021 | |1 | 0.461423657332941551 | |3 | 0.81986097415896223 | +--+--+ 3 rows in set (0.00 sec) ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16464: [SPARK-19066][SparkR]:SparkR LDA doesn't set optimizer c...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16464 **[Test build #70863 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70863/testReport)** for PR 16464 at commit [`14bafc1`](https://github.com/apache/spark/commit/14bafc1bd8b2c621cfd2f83f543182a2e38f8fd6). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16460: [SPARK-19058][SQL] fix partition related behavior...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16460#discussion_r94537072 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala --- @@ -74,12 +69,29 @@ case class InsertIntoHadoopFsRelationCommand( val fs = outputPath.getFileSystem(hadoopConf) val qualifiedOutputPath = outputPath.makeQualified(fs.getUri, fs.getWorkingDirectory) +val partitionsTrackedByCatalog = catalogTable.isDefined && + catalogTable.get.partitionColumnNames.nonEmpty && + catalogTable.get.tracksPartitionsInCatalog --- End diff -- do you mean we should completely ignore the partition information in metastore, when the flag is off, so that we should also ignore the data in custom partition path? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15880: [SPARK-17913][SQL] compare long and string type column m...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15880 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15880: [SPARK-17913][SQL] compare long and string type column m...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15880 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70860/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16417: [SPARK-19014][SQL] support complex aggregate buffer in H...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16417 **[Test build #70862 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70862/testReport)** for PR 16417 at commit [`32e527d`](https://github.com/apache/spark/commit/32e527d902318c9e81e8586f592968ee08416acd). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15880: [SPARK-17913][SQL] compare long and string type column m...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15880 **[Test build #70860 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70860/testReport)** for PR 15880 at commit [`821cca6`](https://github.com/apache/spark/commit/821cca6cd836f11ea917c89938f288f126d633ab). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16460: [SPARK-19058][SQL] fix partition related behavior...
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/16460#discussion_r94536423 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala --- @@ -74,12 +69,29 @@ case class InsertIntoHadoopFsRelationCommand( val fs = outputPath.getFileSystem(hadoopConf) val qualifiedOutputPath = outputPath.makeQualified(fs.getUri, fs.getWorkingDirectory) +val partitionsTrackedByCatalog = catalogTable.isDefined && + catalogTable.get.partitionColumnNames.nonEmpty && + catalogTable.get.tracksPartitionsInCatalog --- End diff -- Hm, in other parts of the code we assume that the feature is completely disabled when the flag is off. This is probably needed since there is no way to revert a table otherwise. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16404: [SPARK-18969][SQL] Support grouping by nondeterministic ...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16404 Oracle allows it. It sounds like they treat ` (username || dbms_random.string('a', 10))` in aggregate and group-by as the same expression. ```SQL SQL> select (username || dbms_random.string('a', 10)) from all_users group by (username || dbms_random.string('a', 10)); (USERNAME||DBMS_RANDOM.STRING('A',10)) APEX_04cklbMYhekl FLOWS_FILESVmTbIIeiUs CTXSYSPmgqeRFPry SYSTEMxQLrzXxHth XDBRRTfatsLlU SYSoLDWRKMvlZ XS$NULLXAaOykZCDH APEX_PUBLIC_USERvcLswvpbcw ANONYMOUSgupWiktQKh OUTLNjLdKOTZoFI MDSYSxEOhwTwQqa (USERNAME||DBMS_RANDOM.STRING('A',10)) HRkovpxQztYU 12 rows selected. ``` If I change the order, I got the error: ```SQL SQL> select (dbms_random.string('a', 10) || username) from all_users group by (username || dbms_random.string('a', 10)) 2 ; select (dbms_random.string('a', 10) || username) from all_users group by (username || dbms_random.string('a', 10)) * ERROR at line 1: ORA-00979: not a GROUP BY expression ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14284: [SPARK-16633] [SPARK-16642] [SPARK-16721] [SQL] Fixes th...
Github user chengat1314 commented on the issue: https://github.com/apache/spark/pull/14284 Is possible add feature to enable ignore nulls? for example: LAG (value_expr [, offset ]) [ IGNORE NULLS | RESPECT NULLS ] OVER ( [ PARTITION BY window_partition ] ORDER BY window_ordering ) thanks Cheng Feng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16460: [SPARK-19058][SQL] fix partition related behavior...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16460#discussion_r94535816 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala --- @@ -152,4 +190,29 @@ case class InsertIntoHadoopFsRelationCommand( } } } + + /** + * Given a set of input partitions, returns those that have locations that differ from the + * Hive default (e.g. /k1=v1/k2=v2). These partitions were manually assigned locations by + * the user. + * + * @return a mapping from partition specs to their custom locations + */ + private def getCustomPartitionLocations( + fs: FileSystem, + qualifiedOutputPath: Path, + partitions: Seq[CatalogTablePartition]): Map[TablePartitionSpec, String] = { +val table = catalogTable.get --- End diff -- yea good idea --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16460: [SPARK-19058][SQL] fix partition related behavior...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16460#discussion_r94535760 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -473,22 +473,26 @@ case class DataSource( s"Unable to resolve $name given [${plan.output.map(_.name).mkString(", ")}]") }.asInstanceOf[Attribute] } +val fileIndex = catalogTable.map(_.identifier).map { tableIdent => + sparkSession.table(tableIdent).queryExecution.analyzed.collect { +case LogicalRelation(t: HadoopFsRelation, _, _) => t.location + }.head +} // For partitioned relation r, r.schema's column ordering can be different from the column // ordering of data.logicalPlan (partition columns are all moved after data column). This // will be adjusted within InsertIntoHadoopFsRelation. val plan = InsertIntoHadoopFsRelationCommand( outputPath = outputPath, staticPartitions = Map.empty, -customPartitionLocations = Map.empty, partitionColumns = columns, bucketSpec = bucketSpec, fileFormat = format, -refreshFunction = _ => Unit, // No existing table needs to be refreshed. --- End diff -- Previously, we did not refresh anything here, but we will repair the partitions in [`CreateDataSourceTableAsSelectCommand`](https://github.com/apache/spark/pull/16460/files#diff-945e51801b84b92da242fcb42f83f5f5L171). After this PR, we only repair the partitions in `CreateDataSourceTableAsSelectCommand` when we are creating a new table. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15178: [SPARK-17556][SQL] Executor side broadcast for broadcast...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15178 **[Test build #70861 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70861/testReport)** for PR 15178 at commit [`1b499d1`](https://github.com/apache/spark/commit/1b499d1f7b5689fd544d7adc4aac709ff74fe684). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16469: [SPARK-19072][SQL] codegen of Literal should not ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16469 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15178: [SPARK-17556][SQL] Executor side broadcast for broadcast...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/15178 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15178: [SPARK-17556][SQL] Executor side broadcast for broadcast...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/15178 `org.apache.spark.rdd.AsyncRDDActionsSuite.async failure handling` passes locally. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16469: [SPARK-19072][SQL] codegen of Literal should not output ...
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/16469 LGTM. Merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory fail to...
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/15819 @gatorsmile can you retest the patch, then we can merge. Sorry to ping you multiple times since several users are asking this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16308: [SPARK-18936][SQL] Infrastructure for session local time...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16308 @hvanhovell anything else to do here other than bringing it up to date? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16451: [WIP][SPARK-18922][SQL][CORE][STREAMING][TESTS] Fix all ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16451 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16451: [WIP][SPARK-18922][SQL][CORE][STREAMING][TESTS] Fix all ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16451 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70858/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16451: [WIP][SPARK-18922][SQL][CORE][STREAMING][TESTS] Fix all ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16451 **[Test build #70858 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70858/testReport)** for PR 16451 at commit [`d50d10c`](https://github.com/apache/spark/commit/d50d10cf1456137f69ca13a686c3fa67a46bc707). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16469: [SPARK-19072][SQL] codegen of Literal should not output ...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16469 some more explanation: when `Literal` codegen produce boxed values, the double equality will break, because the code is `(Double.isNaN(d1) && Double.isNaN(d2)) || d == d2` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16404: [SPARK-18969][SQL] Support grouping by nondeterministic ...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16404 DB2 has such a limit. See the error message `SQL -583`: http://www.ibm.com/support/knowledgecenter/SSEPGG_10.5.0/com.ibm.db2.luw.messages.sql.doc/doc/msql00583n.html > The routine (function or method) or expression is defined as non-deterministic or as having external action. This is not supported in the context in which it is used. The contexts in which these are not valid are: > in an expression of a GROUP BY clause It documents the same workaround: > Remove the non-deterministic or external action routine or expression from the GROUP BY clause. If grouping is desired on a column of the result that is based on a non-deterministic or external action routine or expression use a nested table expression or a common table expression to first provide a result table with the expression as a column of the result. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16460: [SPARK-19058][SQL] fix partition related behavior...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16460#discussion_r94533881 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -473,22 +473,26 @@ case class DataSource( s"Unable to resolve $name given [${plan.output.map(_.name).mkString(", ")}]") }.asInstanceOf[Attribute] } +val fileIndex = catalogTable.map(_.identifier).map { tableIdent => + sparkSession.table(tableIdent).queryExecution.analyzed.collect { +case LogicalRelation(t: HadoopFsRelation, _, _) => t.location + }.head +} // For partitioned relation r, r.schema's column ordering can be different from the column // ordering of data.logicalPlan (partition columns are all moved after data column). This // will be adjusted within InsertIntoHadoopFsRelation. val plan = InsertIntoHadoopFsRelationCommand( outputPath = outputPath, staticPartitions = Map.empty, -customPartitionLocations = Map.empty, partitionColumns = columns, bucketSpec = bucketSpec, fileFormat = format, -refreshFunction = _ => Unit, // No existing table needs to be refreshed. --- End diff -- Previously, in this case, we do not call `refreshPartitionsCallback`. After this PR, we always refresh it. Is my understanding right? How did it work without this PR changes? Does that mean we just rely on Hive to implicitly call `AlterTableAddPartitionCommand`/`createPartition` when the existing table does not exist? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16439: [SPARK-19026]SPARK_LOCAL_DIRS(multiple directories on di...
Github user zuotingbing commented on the issue: https://github.com/apache/spark/pull/16439 @srowen i do not think there should be a fatal error since some of SPARK_LOCAL_DIRS can be written successfully, even there is only one of SPARK_LOCAL_DIRS can be written, the application is able to run successfully. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16461: [SPARK-19060][SQL] remove the supportsPartial flag in Ag...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16461 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70856/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16461: [SPARK-19060][SQL] remove the supportsPartial flag in Ag...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16461 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16461: [SPARK-19060][SQL] remove the supportsPartial flag in Ag...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16461 **[Test build #70856 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70856/testReport)** for PR 16461 at commit [`e213cbb`](https://github.com/apache/spark/commit/e213cbb87618e51e9dfa171eacbfeab4a5874552). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16469: [SPARK-19072][SQL] codegen of Literal should not output ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16469 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16469: [SPARK-19072][SQL] codegen of Literal should not output ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16469 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70855/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16469: [SPARK-19072][SQL] codegen of Literal should not output ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16469 **[Test build #70855 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70855/testReport)** for PR 16469 at commit [`b382117`](https://github.com/apache/spark/commit/b382117566006034007040b0925504a2c1a70ea0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16469: [SPARK-19072][SQL] codegen of Literal should not output ...
Github user kayousterhout commented on the issue: https://github.com/apache/spark/pull/16469 Thanks for the quick fix! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16460: [SPARK-19058][SQL] fix partition related behavior...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16460#discussion_r94532732 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala --- @@ -152,4 +190,29 @@ case class InsertIntoHadoopFsRelationCommand( } } } + + /** + * Given a set of input partitions, returns those that have locations that differ from the + * Hive default (e.g. /k1=v1/k2=v2). These partitions were manually assigned locations by + * the user. + * + * @return a mapping from partition specs to their custom locations + */ + private def getCustomPartitionLocations( + fs: FileSystem, + qualifiedOutputPath: Path, + partitions: Seq[CatalogTablePartition]): Map[TablePartitionSpec, String] = { +val table = catalogTable.get --- End diff -- Shall we pass `catalogTable` as a function parm? `.get` looks a little bit risky. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14284: [SPARK-16633] [SPARK-16642] [SPARK-16721] [SQL] Fixes th...
Github user chengat1314 commented on the issue: https://github.com/apache/spark/pull/14284 Are we able to enable ignore null feature in Spark 2.1? like lag(comm ignore nulls) over (order by empno) prev_comm. thx --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15178: [SPARK-17556][SQL] Executor side broadcast for broadcast...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15178 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15178: [SPARK-17556][SQL] Executor side broadcast for broadcast...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15178 **[Test build #70857 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70857/testReport)** for PR 15178 at commit [`1b499d1`](https://github.com/apache/spark/commit/1b499d1f7b5689fd544d7adc4aac709ff74fe684). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15178: [SPARK-17556][SQL] Executor side broadcast for broadcast...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15178 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70857/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16429: [SPARK-19019][PYTHON] Fix hijacked `collections.namedtup...
Github user azmras commented on the issue: https://github.com/apache/spark/pull/16429 just checked other things, ml, sql etc... everything is looking fine... I can safely say goodbye to python 3.5 now... Thank you. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16455: [MINOR][DOCS] Remove consecutive duplicated words/typo i...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/16455 whoa. LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16429: [SPARK-19019][PYTHON] Fix hijacked `collections.namedtup...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/16429 @azmras Thank you for confirming this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16429: [SPARK-19019][PYTHON] Fix hijacked `collections.namedtup...
Github user azmras commented on the issue: https://github.com/apache/spark/pull/16429 Python 3.6.0 (default, Dec 24 2016, 08:01:42) [GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)] on darwin Type "help", "copyright", "credits" or "license" for more information. NoSuchObjectException Welcome to __ / __/__ ___ _/ /__ _\ \/ _ \/ _ `/ __/ '_/ /__ / .__/\_,_/_/ /_/\_\ version 2.1.0 /_/ Using Python version 3.6.0 (default, Dec 24 2016 08:01:42) SparkSession available as 'spark'. >>> sc.parallelize(range(1000), 20).take(5) [0, 1, 2, 3, 4] Thanks a lot it is working now.. had to patch zipped lib too.. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16417: [SPARK-19014][SQL] support complex aggregate buff...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16417#discussion_r94529892 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java --- @@ -201,6 +210,25 @@ public void setNullAt(int i) { Platform.putLong(baseObject, getFieldOffset(i), 0); } + public void setNullData(int ordinal) { --- End diff -- Ok. Good for me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16417: [SPARK-19014][SQL] support complex aggregate buff...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16417#discussion_r94529822 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java --- @@ -201,6 +210,25 @@ public void setNullAt(int i) { Platform.putLong(baseObject, getFieldOffset(i), 0); } + public void setNullData(int ordinal) { --- End diff -- how about `setNullForFixedLenthNonPrimitive`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16240: [SPARK-16792][SQL] Dataset containing a Case Class with ...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16240 you need to fix mima: ``` [error] * method newDoubleSeqEncoder()org.apache.spark.sql.Encoder in class org.apache.spark.sql.SQLImplicits does not have a correspondent in current version [error]filter with: ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.SQLImplicits.newDoubleSeqEncoder") [error] * method newFloatSeqEncoder()org.apache.spark.sql.Encoder in class org.apache.spark.sql.SQLImplicits does not have a correspondent in current version [error]filter with: ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.SQLImplicits.newFloatSeqEncoder") [error] * method newByteSeqEncoder()org.apache.spark.sql.Encoder in class org.apache.spark.sql.SQLImplicits does not have a correspondent in current version [error]filter with: ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.SQLImplicits.newByteSeqEncoder") [error] * method newLongSeqEncoder()org.apache.spark.sql.Encoder in class org.apache.spark.sql.SQLImplicits does not have a correspondent in current version [error]filter with: ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.SQLImplicits.newLongSeqEncoder") [error] * method newStringSeqEncoder()org.apache.spark.sql.Encoder in class org.apache.spark.sql.SQLImplicits does not have a correspondent in current version [error]filter with: ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.SQLImplicits.newStringSeqEncoder") [error] * method newIntSeqEncoder()org.apache.spark.sql.Encoder in class org.apache.spark.sql.SQLImplicits does not have a correspondent in current version [error]filter with: ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.SQLImplicits.newIntSeqEncoder") [error] * method newBooleanSeqEncoder()org.apache.spark.sql.Encoder in class org.apache.spark.sql.SQLImplicits does not have a correspondent in current version [error]filter with: ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.SQLImplicits.newBooleanSeqEncoder") [error] * method newShortSeqEncoder()org.apache.spark.sql.Encoder in class org.apache.spark.sql.SQLImplicits does not have a correspondent in current version [error]filter with: ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.SQLImplicits.newShortSeqEncoder") ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16240: [SPARK-16792][SQL] Dataset containing a Case Class with ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16240 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16240: [SPARK-16792][SQL] Dataset containing a Case Class with ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16240 **[Test build #70859 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70859/testReport)** for PR 16240 at commit [`efd0801`](https://github.com/apache/spark/commit/efd0801e24088b90c1157de0cb0bfe8159aeaac5). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class SeqCC(s: Seq[Int])` * `case class ListCC(l: List[Int])` * `case class QueueCC(q: Queue[Int])` * `case class ComplexCC(seq: SeqCC, list: ListCC, queue: QueueCC)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16240: [SPARK-16792][SQL] Dataset containing a Case Class with ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16240 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70859/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15880: [SPARK-17913][SQL] compare long and string type column m...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15880 **[Test build #70860 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70860/testReport)** for PR 15880 at commit [`821cca6`](https://github.com/apache/spark/commit/821cca6cd836f11ea917c89938f288f126d633ab). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16240: [SPARK-16792][SQL] Dataset containing a Case Class with ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16240 **[Test build #70859 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70859/testReport)** for PR 16240 at commit [`efd0801`](https://github.com/apache/spark/commit/efd0801e24088b90c1157de0cb0bfe8159aeaac5). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16240: [SPARK-16792][SQL] Dataset containing a Case Class with ...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16240 LGTM, please create 2 more tickets for the optimization you metioned in https://github.com/apache/spark/pull/16240#issuecomment-266318016 and the nested custom collection. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15880: [SPARK-17913][SQL] compare long and string type column m...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15880 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16240: [SPARK-16792][SQL] Dataset containing a Case Clas...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16240#discussion_r94528665 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetPrimitiveSuite.scala --- @@ -17,10 +17,21 @@ package org.apache.spark.sql +import scala.collection.immutable.Queue +import scala.collection.mutable.ArrayBuffer + import org.apache.spark.sql.test.SharedSQLContext case class IntClass(value: Int) +case class SeqCC(s: Seq[Int]) --- End diff -- what does `CC` short for? How about `SeqClass`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16240: [SPARK-16792][SQL] Dataset containing a Case Class with ...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16240 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16460: [SPARK-19058][SQL] fix partition related behavior...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16460#discussion_r94528412 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala --- @@ -74,12 +69,29 @@ case class InsertIntoHadoopFsRelationCommand( val fs = outputPath.getFileSystem(hadoopConf) val qualifiedOutputPath = outputPath.makeQualified(fs.getUri, fs.getWorkingDirectory) +val partitionsTrackedByCatalog = catalogTable.isDefined && + catalogTable.get.partitionColumnNames.nonEmpty && + catalogTable.get.tracksPartitionsInCatalog --- End diff -- This is something I wanna check with @ericl . What if users create a table with partition management, then turn it off, and read this table? If we treat this table as normal table, then the data in custom partition path will be ignored. I think we should respect the partition management flag when the table was created, not when the table is read. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16404: [SPARK-18969][SQL] Support grouping by nondetermi...
Github user cloud-fan closed the pull request at: https://github.com/apache/spark/pull/16404 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16404: [SPARK-18969][SQL] Support grouping by nondetermi...
GitHub user cloud-fan reopened a pull request: https://github.com/apache/spark/pull/16404 [SPARK-18969][SQL] Support grouping by nondeterministic expressions ## What changes were proposed in this pull request? Currently nondeterministic expressions are allowed in `Aggregate`(see the [comment](https://github.com/apache/spark/blob/v2.0.2/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala#L249-L251)), but the `PullOutNondeterministic` analyzer rule failed to handle `Aggregate`, this PR fixes it. close https://github.com/apache/spark/pull/16379 ## How was this patch tested? a new test suite You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloud-fan/spark groupby Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16404.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16404 commit f1451883df9077ecbf31f3a86d2427b60262f863 Author: Wenchen Fan Date: 2016-12-26T10:24:07Z Support grouping by nondeterministic expressions --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16404: [SPARK-18969][SQL] Support grouping by nondeterministic ...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16404 How do other databases handle this case? Do they forbid using non-deterministic expressions in GROUP BY, or give a better error message? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16422: [SPARK-17642] [SQL] support DESC EXTENDED/FORMATTED tabl...
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/16422 @gatorsmile Why statistics info is sensitive? Users can run sql queries to get each of them (max, min, ndv, etc) anyway. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16466: [SPARK-19070] Clean-up dataset actions
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16466 LGTM, if you can pass the test :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16467: [SPARK-19017][SQL] NOT IN subquery with more than one co...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16467 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70849/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16467: [SPARK-19017][SQL] NOT IN subquery with more than one co...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16467 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16451: [WIP][SPARK-18922][SQL][CORE][STREAMING][TESTS] Fix all ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16451 **[Test build #70858 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70858/testReport)** for PR 16451 at commit [`d50d10c`](https://github.com/apache/spark/commit/d50d10cf1456137f69ca13a686c3fa67a46bc707). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16467: [SPARK-19017][SQL] NOT IN subquery with more than one co...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16467 **[Test build #70849 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70849/testReport)** for PR 16467 at commit [`de655d0`](https://github.com/apache/spark/commit/de655d0d00693a2bc98fddad7be6f55fb2690555). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16451: [WIP][SPARK-18922][SQL][CORE][STREAMING][TESTS] Fix all ...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/16451 Build started: [TESTS] `org.apache.spark.streaming.kafka010.DirectKafkaStreamSuite` [![PR-16451](https://ci.appveyor.com/api/projects/status/github/spark-test/spark?branch=E8488472-738C-4ADF-A924-8F858728D120&svg=true)](https://ci.appveyor.com/project/spark-test/spark/branch/E8488472-738C-4ADF-A924-8F858728D120) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15178: [SPARK-17556][SQL] Executor side broadcast for broadcast...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15178 **[Test build #70857 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70857/testReport)** for PR 15178 at commit [`1b499d1`](https://github.com/apache/spark/commit/1b499d1f7b5689fd544d7adc4aac709ff74fe684). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16453: [SPARK-19054][ML] Eliminate extra pass in NB
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16453 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16453: [SPARK-19054][ML] Eliminate extra pass in NB
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16453 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70854/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16453: [SPARK-19054][ML] Eliminate extra pass in NB
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16453 **[Test build #70854 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70854/testReport)** for PR 16453 at commit [`1b3b5a0`](https://github.com/apache/spark/commit/1b3b5a03236c0c42d8e20f24db339c4e7cdbfcf1). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15178: [SPARK-17556][SQL] Executor side broadcast for broadcast...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/15178 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16461: [SPARK-19060][SQL] remove the supportsPartial flag in Ag...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16461 **[Test build #70856 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70856/testReport)** for PR 16461 at commit [`e213cbb`](https://github.com/apache/spark/commit/e213cbb87618e51e9dfa171eacbfeab4a5874552). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15178: [SPARK-17556][SQL] Executor side broadcast for broadcast...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15178 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70851/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15178: [SPARK-17556][SQL] Executor side broadcast for broadcast...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15178 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15178: [SPARK-17556][SQL] Executor side broadcast for broadcast...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15178 **[Test build #70851 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70851/testReport)** for PR 15178 at commit [`1b499d1`](https://github.com/apache/spark/commit/1b499d1f7b5689fd544d7adc4aac709ff74fe684). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16469: [SPARK-19072][SQL] codegen of Literal should not output ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16469 **[Test build #70855 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70855/testReport)** for PR 16469 at commit [`b382117`](https://github.com/apache/spark/commit/b382117566006034007040b0925504a2c1a70ea0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16402: [SPARK-18999][SQL][minor] simplify Literal codegen
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16402 Sorry it's my bad, I should take a look at the test result before retest it. I've sent a PR to fix it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16469: [SPARK-19072][SQL] codegen of Literal should not output ...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16469 cc @kayousterhout @gatorsmile --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16465: [SPARK-19064][PySpark]Fix pip installing of sub componen...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16465 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16469: [SPARK-19072][SQL] codegen of Literal should not ...
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/16469 [SPARK-19072][SQL] codegen of Literal should not output boxed value ## What changes were proposed in this pull request? In https://github.com/apache/spark/pull/16402 we made a mistake that, when double/float is infinity, the `Literal` codegen will output boxed value and cause wrong result. This PR fixes this by special handling infinity to not output boxed value. ## How was this patch tested? new regression test You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloud-fan/spark literal Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16469.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16469 commit b382117566006034007040b0925504a2c1a70ea0 Author: Wenchen Fan Date: 2017-01-04T03:37:25Z codegen of Literal should not output boxed value --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16465: [SPARK-19064][PySpark]Fix pip installing of sub componen...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16465 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70848/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16465: [SPARK-19064][PySpark]Fix pip installing of sub componen...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16465 **[Test build #70848 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70848/testReport)** for PR 16465 at commit [`b28d9ca`](https://github.com/apache/spark/commit/b28d9ca5e553e453b34d6199549d845ff5b6e1e2). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16462: [SPARK-19062] Utils.writeByteBuffer bug fix
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/16462 LGTM, thanks for fixing this @kayousterhout ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16445: [SPARK-19043][SQL]Make SparkSQLSessionManager more confi...
Github user yaooqinn commented on the issue: https://github.com/apache/spark/pull/16445 ping @srowen would you plz take a look at this prï¼ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16457: [SPARK-19057][ML] Instances' weight must be non-negative
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/16457 Agreed. Now five algs inherit `HasWeightCol`: GLR/LoR/LiR/NB/IsotonicReg I found that some algs use `RDD[Instance]` in `train` : GLR/LoR/LiR ``` val instances: RDD[Instance] = dataset.select(col($(labelCol)), w, col($(featuresCol))).rdd.map { case Row(label: Double, weight: Double, features: Vector) => Instance(label, weight, features) } ``` NB can also be modified to start with `RDD[Instance]`. We can create a new API `extractInstance` in `Predictor` and validate weight in it, like the way that we check `label` in `extractLabeledPoints(dataset: Dataset[_], numClasses: Int)` in `Classifier`. For IsotonicReg, we add a validatation in `extractWeightedLabeledPoints`. What about this plan? @srowen @sethah @jkbradley --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16456: [SPARK-18994] clean up the local directories for applica...
Github user liujianhuiouc commented on the issue: https://github.com/apache/spark/pull/16456 in actual scene, it's only one executor's director for an app, does you mean that delete the child directories in parallel? in my opinion, it's unnecessary to delete that in parallel, could be deleted in future, to avoid other message to block the heartbeat, does it right to send heartbeat in another thread? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16457: [SPARK-19057][ML] Instances' weight must be non-negative
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/16457 @srowen OK. This is the list of algs that deals with weights: --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16453: [SPARK-19054][ML] Eliminate extra pass in NB
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16453 **[Test build #70854 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70854/testReport)** for PR 16453 at commit [`1b3b5a0`](https://github.com/apache/spark/commit/1b3b5a03236c0c42d8e20f24db339c4e7cdbfcf1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12775: [SPARK-14958][Core] Failed task not handled when there's...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12775 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70844/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12775: [SPARK-14958][Core] Failed task not handled when there's...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12775 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12775: [SPARK-14958][Core] Failed task not handled when there's...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12775 **[Test build #70844 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70844/testReport)** for PR 12775 at commit [`9778cef`](https://github.com/apache/spark/commit/9778cefce3e152d559e53cd4e2f5a113e561f0ff). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16453: [SPARK-19054][ML] Eliminate extra pass in NB
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/16453 Updated. Thanks for reviewing. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16402: [SPARK-18999][SQL][minor] simplify Literal codegen
Github user kayousterhout commented on the issue: https://github.com/apache/spark/pull/16402 This commit introduced a bug where IN doesn't work right for Infinity / -Infinity (JIRA [here](https://issues.apache.org/jira/browse/SPARK-19072)). I'm not sure how to fix the underlying bug (or if this PR should just be reverted) -- @cloud-fan @gatorsmile can one of you fix this? The relevant test also failed for this PR the first time tests were run -- remember to make sure that test failures aren't related to the PR! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16466: [SPARK-19070] Clean-up dataset actions
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16466 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16466: [SPARK-19070] Clean-up dataset actions
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16466 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70847/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16466: [SPARK-19070] Clean-up dataset actions
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16466 **[Test build #70847 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70847/testReport)** for PR 16466 at commit [`dca1b56`](https://github.com/apache/spark/commit/dca1b56810cd3c3469f70cc653a985b78519f6c6). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16468: [SPARK-19074][SS][DOCS] Updated Structured Streaming Pro...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16468 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70853/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16468: [SPARK-19074][SS][DOCS] Updated Structured Streaming Pro...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16468 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16468: [SPARK-19074][SS][DOCS] Updated Structured Streaming Pro...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16468 **[Test build #70853 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70853/testReport)** for PR 16468 at commit [`fbacbf4`](https://github.com/apache/spark/commit/fbacbf4f26afc5bd67a014b2134a5c97cb33cfda). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16451: [WIP][SPARK-18922][SQL][CORE][STREAMING][TESTS] Fix all ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16451 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16451: [WIP][SPARK-18922][SQL][CORE][STREAMING][TESTS] Fix all ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16451 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70843/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org