[GitHub] spark issue #16753: [SPARK-19296][SQL] Deduplicate url and table in JdbcUtil...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16753 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16753: [SPARK-19296][SQL] Deduplicate url and table in JdbcUtil...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16753 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72199/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16753: [SPARK-19296][SQL] Deduplicate url and table in JdbcUtil...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16753 **[Test build #72199 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72199/testReport)** for PR 16753 at commit [`6b2841a`](https://github.com/apache/spark/commit/6b2841a183825aea1d37287b8530bcb37cdee2c5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16664: [SPARK-18120 ][SQL] Call QueryExecutionListener callback...
Github user salilsurendran commented on the issue: https://github.com/apache/spark/pull/16664 @yhuai @marmbrus @liancheng Can someone review my PR please. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16757: [SPARK-18609][SQL] Fix redundant Alias removal in...
GitHub user hvanhovell opened a pull request: https://github.com/apache/spark/pull/16757 [SPARK-18609][SQL] Fix redundant Alias removal in the optimizer ## What changes were proposed in this pull request? The optimizer tries to remove redundant alias only projections from the query plan using the `RemoveAliasOnlyProject` rule. The current rule identifies removes such a project and rewrites the project's attributes in the **entire** tree. This causes problems when parts of the tree are duplicated (for instance a self join on a temporary view/CTE) and the duplicated part contains the alias only project, in this case the rewrite will break the tree. [Solution] TODO It was difficult to control both the blacklisted attributes, the transformation of the tree, and the to keep the rewrite local to a node's parents. I have made a few changes to `TreeNode`, `QueryPlan` and `LogicalPlan` to open up the transformation logic which allows us to have (the needed) more fine grained control over tree transformations. ## How was this patch tested? Added a test to `RemoveRedundantAliasAndProjectSuite` and existing tests. I will add some more integration tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/hvanhovell/spark SPARK-18609 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16757.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16757 commit 6c89a15ed8eb868b23237bba07498fb2053f4643 Author: Herman van HovellDate: 2017-01-30T12:11:46Z Open-up TreeNode's transform logic. commit dac7ec99075ce98ebea92e108ad66b05537de396 Author: Herman van Hovell Date: 2017-01-31T16:03:57Z Split RemoveAliasOnlyProject into RemoveRedundantAliases and RemoveRedundantProject. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16746: [SPARK-15648][SQL] Add teradataDialect for JDBC c...
Github user klinvill commented on a diff in the pull request: https://github.com/apache/spark/pull/16746#discussion_r98710451 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/TeradataDialect.scala --- @@ -0,0 +1,33 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.jdbc + +import java.sql.Types --- End diff -- Thanks! Fixed in latest commit. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16043: [SPARK-18601][SQL] Simplify Create/Get complex expressio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16043 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72200/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16043: [SPARK-18601][SQL] Simplify Create/Get complex expressio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16043 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16043: [SPARK-18601][SQL] Simplify Create/Get complex expressio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16043 **[Test build #72200 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72200/testReport)** for PR 16043 at commit [`1da58aa`](https://github.com/apache/spark/commit/1da58aa799eac582d2ec2d7980fa3c27b6de8180). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #9759: [SPARK-11753][SQL][test-hadoop2.2] Make allowNonNumericNu...
Github user limansky commented on the issue: https://github.com/apache/spark/pull/9759 Hi all. There are security issues in jackson-dataformat-xml prior to 2.7.4 and 2.8.0. Here are the links: FasterXML/jackson-dataformat-xml#199, FasterXML/jackson-dataformat-xml#190. Even though Spark itself doesn't use this module, this dependency forces Spark users to use affected version, to have consistent set of jackson libraries. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16756: [SPARK-19411][SQL] Remove the metadata used to mark opti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16756 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16746: [SPARK-15648][SQL] Add teradataDialect for JDBC c...
Github user klinvill commented on a diff in the pull request: https://github.com/apache/spark/pull/16746#discussion_r98706364 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/TeradataDialect.scala --- @@ -0,0 +1,33 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.jdbc + +import java.sql.Types +import org.apache.spark.sql.types._ + + +private case object TeradataDialect extends JdbcDialect { + + override def canHandle(url: String): Boolean = { url.startsWith("jdbc:teradata") } + + override def getJDBCType(dt: DataType): Option[JdbcType] = dt match { +case StringType => Some(JdbcType("VARCHAR(255)", java.sql.Types.VARCHAR)) +case BooleanType => Option(JdbcType("CHAR(1)", java.sql.Types.CHAR)) +case _ => None + } --- End diff -- Hi @dongjoon-hyun, Teradata still doesn't support LIMIT (it uses TOP instead) but the spark code that was originally using limit has been changed to use "where 1=0 instead". ``` /** * Get the SQL query that should be used to find if the given table exists. Dialects can * override this method to return a query that works best in a particular database. * @param table The name of the table. * @return The SQL query to use for checking the table. */ def getTableExistsQuery(table: String): String = { s"SELECT * FROM $table WHERE 1=0" } /** * The SQL query that should be used to discover the schema of a table. It only needs to * ensure that the result set has the same schema as the table, such as by calling * "SELECT * ...". Dialects can override this method to return a query that works best in a * particular database. * @param table The name of the table. * @return The SQL query to use for discovering the schema. */ @Since("2.1.0") def getSchemaQuery(table: String): String = { s"SELECT * FROM $table WHERE 1=0" } ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16756: [SPARK-19411][SQL] Remove the metadata used to mark opti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16756 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72197/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16756: [SPARK-19411][SQL] Remove the metadata used to mark opti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16756 **[Test build #72197 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72197/testReport)** for PR 16756 at commit [`ec2bbbf`](https://github.com/apache/spark/commit/ec2bbbf55a99f0fa8fba39569b959e17d24b3243). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16620: [SPARK-19263] DAGScheduler should avoid sending c...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16620#discussion_r98699067 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1212,8 +1223,9 @@ class DAGScheduler( clearCacheLocs() - if (!shuffleStage.isAvailable) { -// Some tasks had failed; let's resubmit this shuffleStage + if (!shuffleStage.isAvailable && noActiveTaskSetManager) { --- End diff -- You need to update this for mapStageJobs -- the `else` branch will now run if the shuffleStage is not available, but there is an active task set manager, which we don't want. Also calling `submitWaitingChildStages(shuffleStage)` is confusing (though it seems to be correct). (or use the other version I suggested) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16620: [SPARK-19263] DAGScheduler should avoid sending c...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16620#discussion_r98703486 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Stage.scala --- @@ -68,6 +68,12 @@ private[scheduler] abstract class Stage( /** Set of jobs that this stage belongs to. */ val jobIds = new HashSet[Int] + /** + * Partitions which there is not yet a task succeeded on. Note that for [[ShuffleMapStage]] + * pendingPartitions.size() == 0 doesn't mean the stage is available. Because the succeeded + * task can be bogus which is out of date and task's epoch is older than corresponding + * executor's failed epoch in [[DAGScheduler]]. + */ --- End diff -- How about: Partitions the DAGScheduler is waiting on before it tries to mark the stage / job as completed and continue. Most commonly, this is the set of tasks that are not successful in the active taskset for this stage, but not always. In particular, when there are multiple attempts for a stage, then this will include late task completions from earlier attempts. Finally, note that when this is empty, it does not *necessarily* mean that stage is completed -- we have may have lost some of the map output from that stage. But the DAGScheduler will check for this condition and resubmit the stage if necessary. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16620: [SPARK-19263] DAGScheduler should avoid sending c...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16620#discussion_r98703683 --- Diff: core/src/test/scala/org/apache/spark/scheduler/SchedulerIntegrationSuite.scala --- @@ -648,4 +660,70 @@ class BasicSchedulerIntegrationSuite extends SchedulerIntegrationSuite[SingleCor } assertDataStructuresEmpty(noFailure = false) } + + testScheduler("[SPARK-19263] DAGScheduler shouldn't resubmit active taskSet.") { +val a = new MockRDD(sc, 2, Nil) +val b = shuffle(2, a) +val shuffleId = b.shuffleDeps.head.shuffleId + +def runBackend(): Unit = { + val (taskDescription, task) = backend.beginTask() + task.stageId match { +// ShuffleMapTask +case 0 => + val stageAttempt = task.stageAttemptId + val partitionId = task.partitionId + (stageAttempt, partitionId) match { +case (0, 0) => + val fetchFailed = FetchFailed( +DAGSchedulerSuite.makeBlockManagerId("hostA"), shuffleId, 0, 0, "ignored") + backend.taskFailed(taskDescription, fetchFailed) +case (0, 1) => + // Wait until stage resubmission caused by FetchFailed is finished. + waitForCondition(taskScheduler.runningTaskSets.size==2, 5000, --- End diff -- nit: spaces around `==` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15009: [SPARK-17443][SPARK-11035] Stop Spark Application if lau...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15009 **[Test build #72201 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72201/consoleFull)** for PR 15009 at commit [`6a7ba5b`](https://github.com/apache/spark/commit/6a7ba5bfdd2cb165956992907f681ab3ad85154e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16751: [SPARK-19409][BUILD] Bump parquet version to 1.8.2
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/16751 Thank you for review and merging, @viirya , @srowen , and @rxin ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...
Github user squito commented on the issue: https://github.com/apache/spark/pull/16620 Hi @jinxing64 sorry to go back and forth on this numerous times -- I think I have another alternative, see https://github.com/squito/spark/tree/SPARK-19263_alternate Its most of your changes but with one main difference: when we encounter the condition where there are no pending partitions, but there is an active taskset -- we just mark that taskset as inactive and continue as before https://github.com/squito/spark/commit/bec061c8486a681dc16e8b92e553f79e486924e9. I think this makes it easier to follow, as there are fewer states to keep track of. It also can potentially improve performance, since you may submit downstream stages more quickly, rather than waiting for all tasks in the active taskset to complete. I also think it fixes a bug in your version with mapStageJobs (I'll point it out in the code). This passes all tests in `o.a.s.scheduler.*`, including your new test case. (I did come across a race in `ScheduleIntegrationSuite` which I fixed https://github.com/squito/spark/commit/9125e6738269df4e0d7e6292726bad2a294c86c0 not directly related to these changes). Do you see any problems w/ that approach? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16043: [SPARK-18601][SQL] Simplify Create/Get complex expressio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16043 **[Test build #72200 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72200/testReport)** for PR 16043 at commit [`1da58aa`](https://github.com/apache/spark/commit/1da58aa799eac582d2ec2d7980fa3c27b6de8180). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16043: [SPARK-18601][SQL] Simplify Create/Get complex expressio...
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/16043 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16753: [SPARK-19296][SQL] Deduplicate url and table in JdbcUtil...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16753 **[Test build #72199 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72199/testReport)** for PR 16753 at commit [`6b2841a`](https://github.com/apache/spark/commit/6b2841a183825aea1d37287b8530bcb37cdee2c5). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16043: [SPARK-18601][SQL] Simplify Create/Get complex expressio...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/16043 I am just interested in it :). Yes, this one looks not related again.. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16755: [MESOS] Support constraints in spark-dispatcher
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16755 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72198/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16755: [MESOS] Support constraints in spark-dispatcher
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16755 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16755: [MESOS] Support constraints in spark-dispatcher
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16755 **[Test build #72198 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72198/testReport)** for PR 16755 at commit [`551a593`](https://github.com/apache/spark/commit/551a593949475abcb40414e03d7b01e04c5932f3). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16753: [SPARK-19296][SQL] Deduplicate url and table in JdbcUtil...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16753 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16753: [SPARK-19296][SQL] Deduplicate url and table in JdbcUtil...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16753 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72194/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16753: [SPARK-19296][SQL] Deduplicate url and table in JdbcUtil...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16753 **[Test build #72194 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72194/testReport)** for PR 16753 at commit [`cfe258b`](https://github.com/apache/spark/commit/cfe258b283941c8a3a55a111092ce511682fdd1a). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16753: [SPARK-19296][SQL] Deduplicate url and table in JdbcUtil...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16753 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16753: [SPARK-19296][SQL] Deduplicate url and table in JdbcUtil...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16753 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72193/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16753: [SPARK-19296][SQL] Deduplicate url and table in JdbcUtil...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16753 **[Test build #72193 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72193/testReport)** for PR 16753 at commit [`d78a7d0`](https://github.com/apache/spark/commit/d78a7d0de980e3af330b95eeb6a9020dfece2ec9). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16755: [MESOS] Support constraints in spark-dispatcher
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16755 **[Test build #72198 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72198/testReport)** for PR 16755 at commit [`551a593`](https://github.com/apache/spark/commit/551a593949475abcb40414e03d7b01e04c5932f3). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16756: [SPARK-19411][SQL] Remove the metadata used to mark opti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16756 **[Test build #72197 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72197/testReport)** for PR 16756 at commit [`ec2bbbf`](https://github.com/apache/spark/commit/ec2bbbf55a99f0fa8fba39569b959e17d24b3243). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16603: [SPARK-19244][Core] Sort MemoryConsumers according to th...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16603 @mridulm Ok. Thanks for the review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16756: [SPARK-19411][SQL] Remove the metadata used to mark opti...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16756 cc @rxin @liancheng @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15120: [SPARK-4563][core] Allow driver to advertise a di...
Github user sumitvashistha commented on a diff in the pull request: https://github.com/apache/spark/pull/15120#discussion_r98669126 --- Diff: core/src/main/scala/org/apache/spark/internal/config/ConfigProvider.scala --- @@ -66,7 +66,7 @@ private[spark] class SparkConfigProvider(conf: JMap[String, String]) extends Con findEntry(key) match { case e: ConfigEntryWithDefault[_] => Option(e.defaultValueString) case e: ConfigEntryWithDefaultString[_] => Option(e.defaultValueString) - case e: FallbackConfigEntry[_] => defaultValueString(e.fallback.key) + case e: FallbackConfigEntry[_] => get(e.fallback.key) --- End diff -- We are facing this issue with Spark 1.6 . Are we going to backport this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16756: [SPARK-19411][SQL] Remove the metadata used to ma...
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/16756 [SPARK-19411][SQL] Remove the metadata used to mark optional columns in merged Parquet schema for filter predicate pushdown ## What changes were proposed in this pull request? There is a metadata introduced before to mark the optional columns in merged Parquet schema for filter predicate pushdown. As we upgrade to Parquet 1.8.2 which includes the fix for the pushdown of optional columns, we don't need this metadata now. ## How was this patch tested? Jenkins tests. Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/viirya/spark-1 remove-optional-metadata Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16756.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16756 commit ec2bbbf55a99f0fa8fba39569b959e17d24b3243 Author: Liang-Chi HsiehDate: 2017-01-31T13:40:20Z Remove the metadata used to mark optional columns for merged Parquet schema. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16755: [MESOS] Support constraints in spark-dispatcher
GitHub user philipphoffmann opened a pull request: https://github.com/apache/spark/pull/16755 [MESOS] Support constraints in spark-dispatcher The `MesosClusterScheduler` doesn't handle the `spark.mesos.constraints` setting (as opposed to `MesosCoarseGrainedSchedulerBackend`). ## What changes were proposed in this pull request? This commit introduces the necessary changes to handle the offer constraints. ## How was this patch tested? unit test You can merge this pull request into a Git repository by running: $ git pull https://github.com/philipphoffmann/spark fix-dispatcher-constraints Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16755.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16755 commit 551a593949475abcb40414e03d7b01e04c5932f3 Author: Philipp HoffmannDate: 2017-01-31T13:42:04Z [MESOS] Support constraints in spark-dispatcher The `MesosClusterScheduler` doesn't handle the `spark.mesos.constraints` setting (as opposed to `MesosCoarseGrainedSchedulerBackend`). This commit introduces the necessary changes to handle the offer constraints. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16043: [SPARK-18601][SQL] Simplify Create/Get complex expressio...
Github user eyalfa commented on the issue: https://github.com/apache/spark/pull/16043 @HyukjinKwon, @hvanhovell, are you familiar with this build failure? seems to be unrelated to my specific build... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16754: [SPARK-19410][DOC] Fix brokens links in ml-pipeline and ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16754 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72196/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16754: [SPARK-19410][DOC] Fix brokens links in ml-pipeline and ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16754 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16754: [SPARK-19410][DOC] Fix brokens links in ml-pipeline and ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16754 **[Test build #72196 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72196/testReport)** for PR 16754 at commit [`6bbe357`](https://github.com/apache/spark/commit/6bbe357715ffc274988b06131d91a3ca153ab3e9). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16754: [SPARK-19410][DOC] Fix brokens links in ml-pipeline and ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16754 **[Test build #72196 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72196/testReport)** for PR 16754 at commit [`6bbe357`](https://github.com/apache/spark/commit/6bbe357715ffc274988b06131d91a3ca153ab3e9). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16043: [SPARK-18601][SQL] Simplify Create/Get complex expressio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16043 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72195/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16043: [SPARK-18601][SQL] Simplify Create/Get complex expressio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16043 **[Test build #72195 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72195/testReport)** for PR 16043 at commit [`1da58aa`](https://github.com/apache/spark/commit/1da58aa799eac582d2ec2d7980fa3c27b6de8180). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16043: [SPARK-18601][SQL] Simplify Create/Get complex expressio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16043 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16754: [SPARK-19410][DOC] Fix brokens links in ml-pipeli...
GitHub user zhengruifeng opened a pull request: https://github.com/apache/spark/pull/16754 [SPARK-19410][DOC] Fix brokens links in ml-pipeline and ml-tuning ## What changes were proposed in this pull request? Fix brokens links in ml-pipeline and ml-tuning `` -> `` ## How was this patch tested? manual tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/zhengruifeng/spark doc_api_fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16754.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16754 commit 6bbe357715ffc274988b06131d91a3ca153ab3e9 Author: Zheng RuiFengDate: 2017-01-31T12:50:19Z create pr --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16043: [SPARK-18601][SQL] Simplify Create/Get complex expressio...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/16043 (Ugh, that -9 again. It is unknown up to my knowledge. I talked about this before) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16043: [SPARK-18601][SQL] Simplify Create/Get complex expressio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16043 **[Test build #72195 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72195/testReport)** for PR 16043 at commit [`1da58aa`](https://github.com/apache/spark/commit/1da58aa799eac582d2ec2d7980fa3c27b6de8180). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16735: [SPARK-19228][SQL] Introduce tryParseDate method ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/16735#discussion_r98661043 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVInferSchema.scala --- @@ -140,12 +137,21 @@ private[csv] object CSVInferSchema { } } + private def tryParseDate(field: String, options: CSVOptions): DataType = { +// This case infers a custom `dateFormat` is set. +if ((allCatch opt options.dateFormat.parse(field)).isDefined) { + DateType +} else { + tryParseTimestamp(field, options) +} + } + private def tryParseTimestamp(field: String, options: CSVOptions): DataType = { -// This case infers a custom `dataFormat` is set. +// This case infers a custom `timestampFormat` is set. if ((allCatch opt options.timestampFormat.parse(field)).isDefined) { TimestampType } else if ((allCatch opt DateTimeUtils.stringToTime(field)).isDefined) { - // We keep this for backwords competibility. + // We keep this for backwards compatibility. TimestampType } else { tryParseBoolean(field, options) --- End diff -- (Maybe, you meant L136) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16043: [SPARK-18601][SQL] Simplify Create/Get complex ex...
Github user eyalfa commented on a diff in the pull request: https://github.com/apache/spark/pull/16043#discussion_r98660310 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ComplexTypes.scala --- @@ -0,0 +1,166 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.optimizer + +import org.apache.spark.sql.catalyst.expressions._ +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.catalyst.rules.Rule + +/** +* push down operations into [[CreateNamedStructLike]]. +*/ +object SimplifyCreateStructOps extends Rule[LogicalPlan] { + override def apply(plan: LogicalPlan): LogicalPlan = { +plan.transformExpressionsUp { + // push down field extraction + case GetStructField(createNamedStructLike: CreateNamedStructLike, ordinal, _) => +createNamedStructLike.valExprs(ordinal) +} + } +} + +/** +* push down operations into [[CreateArray]]. +*/ +object SimplifyCreateArrayOps extends Rule[LogicalPlan] { + override def apply(plan: LogicalPlan): LogicalPlan = { +plan.transformExpressionsUp { + // push down field selection (array of structs) + case GetArrayStructFields(CreateArray(elems), field, ordinal, numFields, containsNull) => +// instead f selecting the field on the entire array, +// select it from each member of the array. +// pushing down the operation this way open other optimizations opportunities +// (i.e. struct(...,x,...).x) +CreateArray(elems.map(GetStructField(_, ordinal, Some(field.name + // push down item selection. + case ga @ GetArrayItem(CreateArray(elems), IntegerLiteral(idx)) => +// instead of creating the array and then selecting one row, +// remove array creation altgether. +if (idx >= 0 && idx < elems.size) { + // valid index + elems(idx) +} else { + // out of bounds, mimic the runtime behavior and return null + Cast(Literal(null), ga.dataType) --- End diff -- yep --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16043: [SPARK-18601][SQL] Simplify Create/Get complex ex...
Github user eyalfa commented on a diff in the pull request: https://github.com/apache/spark/pull/16043#discussion_r98660085 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala --- @@ -293,6 +293,12 @@ object SimplifyConditionals extends Rule[LogicalPlan] with PredicateHelper { // from that. Note that CaseWhen.branches should never be empty, and as a result the // headOption (rather than head) added above is just an extra (and unnecessary) safeguard. branches.head._2 + + case e @ CaseWhen(branches, _) if branches.exists(_._1 == Literal(true)) => +// a branc with a TRue condition eliminates all following branches, +// these branches can be pruned away +val (h, t) = branches.span(_._1 != Literal(true)) +CaseWhen( h :+ t.head, None) --- End diff -- sorry, please explain --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16043: [SPARK-18601][SQL] Simplify Create/Get complex expressio...
Github user eyalfa commented on the issue: https://github.com/apache/spark/pull/16043 @hvanhovell can you figure out what fail the build? seems all tests passed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16603: [SPARK-19244][Core] Sort MemoryConsumers according to th...
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/16603 LGTM, will wait for @vanzin's comments before commiting in case he has any. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16753: [SPARK-19296][SQL] Deduplicate url and table in JdbcUtil...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16753 **[Test build #72194 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72194/testReport)** for PR 16753 at commit [`cfe258b`](https://github.com/apache/spark/commit/cfe258b283941c8a3a55a111092ce511682fdd1a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16753: [SPARK-19296][SQL] Deduplicate url and table in JdbcUtil...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16753 **[Test build #72193 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72193/testReport)** for PR 16753 at commit [`d78a7d0`](https://github.com/apache/spark/commit/d78a7d0de980e3af330b95eeb6a9020dfece2ec9). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16753: [SPARK-19296][SQL] Deduplicate url and table in JdbcUtil...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16753 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72192/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16753: [SPARK-19296][SQL] Deduplicate url and table in JdbcUtil...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16753 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16753: [SPARK-19296][SQL] Deduplicate url and table in JdbcUtil...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16753 **[Test build #72192 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72192/testReport)** for PR 16753 at commit [`9be8f84`](https://github.com/apache/spark/commit/9be8f84756fb7e5d2a4fe31c08603688edaf998c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16753: [SPARK-19296][SQL] Deduplicate arguments in JdbcUtils.sa...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/16753 @srowen, I see. Let me maybe give a shot to make them consistent to show if it look good. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16747: SPARK-16636 Add CalendarIntervalType to documentation
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/16747 I am OK but I remember there are some discussions about whether this type should be exposed or not and I could not track down the conclusion. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16747: SPARK-16636 Add CalendarIntervalType to documentation
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16747 @HyukjinKwon is this OK by you? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16751: [SPARK-19409][BUILD] Bump parquet version to 1.8.2
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16751 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72191/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16751: [SPARK-19409][BUILD] Bump parquet version to 1.8.2
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16751 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16751: [SPARK-19409][BUILD] Bump parquet version to 1.8.2
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16751 **[Test build #72191 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72191/testReport)** for PR 16751 at commit [`92dc3e5`](https://github.com/apache/spark/commit/92dc3e50f136be088357aa7b477ffd79f138be0e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16751: [SPARK-19409][BUILD] Bump parquet version to 1.8....
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16751 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16751: [SPARK-19409][BUILD] Bump parquet version to 1.8.2
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16751 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16750: [SPARK-18937][SQL] Timezone support in CSV/JSON parsing
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16750 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16750: [SPARK-18937][SQL] Timezone support in CSV/JSON parsing
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16750 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72190/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16750: [SPARK-18937][SQL] Timezone support in CSV/JSON parsing
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16750 **[Test build #72190 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72190/testReport)** for PR 16750 at commit [`551cff9`](https://github.com/apache/spark/commit/551cff99785927be3ef68c4393dca4dabb3c2ba0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16751: [SPARK-19409][BUILD] Bump parquet version to 1.8.2
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16751 LGTM too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16753: [SPARK-19296][SQL] Deduplicate arguments in JdbcUtils.sa...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16753 **[Test build #72192 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72192/testReport)** for PR 16753 at commit [`9be8f84`](https://github.com/apache/spark/commit/9be8f84756fb7e5d2a4fe31c08603688edaf998c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16753: [SPARK-19296][SQL] Deduplicate arguments in JdbcUtils.sa...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16753 It's true, though I wonder if it's still by design, that these methods take url and table as important first-class arguments, and then also other options, even though the options also contain the same arguments. Or, could the other methods like tableExists reasonably also not have to take these arguments? Consistency is probably more important. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16753: [SPARK-19296][SQL] Deduplicate arguments in JdbcUtils.sa...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/16753 Hi @gatorsmile, could you take a look for this one please? (It might not need a JIRA but it happened to be opened by someone). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16753: [SPARK-19296][SQL] Deduplicate arguments in JdbcU...
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/16753 [SPARK-19296][SQL] Deduplicate arguments in JdbcUtils.saveTable ## What changes were proposed in this pull request? This PR deduplicates arguments, `url` and `table` in `JdbcUtils`. ```diff def saveTable( df: DataFrame, - url: String, - table: String, tableSchema: Option[StructType], isCaseSensitive: Boolean, options: JDBCOptions): Unit = { +val url = options.url +val table = options.table ``` This seems only called in `JdbcRelationProvider` where both `url` and `table `are originated from `JDBCOptions`. ## How was this patch tested? Running unit test in `JdbcSuite`/`JDBCWriteSuite` Building with Scala 2.10 as below: ``` ./dev/change-scala-version.sh 2.10 ./build/mvn -Pyarn -Phadoop-2.4 -Dscala-2.10 -DskipTests clean package ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/HyukjinKwon/spark SPARK-19296 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16753.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16753 commit 9be8f84756fb7e5d2a4fe31c08603688edaf998c Author: hyukjinkwonDate: 2017-01-31T08:09:14Z Deduplicate arguments in JdbcUtils.saveTable --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16750: [SPARK-18937][SQL] Timezone support in CSV/JSON p...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/16750#discussion_r98624418 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala --- @@ -329,7 +332,17 @@ class DataFrameReader private[sql](sparkSession: SparkSession) extends Logging { * @since 1.4.0 */ def json(jsonRDD: RDD[String]): DataFrame = { -val parsedOptions: JSONOptions = new JSONOptions(extraOptions.toMap) +val optionsWithTimeZone = { --- End diff -- Could we just pass the timezone into `JSONOptions` as a default or resemble `columnNameOfCorruptRecord` in`JSONOptions` below? It seems the same logics here duplicated several times and logics to set default values in tests are introduced there which might be not necessary or be able to be removed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16750: [SPARK-18937][SQL] Timezone support in CSV/JSON p...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/16750#discussion_r98629735 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala --- @@ -329,7 +332,17 @@ class DataFrameReader private[sql](sparkSession: SparkSession) extends Logging { * @since 1.4.0 */ def json(jsonRDD: RDD[String]): DataFrame = { -val parsedOptions: JSONOptions = new JSONOptions(extraOptions.toMap) +val optionsWithTimeZone = { --- End diff -- It seems the same comment also applies to `CSVOptions`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16750: [SPARK-18937][SQL] Timezone support in CSV/JSON p...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/16750#discussion_r98625766 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVOptions.scala --- @@ -161,12 +163,3 @@ private[csv] class CSVOptions(@transient private val parameters: CaseInsensitive settings } } - -object CSVOptions { --- End diff -- Do you mind if I ask the reason to remove this which apparently causing fixing many tests in CSV? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16750: [SPARK-18937][SQL] Timezone support in CSV/JSON p...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/16750#discussion_r98623217 --- Diff: python/pyspark/sql/readwriter.py --- @@ -297,7 +300,7 @@ def text(self, paths): def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=None, comment=None, header=None, inferSchema=None, ignoreLeadingWhiteSpace=None, ignoreTrailingWhiteSpace=None, nullValue=None, nanValue=None, positiveInf=None, -negativeInf=None, dateFormat=None, timestampFormat=None, maxColumns=None, +negativeInf=None, dateFormat=None, timestampFormat=None, timeZone=None, maxColumns=None, --- End diff -- (Hi @ueshin, up to my knowledge, this should be added at the end to prevent breaking the existing codes that use those options by positional arguments) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16751: [SPARK-19409][BUILD] Bump parquet version to 1.8.2
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16751 The dependency change looks clear. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16752: Branch 2.0
Github user kishorbp closed the pull request at: https://github.com/apache/spark/pull/16752 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16752: Branch 2.0
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/16752 Hi @kishorbp , it seems mistakenly open. Would you please close this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16752: Branch 2.0
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16752 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16752: Branch 2.0
GitHub user kishorbp opened a pull request: https://github.com/apache/spark/pull/16752 Branch 2.0 ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) ## How was this patch tested? (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/spark branch-2.0 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16752.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16752 commit b25a8e6e167717fbe92e6a9b69a8a2510bf926ca Author: frreissDate: 2016-09-22T09:31:15Z [SPARK-17421][DOCS] Documenting the current treatment of MAVEN_OPTS. ## What changes were proposed in this pull request? Modified the documentation to clarify that `build/mvn` and `pom.xml` always add Java 7-specific parameters to `MAVEN_OPTS`, and that developers can safely ignore warnings about `-XX:MaxPermSize` that may result from compiling or running tests with Java 8. ## How was this patch tested? Rebuilt HTML documentation, made sure that building-spark.html displays correctly in a browser. Author: frreiss Closes #15005 from frreiss/fred-17421a. (cherry picked from commit 646f383465c123062cbcce288a127e23984c7c7f) Signed-off-by: Sean Owen commit f14f47f072a392df0ebe908f1c57b6eb858105b7 Author: Shivaram Venkataraman Date: 2016-09-22T18:52:42Z Skip building R vignettes if Spark is not built ## What changes were proposed in this pull request? When we build the docs separately we don't have the JAR files from the Spark build in the same tree. As the SparkR vignettes need to launch a SparkContext to be built, we skip building them if JAR files don't exist ## How was this patch tested? To test this we can run the following: ``` build/mvn -DskipTests -Psparkr clean ./R/create-docs.sh ``` You should see a line `Skipping R vignettes as Spark JARs not found` at the end Author: Shivaram Venkataraman Closes #15200 from shivaram/sparkr-vignette-skip. (cherry picked from commit 9f24a17c59b1130d97efa7d313c06577f7344338) Signed-off-by: Reynold Xin commit 243bdb11d89ee379acae1ea1ed78df10797e86d1 Author: Burak Yavuz Date: 2016-09-22T20:05:41Z [SPARK-17613] S3A base paths with no '/' at the end return empty DataFrames Consider you have a bucket as `s3a://some-bucket` and under it you have files: ``` s3a://some-bucket/file1.parquet s3a://some-bucket/file2.parquet ``` Getting the parent path of `s3a://some-bucket/file1.parquet` yields `s3a://some-bucket/` and the ListingFileCatalog uses this as the key in the hash map. When catalog.allFiles is called, we use `s3a://some-bucket` (no slash at the end) to get the list of files, and we're left with an empty list! This PR fixes this by adding a `/` at the end of the `URI` iff the given `Path` doesn't have a parent, i.e. is the root. This is a no-op if the path already had a `/` at the end, and is handled through the Hadoop Path, path merging semantics. Unit test in `FileCatalogSuite`. Author: Burak Yavuz Closes #15169 from brkyvz/SPARK-17613. (cherry picked from commit 85d609cf25c1da2df3cd4f5d5aeaf3cbcf0d674c) Signed-off-by: Josh Rosen commit 47fc0b9f40d814bc8e19f86dad591d4aed467222 Author: Shixiong Zhu Date: 2016-09-22T21:26:45Z [SPARK-17638][STREAMING] Stop JVM StreamingContext when the Python process is dead ## What changes were proposed in this pull request? When the Python process is dead, the JVM StreamingContext is still running. Hence we will see a lot of Py4jException before the JVM process exits. It's better to stop the JVM StreamingContext to avoid those annoying logs. ## How was this patch tested? Jenkins Author: Shixiong Zhu Closes #15201 from zsxwing/stop-jvm-ssc. (cherry picked from commit 3cdae0ff2f45643df7bc198cb48623526c7eb1a6) Signed-off-by: Shixiong Zhu commit 0a593db360b3b7771f45f482cf45e8500f0faa76 Author: Herman van Hovell Date: 2016-09-22T21:29:27Z
[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/16281 Hi, all. Now, I'm trying to upgrade Apache Spark to 1.8.2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16751: [SPARK-19409][BUILD] Bump parquet version to 1.8.2
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16751 **[Test build #72191 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72191/testReport)** for PR 16751 at commit [`92dc3e5`](https://github.com/apache/spark/commit/92dc3e50f136be088357aa7b477ffd79f138be0e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16751: [SPARK-19409][BUILD] Bump parquet version to 1.8....
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/16751 [SPARK-19409][BUILD] Bump parquet version to 1.8.2 ## What changes were proposed in this pull request? Apache Parquet 1.8.2 is released officially last week on 26 Jan. https://lists.apache.org/thread.html/af0c813f1419899289a336d96ec02b3bbeecaea23aa6ef69f435c142@%3Cdev.parquet.apache.org%3E This PR only aims to bump Parquet version to 1.8.2. It didn't touch other codes. ## How was this patch tested? Pass the existing tests and also manually by doing `./dev/test-dependencies.sh`. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dongjoon-hyun/spark SPARK-19409 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16751.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16751 commit 92dc3e50f136be088357aa7b477ffd79f138be0e Author: Dongjoon HyunDate: 2017-01-31T08:41:46Z [SPARK-19409][BUILD] Bump parquet version to 1.8.2 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16689: [SPARK-19342][SPARKR] bug fixed in collect method...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/16689#discussion_r98615769 --- Diff: R/pkg/R/DataFrame.R --- @@ -1138,6 +1138,11 @@ setMethod("collect", if (!is.null(PRIMITIVE_TYPES[[colType]]) && colType != "binary") { vec <- do.call(c, col) stopifnot(class(vec) != "list") +class(vec) <- + if (colType == "timestamp") +c("POSIXct", "POSIXt") --- End diff -- Should `PRIMITIVE_TYPES[["timestamp"]]` be changed then https://github.com/apache/spark/blob/master/R/pkg/R/types.R#L32 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16043: [SPARK-18601][SQL] Simplify Create/Get complex expressio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16043 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72189/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14412: [SPARK-15355] [CORE] Proactive block replication
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14412 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16043: [SPARK-18601][SQL] Simplify Create/Get complex expressio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16043 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14412: [SPARK-15355] [CORE] Proactive block replication
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14412 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72187/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13932: [SPARK-15354] [CORE] Topology aware block replication st...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13932 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72188/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13932: [SPARK-15354] [CORE] Topology aware block replication st...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13932 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16750: [SPARK-18937][SQL] Timezone support in CSV/JSON parsing
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16750 **[Test build #72190 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72190/testReport)** for PR 16750 at commit [`551cff9`](https://github.com/apache/spark/commit/551cff99785927be3ef68c4393dca4dabb3c2ba0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16043: [SPARK-18601][SQL] Simplify Create/Get complex ex...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/16043#discussion_r98613798 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ComplexTypes.scala --- @@ -0,0 +1,166 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.optimizer + +import org.apache.spark.sql.catalyst.expressions._ +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.catalyst.rules.Rule + +/** +* push down operations into [[CreateNamedStructLike]]. +*/ +object SimplifyCreateStructOps extends Rule[LogicalPlan] { + override def apply(plan: LogicalPlan): LogicalPlan = { +plan.transformExpressionsUp { + // push down field extraction + case GetStructField(createNamedStructLike: CreateNamedStructLike, ordinal, _) => +createNamedStructLike.valExprs(ordinal) +} + } +} + +/** +* push down operations into [[CreateArray]]. +*/ +object SimplifyCreateArrayOps extends Rule[LogicalPlan] { + override def apply(plan: LogicalPlan): LogicalPlan = { +plan.transformExpressionsUp { + // push down field selection (array of structs) + case GetArrayStructFields(CreateArray(elems), field, ordinal, numFields, containsNull) => +// instead f selecting the field on the entire array, +// select it from each member of the array. +// pushing down the operation this way open other optimizations opportunities +// (i.e. struct(...,x,...).x) +CreateArray(elems.map(GetStructField(_, ordinal, Some(field.name + // push down item selection. + case ga @ GetArrayItem(CreateArray(elems), IntegerLiteral(idx)) => +// instead of creating the array and then selecting one row, +// remove array creation altgether. +if (idx >= 0 && idx < elems.size) { + // valid index + elems(idx) +} else { + // out of bounds, mimic the runtime behavior and return null + Cast(Literal(null), ga.dataType) --- End diff -- `Literal(null, ga.dataType)`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16750: [SPARK-18937][SQL] Timezone support in CSV/JSON p...
GitHub user ueshin opened a pull request: https://github.com/apache/spark/pull/16750 [SPARK-18937][SQL] Timezone support in CSV/JSON parsing ## What changes were proposed in this pull request? This is a follow-up pr of #16308. This pr enables timezone support in CSV/JSON parsing. We should introduce `timeZone` option for CSV/JSON datasources (the default value of the option is session local timezone). The datasources should use the `timeZone` option to format/parse to write/read timestamp values. Notice that while reading, if the timestampFormat has the timezone info, the timezone will not be used because we should respect the timezone in the values. For example, if you have timestamp `"2016-01-01 00:00:00"` in `GMT`, the values written with the default timezone option, which is `"GMT"` because session local timezone is `"GMT"` here, are: ```scala scala> spark.conf.set("spark.sql.session.timeZone", "GMT") scala> val df = Seq(new java.sql.Timestamp(145160640L)).toDF("ts") df: org.apache.spark.sql.DataFrame = [ts: timestamp] scala> df.show() +---+ |ts | +---+ |2016-01-01 00:00:00| +---+ scala> df.write.json("/path/to/gmtjson") ``` ```sh $ cat /path/to/gmtjson/part-* {"ts":"2016-01-01T00:00:00.000Z"} ``` whereas setting the option to `"PST"`, they are: ```scala scala> df.write.option("timeZone", "PST").json("/path/to/pstjson") ``` ```sh $ cat /path/to/pstjson/part-* {"ts":"2015-12-31T16:00:00.000-08:00"} ``` We can properly read these files even if the timezone option is wrong because the timestamp values have timezone info: ```scala scala> val schema = new StructType().add("ts", TimestampType) schema: org.apache.spark.sql.types.StructType = StructType(StructField(ts,TimestampType,true)) scala> spark.read.schema(schema).json("/path/to/gmtjson").show() +---+ |ts | +---+ |2016-01-01 00:00:00| +---+ scala> spark.read.schema(schema).option("timeZone", "PST").json("/path/to/gmtjson").show() +---+ |ts | +---+ |2016-01-01 00:00:00| +---+ ``` And even if `timezoneFormat` doesn't contain timezone info, we can properly read the values with setting correct timezone option: ```scala scala> df.write.option("timestampFormat", "-MM-dd'T'HH:mm:ss").option("timeZone", "JST").json("/path/to/jstjson") ``` ```sh $ cat /path/to/jstjson/part-* {"ts":"2016-01-01T09:00:00"} ``` ```scala // wrong result scala> spark.read.schema(schema).option("timestampFormat", "-MM-dd'T'HH:mm:ss").json("/path/to/jstjson").show() +---+ |ts | +---+ |2016-01-01 09:00:00| +---+ // correct result scala> spark.read.schema(schema).option("timestampFormat", "-MM-dd'T'HH:mm:ss").option("timeZone", "JST").json("/path/to/jstjson").show() +---+ |ts | +---+ |2016-01-01 00:00:00| +---+ ``` This pr also makes `JsonToStruct` and `StructToJson` `TimeZoneAwareExpression` to be able to evaluate values with timezone option. ## How was this patch tested? Existing tests and added some tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ueshin/apache-spark issues/SPARK-18937 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16750.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16750 commit aa052f4d11929192b749752f4b73772664d0460c Author: Takuya UESHINDate: 2017-01-05T09:29:42Z Add timeZone option to JSONOptions. commit 890879e24b3f63509a000585e18b288961a4e5cf Author: Takuya UESHIN Date: 2017-01-06T05:11:41Z Apply timeZone option to JSON datasources. commit f08b78c16ac444550e7ea0857d0275b9a91b7561 Author: Takuya UESHIN Date: 2017-01-06T06:03:34Z Apply timeZone option to CSV datasources. commit 551cff99785927be3ef68c4393dca4dabb3c2ba0 Author: Takuya UESHIN Date: 2017-01-06T08:39:26Z Modify python files. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please