[GitHub] spark issue #16578: [SPARK-4502][SQL] Parquet nested column pruning

2017-01-19 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/16578 @mallman Thanks for let me know. I'll try your patch and check #14957 take over or not. I also think we need getting feedback from @liancheng , from our last discussion, liancheng may do

[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...

2016-09-05 Thread xuanyuanking
GitHub user xuanyuanking opened a pull request: https://github.com/apache/spark/pull/14957 [SPARK-4502][SQL]Support parquet nested struct pruning and add releva… ## What changes were proposed in this pull request? Like the description in [SPARK-4502](https

[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...

2016-09-06 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r77753006 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/StructType.scala --- @@ -259,8 +259,23 @@ case class StructType(fields: Array

[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...

2016-09-06 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r77760264 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetQuerySuite.scala --- @@ -571,6 +571,44 @@ class

[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...

2016-09-06 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r77760397 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/StructType.scala --- @@ -280,6 +280,29 @@ case class StructType(fields: Array

[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...

2016-09-07 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r77844084 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala --- @@ -97,7 +98,16 @@ object FileSourceStrategy

[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...

2016-09-07 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r77845789 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala --- @@ -97,7 +98,16 @@ object FileSourceStrategy

[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...

2016-09-07 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r77934557 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala --- @@ -97,7 +98,16 @@ object FileSourceStrategy

[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...

2016-09-07 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r77934764 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala --- @@ -97,7 +98,16 @@ object FileSourceStrategy

[GitHub] spark issue #14957: [SPARK-4502][SQL]Support parquet nested struct pruning a...

2016-09-07 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/14957 (Thank you for your comments and help me a lot :) ) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #14957: [SPARK-4502][SQL]Support parquet nested struct pruning a...

2016-09-09 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/14957 @liancheng @rxin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...

2016-09-06 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r77762907 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/StructType.scala --- @@ -280,6 +280,29 @@ case class StructType(fields: Array

[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...

2016-10-30 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r85656729 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala --- @@ -126,4 +136,52 @@ object

[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...

2016-10-30 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r85656841 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategySuite.scala --- @@ -442,6 +443,79 @@ class

[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...

2016-10-23 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r84592821 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala --- @@ -126,4 +136,52 @@ object

[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...

2016-10-23 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r84592818 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala --- @@ -126,4 +136,52 @@ object

[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...

2016-10-23 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r84592816 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala --- @@ -126,4 +136,52 @@ object

[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...

2016-10-23 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r84592883 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -661,6 +666,8 @@ private[sql] class SQLConf extends Serializable

[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...

2016-10-23 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r84592888 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetQuerySuite.scala --- @@ -571,6 +571,37 @@ class

[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...

2016-10-23 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r84592876 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -212,6 +212,11 @@ object SQLConf { .booleanConf

[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...

2016-10-23 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r84592865 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala --- @@ -126,4 +136,52 @@ object

[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...

2016-10-23 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r84592806 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala --- @@ -97,7 +99,15 @@ object FileSourceStrategy

[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...

2016-10-23 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r84592805 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala --- @@ -97,7 +99,15 @@ object FileSourceStrategy

[GitHub] spark pull request #16135: [SPARK-18700][SQL] Add StripedLock for each table...

2016-12-10 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/16135#discussion_r91849543 --- Diff: core/src/main/scala/org/apache/spark/metrics/source/StaticSources.scala --- @@ -97,6 +97,12 @@ object HiveCatalogMetrics extends Source

[GitHub] spark pull request #16135: [SPARK-18700][SQL] Add StripedLock for each table...

2016-12-10 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/16135#discussion_r91849598 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/PartitionedTablePerfStatsSuite.scala --- @@ -352,4 +353,28 @@ class

[GitHub] spark pull request #16135: [SPARK-18700][SQL] Add StripedLock for each table...

2016-12-10 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/16135#discussion_r91849563 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -53,6 +56,18 @@ private[hive] class HiveMetastoreCatalog

[GitHub] spark pull request #16135: [SPARK-18700][SQL] Add StripedLock for each table...

2016-12-10 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/16135#discussion_r91849557 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -53,6 +56,18 @@ private[hive] class HiveMetastoreCatalog

[GitHub] spark pull request #16135: [SPARK-18700][SQL] Add StripedLock for each table...

2016-12-10 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/16135#discussion_r91849565 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/PartitionedTablePerfStatsSuite.scala --- @@ -352,4 +353,28 @@ class

[GitHub] spark pull request #16135: [SPARK-18700][SQL] Add StripedLock for each table...

2016-12-10 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/16135#discussion_r91849561 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -53,6 +56,18 @@ private[hive] class HiveMetastoreCatalog

[GitHub] spark pull request #16135: [SPARK-18700][SQL] Add StripedLock for each table...

2016-12-10 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/16135#discussion_r91849544 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -33,6 +35,7 @@ import

[GitHub] spark pull request #16135: [SPARK-18700][SQL] Add StripedLock for each table...

2016-12-11 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/16135#discussion_r91879183 --- Diff: core/src/main/scala/org/apache/spark/metrics/source/StaticSources.scala --- @@ -105,6 +111,7 @@ object HiveCatalogMetrics extends Source

[GitHub] spark issue #16135: [SPARK-18700][SQL] Add StripedLock for each table's rela...

2016-12-12 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/16135 Thanks for ericl's review! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #16135: [SPARK-18700][SQL] Add StripedLock for each table...

2016-12-12 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/16135#discussion_r91904868 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -209,72 +221,79 @@ private[hive] class

[GitHub] spark pull request #16135: [SPARK-18700][SQL] Add StripedLock for each table...

2016-12-12 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/16135#discussion_r91904915 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/PartitionedTablePerfStatsSuite.scala --- @@ -352,4 +353,34 @@ class

[GitHub] spark pull request #16135: [SPARK-18700][SQL] Add StripedLock for each table...

2016-12-12 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/16135#discussion_r91904834 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -53,6 +53,18 @@ private[hive] class HiveMetastoreCatalog

[GitHub] spark issue #16135: [SPARK-18700][SQL] Add StripedLock for each table's rela...

2016-12-15 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/16135 cc @rxin thanks for check. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #16135: [SPARK-18700][SQL] Add ReadWriteLock for each table's re...

2016-12-06 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/16135 @rxin @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #16135: [SPARK-18700][SQL] Add StripedLock for each table's rela...

2016-12-08 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/16135 hi @ericl This commit do the 3 things below, thanks for your check: 1. Delete the unnecessary lock use and simplify the lock operation 2. Add UT test

[GitHub] spark pull request #16135: SPARK-18700: add ReadWriteLock for each table's r...

2016-12-03 Thread xuanyuanking
GitHub user xuanyuanking opened a pull request: https://github.com/apache/spark/pull/16135 SPARK-18700: add ReadWriteLock for each table's relation in cache ## What changes were proposed in this pull request? As the scenario describe in [SPARK-18700][https

[GitHub] spark issue #16350: [SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock for eac...

2016-12-20 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/16350 Delete the UT and metrics done. :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #16350: [SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock ...

2016-12-20 Thread xuanyuanking
GitHub user xuanyuanking opened a pull request: https://github.com/apache/spark/pull/16350 [SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock for each table's relation in cache ## What changes were proposed in this pull request? Backport of #16135 to branch-2.0

[GitHub] spark issue #16350: [SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock for eac...

2016-12-21 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/16350 Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #16350: [SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock ...

2016-12-21 Thread xuanyuanking
Github user xuanyuanking closed the pull request at: https://github.com/apache/spark/pull/16350 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #16135: [SPARK-18700][SQL] Add ReadWriteLock for each table's re...

2016-12-07 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/16135 @ericl Thanks for your review. > Is it sufficient to lock around the catalog.filterPartitions(Nil)? Yes, this patch port from 1.6.2 and I missed the diff here. Fixed in next pa

[GitHub] spark issue #16135: [SPARK-18700][SQL] Add StripedLock for each table's rela...

2016-12-20 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/16135 @hvanhovell Sure, I open a new BACKPORT-2.0. There's a little diff in branch-2.0, the ut test of this patch based on the `HiveCatalogMetrics` which not added in 2.0, so I added the patch

[GitHub] spark issue #17702: [SPARK-20408][SQL] Get the glob path in parallel to redu...

2017-04-20 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/17702 @marmbrus Can you take a look of this? Thanks :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #17702: [SPARK-20408][SQL] Get the glob path in parallel ...

2017-04-20 Thread xuanyuanking
GitHub user xuanyuanking opened a pull request: https://github.com/apache/spark/pull/17702 [SPARK-20408][SQL] Get the glob path in parallel to reduce resolve relation time ## What changes were proposed in this pull request? This PR change the work of getting glob path

[GitHub] spark pull request #18760: [SPARK-21560][Core] Add hold mode for the LiveLis...

2017-07-28 Thread xuanyuanking
GitHub user xuanyuanking opened a pull request: https://github.com/apache/spark/pull/18760 [SPARK-21560][Core] Add hold mode for the LiveListenerBus ## What changes were proposed in this pull request? 1. Add config for hold strategy and the idle capacity. 2. Hold the post

[GitHub] spark issue #18760: [SPARK-21560][Core] Add hold mode for the LiveListenerBu...

2017-08-14 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/18760 @jiangxb1987 Hi xingbo, can you give me some advise about this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #18654: [SPARK-21435][SQL] Empty files should be skipped while w...

2017-07-17 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/18654 @HyukjinKwon Thanks for you comment, as your mentioned in #18650 and #17395, empty results of parquet can be fixed by leave the first partition, how about the orc format? The orc format error

[GitHub] spark issue #18650: [SPARK-21435][SQL] Empty files should be skipped while w...

2017-07-17 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/18650 Yep, just close this and open #18654 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread xuanyuanking
GitHub user xuanyuanking opened a pull request: https://github.com/apache/spark/pull/18654 [SPARK-21435][SQL] Empty files should be skipped while write to file ## What changes were proposed in this pull request? Add EmptyDirectoryWriteTask for empty task while writing files

[GitHub] spark pull request #18650: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread xuanyuanking
Github user xuanyuanking closed the pull request at: https://github.com/apache/spark/pull/18650 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #18650: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread xuanyuanking
GitHub user xuanyuanking opened a pull request: https://github.com/apache/spark/pull/18650 [SPARK-21435][SQL] Empty files should be skipped while write to file ## What changes were proposed in this pull request? Add EmptyDirectoryWriteTask for empty task while writing files

[GitHub] spark issue #18654: [SPARK-21435][SQL] Empty files should be skipped while w...

2017-07-18 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/18654 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/18654#discussion_r127856498 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileFormatWriterSuite.scala --- @@ -0,0 +1,52

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/18654#discussion_r127872988 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileFormatWriterSuite.scala --- @@ -0,0 +1,52

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-18 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/18654#discussion_r127888746 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileFormatWriterSuite.scala --- @@ -0,0 +1,43

[GitHub] spark issue #18654: [SPARK-21435][SQL] Empty files should be skipped while w...

2017-07-17 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/18654 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #18654: [SPARK-21435][SQL] Empty files should be skipped while w...

2017-07-17 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/18654 Yep, empty result dir need this meta, otherwise will throw the exception: ``` org.apache.spark.sql.AnalysisException: Unable to infer schema for Parquet. It must be specified manually

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/18654#discussion_r127865091 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileFormatWriterSuite.scala --- @@ -0,0 +1,52

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/18654#discussion_r127856720 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileFormatWriterSuite.scala --- @@ -0,0 +1,52

[GitHub] spark issue #18654: [SPARK-21435][SQL] Empty files should be skipped while w...

2017-07-18 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/18654 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #18654: [SPARK-21435][SQL] Empty files should be skipped while w...

2017-07-18 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/18654 retest this please... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #18654: [SPARK-21435][SQL] Empty files should be skipped while w...

2017-07-18 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/18654 ping @cloud-fan @HyukjinKwon --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17702: [SPARK-20408][SQL] Get the glob path in parallel to redu...

2017-04-25 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/17702 cc @zsxwing @tdas, can you review this? Founded the relative code of yours before. Thanks :) --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #17702: [SPARK-20408][SQL] Get the glob path in parallel to redu...

2017-04-27 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/17702 @HyukjinKwon Can you help me to find a appropriate reviewer about this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #17702: [SPARK-20408][SQL] Get the glob path in parallel to redu...

2017-04-30 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/17702 ping @cloud-fan and @gatorsmile , could you have a look about this ? Thanks :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark issue #18760: [SPARK-21560][Core] Add hold mode for the LiveListenerBu...

2017-07-31 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/18760 ping @gatorsmile @cloud-fan , can you review about this? Thanks :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #17702: [SPARK-20408][SQL] Get the glob path in parallel to redu...

2017-05-12 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/17702 ping @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...

2017-06-26 Thread xuanyuanking
Github user xuanyuanking closed the pull request at: https://github.com/apache/spark/pull/14957 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #14957: [SPARK-4502][SQL]Support parquet nested struct pruning a...

2017-06-26 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/14957 OK, I'll close this and just use it in our internal env, thanks all guys's suggestion and review work. Next we may try more complex scenario of this. --- If your project is set up for it, you

[GitHub] spark issue #17702: [SPARK-20408][SQL] Get the glob path in parallel to redu...

2017-06-05 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/17702 @cloud-fan Thanks for your reply! It's possible to consolidate them but may be not so necessary? I can consolidate them by replace the logic in `getGlobbedPaths` list below

[GitHub] spark issue #17702: [SPARK-20408][SQL] Get the glob path in parallel to redu...

2017-06-16 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/17702 Test failed may because of the env? `process was terminated by signal 9` in jenkins log. retest it please --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark issue #17702: [SPARK-20408][SQL] Get the glob path in parallel to redu...

2017-06-16 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/17702 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #17702: [SPARK-20408][SQL] Get the glob path in parallel ...

2017-06-15 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/17702#discussion_r122364493 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -389,6 +389,23 @@ case class DataSource

[GitHub] spark issue #17702: [SPARK-20408][SQL] Get the glob path in parallel to redu...

2017-05-07 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/17702 @gatorsmile @cloud-fan, do we need other performance test? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #17702: [SPARK-20408][SQL] Get the glob path in parallel to redu...

2017-05-02 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/17702 Thanks for your review. @gatorsmile @cloud-fan `Can you show us the performance difference?` No problem, I reproduce our online case offline like below ## Without

[GitHub] spark pull request #17702: [SPARK-20408][SQL] Get the glob path in parallel ...

2017-05-02 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/17702#discussion_r114275475 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -146,6 +146,11 @@ object SQLConf { .longConf

[GitHub] spark pull request #19287: [SPARK-22074][Core] Task killed by other attempt ...

2017-09-19 Thread xuanyuanking
GitHub user xuanyuanking opened a pull request: https://github.com/apache/spark/pull/19287 [SPARK-22074][Core] Task killed by other attempt task should not be resubmitted ## What changes were proposed in this pull request? As the detail scenario described in [SPARK-22074

[GitHub] spark issue #19287: [SPARK-22074][Core] Task killed by other attempt task sh...

2017-09-20 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/19287 `signal 9` retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19287: [SPARK-22074][Core] Task killed by other attempt ...

2017-09-20 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/19287#discussion_r140143293 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskInfo.scala --- @@ -66,6 +66,12 @@ class TaskInfo( */ var finishTime: Long

[GitHub] spark issue #18760: [SPARK-21560][Core] Add hold mode for the LiveListenerBu...

2017-09-26 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/18760 The hold mode is still valid, I resolved the conflict and add the logic into `AsyncEventQueue`, it can confirm by the test case added in this [patch](https://github.com/apache/spark/pull/18760

[GitHub] spark issue #19287: [SPARK-22074][Core] Task killed by other attempt task sh...

2017-09-29 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/19287 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #19287: [SPARK-22074][Core] Task killed by other attempt task sh...

2017-09-28 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/19287 @squito Hi Rashid, thanks for you review and advise. In the last commit I moved `killedByOtherAttempt` into `TaskSetManager ` as you say and added more asserts in UT

[GitHub] spark issue #18760: [SPARK-21560][Core] Add hold mode for the LiveListenerBu...

2017-09-28 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/18760 @vanzin Hi Vanzin, thanks a lot for your comments. Firstly answer your question about `Why isn't hold mode just calling queue.put (blocking) instead of queue.offer (non-blocking

[GitHub] spark issue #17702: [SPARK-20408][SQL] Get the glob path in parallel to redu...

2017-09-30 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/17702 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #17702: [SPARK-20408][SQL] Get the glob path in parallel to redu...

2017-10-01 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/17702 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #19287: [SPARK-22074][Core] Task killed by other attempt ...

2017-09-28 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/19287#discussion_r141784747 --- Diff: core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala --- @@ -744,6 +744,100 @@ class TaskSetManagerSuite extends

[GitHub] spark pull request #19287: [SPARK-22074][Core] Task killed by other attempt ...

2017-09-28 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/19287#discussion_r141784812 --- Diff: core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala --- @@ -744,6 +744,100 @@ class TaskSetManagerSuite extends

[GitHub] spark pull request #19287: [SPARK-22074][Core] Task killed by other attempt ...

2017-09-28 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/19287#discussion_r141784872 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskInfo.scala --- @@ -66,6 +66,13 @@ class TaskInfo( */ var finishTime: Long

[GitHub] spark pull request #19287: [SPARK-22074][Core] Task killed by other attempt ...

2017-09-27 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/19287#discussion_r141513379 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskInfo.scala --- @@ -74,6 +81,10 @@ class TaskInfo( gettingResultTime = time

[GitHub] spark issue #19287: [SPARK-22074][Core] Task killed by other attempt task sh...

2017-09-25 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/19287 ping @cloud-fan @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #19287: [SPARK-22074][Core] Task killed by other attempt task sh...

2017-09-28 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/19287 @jerryshao Thanks for you review. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19287: [SPARK-22074][Core] Task killed by other attempt task sh...

2017-10-09 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/19287 Thanks all reviewers! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #17702: [SPARK-20408][SQL] Get the glob path in parallel ...

2017-11-13 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/17702#discussion_r150484715 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -432,6 +433,32 @@ case class DataSource

[GitHub] spark pull request #17702: [SPARK-20408][SQL] Get the glob path in parallel ...

2017-11-13 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/17702#discussion_r150484788 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala --- @@ -246,6 +246,18 @@ class SparkHadoopUtil extends Logging

[GitHub] spark pull request #19773: [SPARK-22546][SQL] Supporting for changing column...

2017-11-24 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/19773#discussion_r152957444 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -318,16 +318,26 @@ case class AlterTableChangeColumnCommand

[GitHub] spark pull request #19773: [SPARK-22546][SQL] Supporting for changing column...

2017-11-23 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/19773#discussion_r152753785 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -318,16 +318,26 @@ case class AlterTableChangeColumnCommand

[GitHub] spark issue #19773: [SPARK-22546][SQL] Supporting for changing column dataTy...

2017-12-04 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/19773 gental ping @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

  1   2   3   4   5   6   7   8   >