[GitHub] spark pull request: [SPARK-4985][SQL Parquet] Parquet date support

2015-04-14 Thread rdblue
Github user rdblue commented on the pull request: https://github.com/apache/spark/pull/3855#issuecomment-93043864 @liancheng, I'll take a look as soon as I can. I'm a little swamped this week though, so I can't guarantee it'll be quick. Sorry! --- If your project is set up

[GitHub] spark pull request: [SPARK-6776] [SPARK-8811] [SQL] Refactors Parq...

2015-07-06 Thread rdblue
Github user rdblue commented on the pull request: https://github.com/apache/spark/pull/7231#issuecomment-118934295 @liancheng, thanks! I'll make some time to look at it this week. Most likely Wednesday. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-6777] [SQL] Implements backwards compat...

2015-07-28 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/6617#discussion_r35718695 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/parquet/ParquetSchemaSuite.scala --- @@ -109,20 +245,21 @@ class ParquetSchemaSuite extends

[GitHub] spark pull request: [SPARK-6777] [SQL] Implements backwards compat...

2015-07-28 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/6617#discussion_r35663656 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/CatalystSchemaConverter.scala --- @@ -0,0 +1,565 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-6777] [SQL] Implements backwards compat...

2015-07-28 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/6617#discussion_r35664000 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/parquet/ParquetSchemaSuite.scala --- @@ -109,20 +245,21 @@ class ParquetSchemaSuite extends

[GitHub] spark pull request: [SPARK-6777] [SQL] Implements backwards compat...

2015-07-28 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/6617#discussion_r35663868 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/CatalystSchemaConverter.scala --- @@ -0,0 +1,565 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-6777] [SQL] Implements backwards compat...

2015-07-28 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/6617#discussion_r35675961 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/CatalystSchemaConverter.scala --- @@ -0,0 +1,565 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-9340] [SQL] Fixes converting unannotate...

2015-08-10 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/8070#discussion_r36676679 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/CatalystRowConverter.scala --- @@ -446,4 +478,61 @@ private[parquet] class

[GitHub] spark pull request: [SPARK-9340] [SQL] Fixes converting unannotate...

2015-08-10 Thread rdblue
Github user rdblue commented on the pull request: https://github.com/apache/spark/pull/8070#issuecomment-129649565 This looks good to me overall. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-6777] [SQL] Implements backwards compat...

2015-07-14 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/6617#discussion_r34600221 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/CatalystSchemaConverter.scala --- @@ -0,0 +1,565 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-6777] [SQL] Implements backwards compat...

2015-07-14 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/6617#discussion_r34607172 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/CatalystSchemaConverter.scala --- @@ -0,0 +1,565 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-6777] [SQL] Implements backwards compat...

2015-07-14 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/6617#discussion_r34614654 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/CatalystSchemaConverter.scala --- @@ -0,0 +1,565 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-6777] [SQL] Implements backwards compat...

2015-07-14 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/6617#discussion_r34615119 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/CatalystSchemaConverter.scala --- @@ -0,0 +1,565 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-6777] [SQL] Implements backwards compat...

2015-07-14 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/6617#discussion_r34615210 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTableSupport.scala --- @@ -105,8 +104,7 @@ private[parquet] class RowReadSupport

[GitHub] spark pull request: [SPARK-6777] [SQL] Implements backwards compat...

2015-07-14 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/6617#discussion_r34607035 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/CatalystSchemaConverter.scala --- @@ -0,0 +1,565 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-6777] [SQL] Implements backwards compat...

2015-07-15 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/6617#discussion_r34737606 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/parquet/ParquetSchemaSuite.scala --- @@ -109,20 +245,21 @@ class ParquetSchemaSuite extends

[GitHub] spark pull request: [SPARK-6777] [SQL] Implements backwards compat...

2015-07-14 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/6617#discussion_r34598625 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/CatalystSchemaConverter.scala --- @@ -0,0 +1,565 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-6777] [SQL] Implements backwards compat...

2015-07-14 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/6617#discussion_r34599675 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/CatalystSchemaConverter.scala --- @@ -0,0 +1,565 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #13482: SPARK-15725: Ensure ApplicationMaster sleeps for ...

2016-06-02 Thread rdblue
GitHub user rdblue opened a pull request: https://github.com/apache/spark/pull/13482 SPARK-15725: Ensure ApplicationMaster sleeps for the min interval. ## What changes were proposed in this pull request? Update `ApplicationMaster` to sleep for at least the minimum

[GitHub] spark issue #13482: SPARK-15725: Ensure ApplicationMaster sleeps for the min...

2016-06-02 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/13482 @yhuai, @rxin, we should consider this work-around for 2.0 if it isn't too late. We see a lot of apps fail because the driver and AM lock up. --- If your project is set up for it, you can reply

[GitHub] spark pull request #12313: [SPARK-14543] [SQL] Improve InsertIntoTable colum...

2016-06-03 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/12313#discussion_r65757163 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala --- @@ -284,8 +284,128 @@ class InsertIntoHiveTableSuite extends

[GitHub] spark pull request #12313: [SPARK-14543] [SQL] Improve InsertIntoTable colum...

2016-06-03 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/12313#discussion_r65758947 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala --- @@ -284,8 +284,128 @@ class InsertIntoHiveTableSuite extends

[GitHub] spark pull request #12313: [SPARK-14543] [SQL] Improve InsertIntoTable colum...

2016-06-03 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/12313#discussion_r65776253 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala --- @@ -284,8 +284,128 @@ class InsertIntoHiveTableSuite extends

[GitHub] spark pull request #12313: [SPARK-14543] [SQL] Improve InsertIntoTable colum...

2016-06-03 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/12313#discussion_r65767128 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala --- @@ -284,8 +284,128 @@ class InsertIntoHiveTableSuite extends

[GitHub] spark pull request #13338: [SPARK-13723] [YARN] Change behavior of --num-exe...

2016-06-10 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/13338#discussion_r66683769 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -519,9 +519,9 @@ object YarnSparkHadoopUtil { conf

[GitHub] spark issue #12313: [SPARK-14543] [SQL] Improve InsertIntoTable column resol...

2016-06-10 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/12313 @yhuai, whatever release you want to target is fine with me, but I don't think we should block this on a design doc for cleaning up the `DataFrameWriter`. I'm all for writing one and I plan

[GitHub] spark pull request #13338: [SPARK-13723] [YARN] Change behavior of --num-exe...

2016-06-10 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/13338#discussion_r66685637 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -2262,21 +2262,39 @@ private[spark] object Utils extends Logging

[GitHub] spark pull request #12313: [SPARK-14543] [SQL] Improve InsertIntoTable colum...

2016-06-10 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/12313#discussion_r66688732 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -505,6 +506,117 @@ class Analyzer

[GitHub] spark issue #13338: [SPARK-13723] [YARN] Change behavior of --num-executors ...

2016-06-10 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/13338 Thanks for reviewing, everyone! I've made some comments and will update once we have consensus on util methods and semantics. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request #13338: [SPARK-13723] [YARN] Change behavior of --num-exe...

2016-06-10 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/13338#discussion_r66685722 --- Diff: docs/configuration.md --- @@ -1158,6 +1158,10 @@ Apart from these, the following properties are also available, and may be useful For more

[GitHub] spark pull request #13338: [SPARK-13723] [YARN] Change behavior of --num-exe...

2016-06-10 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/13338#discussion_r66685237 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -2262,21 +2262,39 @@ private[spark] object Utils extends Logging

[GitHub] spark pull request #12313: [SPARK-14543] [SQL] Improve InsertIntoTable colum...

2016-06-10 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/12313#discussion_r66687753 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -505,6 +506,117 @@ class Analyzer

[GitHub] spark pull request #12313: [SPARK-14543] [SQL] Improve InsertIntoTable colum...

2016-06-03 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/12313#discussion_r65743060 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala --- @@ -284,8 +284,128 @@ class InsertIntoHiveTableSuite extends

[GitHub] spark pull request #13482: [SPARK-15725][YARN] Ensure ApplicationMaster slee...

2016-06-11 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/13482#discussion_r66713080 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala --- @@ -462,10 +464,23 @@ private[spark] class ApplicationMaster

[GitHub] spark pull request #13482: [SPARK-15725][YARN] Ensure ApplicationMaster slee...

2016-06-11 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/13482#discussion_r66713200 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala --- @@ -462,10 +464,23 @@ private[spark] class ApplicationMaster

[GitHub] spark issue #13482: [SPARK-15725][YARN] Ensure ApplicationMaster sleeps for ...

2016-06-11 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/13482 @andrewor14, I think we should consider two problems here: the fact that the thread will sleep for less than the min interval if something triggers it and whatever is currently triggering it. We

[GitHub] spark pull request #13338: [SPARK-13723] [YARN] Change behavior of --num-exe...

2016-06-11 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/13338#discussion_r66714041 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -2262,21 +2262,39 @@ private[spark] object Utils extends Logging

[GitHub] spark pull request #13280: [SPARK-9876][SQL]: Update Parquet to 1.8.1.

2016-06-14 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/13280#discussion_r67006656 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystSchemaConverter.scala --- @@ -538,6 +538,22 @@ private[parquet

[GitHub] spark pull request #13338: [SPARK-13723] [YARN] Change behavior of --num-exe...

2016-06-13 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/13338#discussion_r66858331 --- Diff: docs/configuration.md --- @@ -1158,6 +1158,10 @@ Apart from these, the following properties are also available, and may be useful For more

[GitHub] spark issue #13338: [SPARK-13723] [YARN] Change behavior of --num-executors ...

2016-06-13 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/13338 I just pushed a version that addresses the comments so far (other than the spark-submit help text). Thanks for looking at this, @vanzin, @jerryshao, and @tgravescs! --- If your project is set up

[GitHub] spark issue #13482: [SPARK-15725][YARN] Ensure ApplicationMaster sleeps for ...

2016-06-13 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/13482 @tgravescs, removing `notifyAll` doesn't solve the problem entirely, it just removes one path that's causing the `allocate` call to be run too many times. (Also, I haven't tested delaying loss

[GitHub] spark pull request #13338: [SPARK-13723] [YARN] Change behavior of --num-exe...

2016-06-13 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/13338#discussion_r66858439 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -519,9 +519,9 @@ object YarnSparkHadoopUtil { conf

[GitHub] spark pull request #13338: [SPARK-13723] [YARN] Change behavior of --num-exe...

2016-06-13 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/13338#discussion_r66858477 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -2262,21 +2262,39 @@ private[spark] object Utils extends Logging

[GitHub] spark issue #13338: [SPARK-13723] [YARN] Change behavior of --num-executors ...

2016-06-13 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/13338 @tgravescs, where is the --help text for spark-submit? I'll update it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #13482: SPARK-15725: Ensure ApplicationMaster sleeps for the min...

2016-06-04 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/13482 cc @vanzin @tgravescs --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request: [SPARK-9876][SQL]: Update Parquet to 1.8.1.

2016-05-27 Thread rdblue
Github user rdblue commented on the pull request: https://github.com/apache/spark/pull/13280#issuecomment-77478 Thanks @liancheng! It will be great to have predicate push-down for strings in 2.0! --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-9876][SQL][FOLLOWUP] Enable string and ...

2016-05-30 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/13389#discussion_r65089323 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala --- @@ -50,7 +50,6 @@ private[sql] object

[GitHub] spark pull request: [SPARK-9876][SQL][FOLLOWUP] Enable string and ...

2016-05-30 Thread rdblue
Github user rdblue commented on the pull request: https://github.com/apache/spark/pull/13389#issuecomment-222521471 +1 overall, good catch on those tests. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-14543] [SQL] Improve InsertIntoTable co...

2016-05-27 Thread rdblue
Github user rdblue commented on the pull request: https://github.com/apache/spark/pull/12313#issuecomment-05450 @yhuai, I'll answer #2 first since it's quick: the column names are used to create a projection of the incoming data frame so any extra columns aren't selected

[GitHub] spark pull request: [SPARK-14543] [SQL] Improve InsertIntoTable co...

2016-05-27 Thread rdblue
Github user rdblue commented on the pull request: https://github.com/apache/spark/pull/12313#issuecomment-31124 @yhuai, I'll add an option for strict checking. I agree with you that we need to have a holistic solution. It's also not ideal that some write methods already

[GitHub] spark pull request: [SPARK-9876][SQL]: Update Parquet to 1.8.1.

2016-05-27 Thread rdblue
Github user rdblue commented on the pull request: https://github.com/apache/spark/pull/13280#issuecomment-13718 @liancheng, fixed. Yeah, IntelliJ has a few annoyances like that with scala. Imports are a mess. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-9876][SQL]: Update Parquet to 1.8.1.

2016-05-27 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/13280#discussion_r64948705 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaSuite.scala --- @@ -1415,6 +1425,18 @@ class

[GitHub] spark pull request: [SPARK-9876][SQL]: Update Parquet to 1.8.1.

2016-05-27 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/13280#discussion_r64947459 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaSuite.scala --- @@ -1415,6 +1425,18 @@ class

[GitHub] spark pull request: [SPARK-9876][SQL]: Update Parquet to 1.8.1.

2016-05-26 Thread rdblue
Github user rdblue commented on the pull request: https://github.com/apache/spark/pull/13280#issuecomment-222027030 @liancheng: rebased. Sorry I missed that earlier. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: SPARK-13723: Change behavior of --num-executor...

2016-05-26 Thread rdblue
GitHub user rdblue opened a pull request: https://github.com/apache/spark/pull/13338 SPARK-13723: Change behavior of --num-executors with dynamic allocation. ## What changes were proposed in this pull request? This changes the behavior of --num-executors

[GitHub] spark pull request: [SPARK-9876][SQL]: Update Parquet to 1.8.1.

2016-05-26 Thread rdblue
Github user rdblue commented on the pull request: https://github.com/apache/spark/pull/13280#issuecomment-222014796 @liancheng, thanks for pointing out that fix, I've added it. I thought that was already committed since it has been a while since we fixed the Parquet side

[GitHub] spark issue #13445: [SPARK-9876] Revert "[SPARK-9876][SQL] Update Parquet to...

2016-06-01 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/13445 Sounds reasonable, but the "regression" wasn't located or even confirmed to exist after this change was reverted the last time. There was also no follow-up on it. If we revert the change

[GitHub] spark issue #13280: [SPARK-9876][SQL]: Update Parquet to 1.8.1.

2016-06-01 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/13280 @yhuai, what started failing? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #13280: [SPARK-9876][SQL]: Update Parquet to 1.8.1.

2016-06-01 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/13280 As I said on PR #13445: It sounds reasonable, but we should follow up on this. If we revert the change I suggest that we only revert it in 2.0 or add it to master as soon as 2.0 is branched

[GitHub] spark issue #12313: [SPARK-14543] [SQL] Improve InsertIntoTable column resol...

2016-06-01 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/12313 @yhuai, I've removed the public API additions so we can get the changes in as you suggest. I also rebased on the current master so it can be merged. I'll fix any test failures that come up. Thanks

[GitHub] spark issue #13450: [SPARK-9876] [BRANCH-2.0] Revert "[SPARK-9876][SQL] Upda...

2016-06-01 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/13450 +1, looks fine to me assuming tests pass. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13338: [SPARK-13723] [YARN] Change behavior of --num-executors ...

2016-06-23 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/13338 @tgravescs, I've updated it. Sorry about the delay, for some reason the notifications for this issue didn't make it to my inbox so I wasn't seeing updates. --- If your project is set up

[GitHub] spark issue #13482: [SPARK-15725][YARN] Ensure ApplicationMaster sleeps for ...

2016-06-23 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/13482 @tgravescs, I've updated it. Sorry about the delay, for some reason the notifications for this issue didn't make it to my inbox so I wasn't seeing updates. --- If your project is set up

[GitHub] spark pull request #13769: [SPARK-16030] [SQL] Allow specifying static parti...

2016-06-20 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/13769#discussion_r67749047 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala --- @@ -43,8 +43,128 @@ import

[GitHub] spark issue #13482: [SPARK-15725][YARN] Ensure ApplicationMaster sleeps for ...

2016-06-23 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/13482 @tgravescs, thanks for reviewing! Sorry about the delay! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #13338: [SPARK-13723] [YARN] Change behavior of --num-executors ...

2016-06-23 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/13338 Thanks @tgravescs! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #13701: [SPARK-15639][SQL] Try to push down filter at RowGroups ...

2016-06-24 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/13701 @gatorsmile, sorry for the delay, I was evidently not getting notifications until I changed some settings yesterday. There are a few tests in Parquet that generate files with test data

[GitHub] spark issue #13701: [SPARK-15639][SQL] Try to push down filter at RowGroups ...

2016-06-24 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/13701 Yeah, Parquet doesn't make a distinction for where filters are applied. If you push a filter, then it will be applied to row groups if possible and individual rows after that. But if you're

[GitHub] spark pull request #13880: SPARK-16178: Remove unnecessary Hive partition ch...

2016-06-23 Thread rdblue
GitHub user rdblue opened a pull request: https://github.com/apache/spark/pull/13880 SPARK-16178: Remove unnecessary Hive partition check. ## What changes were proposed in this pull request? This removes a check that partition names match from the Hive write path, which

[GitHub] spark issue #13338: [SPARK-13723] [YARN] Change behavior of --num-executors ...

2016-06-23 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/13338 @tgravescs, I've updated it. Sorry about the delay, for some reason the notifications for this issue didn't make it to my inbox so I wasn't seeing updates. --- If your project is set up

[GitHub] spark pull request: SPARK-9926: Parallelize partition logic in Uni...

2016-02-17 Thread rdblue
GitHub user rdblue opened a pull request: https://github.com/apache/spark/pull/11242 SPARK-9926: Parallelize partition logic in UnionRDD. This patch has the new logic from #8512 that uses a parallel collection to compute partitions in UnionRDD. The rest of #8512 added

[GitHub] spark pull request: SPARK-13403: Pass hadoopConfiguration to HiveC...

2016-02-19 Thread rdblue
GitHub user rdblue opened a pull request: https://github.com/apache/spark/pull/11273 SPARK-13403: Pass hadoopConfiguration to HiveConf constructors. This commit updates the HiveContext so that sc.hadoopConfiguration is used to instantiate its internal instances of HiveConf

[GitHub] spark pull request: SPARK-9926: Parallelize partition logic in Uni...

2016-03-18 Thread rdblue
Github user rdblue commented on the pull request: https://github.com/apache/spark/pull/11242#issuecomment-198034848 @dbtsai, I updated the PR with your suggestion and rebased on master. Thanks for taking the time to review! --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-13403] [SQL] Pass hadoopConfiguration t...

2016-03-18 Thread rdblue
Github user rdblue commented on the pull request: https://github.com/apache/spark/pull/11273#issuecomment-197622432 Sorry for the delay, @vanzin and @srowen! I didn't get notified that you had commented. I've rebased this on master and fixed the PR title. --- If your project is set

[GitHub] spark pull request: SPARK-13779: Avoid cancelling non-local contai...

2016-03-14 Thread rdblue
Github user rdblue commented on the pull request: https://github.com/apache/spark/pull/11612#issuecomment-196463934 Thanks @vanzin! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-13779: Avoid cancelling non-local contai...

2016-03-09 Thread rdblue
GitHub user rdblue opened a pull request: https://github.com/apache/spark/pull/11612 SPARK-13779: Avoid cancelling non-local container requests. ## What changes were proposed in this pull request? To maximize locality, the YarnAllocator would cancel any requests

[GitHub] spark pull request: SPARK-13688: Add spark.dynamicAllocation.overr...

2016-03-30 Thread rdblue
Github user rdblue closed the pull request at: https://github.com/apache/spark/pull/11528 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: SPARK-13688: Add spark.dynamicAllocation.overr...

2016-03-30 Thread rdblue
Github user rdblue commented on the pull request: https://github.com/apache/spark/pull/11528#issuecomment-203531625 Close --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: SPARK-14459: Detect relation partitioning and ...

2016-04-07 Thread rdblue
GitHub user rdblue opened a pull request: https://github.com/apache/spark/pull/12239 SPARK-14459: Detect relation partitioning and adjust the logical plan ## What changes were proposed in this pull request? This detects a relation's partitioning and adds checks

[GitHub] spark pull request: SPARK-14459: Detect relation partitioning and ...

2016-04-07 Thread rdblue
Github user rdblue commented on the pull request: https://github.com/apache/spark/pull/12239#issuecomment-206992072 @liancheng, can you look at this? It looks like you're familiar with the SQL/Hive code. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: SPARK-9926: Parallelize partition logic in Uni...

2016-04-07 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/11242#discussion_r58932013 --- Diff: core/src/main/scala/org/apache/spark/rdd/UnionRDD.scala --- @@ -62,7 +64,23 @@ class UnionRDD[T: ClassTag]( var rdds: Seq[RDD[T

[GitHub] spark pull request: SPARK-9926: Parallelize partition logic in Uni...

2016-04-07 Thread rdblue
Github user rdblue commented on the pull request: https://github.com/apache/spark/pull/11242#issuecomment-207062268 I've updated this PR and rebased on the current master. As Sean suggested, I used the default thread pool and a parallel collection to do the work, so the commit

[GitHub] spark pull request: [SPARK-14459] [SQL] Detect relation partitioni...

2016-04-08 Thread rdblue
Github user rdblue commented on the pull request: https://github.com/apache/spark/pull/12239#issuecomment-207637975 Looks like I accidentally pushed a commit I didn't intend to. I've fixed that, but the test failed. This is ok to test (again). --- If your project is set up

[GitHub] spark pull request: SPARK-9926: Parallelize partition logic in Uni...

2016-04-08 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/11242#discussion_r59057410 --- Diff: core/src/main/scala/org/apache/spark/rdd/UnionRDD.scala --- @@ -62,8 +62,14 @@ class UnionRDD[T: ClassTag]( var rdds: Seq[RDD[T

[GitHub] spark pull request: SPARK-9926: Parallelize partition logic in Uni...

2016-04-08 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/11242#discussion_r59058645 --- Diff: core/src/main/scala/org/apache/spark/rdd/UnionRDD.scala --- @@ -62,8 +62,14 @@ class UnionRDD[T: ClassTag]( var rdds: Seq[RDD[T

[GitHub] spark pull request: [SPARK-14459] [SQL] Detect relation partitioni...

2016-04-08 Thread rdblue
Github user rdblue commented on the pull request: https://github.com/apache/spark/pull/12239#issuecomment-207631943 @marmbrus, I fixed the failing test. The problem was a query that didn't supply a value for one of the partitions. This commit actually gives a better error message

[GitHub] spark pull request: [SPARK-14543] [SQL] Fix InsertIntoTable column...

2016-04-11 Thread rdblue
GitHub user rdblue opened a pull request: https://github.com/apache/spark/pull/12313 [SPARK-14543] [SQL] Fix InsertIntoTable column resolution. (WIP) WIP: this depends on #12239 and includes its commits for SPARK-14459. ## What changes are proposed in this pull request

[GitHub] spark pull request: SPARK-9926: Parallelize partition logic in Uni...

2016-04-08 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/11242#discussion_r59059521 --- Diff: core/src/main/scala/org/apache/spark/rdd/UnionRDD.scala --- @@ -62,8 +62,14 @@ class UnionRDD[T: ClassTag]( var rdds: Seq[RDD[T

[GitHub] spark pull request: SPARK-9926: Parallelize partition logic in Uni...

2016-04-08 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/11242#discussion_r59062676 --- Diff: core/src/main/scala/org/apache/spark/rdd/UnionRDD.scala --- @@ -62,8 +62,14 @@ class UnionRDD[T: ClassTag]( var rdds: Seq[RDD[T

[GitHub] spark pull request: [SPARK-14459] [SQL] Detect relation partitioni...

2016-04-11 Thread rdblue
Github user rdblue commented on the pull request: https://github.com/apache/spark/pull/12239#issuecomment-208530148 I just pushed a fix for the last test failure. Should be ready to test again. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-13403] [SQL] Pass hadoopConfiguration t...

2016-03-19 Thread rdblue
Github user rdblue commented on the pull request: https://github.com/apache/spark/pull/11273#issuecomment-197945466 Thanks for the reviews! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-13403] [SQL] Pass hadoopConfiguration t...

2016-03-19 Thread rdblue
GitHub user rdblue reopened a pull request: https://github.com/apache/spark/pull/11273 [SPARK-13403] [SQL] Pass hadoopConfiguration to HiveConf constructors. This commit updates the HiveContext so that sc.hadoopConfiguration is used to instantiate its internal instances of HiveConf

[GitHub] spark pull request: SPARK-9926: Parallelize partition logic in Uni...

2016-03-20 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/11242#discussion_r56727449 --- Diff: core/src/main/scala/org/apache/spark/rdd/UnionRDD.scala --- @@ -62,7 +64,23 @@ class UnionRDD[T: ClassTag]( var rdds: Seq[RDD[T

[GitHub] spark pull request: SPARK-9926: Parallelize partition logic in Uni...

2016-03-21 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/11242#discussion_r56893742 --- Diff: core/src/main/scala/org/apache/spark/rdd/UnionRDD.scala --- @@ -62,7 +64,23 @@ class UnionRDD[T: ClassTag]( var rdds: Seq[RDD[T

[GitHub] spark pull request: SPARK-9926: Parallelize partition logic in Uni...

2016-03-21 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/11242#discussion_r56890135 --- Diff: core/src/main/scala/org/apache/spark/rdd/UnionRDD.scala --- @@ -62,7 +64,23 @@ class UnionRDD[T: ClassTag]( var rdds: Seq[RDD[T

[GitHub] spark pull request: SPARK-9926: Parallelize partition logic in Uni...

2016-03-22 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/11242#discussion_r57014360 --- Diff: core/src/main/scala/org/apache/spark/rdd/UnionRDD.scala --- @@ -62,7 +64,23 @@ class UnionRDD[T: ClassTag]( var rdds: Seq[RDD[T

[GitHub] spark pull request: SPARK-9926: Parallelize partition logic in Uni...

2016-03-23 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/11242#discussion_r57203120 --- Diff: core/src/main/scala/org/apache/spark/rdd/UnionRDD.scala --- @@ -62,7 +64,23 @@ class UnionRDD[T: ClassTag]( var rdds: Seq[RDD[T

[GitHub] spark pull request: SPARK-9926: Parallelize partition logic in Uni...

2016-03-21 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/11242#discussion_r5688 --- Diff: core/src/main/scala/org/apache/spark/rdd/UnionRDD.scala --- @@ -62,7 +64,23 @@ class UnionRDD[T: ClassTag]( var rdds: Seq[RDD[T

[GitHub] spark pull request: SPARK-9926: Parallelize partition logic in Uni...

2016-03-19 Thread rdblue
Github user rdblue commented on the pull request: https://github.com/apache/spark/pull/11242#issuecomment-197591766 @JoshRosen, I've added the test. Sorry for taking so long, my e-mail notifications weren't getting through for a little while. --- If your project is set up

[GitHub] spark pull request: [SPARK-13403] [SQL] Pass hadoopConfiguration t...

2016-03-19 Thread rdblue
Github user rdblue closed the pull request at: https://github.com/apache/spark/pull/11273 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

  1   2   3   4   5   6   7   8   9   10   >