[GitHub] spark issue #14279: [SPARK-16216][SQL] Write Timestamp and Date in ISO 8601 ...

2016-08-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14279 @srowen I think it is okay as a separate PR. I believe adding the things I said for a follow-up will mess up the change and make this hard to be reviewed. Nevertheless, I don't mind

[GitHub] spark pull request #14627: [SPARK-16975][SQL] Do not duplicately check file ...

2016-08-12 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/14627 [SPARK-16975][SQL] Do not duplicately check file paths in data sources implementing FileFormat and prevent to attempt to list twice in ORC ## What changes were proposed in this pull request

[GitHub] spark issue #14627: [SPARK-16975][SQL][FOLLOWUP] Do not duplicately check fi...

2016-08-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14627 @dongjoon-hyun Thanks for taking a look! Actually, I intended to test your fix for all data sources just to make sure the issue in `SPARK-16975` is resolved for all. As it is a clean

[GitHub] spark issue #14627: [SPARK-16975][SQL][FOLLOWUP] Do not duplicately check fi...

2016-08-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14627 I am happy to add some more tests but I am a bit confused of what I should test. Maybe a test for `HadoopFsRelation.listLeafFiles` to make sure directories are not included for the return value

[GitHub] spark issue #14627: [SPARK-16975][SQL][FOLLOWUP] Do not duplicately check fi...

2016-08-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14627 FYI, in most data sources, it would try to read duplicately or fail to infer the schema (haven't tested this yet) if directories are allowed in `FileFormat.inferSchema`. --- If your project

[GitHub] spark issue #14627: [SPARK-16975][SQL][FOLLOWUP] Do not duplicately check fi...

2016-08-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14627 Yes, it is not. I am sorry for the confusion. It seems your fix is perfectly fine and correct but it is just a clean-up to get rid of duplicated logics. BTW, I am also okay

[GitHub] spark issue #14627: [SPARK-16975][SQL][FOLLOWUP] Do not duplicately check fi...

2016-08-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14627 Could you take a look please @rxin and @liancheng ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #14627: [SPARK-16975][SQL][FOLLOWUP] Do not duplicately check fi...

2016-08-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14627 BTW, I am waiting for the Jenkins tests before cc someone. However, as you are already in here (I appreciate it), I appreciate it if you take a look. --- If your project is set up for it, you

[GitHub] spark issue #14627: [SPARK-16975][SQL][FOLLOWUP] Do not duplicately check fi...

2016-08-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14627 Ah, yes thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #14593: [MINOR][DOCS] Fix style in examples and inconsistent ind...

2016-08-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14593 Also just for reviewers, the inconsistent stuffs I listed in the PR description happen randomly across documentation. So, this fixes them to be consistent according to style guide lines

[GitHub] spark issue #14102: [SPARK-16434][SQL] Avoid per-record type dispatch in JSO...

2016-08-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14102 @cloud-fan Thanks! I think it is ready to be reviewed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #14593: [MINOR][DOCS] Fix style in examples and inconsist...

2016-08-10 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/14593 [MINOR][DOCS] Fix style in examples and inconsistent indentation across documentation ## What changes were proposed in this pull request? This PR fixes the documentation as below

[GitHub] spark issue #14588: [SPARK-17005][SQL] fix method tpe in trait AnnotationApi...

2016-08-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14588 Please let me cc @srowen to make sure because I believe it is about building. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark issue #14593: [MINOR][DOCS] Fix style in examples and inconsistent ind...

2016-08-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14593 BTW, this is not fixing some wrong examples and inconsistent indentation codes in `structured-streaming-programming-guide.md` because https://github.com/apache/spark/pull/14564 is handling them

[GitHub] spark pull request #14102: [SPARK-16434][SQL] Avoid per-record type dispatch...

2016-08-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/14102#discussion_r74374585 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala --- @@ -35,184 +34,289 @@ import

[GitHub] spark pull request #14102: [SPARK-16434][SQL] Avoid per-record type dispatch...

2016-08-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/14102#discussion_r74372103 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala --- @@ -35,184 +34,289 @@ import

[GitHub] spark pull request #14102: [SPARK-16434][SQL] Avoid per-record type dispatch...

2016-08-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/14102#discussion_r74372641 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala --- @@ -35,184 +34,289 @@ import

[GitHub] spark pull request #14102: [SPARK-16434][SQL] Avoid per-record type dispatch...

2016-08-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/14102#discussion_r74375624 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala --- @@ -35,184 +34,289 @@ import

[GitHub] spark pull request #14102: [SPARK-16434][SQL] Avoid per-record type dispatch...

2016-08-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/14102#discussion_r74375976 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala --- @@ -35,184 +34,289 @@ import

[GitHub] spark pull request #14102: [SPARK-16434][SQL] Avoid per-record type dispatch...

2016-08-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/14102#discussion_r74382183 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala --- @@ -35,184 +34,296 @@ import

[GitHub] spark pull request #14102: [SPARK-16434][SQL] Avoid per-record type dispatch...

2016-08-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/14102#discussion_r74382623 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala --- @@ -35,184 +34,296 @@ import

[GitHub] spark issue #14593: [MINOR][DOC] Fix style in examples and inconsistent inde...

2016-08-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14593 @srowen Thank you for review! I can revert non-visible ones but will just leave the visible ones. I am okay with reverting this (anyway now we know there are nits of spacing

[GitHub] spark pull request #14102: [SPARK-16434][SQL] Avoid per-record type dispatch...

2016-08-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/14102#discussion_r74214974 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala --- @@ -35,184 +34,330 @@ import

[GitHub] spark pull request #14102: [SPARK-16434][SQL] Avoid per-record type dispatch...

2016-08-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/14102#discussion_r74220035 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala --- @@ -35,184 +34,330 @@ import

[GitHub] spark pull request #14172: [SPARK-16516][SQL] Support for pushing down filte...

2016-07-12 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/14172 [SPARK-16516][SQL] Support for pushing down filters for decimal and timestamp types in ORC ## What changes were proposed in this pull request? It seems ORC supports all the types

[GitHub] spark issue #14172: [SPARK-16516][SQL] Support for pushing down filters for ...

2016-07-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14172 @liancheng I thought both were omitted in the original codes because `BigDecimal` and `Timestamp` are not supported but it seems they are and work fine. Could you please take a look

[GitHub] spark issue #14181: [SPARK-15382][SQL] Fix a rule to push down projects bene...

2016-07-14 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14181 FYI, if replacement is disabled, it is failed when the ratio is more than 1.0. ``` scala> spark.range(10).sample(false, 1.1).withColumn("mid", monotonically_increa

[GitHub] spark issue #14181: [SPARK-15382][SQL] Fix a rule to push down projects bene...

2016-07-14 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14181 FYI, it seems it still happens even if ratio is less than 1.0 because it is sampling with replacement. ``` scala> spark.range(10).sample(true, 0.5).withColumn(&

[GitHub] spark pull request #14181: [SPARK-15382][SQL] Fix a rule to push down projec...

2016-07-14 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/14181#discussion_r70765806 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -1536,21 +1536,26 @@ class Dataset[T] private[sql]( * Returns a new

[GitHub] spark issue #14181: [SPARK-15382][SQL] Fix a rule to push down projects bene...

2016-07-14 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14181 Yea, sampling with replacement expects the results can be duplicated. IMHO, this fix should be enabled always when `replace` is `true`. --- If your project is set up for it, you can reply

[GitHub] spark pull request #14181: [SPARK-15382][SQL] Fix a rule to push down projec...

2016-07-14 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/14181#discussion_r70766221 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -154,8 +154,12 @@ class SimpleTestOptimizer

[GitHub] spark pull request #14181: [SPARK-15382][SQL] Fix a rule to push down projec...

2016-07-14 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/14181#discussion_r70771559 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -1536,21 +1536,26 @@ class Dataset[T] private[sql]( * Returns a new

[GitHub] spark pull request #14181: [SPARK-15382][SQL] Fix a rule to push down projec...

2016-07-14 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/14181#discussion_r70770371 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -154,8 +154,12 @@ class SimpleTestOptimizer

[GitHub] spark issue #14294: [SPARK-16646][SQL] LEAST and GREATEST doesn't accept num...

2016-07-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14294 @liancheng and @davies Will this change be appropriate? Could you please take a look? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #14117: [SPARK-16461][SQL] Support partition batch pruning with ...

2016-07-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14117 Could you please take a look @rxin ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

2016-07-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14035 hm.. I can close if it looks inappropriate or it seems making a lot of conflicts across PRs. Could you give some feedback please @mengxr and @yanboliang ? --- If your project is set up

[GitHub] spark pull request #14294: [SPARK-16646][SQL] LEAST and GREATEST doesn't acc...

2016-07-20 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/14294 [SPARK-16646][SQL] LEAST and GREATEST doesn't accept numeric arguments with different data types ## What changes were proposed in this pull request? This PR makes `LEAST

[GitHub] spark issue #14294: [SPARK-16646][SQL] LEAST and GREATEST doesn't accept num...

2016-07-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14294 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #14102: [SPARK-16434][SQL] Avoid per-record type dispatch...

2016-07-17 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/14102#discussion_r71099098 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JSONOptions.scala --- @@ -51,7 +53,8 @@ private[sql] class

[GitHub] spark pull request #14102: [SPARK-16434][SQL] Avoid per-record type dispatch...

2016-07-17 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/14102#discussion_r71099616 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala --- @@ -35,184 +34,306 @@ import

[GitHub] spark issue #14102: [SPARK-16434][SQL] Avoid per-record type dispatch in JSO...

2016-07-17 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14102 @yhuai Thank you for your review! I will try to address all your comments first. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request #13988: [SPARK-16101][SQL] Refactoring CSV data source to...

2016-07-17 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13988#discussion_r71097989 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityGenerator.scala --- @@ -0,0 +1,83 @@ +/* + * Licensed

[GitHub] spark pull request #14102: [SPARK-16434][SQL] Avoid per-record type dispatch...

2016-07-17 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/14102#discussion_r71099344 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala --- @@ -35,184 +34,306 @@ import

[GitHub] spark pull request #14217: [SPARK-16562][SQL] Do not allow downcast in INT32...

2016-07-16 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/14217#discussion_r71076055 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala --- @@ -169,6 +169,19 @@ class ParquetIOSuite

[GitHub] spark issue #14236: [SPARK-16588][SQL] Missed API fix for a function name mi...

2016-07-17 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14236 cc @rxin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #14236: [SPARK-16588][SQL] Missed API fix for a function name mi...

2016-07-17 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14236 FYI, there is one more usage in [ColumnExpressionSuite.scala#L517](https://github.com/apache/spark/blob/46395db80e3304e3f3a1ebdc8aadb8f2819b48b4/sql/core/src/test/scala/org/apache/spark/sql

[GitHub] spark pull request #14236: [SPARK-16588][SQL] Missed API fix for a function ...

2016-07-17 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/14236 [SPARK-16588][SQL] Missed API fix for a function name mismatched between FunctionRegistry and functions.scala ## What changes were proposed in this pull request? It seems the function

[GitHub] spark pull request #12944: [SPARK-15074][Shuffle] Cache shuffle index file t...

2016-07-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12944#discussion_r71254024 --- Diff: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ShuffleIndexRecord.java --- @@ -0,0 +1,39 @@ +/* + * Licensed

[GitHub] spark pull request #14102: [SPARK-16434][SQL] Avoid per-record type dispatch...

2016-07-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/14102#discussion_r71261575 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/InferSchema.scala --- @@ -60,13 +60,13 @@ private[sql] object

[GitHub] spark pull request #13912: [SPARK-16216][SQL] CSV data source supports custo...

2016-07-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13912#discussion_r71276368 --- Diff: python/pyspark/sql/readwriter.py --- @@ -328,6 +328,10 @@ def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=Non

[GitHub] spark issue #14102: [SPARK-16434][SQL] Avoid per-record type dispatch in JSO...

2016-07-18 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14102 @yhuai the commits I pushed include the changes below: - Reverts the changes in `JSONOptions` about `columnNameOfCorruptRecord` https://github.com/apache/spark/pull/14102

[GitHub] spark issue #13912: [SPARK-16216][SQL] CSV data source supports custom date ...

2016-07-19 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/13912 I am closing this first and then will open a new one only for the default format within today. I will cc you all as soon as I open. Thanks! --- If your project is set up for it, you can reply

[GitHub] spark pull request #13912: [SPARK-16216][SQL] CSV data source supports custo...

2016-07-19 Thread HyukjinKwon
Github user HyukjinKwon closed the pull request at: https://github.com/apache/spark/pull/13912 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #14279: [SPARK-16216][SQL] Write Timestamp and Date in IS...

2016-07-20 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/14279#discussion_r71474425 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala --- @@ -62,6 +64,10 @@ object DateTimeUtils

[GitHub] spark pull request #14242: Add a comment

2016-07-20 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/14242#discussion_r71501991 --- Diff: examples/src/main/scala/org/apache/spark/examples/SparkKMeans.scala --- @@ -33,7 +33,6 @@ object SparkKMeans { def parseVector(line

[GitHub] spark pull request #14279: [SPARK-16216][SQL] Write Timestamp and Date in IS...

2016-07-20 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/14279 [SPARK-16216][SQL] Write Timestamp and Date in ISO 8601 formatted string by default for CSV and JSON ## What changes were proposed in this pull request? Currently, CSV datasource

[GitHub] spark pull request #14279: [SPARK-16216][SQL] Write Timestamp and Date in IS...

2016-07-20 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/14279#discussion_r71474579 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala --- @@ -477,15 +478,15 @@ class CSVSuite extends

[GitHub] spark issue #14279: [SPARK-16216][SQL] Write Timestamp and Date in ISO 8601 ...

2016-07-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14279 cc @rxin, @srowen and @deanchen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14217: [SPARK-16562][SQL] Do not allow downcast in INT32 based ...

2016-07-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14217 Closing this due to HIVE-14294. This will causes SPARK-16632 for normal parquet reader as well. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request #14217: [SPARK-16562][SQL] Do not allow downcast in INT32...

2016-07-20 Thread HyukjinKwon
Github user HyukjinKwon closed the pull request at: https://github.com/apache/spark/pull/14217 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #14279: [SPARK-16216][SQL] Write Timestamp and Date in ISO 8601 ...

2016-07-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14279 It seems due to https://issues.apache.org/jira/browse/LANG-982 I will look into this deeper --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14124 Could you take a look please @marmbrus ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #14117: [SPARK-16461][SQL] Support partition batch pruning with ...

2016-07-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14117 Gentle ping @davies @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14102: [SPARK-16434][SQL] Avoid per-record type dispatch in JSO...

2016-07-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14102 Sorry for pinging here and there.. @yhuai @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

2016-07-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14035 Gentle ping @mengxr and @yanboliang --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14028: [SPARK-16351][SQL] Avoid per-record type dispatch in JSO...

2016-07-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14028 @yhuai Could you take another look? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #13912: [SPARK-16216][SQL] CSV data source supports custo...

2016-07-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13912#discussion_r71120468 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVRelation.scala --- @@ -195,18 +202,40 @@ private[sql] class

[GitHub] spark issue #14236: [SPARK-16588][SQL] Missed API fix for a function name mi...

2016-07-17 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14236 @rxin Sure, Thank you very much. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #13912: [SPARK-16216][SQL] CSV data source supports custo...

2016-07-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13912#discussion_r71130049 --- Diff: python/pyspark/sql/readwriter.py --- @@ -328,6 +328,10 @@ def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=Non

[GitHub] spark pull request #13912: [SPARK-16216][SQL] CSV data source supports custo...

2016-07-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13912#discussion_r71131209 --- Diff: python/pyspark/sql/readwriter.py --- @@ -328,6 +328,10 @@ def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=Non

[GitHub] spark pull request #13912: [SPARK-16216][SQL] CSV data source supports custo...

2016-07-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13912#discussion_r71124928 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVRelation.scala --- @@ -195,18 +202,40 @@ private[sql] class

[GitHub] spark pull request #13912: [SPARK-16216][SQL] CSV data source supports custo...

2016-07-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13912#discussion_r71124645 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVRelation.scala --- @@ -195,18 +202,40 @@ private[sql] class

[GitHub] spark pull request #13912: [SPARK-16216][SQL] CSV data source supports custo...

2016-07-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13912#discussion_r71127773 --- Diff: python/pyspark/sql/readwriter.py --- @@ -328,6 +328,10 @@ def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=Non

[GitHub] spark pull request #13912: [SPARK-16216][SQL] CSV data source supports custo...

2016-07-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13912#discussion_r71124186 --- Diff: python/pyspark/sql/readwriter.py --- @@ -328,6 +328,10 @@ def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=Non

[GitHub] spark pull request #13912: [SPARK-16216][SQL] CSV data source supports custo...

2016-07-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13912#discussion_r71124234 --- Diff: python/pyspark/sql/readwriter.py --- @@ -328,6 +328,10 @@ def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=Non

[GitHub] spark pull request #14215: [SPARK-16544][SQL][WIP] Support for conversion fr...

2016-07-14 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/14215 [SPARK-16544][SQL][WIP] Support for conversion from compatible schema for Parquet data source when data types are not matched ## What changes were proposed in this pull request

[GitHub] spark pull request #14217: [SPARK-16562][SQL] Do not allow downcast in INT32...

2016-07-14 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/14217 [SPARK-16562][SQL] Do not allow downcast in INT32 based types for normal Parquet reader ## What changes were proposed in this pull request? Currently, INT32 based types, (`ShortType

[GitHub] spark issue #14035: [SPARK-16356][ML] Add testImplicits for ML unit tests an...

2016-07-14 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14035 ping @mengxr and @yanboliang --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #14215: [SPARK-16544][SQL][WIP] Support for conversion from comp...

2016-07-15 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14215 I see, yes I will think of a better way to fix the message. Yea it is still happening across other data sources and this implementation is very specific to Parquet. I just wonder we

[GitHub] spark issue #14215: [SPARK-16544][SQL][WIP] Support for conversion from comp...

2016-07-15 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14215 For handling messages, I will open a separate PR soon! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #14215: [SPARK-16544][SQL][WIP] Support for conversion from comp...

2016-07-14 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14215 Hi @gatorsmile @dongjoon-hyun @liancheng , currently this deals with only `NumericType` except `DecimalType` for upcasting only for non-vectorized reader. Before proceeding further, I

[GitHub] spark issue #14217: [SPARK-16562][SQL] Do not allow downcast in INT32 based ...

2016-07-14 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14217 cc @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #12972: [SPARK-15198][SQL] Support for pushing down filters for ...

2016-06-27 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/12972 Hi @liancheng ! Could you take a quick look please? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #13942: [SPARK-16250][PYSPARK] Can't use escapeQuotes opt...

2016-06-28 Thread HyukjinKwon
Github user HyukjinKwon closed the pull request at: https://github.com/apache/spark/pull/13942 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #13942: [SPARK-16250][PYSPARK] Can't use escapeQuotes option in ...

2016-06-28 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/13942 Fixed in https://github.com/apache/spark/commit/1aad8c6e59c1e8b18a3eaa8ded93ff6ad05d83df --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request #13517: [SPARK-14839][SQL] Support for other types as opt...

2016-06-28 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13517#discussion_r68871740 --- Diff: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 --- @@ -45,11 +45,11 @@ statement | ALTER DATABASE

[GitHub] spark pull request #13517: [SPARK-14839][SQL] Support for other types as opt...

2016-06-28 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13517#discussion_r68869270 --- Diff: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 --- @@ -45,11 +45,11 @@ statement | ALTER DATABASE

[GitHub] spark pull request #13963: [TRIVIAL][PYSPARK] Clean up orc compression optio...

2016-06-28 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/13963 [TRIVIAL][PYSPARK] Clean up orc compression option as well ## What changes were proposed in this pull request? This PR corrects ORC compression option for PySpark as well. I think

[GitHub] spark issue #13963: [TRIVIAL][PYSPARK] Clean up orc compression option as we...

2016-06-28 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/13963 cc @davies --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #12972: [SPARK-15198][SQL] Support for pushing down filters for ...

2016-07-04 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/12972 No problem! thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #14067: [SPARK-16371][SQL] Do not push down filters incorrectly ...

2016-07-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14067 Hi, @rxin @liancheng, I hope this is not missed to 2.0.. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #14067: [SPARK-16371][SQL] Do not push down filters incorrectly ...

2016-07-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14067 Oh, @liancheng I just corrected some more. Please take another look.. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #14067: [SPARK-16371][SQL] Do not push down filters incor...

2016-07-06 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/14067#discussion_r69707869 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala --- @@ -188,7 +188,7 @@ private[sql] object

[GitHub] spark issue #14067: [SPARK-16371][SQL] Do not push down filters incorrectly ...

2016-07-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14067 Hm... I though Spark does not support filter-push down for nested fields. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #14067: [SPARK-16371][SQL] Do not push down filters incorrectly ...

2016-07-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14067 cc @viirya as well. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #14067: [SPARK-16371][SQL] Do not push down filters incorrectly ...

2016-07-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14067 yes it is, I just found I will correct them all. Thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #14067: [SPARK-16371][SQL] Do not push down filters incor...

2016-07-06 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/14067 [SPARK-16371][SQL] Do not push down filters incorrectly when inner name and outer name are the same in Parquet ## What changes were proposed in this pull request? Currently

[GitHub] spark issue #14028: [SPARK-16351][SQL] Avoid record-per type dispatch in JSO...

2016-07-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14028 Actually, I resembled your codes @liancheng here. Would this change be sensible maybe? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request #14063: [MINOR][PySpark][DOC] Fix SparkSession and Builde...

2016-07-05 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/14063 [MINOR][PySpark][DOC] Fix SparkSession and Builder API docum ## What changes were proposed in this pull request? This PR fixes wrongly formatted examples in PySpark documentation

[GitHub] spark issue #14063: [MINOR][PySpark][DOC] Fix code examples of SparkSession ...

2016-07-05 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14063 Oh, I will fix them up soon. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

<    8   9   10   11   12   13   14   15   16   17   >