[GitHub] spark pull request: [SPARK-15476][SQL] Support for reading text da...

2016-05-24 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13254#issuecomment-221199393 Oh, I was doing this for a `test`. Sorry, I will update this. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-15476][SQL] Support for reading text da...

2016-05-24 Thread HyukjinKwon
Github user HyukjinKwon closed the pull request at: https://github.com/apache/spark/pull/13254 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-15476][SQL] Support for reading text da...

2016-05-24 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13254#issuecomment-221201817 I was totally stupid. I was testing `test` for `text`.. I am closing this sorry for my cc. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-15475][SQL] Add tests for writing and r...

2016-05-24 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13253#issuecomment-221187678 @rxin No. it has not been fixed.. So, I wanted to add some test codes first to check writing and reading empty data to make sure this is working first

[GitHub] spark pull request: [SPARK-15475][SQL] Add tests for writing and r...

2016-05-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13253#issuecomment-221171371 Hi @rxin and @marmbrus, As you already know a "critical" issue was found here, [SPARK-15393](https://issues.apache.org/jira/browse/SPARK-15393). S

[GitHub] spark pull request: [SPARK-15475][SQL] Add tests for writing and r...

2016-05-24 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13253#issuecomment-221223288 I don't mind closing this. I will close if you think so. I can do this together later with [SPARK-10216](https://issues.apache.org/jira/browse/SPARK-10216

[GitHub] spark pull request: [SPARK-14800][SQL] Dealing with null as a valu...

2016-05-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12629#issuecomment-218415457 I am closing this due to not answering which I guess means not decided or not worth to doing this. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-15267][SQL] Refactor and add some class...

2016-05-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13048#discussion_r62825064 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcOptions.scala --- @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: Update LocalKMeans.scala

2016-05-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13047#discussion_r62822648 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LocalKMeans.scala --- @@ -46,17 +46,21 @@ private[mllib] object LocalKMeans extends

[GitHub] spark pull request: [SPARK-15267][SQL] Refactor and add some class...

2016-05-11 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/13048 [SPARK-15267][SQL] Refactor and add some classes for options in datasources ## What changes were proposed in this pull request? Currently, Parquet, JSON and CSV data sources have

[GitHub] spark pull request: Update LocalKMeans.scala

2016-05-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13047#issuecomment-218418565 I am careful of saying this but personally I would suggest to close this first and then follow the guides from scratch. --- If your project is set up for it, you

[GitHub] spark pull request: [SPARK-14800][SQL] Dealing with null as a valu...

2016-05-11 Thread HyukjinKwon
Github user HyukjinKwon closed the pull request at: https://github.com/apache/spark/pull/12629 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: Update LocalKMeans.scala

2016-05-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13047#issuecomment-218417501 FYI, it would be great if you run `./dev/run_tests` first and then follow https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark, https

[GitHub] spark pull request: Update LocalKMeans.scala

2016-05-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13047#issuecomment-218421104 @mouendless Thank you for bearing with me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-15267][SQL] Refactor and add some class...

2016-05-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13048#issuecomment-218440525 cc @rxin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [BUILD] Test closing stale PRs

2016-05-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13052#issuecomment-218484609 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [BUILD] Test closing stale PRs

2016-05-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13052#issuecomment-218487757 Do you mind adding #13003 maybe? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-15031][SPARK-15134][EXAMPLE][DOC] Use S...

2016-05-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13050#discussion_r62861788 --- Diff: examples/src/main/python/ml/simple_params_example.py --- @@ -18,36 +18,30 @@ from __future__ import print_function import

[GitHub] spark pull request: [SPARK-15250][SQL] Remove deprecated json API ...

2016-05-10 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/13040 [SPARK-15250][SQL] Remove deprecated json API in DataFrameReader ## What changes were proposed in this pull request? This PR removes the old `json(path: String)` API which is covered

[GitHub] spark pull request: [SPARK-15250][SQL] Remove deprecated json API ...

2016-05-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13040#issuecomment-218350886 (Sorry, I left a wrong comment referring unrelated PRs here and I removed that.) --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-15264][SQL] CSV Reader Error on Blank C...

2016-05-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13041#issuecomment-218345563 (I think it would be nicer if the PR description is fill up.) --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-15264][SQL] CSV Reader Error on Blank C...

2016-05-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13041#discussion_r62792044 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/DefaultSource.scala --- @@ -61,7 +61,9 @@ class DefaultSource extends

[GitHub] spark pull request: [SPARK-15264][SQL] CSV Reader Error on Blank C...

2016-05-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13041#issuecomment-218349299 Related with https://github.com/apache/spark/pull/12904 and https://github.com/apache/spark/pull/12921. @andrewor14 Do you mind if I review

[GitHub] spark pull request: [SPARK-15250][SQL] Remove deprecated json API ...

2016-05-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13040#issuecomment-218349218 Related with https://github.com/apache/spark/pull/12904 and https://github.com/apache/spark/pull/12921. @andrewor14 Do you mind if I review

[GitHub] spark pull request: [SPARK-15264][SQL] CSV Reader Error on Blank C...

2016-05-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13041#issuecomment-218363294 @anabranch First of all, I think currently (at least for me) it is really confusing to deal with `null`, `""` and empty string in CSV. I am tryin

[GitHub] spark pull request: [SPARK-15250][SQL] Remove deprecated json API ...

2016-05-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13040#issuecomment-218363946 cc @rxin (I see doc tests and test in Python is still using `SqlContext`. Do you mind if I correct this in another PR?) --- If your project is set up

[GitHub] spark pull request: [SPARK-15264][SPARK-15274][SQL] CSV Reader Err...

2016-05-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13041#issuecomment-218604088 @andrewor14 Also, I am careful of this because the header might be intendedly a empty string, meaning the field name is literally a empty string. For example

[GitHub] spark pull request: [SPARK-15264][SPARK-15274][SQL] CSV Reader Err...

2016-05-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13041#discussion_r62939104 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/DefaultSource.scala --- @@ -61,9 +61,11 @@ class DefaultSource extends

[GitHub] spark pull request: [SPARK-15264][SPARK-15274][SQL] CSV Reader Err...

2016-05-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13041#issuecomment-218610605 (I remember , for example, setting `nullValue` for `"abc"` will affect the values what Univocity thinks are `null` ending up with duplicated field names

[GitHub] spark pull request: [SPARK-15264][SPARK-15274][SQL] CSV Reader Err...

2016-05-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13041#issuecomment-218602160 @andrewor14 they are related because the first row is already parsed by Univocity parser in which `nullValue` is set, [here](https://github.com/apache/spark/blob

[GitHub] spark pull request: [SPARK-15264][SPARK-15274][SQL] CSV Reader Err...

2016-05-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13041#discussion_r62936110 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/DefaultSource.scala --- @@ -61,9 +61,11 @@ class DefaultSource extends

[GitHub] spark pull request: [SPARK-4131] [SQL] Support INSERT OVERWRITE [L...

2016-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13067#issuecomment-218761425 ( it seems the last part of the title os truncated.) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-15125][SQL] Changing CSV data source ma...

2016-05-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12904#issuecomment-218635820 After the discussion here, https://github.com/apache/spark/pull/12904, then I think ```scala StructField("", StringType, nulla

[GitHub] spark pull request: [SPARK-15264][SPARK-15274][SQL] CSV Reader Err...

2016-05-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13041#issuecomment-218644105 @marmbrus I am sorry that I said that without knowing the background enough. I wanted to say that might break the consistency of dealing with fields in Spark SQL

[GitHub] spark pull request: [SPARK-15264][SPARK-15274][SQL] CSV Reader Err...

2016-05-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13041#issuecomment-218630690 @andrewor14 Then, do you mind if this should be supported for JSON data source as well? --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-15264][SPARK-15274][SQL] CSV Reader Err...

2016-05-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13041#issuecomment-218623499 Could I maybe ask your thoughts on the comment above https://github.com/apache/spark/pull/13041#issuecomment-218604088? --- If your project is set up for it, you

[GitHub] spark pull request: [SPARK-15264][SPARK-15274][SQL] CSV Reader Err...

2016-05-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13041#issuecomment-218633927 I am not a committer and I might not have the right to say this but in my point of view, this change should not be included in Spark 2.0 but 2.1. I think

[GitHub] spark pull request: [SPARK-15125][SQL] Changing CSV data source ma...

2016-05-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12904#issuecomment-218633000 @andrewor14 Here it seems it is concluded that `""` is a string and a empty string is `null`. Because `""` is a legitimate string, this can be a

[GitHub] spark pull request: [SPARK-15267][SQL] Refactor and add some class...

2016-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13048#issuecomment-218695548 @rxin Sure. How about including `ORCOptions` This will be almost identical with `ParquetOptions`? --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-15125][SQL] Changing CSV data source ma...

2016-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12904#issuecomment-218680978 After the discussion https://github.com/apache/spark/pull/13041, I think the field below: ```scala StructField("", StringType, null

[GitHub] spark pull request: [SPARK-4131] [SQL] Support INSERT OVERWRITE [L...

2016-05-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13067#discussion_r62963928 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/SaveAsHiveFile.scala --- @@ -0,0 +1,77 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-4131] [SQL] Support INSERT OVERWRITE [L...

2016-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13067#issuecomment-218871601 @Parth-Brahmbhatt I think the title of this PR, `[SPARK-4131] [SQL] Support INSERT OVERWRITE [LOCAL] DIRECTORY '/path/…` is incomplete because it is ending

[GitHub] spark pull request: [DOC][MINOR] ml.feature Scala and Python API s...

2016-05-17 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13159#discussion_r63637481 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/VectorIndexer.scala --- @@ -240,7 +240,8 @@ object VectorIndexer extends

[GitHub] spark pull request: [DOC][MINOR] ml.feature Scala and Python API s...

2016-05-17 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13159#discussion_r63637490 --- Diff: python/pyspark/ml/feature.py --- @@ -2401,7 +2413,7 @@ class PCAModel(JavaModel, JavaMLReadable, JavaMLWritable

[GitHub] spark pull request: [DOC][MINOR] ml.feature Scala and Python API s...

2016-05-17 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13159#discussion_r63637479 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/PCA.scala --- @@ -106,7 +107,7 @@ object PCA extends DefaultParamsReadable[PCA

[GitHub] spark pull request: [SPARK-15031][EXAMPLES][FOLLOW-UP] Make Python...

2016-05-17 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13135#issuecomment-219876620 @MLnick Yes. It seems `threshold` and `thresholds` are set exclusively via `set..` methods but in Python both can be set via `fit` method. --- If your project

[GitHub] spark pull request: [SPARK-15370] [SQL] Update RewriteCorrelatedSc...

2016-05-17 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13155#discussion_r63626483 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala --- @@ -293,4 +293,65 @@ class SubquerySuite extends QueryTest

[GitHub] spark pull request: [SPARK-15370] [SQL] Update RewriteCorrelatedSc...

2016-05-17 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13155#discussion_r63626503 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala --- @@ -293,4 +293,65 @@ class SubquerySuite extends QueryTest

[GitHub] spark pull request: [SPARK-15365] [SQL]: When table size statistic...

2016-05-17 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13150#discussion_r63626710 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/MetastoreRelation.scala --- @@ -114,17 +115,27 @@ private[hive] case class MetastoreRelation

[GitHub] spark pull request: [SPARK-15365] [SQL]: When table size statistic...

2016-05-17 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13150#discussion_r63626678 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/MetastoreRelation.scala --- @@ -114,17 +115,27 @@ private[hive] case class MetastoreRelation

[GitHub] spark pull request: [SPARK-15365] [SQL]: When table size statistic...

2016-05-17 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13150#discussion_r63628099 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/MetastoreRelation.scala --- @@ -114,17 +115,27 @@ private[hive] case class MetastoreRelation

[GitHub] spark pull request: [SPARK-4131] [SQL] Support INSERT OVERWRITE [L...

2016-05-17 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13067#issuecomment-219894120 Please let me cc @hvanhovell because it seems most of codes were recently updated by him. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-8603] [sparkR] In windows, Incorrect fi...

2016-05-17 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/7025#discussion_r63630461 --- Diff: R/pkg/R/client.R --- @@ -42,6 +42,19 @@ determineSparkSubmitBin <- function() { } sparkSubmitBinName } +# R supports b

[GitHub] spark pull request: [SPARK-15353] [CORE] Making peer selection for...

2016-05-17 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13152#discussion_r63636132 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerId.scala --- @@ -103,8 +121,11 @@ private[spark] object BlockManagerId

[GitHub] spark pull request: [SPARK-15353] [CORE] Making peer selection for...

2016-05-17 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13152#discussion_r63636200 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala --- @@ -124,6 +138,13 @@ class BlockManagerMasterEndpoint

[GitHub] spark pull request: [SPARK-15353] [CORE] Making peer selection for...

2016-05-17 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13152#discussion_r63636245 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockReplicationPrioritization.scala --- @@ -0,0 +1,72 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-15353] [CORE] Making peer selection for...

2016-05-17 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13152#discussion_r63636272 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockReplicationPrioritization.scala --- @@ -0,0 +1,72 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-15353] [CORE] Making peer selection for...

2016-05-17 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13152#discussion_r63636294 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockReplicationPrioritization.scala --- @@ -0,0 +1,72 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-15353] [CORE] Making peer selection for...

2016-05-17 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13152#discussion_r63636336 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1079,109 +1103,97 @@ private[spark] class BlockManager

[GitHub] spark pull request: [SPARK-15353] [CORE] Making peer selection for...

2016-05-17 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13152#discussion_r63636461 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1079,109 +1103,97 @@ private[spark] class BlockManager

[GitHub] spark pull request: [SPARK-15353] [CORE] Making peer selection for...

2016-05-17 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13152#discussion_r63636491 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1079,109 +1103,97 @@ private[spark] class BlockManager

[GitHub] spark pull request: [SPARK-15353] [CORE] Making peer selection for...

2016-05-17 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13152#discussion_r63636534 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1079,109 +1103,97 @@ private[spark] class BlockManager

[GitHub] spark pull request: [SPARK-15353] [CORE] Making peer selection for...

2016-05-17 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13152#discussion_r63636922 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1079,109 +1103,97 @@ private[spark] class BlockManager

[GitHub] spark pull request: [SPARK-14851] Support radix sort with nullable...

2016-05-17 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13161#discussion_r63637193 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SortOrder.scala --- @@ -64,49 +64,57 @@ case class SortOrder(child

[GitHub] spark pull request: [SPARK-15267][SQL] Refactor options for JDBC a...

2016-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13048#discussion_r63135350 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcOptions.scala --- @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [spark-15212][SQL]CSV file reader when read fi...

2016-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12987#issuecomment-218958088 @WeichenXu123 I tried that with `ignoreLeadingWhiteSpace` and `ignoreTrailingWhiteSpace` and it seems working fine. I am careful of saying this because I am

[GitHub] spark pull request: [SPARK-15312] [SQL] Detect Duplicate Key in Pa...

2016-05-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13095#discussion_r63151860 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala --- @@ -220,11 +220,18 @@ class AstBuilder extends

[GitHub] spark pull request: [SPARK-15313][SQL] EmbedSerializerInFilter rul...

2016-05-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13096#discussion_r63152997 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1560,7 +1561,15 @@ object EmbedSerializerInFilter

[GitHub] spark pull request: [SPARK-15125][SQL] Changing CSV data source ma...

2016-05-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12904#issuecomment-218976260 @sureshthalamati oh, the comments are not related with this PR but moving the discussion to here was suggested. So, i did. Sorry for that if it was confusing

[GitHub] spark pull request: [SPARK-15313][SQL] EmbedSerializerInFilter rul...

2016-05-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13096#discussion_r63153328 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1560,7 +1561,15 @@ object EmbedSerializerInFilter

[GitHub] spark pull request: [SPARK-15315][SQL] Adding error check to the C...

2016-05-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13105#issuecomment-219588474 Just for a note, JSON data source is doing this in via [`JacsonGenerator.apply()`](https://github.com/apache/spark/blob/d6dc12ef0146ae409834c78737c116050961f350

[GitHub] spark pull request: [SPARK-15315][SQL] Adding error check to the C...

2016-05-16 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13105#discussion_r63450323 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/DefaultSource.scala --- @@ -172,4 +173,15 @@ class DefaultSource

[GitHub] spark pull request: [SPARK-15031][EXAMPLES][FOLLOW-UP] Make Python...

2016-05-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13135#issuecomment-219569749 @yanboliang Thank you so much for taking a close look and a detailed explanation! --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-15315][SQL] Adding error check to the C...

2016-05-16 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13105#discussion_r63447722 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/DefaultSource.scala --- @@ -172,4 +173,15 @@ class DefaultSource

[GitHub] spark pull request: [SPARK-15031][EXAMPLES][FOLLOW-UP] Make Python...

2016-05-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13135#issuecomment-219419054 (Please let me leave a link I usually refer. https://github.com/databricks/scala-style-guide) --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-6706] [MLlib] Reduce duplicate computat...

2016-05-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13133#issuecomment-219419219 (Please let me leave a link I usually refer, https://github.com/databricks/scala-style-guide) --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-10216][SQL] Avoid creating empty files ...

2016-05-16 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12855#discussion_r63466408 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala --- @@ -216,6 +215,33 @@ class InsertIntoHiveTableSuite

[GitHub] spark pull request: update from orign

2016-05-14 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13118#issuecomment-219223860 @zhaorongsheng it seems it is open mistakenly. I guess this might have to be closed. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-15325][SQL] Replace the usage of deprec...

2016-05-15 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13113#issuecomment-219333062 @srowen Could I ask this would be a proper change? If so, I would like to go ahead for Python and others. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-15325][SQL] Replace the usage of deprec...

2016-05-15 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13113#issuecomment-219341324 It seems `unionAll()` is all cleaned up. I am closing this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-15325][SQL] Replace the usage of deprec...

2016-05-15 Thread HyukjinKwon
Github user HyukjinKwon closed the pull request at: https://github.com/apache/spark/pull/13113 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-15325][SQL] Replace the usage of deprec...

2016-05-15 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13113#issuecomment-219340807 @gatorsmile Oh, it seems this PR has the duplicated changes.. Sorry, I should have looked though JIRAs and PRs more carefully. Would that make sense if I try

[GitHub] spark pull request: [SPARK-15340][SQL]Limit the size of the map us...

2016-05-16 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13130#discussion_r63334846 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -363,7 +363,7 @@ private[spark] object HadoopRDD extends Logging

[GitHub] spark pull request: [SPARK-15031][EXAMPLES][FOLLOW-UP] Make Python...

2016-05-16 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/13135 [SPARK-15031][EXAMPLES][FOLLOW-UP] Make Python param example working with SparkSession ## What changes were proposed in this pull request? It seems most of Python examples were

[GitHub] spark pull request: [SPARK-15031][EXAMPLES][FOLLOW-UP] Make Python...

2016-05-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13135#issuecomment-219405542 Could you please take a look? @MLnick --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-15267][SQL] Refactor options for JDBC a...

2016-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13048#discussion_r63134568 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcOptions.scala --- @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-15143][SPARK-15144][SQL] Add CSV tests ...

2016-05-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12921#discussion_r63140449 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVTypeCastSuite.scala --- @@ -73,10 +73,10 @@ class CSVTypeCastSuite

[GitHub] spark pull request: [SPARK-15198][SQL] Support for pushing down fi...

2016-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12972#issuecomment-218946661 Please excuse my ping, @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-13866][SQL] Handle decimal type in CSV ...

2016-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/11724#issuecomment-218946776 @rxin Do you mind if I ask a quick look again? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-10216][SQL] Avoid creating empty files ...

2016-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12855#issuecomment-218946730 ping @marmbrus --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-15267][SQL] Refactor options for JDBC a...

2016-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13048#discussion_r63133772 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcOptions.scala --- @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-15267][SQL] Refactor options for JDBC a...

2016-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13048#discussion_r63134045 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcOptions.scala --- @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-15143][SPARK-15144][SQL] Add CSV tests ...

2016-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12921#issuecomment-218945091 Please excuse my ping @rxin @falaki --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: Fix reading of partitioned format=text dataset...

2016-05-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13104#issuecomment-219199027 (I think it might need a JIRA because it seems changing existing behaviour) --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: Problem select empty ORC table

2016-05-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13103#issuecomment-219193053 In https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-ContributingBugReports , It seems not suggesting to create a PR

[GitHub] spark pull request: Problem select empty ORC table

2016-05-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13103#issuecomment-219193377 For me, I would like to say this might better be closed if this PR does not propose code changes. I think It would be nicer if the bug report is in the JIRA

[GitHub] spark pull request: [SPARK-13850] Force the sorter to Spill when n...

2016-05-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13107#discussion_r6329 --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java --- @@ -24,6 +24,8 @@ import java.util.Queue

[GitHub] spark pull request: Problem select empty ORC table

2016-05-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13103#issuecomment-219188872 Because I guess this is a contribution to Spark and there is a guide for this. It seems there are a lot of things wrong with this PR (e.g. no JIRA). --- If your

[GitHub] spark pull request: [SPARK-13850] Force the sorter to Spill when n...

2016-05-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13107#discussion_r63267469 --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java --- @@ -143,6 +151,8 @@ private

[GitHub] spark pull request: Fix MapObjects.itemAccessorMethod to handle Ti...

2016-05-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13108#issuecomment-219191307 (I think it might needs a JIRA because it seems changing existing behaviour) --- If your project is set up for it, you can reply to this email and have your reply

<    4   5   6   7   8   9   10   11   12   13   >