[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding optimizer by ...

2016-05-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/12850#discussion_r65272379 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -742,6 +742,23 @@ object

[GitHub] spark pull request: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

2016-05-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/12850 Hi, @cloud-fan . Now, I made a new rule `ReorderAssociativeOperator` as you recommended. Jira issue and PR description are updated together, too. --- If your project is set up

[GitHub] spark pull request: [SPARK-15612][SQL] Raise exception if decimal ...

2016-05-27 Thread dongjoon-hyun
Github user dongjoon-hyun closed the pull request at: https://github.com/apache/spark/pull/13358 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request: [SPARK-15603][MLLIB] Replace SQLContext with S...

2016-05-27 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/13352#issuecomment-222115643 This removes lots of deprecation warning messages like the followings. ``` /home/jenkins/workspace/SparkPullRequestBuilder/mllib/src/main/scala/org/apache

[GitHub] spark pull request: [SPARK-15612][SQL] Raise exception if decimal ...

2016-05-27 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/13358 [SPARK-15612][SQL] Raise exception if decimal `scale` >= `precision` ## What changes were proposed in this pull request? Currently, Spark raises exceptions only when decimal `sc

[GitHub] spark pull request: [SPARK-15603][MLLIB] Replace SQLContext with S...

2016-05-27 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/13352#issuecomment-06324 Hi, @andrewor14 . This is about deprecation warnings about `SQLContext`s. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-15603][MLLIB] Replace SQLContext with S...

2016-05-27 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/13352#issuecomment-16523 Oh, thank you, @andrewor14 . I see. I will make another PR for using `builder.sparkContext(sc)` pattern. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-15584][SQL] Abstract duplicate code: `s...

2016-05-27 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/13349#issuecomment-16623 Thank you again, @andrewor14 ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-15618][SQL][MLLIB] Use SparkSession.bui...

2016-05-27 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/13365 [SPARK-15618][SQL][MLLIB] Use SparkSession.builder.sparkContext if applicable. ## What changes were proposed in this pull request? This PR changes function

[GitHub] spark pull request: [SPARK-15618][SQL][MLLIB] Use SparkSession.bui...

2016-05-27 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/13365#issuecomment-75399 Hi, @andrewor14 . Could you review this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-15618][SQL][MLLIB] Use SparkSession.builder.spark...

2016-05-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/13365 Thank you, @andrewor14 . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [MINOR][SQL][DOCS] Fix docs of Dataset.scala and SQLImpl...

2016-05-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/13420 Thank you, @srowen and @andrewor14 ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [MINOR][SQL][DOCS] Fix docs of Dataset.scala and SQLImpl...

2016-05-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13420#discussion_r65286389 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -93,7 +93,7 @@ private[sql] object Dataset { * to some files

[GitHub] spark issue #13449: [SPARK-15709][SQL] Prevent `freqItems` from raising `Uns...

2016-06-02 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13449 Thank you, @srowen . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

2016-06-02 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/12850 Oh, thank you! @cloud-fan . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #13464: [Minor] Fix Java Lint errors introduced by #13286...

2016-06-02 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13464#discussion_r65584402 --- Diff: dev/checkstyle.xml --- @@ -157,7 +157,8

[GitHub] spark pull request: [SPARK-15584][SQL] Abstract duplicate code: `s...

2016-05-26 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/13349#issuecomment-222054344 Hi, @andrewor14 . This is the PR for SPARK-15583. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-15584][SQL] Abstract duplicate code: `s...

2016-05-26 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/13349 [SPARK-15584][SQL] Abstract duplicate code: `spark.sql.sources.` properties ## What changes were proposed in this pull request? This PR replaces `spark.sql.sources.` strings

[GitHub] spark pull request: [SPARK-15584][SQL] Abstract duplicate code: `s...

2016-05-26 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/13349#issuecomment-222059440 Thank YOU for pinging me, @andrewor14 . For me, it's the most difficult to find the real issue. :) --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-15583][SQL] Disallow altering datasourc...

2016-05-26 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13341#discussion_r64847876 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala --- @@ -255,6 +255,23 @@ case class

[GitHub] spark pull request: [SPARK-15603][MLLIB] Replace SQLContext with S...

2016-05-27 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/13352 [SPARK-15603][MLLIB] Replace SQLContext with SparkSession in ML/MLLib ## What changes were proposed in this pull request? This PR replaces all deprecated `SQLContext` occurrences

[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

2016-06-01 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/12850 Hi, @cloud-fan . It's ready for review again. Could you review this when you have some time? Thank you always! --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator...

2016-06-01 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/12850#discussion_r65464887 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ReorderAssociativeOperatorSuite.scala --- @@ -0,0 +1,59

[GitHub] spark pull request #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator...

2016-06-01 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/12850#discussion_r65465190 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -738,6 +739,49 @@ object

[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

2016-06-01 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/12850 @cloud-fan . According to your advice, I refactored the code and added mixed(addition+multiplication) testcases. Also, the PR description is updated. Thank you so much again

[GitHub] spark pull request #13449: [SPARK-15709][SQL] Prevent `freqItems` from raisi...

2016-06-01 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/13449 [SPARK-15709][SQL] Prevent `freqItems` from raising `UnsupportedOperationException: empty.min` ## What changes were proposed in this pull request? Currently, `freqItems` raises

[GitHub] spark issue #13449: [SPARK-15709][SQL] Prevent `freqItems` from raising `Uns...

2016-06-01 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13449 Thank you for review, @srowen . :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13436: [SPARK-15696][SQL] Improve `crosstab` to have a consiste...

2016-06-01 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13436 @srowen . Could you review this PR, too? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

2016-06-01 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/12850 Thank you for feedback. I'm really happy with your attention! For the non-deterministic part, we can add a single condition in `isAssociativelyFoldable`. If some of operand

[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

2016-06-01 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/12850 Thank you for deep discussion on this. I think like this. For 1), there are **machine-generated** queries by BI tools. This is an important category of queries. In many cases, BIs

[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

2016-06-01 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/12850 Hi, @cloud-fan and @davies . How do you think about the above? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

2016-06-01 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/12850 I added the missing part, `e.deterministic` check in `isAssociativelyFoldable`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark issue #13403: [SPARK-15660][CORE] RDD and Dataset should show the cons...

2016-06-01 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13403 Rebased. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

2016-06-01 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/12850 Thank you for reconsidering this PR positively. I'll update soon according to your advice. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator...

2016-06-01 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/12850#discussion_r65465517 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -738,6 +739,49 @@ object

[GitHub] spark pull request #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator...

2016-06-01 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/12850#discussion_r65485176 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -738,6 +739,42 @@ object

[GitHub] spark pull request: [SPARK-15696][SQL] Improve `crosstab` to have a consiste...

2016-06-01 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/13436 [SPARK-15696][SQL] Improve `crosstab` to have a consistent column order ## What changes were proposed in this pull request? Currently, `crosstab` returns a Dataframe having **random

[GitHub] spark pull request: [SPARK-15660][CORE] RDD and Dataset should sho...

2016-05-30 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/13403 [SPARK-15660][CORE] RDD and Dataset should show the consistent values for variance/stdev. ## What changes were proposed in this pull request? In Spark-11490, `variance/stdev

[GitHub] spark pull request: [SPARK-15660][CORE] RDD and Dataset should show the cons...

2016-05-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/13403 Thank you for review, @srowen . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-15660][CORE] RDD and Dataset should sho...

2016-05-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/13403#issuecomment-222648543 Hi, @rxin . I updated the example more practically by using **SparkSession.createDataset().rdd.stdev**. If we must preserve the current behavior

[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding optimizer by ...

2016-05-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/12850 Hi, @cloud-fan . Could you review again? Now, this PR provides a more generalized way to handle all foldable constants. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-15618][SQL][MLLIB] Use SparkSession.bui...

2016-05-27 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/13365#issuecomment-72318 At this time, Scala 2.10 build is also tested locally. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #13403: [SPARK-15660][CORE] Update RDD `variance/stdev` descript...

2016-06-22 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13403 Thank you, @srowen . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #13832: [SPARK-16123] Avoid NegativeArraySizeException wh...

2016-06-22 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13832#discussion_r68013403 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OffHeapColumnVector.java --- @@ -424,7 +424,9 @@ public void loadBytes

[GitHub] spark pull request #13403: [SPARK-15660][CORE] Update RDD `variance/stdev` d...

2016-06-22 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13403#discussion_r68007866 --- Diff: core/src/main/scala/org/apache/spark/rdd/DoubleRDDFunctions.scala --- @@ -74,6 +74,20 @@ class DoubleRDDFunctions(self: RDD[Double]) extends

[GitHub] spark pull request #13403: [SPARK-15660][CORE] Update RDD `variance/stdev` d...

2016-06-22 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13403#discussion_r68008205 --- Diff: core/src/main/scala/org/apache/spark/rdd/DoubleRDDFunctions.scala --- @@ -74,6 +74,20 @@ class DoubleRDDFunctions(self: RDD[Double]) extends

[GitHub] spark issue #13684: [SPARK-15908][R] Add varargs-type dropDuplicates() funct...

2016-06-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13684 I updated those two stuffs. Thank you, @shivaram . Yep. It's @sun-rui 's. It would be great. Hi, @sun-rui . Could you review this PR? --- If your project is set up

[GitHub] spark pull request #13684: [SPARK-15908][R] Add varargs-type dropDuplicates(...

2016-06-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13684#discussion_r67213090 --- Diff: R/pkg/R/DataFrame.R --- @@ -1869,6 +1869,7 @@ setMethod("where", #' path <- "path/to/file.json"

[GitHub] spark issue #13684: [SPARK-15908][R] Add varargs-type dropDuplicates() funct...

2016-06-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13684 Now, it's ready for review again, @shivaram . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #13684: [SPARK-15908][R] Add varargs-type dropDuplicates() funct...

2016-06-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13684 Oh, thank you, @shivaram ! I merged those functions into one according to your advice. In my opinion, there are more functions we can simplify like this. --- If your project is set up

[GitHub] spark pull request #13674: [MINOR][DOCS][SQL] Fix some comments about types(...

2016-06-14 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/13674 [MINOR][DOCS][SQL] Fix some comments about types(TypeCoercion,Partition) and exceptions. ## What changes were proposed in this pull request? This PR contains a few changes on code

[GitHub] spark issue #13674: [MINOR][DOCS][SQL] Fix some comments about types(TypeCoe...

2016-06-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13674 Thank you for review and merging, @andrewor14 . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #13643: [SPARK-15922][MLLIB] `toIndexedRowMatrix` should conside...

2016-06-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13643 Thank you so much, @srowen ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #13714: [SPARK-15996][R] Fix R examples by removing depre...

2016-06-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13714#discussion_r67440657 --- Diff: examples/src/main/r/data-manipulation.R --- @@ -75,8 +75,8 @@ destDF <- select(flightsDF, "dest", "cancelled")

[GitHub] spark pull request #13721: [SPARK-16005][R] Add `randomSplit` to SparkR

2016-06-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13721#discussion_r67444157 --- Diff: R/pkg/R/DataFrame.R --- @@ -2884,3 +2884,38 @@ setMethod("write.jdbc", write <- callJMethod(write,

[GitHub] spark pull request #13721: [SPARK-16005][R] Add `randomSplit` to SparkR

2016-06-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13721#discussion_r67444126 --- Diff: R/pkg/R/DataFrame.R --- @@ -2884,3 +2884,38 @@ setMethod("write.jdbc", write <- callJMethod(write,

[GitHub] spark issue #13721: [SPARK-16005][R] Add `randomSplit` to SparkR

2016-06-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13721 Now, it passed all tests and become ready for review again. Could you review this PR, @shivaram ? --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #13684: [SPARK-15908][R] Add varargs-type dropDuplicates() funct...

2016-06-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13684 If there is something to do more, please let me know. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #13721: [SPARK-16005][R] Add `randomSplit` to SparkR

2016-06-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13721 Thank you, @felixcheung . For `SPARK-14995`, I'll do that tonight. It looks good as an exercise for me. Thank you for let me know that. --- If your project is set up for it, you can

[GitHub] spark pull request #13721: [SPARK-16005][R] Add `randomSplit` to SparkR

2016-06-16 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/13721 [SPARK-16005][R] Add `randomSplit` to SparkR ## What changes were proposed in this pull request? This PR adds `randomSplit` to SparkR for API parity. ## How was this patch

[GitHub] spark pull request #13721: [SPARK-16005][R] Add `randomSplit` to SparkR

2016-06-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13721#discussion_r67443648 --- Diff: R/pkg/R/DataFrame.R --- @@ -2884,3 +2884,38 @@ setMethod("write.jdbc", write <- callJMethod(write,

[GitHub] spark issue #13684: [SPARK-15908][R] Add varargs-type dropDuplicates() funct...

2016-06-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13684 Thank you, @sun-rui ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #13721: [SPARK-16005][R] Add `randomSplit` to SparkR

2016-06-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13721#discussion_r67443758 --- Diff: R/pkg/R/DataFrame.R --- @@ -2884,3 +2884,38 @@ setMethod("write.jdbc", write <- callJMethod(write,

[GitHub] spark pull request #13721: [SPARK-16005][R] Add `randomSplit` to SparkR

2016-06-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13721#discussion_r67443819 --- Diff: R/pkg/R/DataFrame.R --- @@ -2884,3 +2884,38 @@ setMethod("write.jdbc", write <- callJMethod(write,

[GitHub] spark pull request #13721: [SPARK-16005][R] Add `randomSplit` to SparkR

2016-06-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13721#discussion_r67443885 --- Diff: R/pkg/R/DataFrame.R --- @@ -2884,3 +2884,38 @@ setMethod("write.jdbc", write <- callJMethod(write,

[GitHub] spark pull request #13721: [SPARK-16005][R] Add `randomSplit` to SparkR

2016-06-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13721#discussion_r67445221 --- Diff: R/pkg/R/DataFrame.R --- @@ -2884,3 +2884,38 @@ setMethod("write.jdbc", write <- callJMethod(write,

[GitHub] spark issue #13714: [SPARK-15996][R] Fix R examples by removing deprecated f...

2016-06-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13714 Thank you, @shivaram ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #13684: [SPARK-15908][R] Add varargs-type dropDuplicates(...

2016-06-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13684#discussion_r67412381 --- Diff: R/pkg/R/DataFrame.R --- @@ -1949,14 +1950,24 @@ setMethod("where", #' path <- "path/to/file.json"

[GitHub] spark pull request #13684: [SPARK-15908][R] Add varargs-type dropDuplicates(...

2016-06-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13684#discussion_r67412795 --- Diff: R/pkg/R/DataFrame.R --- @@ -1949,14 +1950,24 @@ setMethod("where", #' path <- "path/to/file.json"

[GitHub] spark pull request #13730: [SPARK-16006][SQL] Attemping to write empty DataF...

2016-06-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13730#discussion_r67481562 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -351,8 +354,10 @@ private[sql] object

[GitHub] spark pull request #13734: [SPARK-14995][R] Add `since` tag in Roxygen docum...

2016-06-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13734#discussion_r67556931 --- Diff: R/pkg/R/SQLContext.R --- @@ -213,7 +213,7 @@ createDataFrame <- function(x, ...) { #' @aliases createDataFrame #' @exp

[GitHub] spark issue #13734: [SPARK-14995][R] Add `since` tag in Roxygen documentatio...

2016-06-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13734 Hi, All. For review, I uploaded the generated R doc here. https://home.apache.org/~dongjoon/spark-2.0.0-docs/api/R/ The remaining issue is the **multiple** notes like

[GitHub] spark pull request #13734: [SPARK-14995][R] Add `since` tag in Roxygen docum...

2016-06-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13734#discussion_r67557255 --- Diff: R/pkg/R/generics.R --- @@ -20,157 +20,196 @@ # @rdname aggregateRDD # @seealso reduce # @export +# @note since 1.5.0

[GitHub] spark issue #13734: [SPARK-14995][R] Add `since` tag in Roxygen documentatio...

2016-06-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13734 Thank you for review, @felixcheung ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #13403: [SPARK-15660][CORE] Update RDD `variance/stdev` d...

2016-06-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13403#discussion_r67567009 --- Diff: core/src/main/scala/org/apache/spark/util/StatCounter.scala --- @@ -104,8 +104,11 @@ class StatCounter(values: TraversableOnce[Double

[GitHub] spark issue #13721: [SPARK-16005][R] Add `randomSplit` to SparkR

2016-06-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13721 Hi, @shivaram . Although it seems to be late for becoming a part of the Spark 2.0.0, could you review again ? --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request #13774: [SPARK-16059][R] Add `monotonically_increasing_id...

2016-06-20 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13774#discussion_r67672158 --- Diff: R/pkg/R/functions.R --- @@ -911,6 +911,33 @@ setMethod("minute",

[GitHub] spark issue #13768: [SPARK-16053][R] Add `spark_partition_id` in SparkR

2016-06-20 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13768 Thank you for review, @davies ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #13774: [SPARK-16059][R] Add `monotonically_increasing_id...

2016-06-20 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13774#discussion_r67721810 --- Diff: R/pkg/R/functions.R --- @@ -911,6 +911,33 @@ setMethod("minute",

[GitHub] spark issue #13765: [SPARK-16052][SQL] Add CollapseRepartitionBy optimizer

2016-06-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13765 Thank you for comments! I see. No problem! :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #13403: [SPARK-15660][CORE] Update RDD `variance/stdev` d...

2016-06-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13403#discussion_r67903762 --- Diff: core/src/test/java/org/apache/spark/JavaAPISuite.java --- @@ -733,8 +733,10 @@ public Boolean call(Double x) { assertEquals(20/6.0

[GitHub] spark pull request #13403: [SPARK-15660][CORE] Update RDD `variance/stdev` d...

2016-06-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13403#discussion_r67904001 --- Diff: core/src/main/scala/org/apache/spark/rdd/DoubleRDDFunctions.scala --- @@ -47,12 +47,12 @@ class DoubleRDDFunctions(self: RDD[Double

[GitHub] spark pull request #13403: [SPARK-15660][CORE] Update RDD `variance/stdev` d...

2016-06-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13403#discussion_r67922737 --- Diff: core/src/test/java/org/apache/spark/JavaAPISuite.java --- @@ -733,8 +733,10 @@ public Boolean call(Double x) { assertEquals(20/6.0

[GitHub] spark issue #13403: [SPARK-15660][CORE] Update RDD `variance/stdev` descript...

2016-06-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13403 Thank you, @srowen . I updated the PR according to one of your advices. For the other advice, I tried like the following. It looks good, but a little bit inconsistent

[GitHub] spark issue #13403: [SPARK-15660][CORE] Update RDD `variance/stdev` descript...

2016-06-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13403 Oh, thank you, @mengxr ! I'll update again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #13403: [SPARK-15660][CORE] Update RDD `variance/stdev` descript...

2016-06-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13403 Thank you for reviewing this PR, @mengxr and @srowen ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #13403: [SPARK-15660][CORE] Update RDD `variance/stdev` d...

2016-06-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13403#discussion_r67931118 --- Diff: core/src/test/scala/org/apache/spark/PartitioningSuite.scala --- @@ -244,6 +244,10 @@ class PartitioningSuite extends SparkFunSuite

[GitHub] spark issue #13403: [SPARK-15660][CORE] Update RDD `variance/stdev` descript...

2016-06-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13403 Hi, @mengxr . I updated them to use accurate values and small tolerances, too. ``` -assert(abs(2.0 - rdd.sampleVariance) < 0.01) -assert(abs(1.41 - rdd.sampleSt

[GitHub] spark issue #13403: [SPARK-15660][CORE] Update RDD `variance/stdev` descript...

2016-06-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13403 Hi, @srowen . Now, I fixed them all. Sorry for missing those. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #13403: [SPARK-15660][CORE] Update RDD `variance/stdev` d...

2016-06-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13403#discussion_r67931444 --- Diff: core/src/test/scala/org/apache/spark/PartitioningSuite.scala --- @@ -244,6 +244,10 @@ class PartitioningSuite extends SparkFunSuite

[GitHub] spark issue #13763: [SPARK-16051][R] Add `read.orc/write.orc` to SparkR

2016-06-19 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13763 Thank you for review, @sun-rui . I fixed all occurrence; `a ORC` with `an ORC`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #13721: [SPARK-16005][R] Add `randomSplit` to SparkR

2016-06-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13721 Thank you for merging, @shivaram . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13734: [SPARK-14995][R] Add `since` tag in Roxygen documentatio...

2016-06-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13734 Thank you for opinions! I'll revise and update the html doc. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request #13486: [SPARK-15743][SQL] Prevent saving with all-column...

2016-06-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13486#discussion_r67581319 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/test/DataFrameReaderWriterSuite.scala --- @@ -572,4 +572,16 @@ class

[GitHub] spark pull request #13721: [SPARK-16005][R] Add `randomSplit` to SparkR

2016-06-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13721#discussion_r67583977 --- Diff: R/pkg/R/DataFrame.R --- @@ -2908,3 +2908,39 @@ setMethod("write.jdbc", write <- callJMethod(write,

[GitHub] spark pull request #13721: [SPARK-16005][R] Add `randomSplit` to SparkR

2016-06-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13721#discussion_r67583918 --- Diff: R/pkg/R/DataFrame.R --- @@ -2908,3 +2908,39 @@ setMethod("write.jdbc", write <- callJMethod(write,

[GitHub] spark pull request #13486: [SPARK-15743][SQL] Prevent saving with all-column...

2016-06-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13486#discussion_r67578679 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/test/DataFrameReaderWriterSuite.scala --- @@ -572,4 +572,16 @@ class

[GitHub] spark issue #13721: [SPARK-16005][R] Add `randomSplit` to SparkR

2016-06-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13721 @shivaram . I added the description. Thank you for review! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #13403: [SPARK-15660][CORE] Update RDD `variance/stdev` descript...

2016-06-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13403 Hi, @rxin and @srowen . Now, I update this PR like the following. 1. Update the documentation of legacy Scala/Java API more clearly 2. Add `popVariance/popStdev` functions

[GitHub] spark issue #13763: [SPARK-16051][R] Add `read.orc/write.orc` to SparkR

2016-06-18 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13763 Hi, @shivaram , @felixcheung , @sun-rui . Could you review this PR when you have some time? --- If your project is set up for it, you can reply to this email and have your reply appear

<    1   2   3   4   5   6   7   8   9   10   >