[GitHub] spark issue #13487: [SPARK-15744][SQL] Rename two TungstenAggregation*Suites...

2016-06-03 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13487 Thank you, @rxin . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #13487: [MINOR][SQL] Update testsuites/comments/error mes...

2016-06-02 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/13487 [MINOR][SQL] Update testsuites/comments/error messages about Tungsten/SortBasedAggregate. ## What changes were proposed in this pull request? For consistency, this PR updates some

[GitHub] spark pull request #13486: [SPARK-15743][SQL] Prevent saving with all-column...

2016-06-02 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/13486 [SPARK-15743][SQL] Prevent saving with all-column partitioning ## What changes were proposed in this pull request? When saving datasets on storage, `partitionBy` provides an easy way

[GitHub] spark issue #13403: [SPARK-15660][CORE] RDD and Dataset should show the cons...

2016-06-09 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13403 What about just adding an explicit note on old `StatCounter.stdev`? http://spark.apache.org/docs/2.0.0-preview/api/scala/index.html#org.apache.spark.util.StatCounter MLLIB

[GitHub] spark issue #13403: [SPARK-15660][CORE] RDD and Dataset should show the cons...

2016-06-09 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13403 Although we can not change old API, I think it's a good idea to add `popVariance` and `popStdev` clearly. If everything in this PR is now allowed, what about just adding an explicit

[GitHub] spark issue #13545: [SPARK-15807][SQL] Support varargs for dropDuplicates in...

2016-06-08 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13545 Hi, @rxin . I updated this PR and JIRA by removing `distinct`-related changes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request #13486: [SPARK-15743][SQL] Prevent saving with all-column...

2016-06-08 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13486#discussion_r66349438 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/PartitioningUtilsSuite.scala --- @@ -0,0 +1,36

[GitHub] spark pull request #13486: [SPARK-15743][SQL] Prevent saving with all-column...

2016-06-08 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13486#discussion_r66349371 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -339,7 +339,7 @@ private[sql] object

[GitHub] spark issue #13486: [SPARK-15743][SQL] Prevent saving with all-column partit...

2016-06-08 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13486 Hi, @marmbrus . Now, the PR is updated according to your advice and passed the Jenkins again. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #13520: [SPARK-15773][CORE][EXAMPLE] Avoid creating local variab...

2016-06-08 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13520 Since this is about examples, I think the shorter is the better. Users can think simply `parallelize` or `broadcast` are just one of functions without knowing `SparkContext`. --- If your

[GitHub] spark issue #13520: [SPARK-15773][CORE][EXAMPLE] Avoid creating local variab...

2016-06-06 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13520 Initially, I thought the printed message was wrong in the statement `println("Creating SparkContext")` because `spark.sparkContext` is just to return the already existing one. -

[GitHub] spark issue #13520: [SPARK-15773][CORE][EXAMPLE] Avoid creating local variab...

2016-06-06 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13520 Thank you, @srowen ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #13520: [SPARK-15773][CORE][EXAMPLE] Avoid creating local variab...

2016-06-06 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13520 Thank you for review, @rxin and @srowen . The main rational of this PR is to make `SparkSession` explicitly as a starting point for the operations in these examples. (Instead

[GitHub] spark pull request #13545: [SPARK-15807][SQL] Support varargs for distinct/d...

2016-06-07 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13545#discussion_r66152341 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -2262,6 +2275,19 @@ class Dataset[T] private[sql]( def distinct

[GitHub] spark issue #13545: [SPARK-15807][SQL] Support varargs for distinct/dropDupl...

2016-06-07 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13545 What do you think `dropDuplicates`? 1. ds.select("_1", "_2", "_3").dropDuplicates(Seq("_1", "_2")).orderBy("_1", "

[GitHub] spark pull request #13545: [SPARK-15807][SQL] Support varargs for distinct/d...

2016-06-07 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13545#discussion_r66156310 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -2262,6 +2275,19 @@ class Dataset[T] private[sql]( def distinct

[GitHub] spark issue #13486: [SPARK-15743][SQL] Prevent saving with all-column partit...

2016-06-06 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13486 Hi, @marmbrus . Could you review this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #13634: [SPARK-15913][CORE] Dispatcher.stopped should be ...

2016-06-12 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/13634 [SPARK-15913][CORE] Dispatcher.stopped should be enclosed by synchronized block. ## What changes were proposed in this pull request? `Dispatcher.stopped` is guarded

[GitHub] spark issue #13634: [SPARK-15913][CORE] Dispatcher.stopped should be enclose...

2016-06-12 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13634 Hi, @vanzin . Could you review this when you have some time? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #13436: [SPARK-15696][SQL] Improve `crosstab` to have a consiste...

2016-06-10 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13436 Thank you, @rxin . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #13486: [SPARK-15743][SQL] Prevent saving with all-column partit...

2016-06-10 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13486 Hi, @marmbrus . Could you review this PR again? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #13608: [SPARK-15883][MLLIB][DOCS] Fix broken links in mllib doc...

2016-06-10 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13608 Actually, at this time, I manually clicked all the link in mllib documentation. Maybe, later, we can make some simple crawler to check this kind of errors. --- If your project is set up

[GitHub] spark issue #13486: [SPARK-15743][SQL] Prevent saving with all-column partit...

2016-06-10 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13486 Thank you, @marmbrus ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #13608: [SPARK-15883][MLLIB][DOCS] Fix broken links in ml...

2016-06-10 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13608#discussion_r66682287 --- Diff: docs/mllib-data-types.md --- @@ -535,12 +537,6 @@ rowsRDD = mat.rows # Convert to a RowMatrix by dropping the row indices

[GitHub] spark pull request #13608: [SPARK-15883][MLLIB][DOCS] Fix broken links in ml...

2016-06-10 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13608#discussion_r66682877 --- Diff: docs/mllib-linear-methods.md --- @@ -185,10 +185,10 @@ algorithm for 200 iterations. import

[GitHub] spark issue #13608: [SPARK-15883][MLLIB][DOCS] Fix broken links in mllib doc...

2016-06-10 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13608 Yep. I already built this with Jekyll locally and checked the result manually, too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #13608: [SPARK-15883][MLLIB][DOCS] Fix broken links in mllib doc...

2016-06-10 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13608 Thank you for fast review, @srowen . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #13608: [SPARK-15883][MLLIB][DOCS] Fix broken links in ml...

2016-06-10 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/13608 [SPARK-15883][MLLIB][DOCS] Fix broken links in mllib documents ## What changes were proposed in this pull request? This issue fixes all broken links on Spark 2.0 preview MLLib

[GitHub] spark issue #13520: [SPARK-15773][CORE][EXAMPLE] Avoid creating local variab...

2016-06-10 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13520 Thank you, @rxin ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #13545: [SPARK-15807][SQL] Support varargs for dropDuplicates in...

2016-06-10 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13545 Thank you again, @rxin . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #13608: [SPARK-15883][MLLIB][DOCS] Fix broken links in ml...

2016-06-10 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13608#discussion_r66687221 --- Diff: docs/mllib-linear-methods.md --- @@ -395,7 +395,7 @@ section of the Spark quick-start guide. Be sure to also include *spark-mllib

[GitHub] spark issue #13545: [SPARK-15807][SQL] Support varargs for dropDuplicates in...

2016-06-10 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13545 Hi, @rxin . For `dropDuplicates`, this PR definitely adds a new signature. However, I think this is the right direction to improve user experience because they expect the same usage

[GitHub] spark pull request #13608: [SPARK-15883][MLLIB][DOCS] Fix broken links in ml...

2016-06-11 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13608#discussion_r66703204 --- Diff: docs/mllib-linear-methods.md --- @@ -185,10 +185,10 @@ algorithm for 200 iterations. import

[GitHub] spark issue #13436: [SPARK-15696][SQL] Improve `crosstab` to have a consiste...

2016-06-09 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13436 Hi, @rxin . Could you review this PR and give some opinion when you have some time? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request #13486: [SPARK-15743][SQL] Prevent saving with all-column...

2016-06-04 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13486#discussion_r65799702 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -350,6 +350,10 @@ private[sql] object

[GitHub] spark issue #13486: [SPARK-15743][SQL] Prevent saving with all-column partit...

2016-06-04 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13486 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59987/ Test PASSed

[GitHub] spark pull request #13486: [SPARK-15743][SQL] Prevent saving with all-column...

2016-06-04 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13486#discussion_r65799585 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -350,6 +350,10 @@ private[sql] object

[GitHub] spark issue #13486: [SPARK-15743][SQL] Prevent saving with all-column partit...

2016-06-04 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13486 **[Test build #59986 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59986/consoleFull)** for PR 13486 at commit [`9c5f13d`](https://github.com/apache/spark

[GitHub] spark issue #13486: [SPARK-15743][SQL] Prevent saving with all-column partit...

2016-06-04 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13486 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59986/ Test PASSed

[GitHub] spark issue #13486: [SPARK-15743][SQL] Prevent saving with all-column partit...

2016-06-04 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13486 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13486: [SPARK-15743][SQL] Prevent saving with all-column partit...

2016-06-04 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13486 **[Test build #59987 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59987/consoleFull)** for PR 13486 at commit [`6a9006d`](https://github.com/apache/spark

[GitHub] spark issue #13486: [SPARK-15743][SQL] Prevent saving with all-column partit...

2016-06-04 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13486 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [MINOR][CORE] Fix a HadoopRDD log message and ...

2016-05-25 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/13294#issuecomment-221656682 Thank you, @andrewor14 and @srowen ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-15512][CORE] repartition(0) should rais...

2016-05-24 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/13282#issuecomment-221426314 Yes. They need this. I'll add that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #13545: [SPARK-15807][SQL] Support varargs for dropDuplicates in...

2016-06-11 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13545 Thank you for merging, @rxin ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13634: [SPARK-15913][CORE] Dispatcher.stopped should be enclose...

2016-06-13 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13634 Thank you for review, @srowen . Oh, right. That sounds much better to me. I'll update this PR like that. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request #13634: [SPARK-15913][CORE] Dispatcher.stopped should be ...

2016-06-13 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13634#discussion_r66761760 --- Diff: core/src/main/scala/org/apache/spark/rpc/netty/Dispatcher.scala --- @@ -144,24 +144,21 @@ private[netty] class Dispatcher(nettyEnv

[GitHub] spark issue #13634: [SPARK-15913][CORE] Dispatcher.stopped should be enclose...

2016-06-13 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13634 Thank you always, @srowen . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #13634: [SPARK-15913][CORE] Dispatcher.stopped should be enclose...

2016-06-13 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13634 Thank you, @vanzin ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #13643: [SPARK-15922][MLLIB] `toIndexedRowMatrix` should ...

2016-06-13 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/13643 [SPARK-15922][MLLIB] `toIndexedRowMatrix` should consider the case `cols < colsPerBlock` ## What changes were proposed in this pull request? SPARK-15922 reports the follow

[GitHub] spark pull request #13684: [SPARK-15908][R] Add varargs-type dropDuplicates(...

2016-06-15 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/13684 [SPARK-15908][R] Add varargs-type dropDuplicates() function in SparkR ## What changes were proposed in this pull request? This PR adds varargs-type `dropDuplicates` function

[GitHub] spark issue #13636: [SPARK-15637][SPARK-15931][SPARKR] Fix R masked function...

2016-06-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13636 This passes for me, too. Thank you, @felixcheung . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #13592: [SPARK-15863][SQL][DOC] Initial SQL programming g...

2016-06-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13592#discussion_r67202830 --- Diff: docs/sql-programming-guide.md --- @@ -889,7 +887,7 @@ df.select("name", "favorite_color").write.save("

[GitHub] spark pull request #13592: [SPARK-15863][SQL][DOC] Initial SQL programming g...

2016-06-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13592#discussion_r67202934 --- Diff: docs/sql-programming-guide.md --- @@ -939,7 +937,7 @@ df.select("name", "age").write.save("namesAndAges.

[GitHub] spark pull request #13592: [SPARK-15863][SQL][DOC] Initial SQL programming g...

2016-06-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13592#discussion_r67203021 --- Diff: docs/sql-programming-guide.md --- @@ -956,30 +954,30 @@ file directly with SQL. {% highlight scala %} -val df

[GitHub] spark pull request #13592: [SPARK-15863][SQL][DOC] Initial SQL programming g...

2016-06-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13592#discussion_r67203777 --- Diff: docs/sql-programming-guide.md --- @@ -1142,11 +1141,11 @@ write.parquet(schemaPeople, "people.parquet") # Read in the Pa

[GitHub] spark pull request #13592: [SPARK-15863][SQL][DOC] Initial SQL programming g...

2016-06-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13592#discussion_r67204146 --- Diff: docs/sql-programming-guide.md --- @@ -1142,11 +1141,11 @@ write.parquet(schemaPeople, "people.parquet") # Read in the Pa

[GitHub] spark pull request #13592: [SPARK-15863][SQL][DOC] Initial SQL programming g...

2016-06-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13592#discussion_r67202307 --- Diff: docs/sql-programming-guide.md --- @@ -171,9 +171,9 @@ df.show() {% highlight r %} -sqlContext <- SQLContext

[GitHub] spark pull request #13592: [SPARK-15863][SQL][DOC] Initial SQL programming g...

2016-06-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13592#discussion_r67202506 --- Diff: docs/sql-programming-guide.md --- @@ -363,10 +363,10 @@ In addition to simple column references and expressions, DataFrames also have

[GitHub] spark pull request #13592: [SPARK-15863][SQL][DOC] Initial SQL programming g...

2016-06-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13592#discussion_r67202611 --- Diff: docs/sql-programming-guide.md --- @@ -419,35 +419,35 @@ In addition to simple column references and expressions, DataFrames also have

[GitHub] spark pull request #13592: [SPARK-15863][SQL][DOC] Initial SQL programming g...

2016-06-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13592#discussion_r67203431 --- Diff: docs/sql-programming-guide.md --- @@ -1142,11 +1141,11 @@ write.parquet(schemaPeople, "people.parquet") # Read in the Pa

[GitHub] spark pull request #13592: [SPARK-15863][SQL][DOC] Initial SQL programming g...

2016-06-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13592#discussion_r67204480 --- Diff: docs/sql-programming-guide.md --- @@ -1326,7 +1325,7 @@ write.df(df1, "data/test_table/key=1", "parquet", "over

[GitHub] spark issue #13684: [SPARK-15908][R] Add varargs-type dropDuplicates() funct...

2016-06-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13684 Hi, @shivaram . Could you review this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #13684: [SPARK-15908][R] Add varargs-type dropDuplicates(...

2016-06-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13684#discussion_r67206540 --- Diff: R/pkg/R/DataFrame.R --- @@ -1859,7 +1859,7 @@ setMethod("where", #' @param colnames A character vector of column names. --

[GitHub] spark pull request #13684: [SPARK-15908][R] Add varargs-type dropDuplicates(...

2016-06-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13684#discussion_r67209107 --- Diff: R/pkg/R/DataFrame.R --- @@ -1859,7 +1859,7 @@ setMethod("where", #' @param colnames A character vector of column names. --

[GitHub] spark pull request #13684: [SPARK-15908][R] Add varargs-type dropDuplicates(...

2016-06-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13684#discussion_r67210755 --- Diff: R/pkg/R/DataFrame.R --- @@ -1869,6 +1869,7 @@ setMethod("where", #' path <- "path/to/file.json"

[GitHub] spark issue #13643: [SPARK-15922][MLLIB] `toIndexedRowMatrix` should conside...

2016-06-13 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13643 Thank you again, @srowen . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #13643: [SPARK-15922][MLLIB] `toIndexedRowMatrix` should conside...

2016-06-13 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13643 Hi, @Fokko and @mengxr . Could you review this PR when you have some time? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request #13643: [SPARK-15922][MLLIB] `toIndexedRowMatrix` should ...

2016-06-13 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13643#discussion_r66849603 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/BlockMatrix.scala --- @@ -288,7 +288,7 @@ class BlockMatrix @Since("

[GitHub] spark pull request #13520: [SPARK-15773][CORE][EXAMPLE] Avoid creating local...

2016-06-05 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/13520 [SPARK-15773][CORE][EXAMPLE] Avoid creating local variable `sc` in examples if possible ## What changes were proposed in this pull request? Instead of using local variable `sc` like

[GitHub] spark pull request #13545: [SPARK-15807][SQL] Support varargs for distinct/d...

2016-06-07 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/13545 [SPARK-15807][SQL] Support varargs for distinct/dropDuplicates in Dataset/DataFrame ## What changes were proposed in this pull request? This PR adds `varargs`-types `distinct

[GitHub] spark pull request: [SPARK-15644] [MLlib] [SQL] Replace SQLContext...

2016-05-28 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/13380#issuecomment-222337348 Hi, @gatorsmile . Personally, I love this PR. :) I just hesitated to change the function signatures of MLLIB in #13352 . --- If your project is set up

[GitHub] spark pull request: [SPARK-15644] [MLlib] [SQL] Replace SQLContext...

2016-05-28 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13380#discussion_r64996971 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/joins/BroadcastJoinSuite.scala --- @@ -48,7 +48,7 @@ class BroadcastJoinSuite

[GitHub] spark pull request: [SPARK-15644] [MLlib] [SQL] Replace SQLContext...

2016-05-28 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13380#discussion_r64996960 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/joins/BroadcastJoinSuite.scala --- @@ -48,7 +48,7 @@ class BroadcastJoinSuite

[GitHub] spark pull request: [SPARK-15644] [MLlib] [SQL] Replace SQLContext...

2016-05-28 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13380#discussion_r64997097 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/joins/BroadcastJoinSuite.scala --- @@ -48,7 +48,7 @@ class BroadcastJoinSuite

[GitHub] spark pull request: [SPARK-15644] [MLlib] [SQL] Replace SQLContext...

2016-05-28 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13380#discussion_r64997103 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/joins/BroadcastJoinSuite.scala --- @@ -48,7 +48,7 @@ class BroadcastJoinSuite

[GitHub] spark pull request: [SPARK-15647] [SQL] Fix Boundary Cases in Opti...

2016-05-29 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/13392#issuecomment-222346595 Thank you for making me up-to-date, @gatorsmile ! By the way, there is one correction. My PR is about **parameterizing** the following previous code

[GitHub] spark pull request: [SPARK-15557][SQL] expressi[on ((cast(99 as de...

2016-05-27 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13368#discussion_r64979724 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -290,11 +290,6 @@ object TypeCoercion

[GitHub] spark pull request: [MINOR][CORE][DOCS] Fix description of FilterF...

2016-05-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/13404#issuecomment-222611700 Thank you, @rxin ! Then, I'll close this PR now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [MINOR][CORE][DOCS] Fix description of FilterF...

2016-05-30 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13404#discussion_r65125923 --- Diff: core/src/main/java/org/apache/spark/api/java/function/package.scala --- @@ -22,4 +22,5 @@ package org.apache.spark.api.java

[GitHub] spark pull request: [MINOR][CORE][DOCS] Fix description of FilterF...

2016-05-31 Thread dongjoon-hyun
Github user dongjoon-hyun closed the pull request at: https://github.com/apache/spark/pull/13404 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request: [MINOR][CORE][DOC Fix description of FilterFun...

2016-05-30 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/13404 [MINOR][CORE][DOC Fix description of FilterFunction ## What changes were proposed in this pull request? This PR fixes the wrong description of `FilterFunction

[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

2016-05-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/12850#discussion_r65127553 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -751,6 +751,16 @@ object ConstantFolding extends

[GitHub] spark pull request: [SPARK-15660][CORE] RDD and Dataset should sho...

2016-05-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/13403#issuecomment-222614982 Thank you for review again @rxin. Actually, I fully understand and expect your decision. The reason why I making this issue is I think we need

[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding optimizer by ...

2016-05-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/12850#discussion_r65237290 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -742,6 +742,23 @@ object

[GitHub] spark pull request: [SPARK-15662][SQL] Add since annotation for classes in s...

2016-05-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13406#discussion_r65240101 --- Diff: core/src/main/java/org/apache/spark/api/java/function/package.scala --- @@ -22,4 +22,4 @@ package org.apache.spark.api.java

[GitHub] spark pull request: [SPARK-15662][SQL] Add since annotation for cl...

2016-05-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13406#discussion_r65136736 --- Diff: core/src/main/java/org/apache/spark/api/java/function/package.scala --- @@ -22,4 +22,4 @@ package org.apache.spark.api.java

[GitHub] spark pull request: [SPARK-15662][SQL] Add since annotation for cl...

2016-05-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13406#discussion_r65136837 --- Diff: core/src/main/java/org/apache/spark/api/java/function/package.scala --- @@ -22,4 +22,4 @@ package org.apache.spark.api.java

[GitHub] spark pull request: [MINOR][SQL][DOCS] Fix docs of Dataset.scala and SQLImpl...

2016-05-31 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/13420 [MINOR][SQL][DOCS] Fix docs of Dataset.scala and SQLImplicits.scala. ## What changes were proposed in this pull request? This PR fixes a sample code, a description, and indentations

[GitHub] spark pull request: [SPARK-15662][SQL] Add since annotation for classes in s...

2016-05-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13406#discussion_r65249012 --- Diff: core/src/main/java/org/apache/spark/api/java/function/package.scala --- @@ -22,4 +22,4 @@ package org.apache.spark.api.java

[GitHub] spark pull request: [SPARK-15618][SQL][MLLIB] Use SparkSession.builder.spark...

2016-05-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/13365 Hi, @andrewor14 . Could you review this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-15678][SQL] Drop cache on appends and overwrites

2016-05-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13419#discussion_r65251560 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetQuerySuite.scala --- @@ -67,6 +67,28 @@ class

[GitHub] spark pull request: [SPARK-15678][SQL] Drop cache on appends and overwrites

2016-05-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13419#discussion_r65251574 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetQuerySuite.scala --- @@ -67,6 +67,28 @@ class

[GitHub] spark pull request: [SPARK-15678][SQL] Drop cache on appends and overwrites

2016-05-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/13419 Hi, @sameeragarwal . Is there any reason to use `SQLContext` instead of `SparkSession` in this PR? --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-15662][SQL] Add since annotation for classes in s...

2016-05-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13406#discussion_r65262391 --- Diff: core/src/main/java/org/apache/spark/api/java/function/package.scala --- @@ -22,4 +22,4 @@ package org.apache.spark.api.java

[GitHub] spark pull request: [SPARK-15662][SQL] Add since annotation for classes in s...

2016-05-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13406#discussion_r65263287 --- Diff: core/src/main/java/org/apache/spark/api/java/function/package.scala --- @@ -22,4 +22,4 @@ package org.apache.spark.api.java

[GitHub] spark pull request: [SPARK-15662][SQL] Add since annotation for classes in s...

2016-05-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13406#discussion_r65264316 --- Diff: core/src/main/java/org/apache/spark/api/java/function/package.scala --- @@ -22,4 +22,4 @@ package org.apache.spark.api.java

[GitHub] spark pull request: [SPARK-15662][SQL] Add since annotation for classes in s...

2016-05-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13406#discussion_r65264904 --- Diff: core/src/main/java/org/apache/spark/api/java/function/package.scala --- @@ -22,4 +22,4 @@ package org.apache.spark.api.java

[GitHub] spark pull request: [SPARK-15662][SQL] Add since annotation for classes in s...

2016-05-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13406#discussion_r65265296 --- Diff: core/src/main/java/org/apache/spark/api/java/function/package.scala --- @@ -22,4 +22,4 @@ package org.apache.spark.api.java

[GitHub] spark pull request: [SPARK-15662][SQL] Add since annotation for classes in s...

2016-05-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13406#discussion_r65269018 --- Diff: core/src/main/java/org/apache/spark/api/java/function/package.scala --- @@ -22,4 +22,4 @@ package org.apache.spark.api.java

  1   2   3   4   5   6   7   8   9   10   >