Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/22365
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/22365#discussion_r219034294
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/DataFrameStatFunctions.scala ---
@@ -370,29 +370,76 @@ final class DataFrameStatFunctions
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/22365#discussion_r217257137
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/DataFrameStatFunctions.scala ---
@@ -370,29 +370,76 @@ final class DataFrameStatFunctions
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/22365#discussion_r217256279
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/DataFrameStatFunctions.scala ---
@@ -370,29 +370,76 @@ final class DataFrameStatFunctions
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/22365#discussion_r217252035
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/DataFrameStatFunctions.scala ---
@@ -370,29 +370,76 @@ final class DataFrameStatFunctions
Github user MaxGekk commented on a diff in the pull request:
https://github.com/apache/spark/pull/22365#discussion_r216482340
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -880,18 +880,23 @@ def sampleBy(self, col, fractions, seed=None):
| 0|5|
|
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/22365#discussion_r216233575
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -880,18 +880,23 @@ def sampleBy(self, col, fractions, seed=None):
| 0|5|
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/22365#discussion_r216233066
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -880,18 +880,23 @@ def sampleBy(self, col, fractions, seed=None):
| 0|5|
GitHub user MaxGekk opened a pull request:
https://github.com/apache/spark/pull/22365
[SPARK-25381][SQL] Stratified sampling by Column argument
## What changes were proposed in this pull request?
In the PR, I propose to add an overloaded method for `sampleBy` which
accepts