[GitHub] spark pull request #20367: [SPARK-23166][ML] Add maxDF Parameter to CountVec...

2018-01-28 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20367 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20367: [SPARK-23166][ML] Add maxDF Parameter to CountVec...

2018-01-27 Thread ymazari
Github user ymazari commented on a diff in the pull request: https://github.com/apache/spark/pull/20367#discussion_r164275764 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala --- @@ -155,24 +182,48 @@ class CountVectorizer @Since("1.5.0")

[GitHub] spark pull request #20367: [SPARK-23166][ML] Add maxDF Parameter to CountVec...

2018-01-27 Thread ymazari
Github user ymazari commented on a diff in the pull request: https://github.com/apache/spark/pull/20367#discussion_r164275714 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala --- @@ -155,24 +182,47 @@ class CountVectorizer @Since("1.5.0")

[GitHub] spark pull request #20367: [SPARK-23166][ML] Add maxDF Parameter to CountVec...

2018-01-27 Thread ymazari
Github user ymazari commented on a diff in the pull request: https://github.com/apache/spark/pull/20367#discussion_r164275722 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala --- @@ -155,24 +182,47 @@ class CountVectorizer @Since("1.5.0")

[GitHub] spark pull request #20367: [SPARK-23166][ML] Add maxDF Parameter to CountVec...

2018-01-27 Thread ymazari
Github user ymazari commented on a diff in the pull request: https://github.com/apache/spark/pull/20367#discussion_r164275721 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala --- @@ -155,24 +182,47 @@ class CountVectorizer @Since("1.5.0")

[GitHub] spark pull request #20367: [SPARK-23166][ML] Add maxDF Parameter to CountVec...

2018-01-27 Thread ymazari
Github user ymazari commented on a diff in the pull request: https://github.com/apache/spark/pull/20367#discussion_r164275712 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala --- @@ -155,24 +182,47 @@ class CountVectorizer @Since("1.5.0")

[GitHub] spark pull request #20367: [SPARK-23166][ML] Add maxDF Parameter to CountVec...

2018-01-27 Thread ymazari
Github user ymazari commented on a diff in the pull request: https://github.com/apache/spark/pull/20367#discussion_r164275697 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala --- @@ -155,24 +182,47 @@ class CountVectorizer @Since("1.5.0")

[GitHub] spark pull request #20367: [SPARK-23166][ML] Add maxDF Parameter to CountVec...

2018-01-27 Thread ymazari
Github user ymazari commented on a diff in the pull request: https://github.com/apache/spark/pull/20367#discussion_r164275706 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala --- @@ -155,24 +182,47 @@ class CountVectorizer @Since("1.5.0")

[GitHub] spark pull request #20367: [SPARK-23166][ML] Add maxDF Parameter to CountVec...

2018-01-27 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/20367#discussion_r164273743 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala --- @@ -155,24 +182,47 @@ class CountVectorizer @Since("1.5.0")

[GitHub] spark pull request #20367: [SPARK-23166][ML] Add maxDF Parameter to CountVec...

2018-01-27 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/20367#discussion_r164273656 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala --- @@ -155,24 +182,47 @@ class CountVectorizer @Since("1.5.0")

[GitHub] spark pull request #20367: [SPARK-23166][ML] Add maxDF Parameter to CountVec...

2018-01-27 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/20367#discussion_r164267076 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala --- @@ -155,24 +182,47 @@ class CountVectorizer @Since("1.5.0")

[GitHub] spark pull request #20367: [SPARK-23166][ML] Add maxDF Parameter to CountVec...

2018-01-27 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/20367#discussion_r164267079 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala --- @@ -155,24 +182,47 @@ class CountVectorizer @Since("1.5.0")

[GitHub] spark pull request #20367: [SPARK-23166][ML] Add maxDF Parameter to CountVec...

2018-01-27 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/20367#discussion_r164267103 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala --- @@ -155,24 +182,47 @@ class CountVectorizer @Since("1.5.0")

[GitHub] spark pull request #20367: [SPARK-23166][ML] Add maxDF Parameter to CountVec...

2018-01-27 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/20367#discussion_r164267128 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala --- @@ -155,24 +182,47 @@ class CountVectorizer @Since("1.5.0")

[GitHub] spark pull request #20367: [SPARK-23166][ML] Add maxDF Parameter to CountVec...

2018-01-27 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/20367#discussion_r164267057 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala --- @@ -155,24 +182,47 @@ class CountVectorizer @Since("1.5.0")

[GitHub] spark pull request #20367: [SPARK-23166][ML] Add maxDF Parameter to CountVec...

2018-01-27 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/20367#discussion_r164267118 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala --- @@ -155,24 +182,47 @@ class CountVectorizer @Since("1.5.0")

[GitHub] spark pull request #20367: [SPARK-23166][ML] Add maxDF Parameter to CountVec...

2018-01-26 Thread ymazari
Github user ymazari commented on a diff in the pull request: https://github.com/apache/spark/pull/20367#discussion_r164260027 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala --- @@ -160,6 +187,11 @@ class CountVectorizer @Since("1.5.0")

[GitHub] spark pull request #20367: [SPARK-23166][ML] Add maxDF Parameter to CountVec...

2018-01-25 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/20367#discussion_r163911373 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala --- @@ -160,6 +187,11 @@ class CountVectorizer @Since("1.5.0")

[GitHub] spark pull request #20367: [SPARK-23166][ML] Add maxDF Parameter to CountVec...

2018-01-24 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/20367#discussion_r163640976 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala --- @@ -113,7 +132,11 @@ private[feature] trait CountVectorizerParams

[GitHub] spark pull request #20367: [SPARK-23166][ML] Add maxDF Parameter to CountVec...

2018-01-23 Thread ymazari
Github user ymazari commented on a diff in the pull request: https://github.com/apache/spark/pull/20367#discussion_r163465302 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/CountVectorizerSuite.scala --- @@ -119,6 +119,41 @@ class CountVectorizerSuite extends

[GitHub] spark pull request #20367: [SPARK-23166][ML] Add maxDF Parameter to CountVec...

2018-01-23 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/20367#discussion_r163362834 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala --- @@ -113,7 +132,11 @@ private[feature] trait CountVectorizerParams

[GitHub] spark pull request #20367: [SPARK-23166][ML] Add maxDF Parameter to CountVec...

2018-01-23 Thread ymazari
Github user ymazari commented on a diff in the pull request: https://github.com/apache/spark/pull/20367#discussion_r163359719 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala --- @@ -113,7 +132,11 @@ private[feature] trait CountVectorizerParams

[GitHub] spark pull request #20367: [SPARK-23166][ML] Add maxDF Parameter to CountVec...

2018-01-23 Thread ymazari
Github user ymazari commented on a diff in the pull request: https://github.com/apache/spark/pull/20367#discussion_r163358962 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/CountVectorizerSuite.scala --- @@ -119,6 +119,41 @@ class CountVectorizerSuite extends

[GitHub] spark pull request #20367: [SPARK-23166][ML] Add maxDF Parameter to CountVec...

2018-01-23 Thread ymazari
Github user ymazari commented on a diff in the pull request: https://github.com/apache/spark/pull/20367#discussion_r163358747 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala --- @@ -169,7 +201,7 @@ class CountVectorizer @Since("1.5.0")

[GitHub] spark pull request #20367: [SPARK-23166][ML] Add maxDF Parameter to CountVec...

2018-01-23 Thread ymazari
GitHub user ymazari opened a pull request: https://github.com/apache/spark/pull/20367 [SPARK-23166][ML] Add maxDF Parameter to CountVectorizer ## What changes were proposed in this pull request? Currently, the CountVectorizer has a minDF parameter. It might be useful to