Repository: spark
Updated Branches:
refs/heads/master de6ad3dfa -> f067acefa
[SPARK-19155][ML] Make family case insensitive in GLM
## What changes were proposed in this pull request?
This is a supplement to PR #16516 which did not make the value from `getFamily`
case insensitive. Current
Repository: spark
Updated Branches:
refs/heads/master 3dcad9fab -> 0c589e371
[SPARK-19291][SPARKR][ML] spark.gaussianMixture supports output log-likelihood.
## What changes were proposed in this pull request?
```spark.gaussianMixture``` supports output total log-likelihood for the model
like
Repository: spark
Updated Branches:
refs/heads/branch-2.0 4c2065d0a -> 886f73737
[SPARK-19155][ML] MLlib GeneralizedLinearRegression family and link should case
insensitive
## What changes were proposed in this pull request?
MLlib ```GeneralizedLinearRegression``` ```family``` and ```link```
Repository: spark
Updated Branches:
refs/heads/branch-2.1 6f0ad575d -> 8daf10e3f
[SPARK-19155][ML] MLlib GeneralizedLinearRegression family and link should case
insensitive
## What changes were proposed in this pull request?
MLlib ```GeneralizedLinearRegression``` ```family``` and ```link```
Repository: spark
Updated Branches:
refs/heads/master aa014eb74 -> 3dcad9fab
[SPARK-19155][ML] MLlib GeneralizedLinearRegression family and link should case
insensitive
## What changes were proposed in this pull request?
MLlib ```GeneralizedLinearRegression``` ```family``` and ```link```
Repository: spark
Updated Branches:
refs/heads/master 2e6256002 -> 8ccca9170
[SPARK-14272][ML] Add Loglikelihood in GaussianMixtureSummary
## What changes were proposed in this pull request?
add loglikelihood in GMM.summary
## How was this patch tested?
added tests
Author: Zheng RuiFeng
Repository: spark
Updated Branches:
refs/heads/master 18ee55dd5 -> 84f0b645b
[MINOR][YARN] Move YarnSchedulerBackendSuite to resource-managers/yarn
directory.
## What changes were proposed in this pull request?
#16092 moves YARN resource manager related code to resource-managers/yarn
Repository: spark
Updated Branches:
refs/heads/master e635cbb6e -> 12c8c2160
[SPARK-19066][SPARKR] SparkR LDA doesn't set optimizer correctly
## What changes were proposed in this pull request?
spark.lda passes the optimizer "em" or "online" as a string to the backend.
However, LDAWrapper
Repository: spark
Updated Branches:
refs/heads/master 3356b8b6a -> 7f24a0b6c
[SPARK-19142][SPARKR] spark.kmeans should take seed, initSteps, and tol as
parameters
## What changes were proposed in this pull request?
spark.kmeans doesn't have interface to set initSteps, seed and tol. As Spark
Repository: spark
Updated Branches:
refs/heads/branch-2.1 82fcc1330 -> 0b07634b5
[SPARK-19158][SPARKR][EXAMPLES] Fix ml.R example fails due to lack of e1071
package.
## What changes were proposed in this pull request?
```ml.R``` example depends on ```e1071``` package, if it's not available
Repository: spark
Updated Branches:
refs/heads/master 24100f162 -> 2c586f506
[SPARK-19158][SPARKR][EXAMPLES] Fix ml.R example fails due to lack of e1071
package.
## What changes were proposed in this pull request?
```ml.R``` example depends on ```e1071``` package, if it's not available in
Repository: spark
Updated Branches:
refs/heads/master 3ef6d98a8 -> b0e5840d4
[SPARK-19134][EXAMPLE] Fix several sql, mllib and status api examples not
working
## What changes were proposed in this pull request?
**binary_classification_metrics_example.py**
LibSVM datasource loads
Repository: spark
Updated Branches:
refs/heads/master faabe69cc -> 3ef6d98a8
[SPARK-17847][ML] Reduce shuffled data size of GaussianMixture & copy the
implementation from mllib to ml
## What changes were proposed in this pull request?
Copy `GaussianMixture` implementation from mllib to ml,
http://git-wip-us.apache.org/repos/asf/spark/blob/6b6b555a/R/pkg/R/mllib.R
--
diff --git a/R/pkg/R/mllib.R b/R/pkg/R/mllib.R
deleted file mode 100644
index d736bbb..000
--- a/R/pkg/R/mllib.R
+++ /dev/null
@@ -1,2114 +0,0 @@
-#
[SPARK-18862][SPARKR][ML] Split SparkR mllib.R into multiple files
## What changes were proposed in this pull request?
SparkR ```mllib.R``` is getting bigger as we add more ML wrappers, I'd like to
split it into multiple files to make us easy to maintain:
* mllib_classification.R
*
Repository: spark
Updated Branches:
refs/heads/master 923e59484 -> 6b6b555a1
http://git-wip-us.apache.org/repos/asf/spark/blob/6b6b555a/R/pkg/inst/tests/testthat/test_mllib_classification.R
--
diff --git
http://git-wip-us.apache.org/repos/asf/spark/blob/6b6b555a/R/pkg/R/mllib_classification.R
--
diff --git a/R/pkg/R/mllib_classification.R b/R/pkg/R/mllib_classification.R
new file mode 100644
index 000..8da8449
--- /dev/null
http://git-wip-us.apache.org/repos/asf/spark/blob/6b6b555a/R/pkg/R/mllib_tree.R
--
diff --git a/R/pkg/R/mllib_tree.R b/R/pkg/R/mllib_tree.R
new file mode 100644
index 000..0d53fad
--- /dev/null
+++ b/R/pkg/R/mllib_tree.R
@@
Repository: spark
Updated Branches:
refs/heads/master cca945b6a -> dfc4c935b
[MINOR] Correct LogisticRegression test case for probability2prediction.
## What changes were proposed in this pull request?
Set correct column names for ```force to use probability2prediction``` in
Repository: spark
Updated Branches:
refs/heads/master d7bce3bd3 -> 6a475ae46
[SPARK-17772][ML][TEST] Add test functions for ML sample weights
## What changes were proposed in this pull request?
More and more ML algos are accepting sample weights, and they have been tested
rather
Repository: spark
Updated Branches:
refs/heads/master 79ff85363 -> 9cff67f34
[MINOR][ML] Correct test cases of LoR raw2prediction & probability2prediction.
## What changes were proposed in this pull request?
Correct test cases of ```LogisticRegression``` raw2prediction &
Repository: spark
Updated Branches:
refs/heads/master 2af8b5cff -> 79ff85363
[SPARK-17645][MLLIB][ML] add feature selector method based on: False Discovery
Rate (FDR) and Family wise error rate (FWE)
## What changes were proposed in this pull request?
Univariate feature selection works by
Repository: spark
Updated Branches:
refs/heads/master e104e55c1 -> f2ddabfa0
[MINOR][SPARKR] fix kstest example error and add unit test
## What changes were proposed in this pull request?
While adding vignettes for kstest, I found some errors in the example:
1. There is a typo of kstest;
2.
Repository: spark
Updated Branches:
refs/heads/branch-2.1 019d1fa3d -> 8ef005931
[MINOR][SPARKR] fix kstest example error and add unit test
## What changes were proposed in this pull request?
While adding vignettes for kstest, I found some errors in the example:
1. There is a typo of kstest;
Repository: spark
Updated Branches:
refs/heads/branch-2.1 48aa6775d -> 9095c152e
[SPARK-18325][SPARKR][ML] SparkR ML wrappers example code and user guide
## What changes were proposed in this pull request?
* Add all R examples for ML wrappers which were added during 2.1 release cycle.
* Split
Repository: spark
Updated Branches:
refs/heads/master b47b892e4 -> 9bf8f3cd4
[SPARK-18325][SPARKR][ML] SparkR ML wrappers example code and user guide
## What changes were proposed in this pull request?
* Add all R examples for ML wrappers which were added during 2.1 release cycle.
* Split the
Repository: spark
Updated Branches:
refs/heads/master 82253617f -> 97255497d
[SPARK-18326][SPARKR][ML] Review SparkR ML wrappers API for 2.1
## What changes were proposed in this pull request?
Reviewing SparkR ML wrappers API for 2.1 release, mainly two issues:
* Remove ```probabilityCol```
Repository: spark
Updated Branches:
refs/heads/branch-2.1 ab865cfd9 -> 1c3f1da82
[SPARK-18326][SPARKR][ML] Review SparkR ML wrappers API for 2.1
## What changes were proposed in this pull request?
Reviewing SparkR ML wrappers API for 2.1 release, mainly two issues:
* Remove
Repository: spark
Updated Branches:
refs/heads/branch-2.1 617ce3ba7 -> ab865cfd9
[SPARK-18705][ML][DOC] Update user guide to reflect one pass solver for L1 and
elastic-net
## What changes were proposed in this pull request?
WeightedLeastSquares now supports L1 and elastic net penalties and
Repository: spark
Updated Branches:
refs/heads/master 9ab725eab -> 82253617f
[SPARK-18705][ML][DOC] Update user guide to reflect one pass solver for L1 and
elastic-net
## What changes were proposed in this pull request?
WeightedLeastSquares now supports L1 and elastic net penalties and has
Repository: spark
Updated Branches:
refs/heads/branch-2.1 3750c6e9b -> 340e9aea4
[SPARK-18686][SPARKR][ML] Several cleanup and improvements for spark.logit.
## What changes were proposed in this pull request?
Several cleanup and improvements for ```spark.logit```:
* ```summary``` should
Repository: spark
Updated Branches:
refs/heads/master 5c6bcdbda -> 90b59d1bf
[SPARK-18686][SPARKR][ML] Several cleanup and improvements for spark.logit.
## What changes were proposed in this pull request?
Several cleanup and improvements for ```spark.logit```:
* ```summary``` should return
Repository: spark
Updated Branches:
refs/heads/branch-2.1 88e07efe8 -> 1821cbead
[SPARK-18279][DOC][ML][SPARKR] Add R examples to ML programming guide.
## What changes were proposed in this pull request?
Add R examples to ML programming guide for the following algorithms as POC:
* spark.glm
*
Repository: spark
Updated Branches:
refs/heads/master bdfe7f674 -> eb8dd6813
[SPARK-18279][DOC][ML][SPARKR] Add R examples to ML programming guide.
## What changes were proposed in this pull request?
Add R examples to ML programming guide for the following algorithms as POC:
* spark.glm
*
Repository: spark
Updated Branches:
refs/heads/master e9730b707 -> bdfe7f674
[SPARK-18625][ML] OneVsRestModel should support setFeaturesCol and
setPredictionCol
## What changes were proposed in this pull request?
add `setFeaturesCol` and `setPredictionCol` for `OneVsRestModel`
## How was
Repository: spark
Updated Branches:
refs/heads/branch-2.1 c13c2939f -> 88e07efe8
[SPARK-18625][ML] OneVsRestModel should support setFeaturesCol and
setPredictionCol
## What changes were proposed in this pull request?
add `setFeaturesCol` and `setPredictionCol` for `OneVsRestModel`
## How
Repository: spark
Updated Branches:
refs/heads/branch-2.1 7d4596734 -> e8d8e3509
[SPARK-18476][SPARKR][ML] SparkR Logistic Regression should should support
output original label.
## What changes were proposed in this pull request?
Similar to SPARK-18401, as a classification algorithm,
Repository: spark
Updated Branches:
refs/heads/master 0a811210f -> 2eb6764fb
[SPARK-18476][SPARKR][ML] SparkR Logistic Regression should should support
output original label.
## What changes were proposed in this pull request?
Similar to SPARK-18401, as a classification algorithm, logistic
Repository: spark
Updated Branches:
refs/heads/branch-2.1 55b1142bd -> b95aad7ca
[SPARK-15819][PYSPARK][ML] Add KMeanSummary in KMeans of PySpark
## What changes were proposed in this pull request?
Add python api for KMeansSummary
## How was this patch tested?
unit test added
Author: Jeff
Repository: spark
Updated Branches:
refs/heads/branch-2.1 27d81d000 -> 04ec74f12
[SPARK-18520][ML] Add missing setXXXCol methods for BisectingKMeansModel and
GaussianMixtureModel
## What changes were proposed in this pull request?
add `setFeaturesCol` and `setPredictionCol` for BiKModel and
Repository: spark
Updated Branches:
refs/heads/master 223fa218e -> 2dfabec38
[SPARK-18520][ML] Add missing setXXXCol methods for BisectingKMeansModel and
GaussianMixtureModel
## What changes were proposed in this pull request?
add `setFeaturesCol` and `setPredictionCol` for BiKModel and
Repository: spark
Updated Branches:
refs/heads/branch-2.1 3be2d1e0b -> fc5fee83e
[SPARK-18501][ML][SPARKR] Fix spark.glm errors when fitting on collinear data
## What changes were proposed in this pull request?
* Fix SparkR ```spark.glm``` errors when fitting on collinear data, since
Repository: spark
Updated Branches:
refs/heads/master d0212eb0f -> 982b82e32
[SPARK-18501][ML][SPARKR] Fix spark.glm errors when fitting on collinear data
## What changes were proposed in this pull request?
* Fix SparkR ```spark.glm``` errors when fitting on collinear data, since
```standard
Repository: spark
Updated Branches:
refs/heads/branch-2.0 9dad3a7b0 -> a37238b06
[SPARK-18444][SPARKR] SparkR running in yarn-cluster mode should not download
Spark package.
## What changes were proposed in this pull request?
When running SparkR job in yarn-cluster mode, it will download
Repository: spark
Updated Branches:
refs/heads/branch-2.1 aaa2a173a -> c70214075
[SPARK-18444][SPARKR] SparkR running in yarn-cluster mode should not download
Spark package.
## What changes were proposed in this pull request?
When running SparkR job in yarn-cluster mode, it will download
Repository: spark
Updated Branches:
refs/heads/master ebeb0830a -> acb971577
[SPARK-18444][SPARKR] SparkR running in yarn-cluster mode should not download
Spark package.
## What changes were proposed in this pull request?
When running SparkR job in yarn-cluster mode, it will download Spark
Repository: spark
Updated Branches:
refs/heads/branch-2.1 fb4e6359d -> 31002e4a7
[SPARK-18282][ML][PYSPARK] Add python clustering summaries for GMM and BKM
## What changes were proposed in this pull request?
Add model summary APIs for `GaussianMixtureModel` and `BisectingKMeansModel` in
Repository: spark
Updated Branches:
refs/heads/master 658547974 -> e811fbf9e
[SPARK-18282][ML][PYSPARK] Add python clustering summaries for GMM and BKM
## What changes were proposed in this pull request?
Add model summary APIs for `GaussianMixtureModel` and `BisectingKMeansModel` in
Repository: spark
Updated Branches:
refs/heads/branch-2.1 820847008 -> 6b6eb4e52
[SPARK-18434][ML] Add missing ParamValidations for ML algos
## What changes were proposed in this pull request?
Add missing ParamValidations for ML algos
## How was this patch tested?
existing tests
Author:
Repository: spark
Updated Branches:
refs/heads/master 241e04bc0 -> c68f1a38a
[SPARK-18434][ML] Add missing ParamValidations for ML algos
## What changes were proposed in this pull request?
Add missing ParamValidations for ML algos
## How was this patch tested?
existing tests
Author: Zheng
Repository: spark
Updated Branches:
refs/heads/branch-2.1 436ae201f -> 7b57e480d
[SPARK-18438][SPARKR][ML] spark.mlp should support RFormula.
## What changes were proposed in this pull request?
```spark.mlp``` should support ```RFormula``` like other ML algorithm wrappers.
BTW, I did some
Repository: spark
Updated Branches:
refs/heads/master 4ac9759f8 -> 95eb06bd7
[SPARK-18438][SPARKR][ML] spark.mlp should support RFormula.
## What changes were proposed in this pull request?
```spark.mlp``` should support ```RFormula``` like other ML algorithm wrappers.
BTW, I did some cleanup
Repository: spark
Updated Branches:
refs/heads/branch-2.1 0c69224ed -> 8fc6455c0
[SPARK-18412][SPARKR][ML] Fix exception for some SparkR ML algorithms training
on libsvm data
## What changes were proposed in this pull request?
* Fix the following exceptions which throws when
Repository: spark
Updated Branches:
refs/heads/master b91a51bb2 -> 07be232ea
[SPARK-18412][SPARKR][ML] Fix exception for some SparkR ML algorithms training
on libsvm data
## What changes were proposed in this pull request?
* Fix the following exceptions which throws when
Repository: spark
Updated Branches:
refs/heads/branch-2.1 893355143 -> b2ba83d10
[SPARK-14077][ML][FOLLOW-UP] Minor refactor and cleanup for NaiveBayes
## What changes were proposed in this pull request?
* Refactor out ```trainWithLabelCheck``` and make ```mllib.NaiveBayes``` call
into it.
*
Repository: spark
Updated Branches:
refs/heads/master bc41d997e -> 22cb3a060
[SPARK-14077][ML][FOLLOW-UP] Minor refactor and cleanup for NaiveBayes
## What changes were proposed in this pull request?
* Refactor out ```trainWithLabelCheck``` and make ```mllib.NaiveBayes``` call
into it.
*
Repository: spark
Updated Branches:
refs/heads/branch-2.1 064d4315f -> 51dca6143
[SPARK-18401][SPARKR][ML] SparkR random forest should support output original
label.
## What changes were proposed in this pull request?
SparkR ```spark.randomForest``` classification prediction should output
Repository: spark
Updated Branches:
refs/heads/master a3356343c -> 5ddf69470
[SPARK-18401][SPARKR][ML] SparkR random forest should support output original
label.
## What changes were proposed in this pull request?
SparkR ```spark.randomForest``` classification prediction should output
Repository: spark
Updated Branches:
refs/heads/branch-2.1 df40ee2b4 -> 6b332909f
[SPARK-18291][SPARKR][ML] SparkR glm predict should output original label when
family = binomial.
## What changes were proposed in this pull request?
SparkR ```spark.glm``` predict should output original label
Repository: spark
Updated Branches:
refs/heads/master a814eeac6 -> daa975f4b
[SPARK-18291][SPARKR][ML] SparkR glm predict should output original label when
family = binomial.
## What changes were proposed in this pull request?
SparkR ```spark.glm``` predict should output original label when
Repository: spark
Updated Branches:
refs/heads/master 340f09d10 -> b89d0556d
[SPARK-18210][ML] Pipeline.copy does not create an instance with the same UID
## What changes were proposed in this pull request?
Motivation:
`org.apache.spark.ml.Pipeline.copy(extra: ParamMap)` does not create an
Repository: spark
Updated Branches:
refs/heads/branch-2.1 dcbf3fd4b -> d2f2cf68a
[SPARK-18210][ML] Pipeline.copy does not create an instance with the same UID
## What changes were proposed in this pull request?
Motivation:
`org.apache.spark.ml.Pipeline.copy(extra: ParamMap)` does not create
Repository: spark
Updated Branches:
refs/heads/branch-2.1 e9f1d4aaa -> c42301f1e
[SPARK-18276][ML] ML models should copy the training summary and set parent
## What changes were proposed in this pull request?
Only some of the models which contain a training summary currently set the
Repository: spark
Updated Branches:
refs/heads/master 15d392688 -> 23ce0d1e9
[SPARK-18276][ML] ML models should copy the training summary and set parent
## What changes were proposed in this pull request?
Only some of the models which contain a training summary currently set the
summaries
Repository: spark
Updated Branches:
refs/heads/branch-2.1 71104c9c9 -> 99891e56e
[SPARK-18177][ML][PYSPARK] Add missing 'subsamplingRate' of pyspark
GBTClassifier
## What changes were proposed in this pull request?
Add missing 'subsamplingRate' of pyspark GBTClassifier
## How was this patch
Repository: spark
Updated Branches:
refs/heads/master 0ea5d5b24 -> 9dc9f9a5d
[SPARK-18177][ML][PYSPARK] Add missing 'subsamplingRate' of pyspark
GBTClassifier
## What changes were proposed in this pull request?
Add missing 'subsamplingRate' of pyspark GBTClassifier
## How was this patch
Repository: spark
Updated Branches:
refs/heads/master 569788a55 -> e9746f87d
[SPARK-18133][EXAMPLES][ML] Python ML Pipeline Example has syntax eâ¦
## What changes were proposed in this pull request?
In Python 3, there is only one integer type (i.e., int), which mostly behaves
like the long
Repository: spark
Updated Branches:
refs/heads/master ab5f938bc -> 569788a55
[SPARK-18109][ML] Add instrumentation to GMM
## What changes were proposed in this pull request?
Add instrumentation to GMM
## How was this patch tested?
Test in spark-shell
Author: Zheng RuiFeng
Repository: spark
Updated Branches:
refs/heads/master 4bee95407 -> 312ea3f7f
[SPARK-17748][FOLLOW-UP][ML] Reorg variables of WeightedLeastSquares.
## What changes were proposed in this pull request?
This is follow-up work of #15394.
Reorg some variables of ```WeightedLeastSquares``` and fix
Repository: spark
Updated Branches:
refs/heads/master 38cdd6ccd -> ac8ff920f
[SPARK-17748][FOLLOW-UP][ML] Fix build error for Scala 2.10.
## What changes were proposed in this pull request?
#15394 introduced build error for Scala 2.10, this PR fix it.
## How was this patch tested?
Existing
Repository: spark
Updated Branches:
refs/heads/master 6f31833db -> 38cdd6ccd
[SPARK-14634][ML][FOLLOWUP] Delete superfluous line in BisectingKMeans
## What changes were proposed in this pull request?
As commented by jkbradley in https://github.com/apache/spark/pull/12394,
Repository: spark
Updated Branches:
refs/heads/master 483c37c58 -> 78d740a08
[SPARK-17748][ML] One pass solver for Weighted Least Squares with ElasticNet
## What changes were proposed in this pull request?
1. Make a pluggable solver interface for `WeightedLeastSquares`
2. Add a `QuasiNewton`
Repository: spark
Updated Branches:
refs/heads/branch-2.0 a0c03c925 -> b959dab32
[SPARK-17986][ML] SQLTransformer should remove temporary tables
## What changes were proposed in this pull request?
A call to the method `SQLTransformer.transform` previously would create a
temporary table and
Repository: spark
Updated Branches:
refs/heads/master 01b26a064 -> ab3363e9f
[SPARK-17986][ML] SQLTransformer should remove temporary tables
## What changes were proposed in this pull request?
A call to the method `SQLTransformer.transform` previously would create a
temporary table and
Repository: spark
Updated Branches:
refs/heads/master 1db8feab8 -> a1b136d05
[SPARK-14634][ML] Add BisectingKMeansSummary
## What changes were proposed in this pull request?
Add BisectingKMeansSummary
## How was this patch tested?
unit test
Author: Zheng RuiFeng
Repository: spark
Updated Branches:
refs/heads/master 2fb12b0a3 -> 1db8feab8
[SPARK-15402][ML][PYSPARK] PySpark ml.evaluation should support save/load
## What changes were proposed in this pull request?
Since ```ml.evaluation``` has supported save/load at Scala side, supporting it
at Python
Repository: spark
Updated Branches:
refs/heads/master 9dc0ca060 -> 44cbb61b3
[SPARK-15957][FOLLOW-UP][ML][PYSPARK] Add Python API for RFormula
forceIndexLabel.
## What changes were proposed in this pull request?
Follow-up work of #13675, add Python API for ```RFormula forceIndexLabel```.
##
Repository: spark
Updated Branches:
refs/heads/master 0d4a69527 -> 21cb59f1c
[SPARK-17835][ML][MLLIB] Optimize NaiveBayes mllib wrapper to eliminate extra
pass on data
## What changes were proposed in this pull request?
[SPARK-14077](https://issues.apache.org/jira/browse/SPARK-14077) copied
Repository: spark
Updated Branches:
refs/heads/master 6f20a92ca -> 0d4a69527
[SPARK-17745][ML][PYSPARK] update NB python api - add weight col parameter
## What changes were proposed in this pull request?
update python api for NaiveBayes: add weight col parameter.
## How was this patch
Repository: spark
Updated Branches:
refs/heads/master b515768f2 -> 19401a203
[SPARK-15957][ML] RFormula supports forcing to index label
## What changes were proposed in this pull request?
```RFormula``` will index label only when it is string type currently. If the
label is numeric type and
Repository: spark
Updated Branches:
refs/heads/branch-2.0 b1a9c41e8 -> 594a2cf6f
[SPARK-17792][ML] L-BFGS solver for linear regression does not accept general
numeric label column types
## What changes were proposed in this pull request?
Before, we computed `instances` in LinearRegression
Repository: spark
Updated Branches:
refs/heads/master 49d11d499 -> 3713bb199
[SPARK-17792][ML] L-BFGS solver for linear regression does not accept general
numeric label column types
## What changes were proposed in this pull request?
Before, we computed `instances` in LinearRegression in
Repository: spark
Updated Branches:
refs/heads/master b678e465a -> 7aeb20be7
[MINOR][ML] Avoid 2D array flatten in NB training.
## What changes were proposed in this pull request?
Avoid 2D array flatten in ```NaiveBayes``` training, since flatten method might
be expensive (It will create
Repository: spark
Updated Branches:
refs/heads/master 7d5160883 -> c17f97183
[SPARK-17744][ML] Parity check between the ml and mllib test suites for NB
## What changes were proposed in this pull request?
1,parity check and add missing test suites for ml's NB
2,remove some unused imports
##
Repository: spark
Updated Branches:
refs/heads/master 1fad55968 -> 8e491af52
[SPARK-14077][ML][FOLLOW-UP] Revert change for NB Model's Load to maintain
compatibility with the model stored before 2.0
## What changes were proposed in this pull request?
Revert change for NB Model's Load to
Repository: spark
Updated Branches:
refs/heads/master 74ac1c438 -> 1fad55968
[SPARK-14077][ML] Refactor NaiveBayes to support weighted instances
## What changes were proposed in this pull request?
1,support weighted data
2,use dataset/dataframe instead of rdd
3,make mllib as a wrapper to call
Repository: spark
Updated Branches:
refs/heads/master a19a1bb59 -> f7082ac12
[SPARK-17704][ML][MLLIB] ChiSqSelector performance improvement.
## What changes were proposed in this pull request?
Several performance improvement for ```ChiSqSelector```:
1, Keep ```selectedFeatures``` ordered
Repository: spark
Updated Branches:
refs/heads/master 37eb9184f -> a19a1bb59
[SPARK-16356][FOLLOW-UP][ML] Enforce ML test of exception for local/distributed
Dataset.
## What changes were proposed in this pull request?
#14035 added ```testImplicits``` to ML unit tests and promoted
Repository: spark
Updated Branches:
refs/heads/master 85b0a1575 -> 7f16affa2
[SPARK-17138][ML][MLIB] Add Python API for multinomial logistic regression
## What changes were proposed in this pull request?
Add Python API for multinomial logistic regression.
- add `family` param in python api.
Repository: spark
Updated Branches:
refs/heads/master 00be16df6 -> 93c743f1a
[SPARK-17577][FOLLOW-UP][SPARKR] SparkR spark.addFile supports adding directory
recursively
## What changes were proposed in this pull request?
#15140 exposed ```JavaSparkContext.addFile(path: String, recursive:
Repository: spark
Updated Branches:
refs/heads/master 50b89d05b -> f234b7cd7
http://git-wip-us.apache.org/repos/asf/spark/blob/f234b7cd/mllib/src/test/scala/org/apache/spark/ml/feature/StringIndexerSuite.scala
--
diff --git
[SPARK-16356][ML] Add testImplicits for ML unit tests and promote toDF()
## What changes were proposed in this pull request?
This was suggested in
https://github.com/apache/spark/commit/101663f1ae222a919fc40510aa4f2bad22d1be6f#commitcomment-17114968.
This PR adds `testImplicits` to
Repository: spark
Updated Branches:
refs/heads/master 248916f55 -> 7945daed1
[MINOR][SPARKR] Add sparkr-vignettes.html to gitignore.
## What changes were proposed in this pull request?
Add ```sparkr-vignettes.html``` to ```.gitignore```.
## How was this patch tested?
No need test.
Author:
Repository: spark
Updated Branches:
refs/heads/master 646f38346 -> 72d9fba26
[SPARK-17281][ML][MLLIB] Add treeAggregateDepth parameter for
AFTSurvivalRegression
## What changes were proposed in this pull request?
Add treeAggregateDepth parameter for AFTSurvivalRegression to keep consistent
Repository: spark
Updated Branches:
refs/heads/master c133907c5 -> 6902edab7
[SPARK-17315][FOLLOW-UP][SPARKR][ML] Fix print of Kolmogorov-Smirnov test
summary
## What changes were proposed in this pull request?
#14881 added Kolmogorov-Smirnov Test wrapper to SparkR. I found that
Repository: spark
Updated Branches:
refs/heads/master 61876a427 -> d3b886976
[SPARK-17585][PYSPARK][CORE] PySpark SparkContext.addFile supports adding files
recursively
## What changes were proposed in this pull request?
Users would like to add a directory as dependency in some cases, they
Repository: spark
Updated Branches:
refs/heads/master 1fec3ce4e -> bcdd259c3
[SPARK-15509][FOLLOW-UP][ML][SPARKR] R MLlib algorithms should support input
columns "features" and "label"
## What changes were proposed in this pull request?
#13584 resolved the issue of features and label columns
Repository: spark
Updated Branches:
refs/heads/master 65b814bf5 -> 2ed601217
[SPARK-17464][SPARKR][ML] SparkR spark.als argument reg should be 0.1 by
default.
## What changes were proposed in this pull request?
SparkR ```spark.als``` arguments ```reg``` should be 0.1 by default, which need
Repository: spark
Updated Branches:
refs/heads/master 92ce8d484 -> 65b814bf5
[SPARK-17456][CORE] Utility for parsing Spark versions
## What changes were proposed in this pull request?
This patch adds methods for extracting major and minor versions as Int types in
Scala from a Spark version
Repository: spark
Updated Branches:
refs/heads/master 6f13aa7df -> 39d538ddd
[MINOR][ML] Correct weights doc of MultilayerPerceptronClassificationModel.
## What changes were proposed in this pull request?
```weights``` of ```MultilayerPerceptronClassificationModel``` should be the
output
101 - 200 of 227 matches
Mail list logo