GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14629
[SPARK-17046][SQL] prevent_user_call_df_select_will_empty_paramlist
## What changes were proposed in this pull request?
We can see the DataFrame API:
`def select(col: String, cols
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14628
[SPARK-17033][Follow-up][ML][MLLib] Improve kmean aggregate to treeAggregate
## What changes were proposed in this pull request?
The kmean use `aggregate` to compute points cost
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14629
@srowen
How do you think about this problem?
I found adding two method like
`def select(cols: Column*)`
`def select(col: Column, cols: Column*)`
causing ambiguous,
I
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14628
@lins05 Ok, give me some time to check whether the one in LDAModel is also
proper to use treeAggregate
---
If your project is set up for it, you can reply to this email and have your
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14265
cc @rxin Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14293
[GIT] add pydev & Rstudio project file to gitignore list
## What changes were proposed in this pull request?
Add Pydev & Rstudio project file to gitignore list, I think the
Github user WeichenXu123 closed the pull request at:
https://github.com/apache/spark/pull/13275
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14293
I use PyDev IDE to edit python code and it generate `.pydevproject`, and
use Rstudio IDE to edit R code it generate *.Rproj, these are only projects
setting files used by the IDEs like `.idea
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14265
@srowen
I check the ml.python.MLSerde and it support SparseMatrix pickler and at
python side the SparseMatrix constructor also match the pickler. So I think the
`_picklable_classes
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14216
@srowen several minor modifications done.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14301
[SPARK-16662][PySpark][SQL] update HiveContext warning
## What changes were proposed in this pull request?
move the `HiveContext` deprecate warning printing statement
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14265
@srowen I guess the `_picklable_classes` list in `ml.linalg.common` is
copied from `mllib.linalg.common` so it forgot to add the `SparseMatrix` which
is added later.
---
If your project
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14238
[MINOR][TYPO] fix fininsh typo
## What changes were proposed in this pull request?
fininsh => finish
## How was this patch tested?
(Please explain how this pa
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14122#discussion_r71083700
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala ---
@@ -327,6 +327,11 @@ class LinearRegression @Since("
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14220
[SPARK-16568][SQL][Documentation] update sql programming guide refreshTable
API in python code
## What changes were proposed in this pull request?
update `refreshTable` API in python
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14216
@srowen OK var names updated.
and the 'fixing' numNonzero which you said means the number of input
vectors which weight > 0 ?
---
If your project is set up for it, you can re
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14216
@srowen OK. I'll fix the var names first.
nnz => weightSum
weightSum => totalWeightSum
cnnz => nnz
is that right ?
---
If your project is set up for it, you
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14265
[PySpark] add picklable SparseMatrix
## What changes were proposed in this pull request?
add `SparseMatrix` class whick support pickler.
## How was this patch tested
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14276
cc @srowen Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user WeichenXu123 closed the pull request at:
https://github.com/apache/spark/pull/14276
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14220
cc @rxin Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14276
[SPARK-16638][ML][Optimizer] fix L2 reg computation in linearRegression
when standarlization is false
## What changes were proposed in this pull request?
when `standardization
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14216
@srowen Now I add testcase, I test 3 cases, they are the same with the
example cases I wrote in [SPARK-16561], thanks!
---
If your project is set up for it, you can reply to this email
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14276
@srowen
I re-think the code again and maybe my previous idea is wrong. The
intension of author may be to use w[i] / featuresStd[i] to reduce penalty on
large scale dimension (because
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14286
[SPARK-16653][ML][Optimizer] update ANN convergence tolerance param default
to 1e-6
## What changes were proposed in this pull request?
replace ANN convergence tolerance param
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14156
[SPARK-16499][ML][MLLib] improve ApplyInPlace function in ANN code
## What changes were proposed in this pull request?
I re-code the following fuction using breeze's matrix operating
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14157
[SPARK-16500][ML][MLLib][Optimizer] add LBFGS convergence warning for all
used place in MLLib
## What changes were proposed in this pull request?
Add warning_for the following case
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14246
[SPARK-16600][MLLib] fix some latex formula syntax error
## What changes were proposed in this pull request?
`\partial\x` ==> `\partial x`
`har{x_i}` ==> `h
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14220
cc @liancheng Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14203#discussion_r70913944
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -1416,13 +1416,25 @@ def drop(self, col):
>>> df.join(df2, df.name ==
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14216
[SPARK-16561][MLLib] fix multivarOnlineSummary min/max bug
## What changes were proposed in this pull request?
add a member vector `cnnz` to count each dimensions non-zero value
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/13946
[MINOR][SparkR] update sparkR DataFrame.R comment
## What changes were proposed in this pull request?
update sparkR DataFrame.R comment
SQLContext ==> SparkSess
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/13558
cc @liancheng Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14025#discussion_r69720553
--- Diff: docs/streaming-programming-guide.md ---
@@ -1546,9 +1546,9 @@ val words: DStream[String] = ...
words.foreachRDD { rdd
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14121
[MINOR][ML] update comment where is inconsistent with code in
ml.regression.LinearRegression
## What changes were proposed in this pull request?
In `train` method
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14122
[SPARK-16470][ML][Optimizer] Check linear regression training whether
actually reach convergence and add warning if not
## What changes were proposed in this pull request
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14122#discussion_r70181416
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala ---
@@ -327,6 +327,11 @@ class LinearRegression @Since("
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14156
@srowen OK I close the pr for now if I found better way to optimize it I
will reopen it, thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user WeichenXu123 closed the pull request at:
https://github.com/apache/spark/pull/14156
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14519#discussion_r73787877
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/AFTSurvivalRegression.scala
---
@@ -583,19 +591,22 @@ private class AFTAggregator
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14520
cc @sethah @yanboliang
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14520
Oh..its another algorithm and there are several different details so in
order to make it clear I create a separated PR to discuss it , thanks!
---
If your project is set up for it, you can
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14520
[SPARK-16934][ML][MLLib] Improve LogisticCostFun to avoid redundant
serielization
## What changes were proposed in this pull request?
Improve LogisticCostFun, replace closure var
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14520
@MLnick
The main improvement here is about `localFeaturesStd`,
in previous code, each calling on `CostFun.calculate` will do a
serialization and broadcast on vector.
mark
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14520
@sethah
Thanks for your careful review!
The PR here already passing the bcFeaturesStd and bcCoeffs as constructor
args to the `LogisticAggregator`, like your PR #14109
You
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14519#discussion_r74011434
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/AFTSurvivalRegression.scala
---
@@ -478,21 +482,23 @@ object AFTSurvivalRegressionModel
Github user WeichenXu123 closed the pull request at:
https://github.com/apache/spark/pull/14440
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14440
sounds reasonable...
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14015
[SPARK-16345][Documentation][Examples][GraphX] Extract graphx programming
guide example snippets from source files instead of hard code them
## What changes were proposed in this pull request
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14010
Jenkins retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14010
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14010
[GRAPHX][EXAMPLES] move graphx test data directory and update graphx
document
## What changes were proposed in this pull request?
There are two test data for graphx examples which
Github user WeichenXu123 closed the pull request at:
https://github.com/apache/spark/pull/13136
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14015
@srowen Yes, the example code is exactly the same as those in graphx doc,
and I test them all, can run normally.
---
If your project is set up for it, you can reply to this email and have
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14010
@srowen Done.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14015
Merge conflicts have been solved.
cc @srowen
Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14025
cc @srowen
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14025
cc @liancheng
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14025
@liancheng Yes. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14025#discussion_r69417590
--- Diff: docs/configuration.md ---
@@ -1564,8 +1564,8 @@ spark.sql("SET -v").show(n=200, truncate=False)
{% h
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14025
[WIP][DOC] update out-of-date code snippets using SQLContext in all
documents.
## What changes were proposed in this pull request?
I search the whole documents directory using
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14628
@holdenk
I think depth (2) is enough to handle large RDD and bigger depth may add
cost. I'll append test result later. Thanks!
---
If your project is set up for it, you can reply
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15435
@sethah
If I merge the MulticlassLogisticRegressionSummary into
LogisticRegressionSummary,
then, according to the hierarchy currently designed, it became:
class
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15435
@sethah
About your new design,
```
Summary
PredictionSummary extends Summary
ClassificationSummary extends PredictionSummary
ProbabilisticClassificationSummary
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15435
I read jkbradley's thoughts here, so I will modify this as following:
first we need 4 traits, using the following hierarchy:
LogisticRegressionSummary
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15435
sethah
About this issue:
Why is there a one-to-one overlap between MulticlassClassificationSummary
and LogisticRegressionSummary, and MulticlassLogisticRegressionSummary inherits
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15730
@brkyvz Also thanks for your careful code review! ^_^
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15435
cc @sethah @jkbradley Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16576#discussion_r96573963
--- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala ---
@@ -1539,6 +1539,9 @@ abstract class RDD[T: ClassTag](
// NOTE: we use
Github user WeichenXu123 closed the pull request at:
https://github.com/apache/spark/pull/14629
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14156
cc @srowen thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14922#discussion_r77352565
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
---
@@ -295,6 +295,13 @@ class LogisticRegression @Since
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14922#discussion_r77354917
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala ---
@@ -405,5 +405,9 @@ private[ml] trait HasAggregationDepth
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14923#discussion_r77355534
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/stat/MultivariateOnlineSummarizer.scala
---
@@ -231,9 +231,9 @@ class
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14922#discussion_r77352634
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
---
@@ -295,6 +295,13 @@ class LogisticRegression @Since
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14922#discussion_r77358835
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
---
@@ -295,6 +295,13 @@ class LogisticRegression @Since
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14923#discussion_r77352136
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/stat/MultivariateOnlineSummarizer.scala
---
@@ -233,7 +233,7 @@ class
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14922#discussion_r77362198
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala ---
@@ -405,5 +405,9 @@ private[ml] trait HasAggregationDepth
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14922#discussion_r77356253
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
---
@@ -295,6 +295,13 @@ class LogisticRegression @Since
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14923#discussion_r77361651
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/stat/MultivariateOnlineSummarizer.scala
---
@@ -231,9 +231,9 @@ class
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14950
@srowen not only cpu cost, if data dimension is big, serialization cost
will be big, such as https://github.com/apache/spark/pull/14109
and compute all target seems not proper if we may add
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14950
[SPARK-17390][ML][MLLib] Optimize MultivariantOnlineSummerizer by making
the summarized target configurable
## What changes were proposed in this pull request?
add a mask parameter
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14922
[WIP][SPARK-17175] Add a expert formula to aggregationDepth of SharedParam
## What changes were proposed in this pull request?
Add a expert formula to aggregationDepth of SharedParam
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14923
[SPARK-17363][ML][MLLib] fix MultivariantOnlineSummerizer.numNonZeros
## What changes were proposed in this pull request?
fix `MultivariantOnlineSummerizer.numNonZeros` method
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14628
because KMeans algo is being optimized by another task I close this PR for
now and when that one merged I'll check for whether this need to be optimized.
---
If your project is set up
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14898
cc @srowen thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14898
[SPARK-16499][ML][MLLib] optimize ann algorithm where using ApplyInPlace
function
## What changes were proposed in this pull request?
replace
`ApplyInPlace(output, target, delta
Github user WeichenXu123 closed the pull request at:
https://github.com/apache/spark/pull/14628
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15045
jenkins test please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/15051#discussion_r78309230
--- Diff: R/pkg/R/mllib.R ---
@@ -694,8 +694,8 @@ setMethod("predict", signature(object = "KMeansModel"),
#' }
#' @note s
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/15051#discussion_r78307552
--- Diff: R/pkg/R/mllib.R ---
@@ -694,8 +694,8 @@ setMethod("predict", signature(object = "KMeansModel"),
#' }
#' @note s
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/15051#discussion_r78308116
--- Diff: R/pkg/R/mllib.R ---
@@ -694,8 +694,8 @@ setMethod("predict", signature(object = "KMeansModel"),
#' }
#' @note s
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/15051
[SPARK-17499][ML][MLLib] make the default params in sparkR spark.mlp
consistent with MultilayerPerceptronClassifier
## What changes were proposed in this pull request?
update several
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15045
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/15051#discussion_r78315909
--- Diff: R/pkg/R/mllib.R ---
@@ -694,8 +694,8 @@ setMethod("predict", signature(object = "KMeansModel"),
#' }
#' @note s
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/15051#discussion_r78315763
--- Diff: R/pkg/R/mllib.R ---
@@ -694,8 +694,8 @@ setMethod("predict", signature(object = "KMeansModel"),
#' }
#' @note s
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/15060#discussion_r78749771
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.scala
---
@@ -235,6 +235,7 @@ class
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/15097
[SPARK-17540][SparkR][Spark Core] fix SparkR array serde type problem when
length == 0
## What changes were proposed in this pull request?
fix SparkR array serde type problem when
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15045
oh, there are 5 similar messages..
I check the others, the others may be set the default one,
so I update their message as "Specified or default partitioner..."
b
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/15045
[Spark Core][MINOR] fix partitionBy error message
## What changes were proposed in this pull request?
In order to avoid confusing user,
it is better to change
101 - 200 of 1170 matches
Mail list logo