Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14922#discussion_r77352634
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
---
@@ -295,6 +295,13 @@ class LogisticRegression @Since
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14922#discussion_r77354917
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala ---
@@ -405,5 +405,9 @@ private[ml] trait HasAggregationDepth
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14923#discussion_r77355534
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/stat/MultivariateOnlineSummarizer.scala
---
@@ -231,9 +231,9 @@ class
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14922#discussion_r77356253
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
---
@@ -295,6 +295,13 @@ class LogisticRegression @Since
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14922#discussion_r77358835
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
---
@@ -295,6 +295,13 @@ class LogisticRegression @Since
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14923#discussion_r77361651
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/stat/MultivariateOnlineSummarizer.scala
---
@@ -231,9 +231,9 @@ class
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14922#discussion_r77362198
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala ---
@@ -405,5 +405,9 @@ private[ml] trait HasAggregationDepth
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14950
[SPARK-17390][ML][MLLib] Optimize MultivariantOnlineSummerizer by making
the summarized target configurable
## What changes were proposed in this pull request?
add a mask parameter
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14950
@srowen not only cpu cost, if data dimension is big, serialization cost
will be big, such as https://github.com/apache/spark/pull/14109
and compute all target seems not proper if we may add
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/15045
[Spark Core][MINOR] fix partitionBy error message
## What changes were proposed in this pull request?
In order to avoid confusing user,
it is better to change
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15045
oh, there are 5 similar messages..
I check the others, the others may be set the default one,
so I update their message as "Specified or default partitioner..."
but
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14898
@srowen
but here, `delta -= target` breeze lib will call BLAS and it will usually
be 10x faster than normal loop because it use SIMD instructions. Here is some
performance information
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/15051
[SPARK-17499][ML][MLLib] make the default params in sparkR spark.mlp
consistent with MultilayerPerceptronClassifier
## What changes were proposed in this pull request?
update several
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15045
jenkins test please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/15051#discussion_r78307552
--- Diff: R/pkg/R/mllib.R ---
@@ -694,8 +694,8 @@ setMethod("predict", signature(object = "KMeansModel"),
#' }
#
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/15051#discussion_r78308116
--- Diff: R/pkg/R/mllib.R ---
@@ -694,8 +694,8 @@ setMethod("predict", signature(object = "KMeansModel"),
#' }
#
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/15051#discussion_r78309230
--- Diff: R/pkg/R/mllib.R ---
@@ -694,8 +694,8 @@ setMethod("predict", signature(object = "KMeansModel"),
#' }
#
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15045
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/15051#discussion_r78315763
--- Diff: R/pkg/R/mllib.R ---
@@ -694,8 +694,8 @@ setMethod("predict", signature(object = "KMeansModel"),
#' }
#
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/15051#discussion_r78315909
--- Diff: R/pkg/R/mllib.R ---
@@ -694,8 +694,8 @@ setMethod("predict", signature(object = "KMeansModel"),
#' }
#
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/15060
[SPARK-17507][ML][MLLib] check weight vector size in ANN
## What changes were proposed in this pull request?
as the TODO described,
check weight vector size and if wrong throw
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15059
but relTol is defined in mllib and sql not reference it, seems better to
move it to spark-core project?
---
If your project is set up for it, you can reply to this email and have your
reply
Github user WeichenXu123 closed the pull request at:
https://github.com/apache/spark/pull/14922
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user WeichenXu123 closed the pull request at:
https://github.com/apache/spark/pull/14950
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14950
when benchmark is done I will reopen it.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14922
all right. when refining & benchmark is done I will reopen it.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your pro
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15060
@srowen the `weight` by default will randomly generated and it will
automatically match the size, only when it is specified by user it will need
this check... now the modification here seems
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/15060#discussion_r78749771
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.scala
---
@@ -235,6 +235,7 @@ class
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/15097
[SPARK-17540][SparkR][Spark Core] fix SparkR array serde type problem when
length == 0
## What changes were proposed in this pull request?
fix SparkR array serde type problem when
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15097
@shivaram
Oh...I found this way still has problems, Array[Nothing] in scala, after
compiling with type erasing, at last it turned into type `Ljava.lang.object`,
but primitive type Array
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14851
cc @srowen thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15051
@felixcheung Now I add some test using default parameter and compare the
output prediction with the result generated using scala-side code.
thanks!
---
If your project is set up for it
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/15051#discussion_r79295910
--- Diff: R/pkg/R/mllib.R ---
@@ -694,8 +694,14 @@ setMethod("predict", signature(object = "KMeansModel"),
#' }
#
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15051
@felixcheung
yeah, in fact 0x7FFF is not ideal because itself also a valid seed.
and there is another problem, in scala, seed is `long` type,
but in R side, it seems there is no
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/15051#discussion_r79297392
--- Diff: R/pkg/R/mllib.R ---
@@ -694,8 +694,14 @@ setMethod("predict", signature(object = "KMeansModel"),
#' }
#
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/15051#discussion_r79299954
--- Diff: R/pkg/R/mllib.R ---
@@ -694,8 +694,14 @@ setMethod("predict", signature(object = "KMeansModel"),
#' }
#
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15051
@felixcheung
Now I update the scala-side wapper args type as following:
layers: Array[Int],
seed: String
and the seed default value currently I use "", not NUL
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15051
@felixcheung negative test added, thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/15051#discussion_r79361373
--- Diff: R/pkg/R/mllib.R ---
@@ -695,17 +695,15 @@ setMethod("predict", signature(object =
"KMeansModel"),
#' @n
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/15051#discussion_r79362082
--- Diff: R/pkg/R/mllib.R ---
@@ -695,17 +695,15 @@ setMethod("predict", signature(object =
"KMeansModel"),
#' @n
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14852
@sethah OK. I will study the unified scala API for LOR and update the
python-side api PR ASAP. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14852
cc @sethah @yanboliang thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14852
Done. thanks! @yanboliang
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14852
Done. thanks for careful review :) @sethah
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15097
@felixcheung
I think out two ways for this problem, see the PR description.
which is better in your opinion?
Or whether it exists better solution?
---
If your project is set up
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14203
update python dataframe.drop
## What changes were proposed in this pull request?
Make `dataframe.drop` API in python support multi-columns parameters,
so that it is the same with
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14203#discussion_r70913944
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -1416,13 +1416,25 @@ def drop(self, col):
>>> df.join(df2, df.name ==
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14216
[SPARK-16561][MLLib] fix multivarOnlineSummary min/max bug
## What changes were proposed in this pull request?
add a member vector `cnnz` to count each dimensions non-zero value
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14216
@srowen OK. I'll fix the var names first.
nnz => weightSum
weightSum => totalWeightSum
cnnz => nnz
is that right ?
---
If your project is set up for it, yo
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14220
[SPARK-16568][SQL][Documentation] update sql programming guide refreshTable
API in python code
## What changes were proposed in this pull request?
update `refreshTable` API in python
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14216
@srowen OK var names updated.
and the 'fixing' numNonzero which you said means the number of input
vectors which weight > 0 ?
---
If your project is set up for it, you can
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14238
[MINOR][TYPO] fix fininsh typo
## What changes were proposed in this pull request?
fininsh => finish
## How was this patch tested?
(Please explain how this patch
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14122#discussion_r71083700
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala ---
@@ -327,6 +327,11 @@ class LinearRegression @Since("
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14246
[SPARK-16600][MLLib] fix some latex formula syntax error
## What changes were proposed in this pull request?
`\partial\x` ==> `\partial x`
`har{x_i}` ==> `h
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14220
cc @liancheng Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14265
[PySpark] add picklable SparseMatrix
## What changes were proposed in this pull request?
add `SparseMatrix` class whick support pickler.
## How was this patch tested
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14220
cc @rxin Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14276
[SPARK-16638][ML][Optimizer] fix L2 reg computation in linearRegression
when standarlization is false
## What changes were proposed in this pull request?
when `standardization
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14276
cc @srowen Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user WeichenXu123 closed the pull request at:
https://github.com/apache/spark/pull/14276
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14276
@srowen
I re-think the code again and maybe my previous idea is wrong. The
intension of author may be to use w[i] / featuresStd[i] to reduce penalty on
large scale dimension (because these
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14216
@srowen Now I add testcase, I test 3 cases, they are the same with the
example cases I wrote in [SPARK-16561], thanks!
---
If your project is set up for it, you can reply to this email and
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14286
[SPARK-16653][ML][Optimizer] update ANN convergence tolerance param default
to 1e-6
## What changes were proposed in this pull request?
replace ANN convergence tolerance param
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14293
[GIT] add pydev & Rstudio project file to gitignore list
## What changes were proposed in this pull request?
Add Pydev & Rstudio project file to gitignore list, I think the
Github user WeichenXu123 closed the pull request at:
https://github.com/apache/spark/pull/13275
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14265
@srowen
I check the ml.python.MLSerde and it support SparseMatrix pickler and at
python side the SparseMatrix constructor also match the pickler. So I think the
`_picklable_classes
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14293
I use PyDev IDE to edit python code and it generate `.pydevproject`, and
use Rstudio IDE to edit R code it generate *.Rproj, these are only projects
setting files used by the IDEs like `.idea
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14216
@srowen several minor modifications done.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14265
@srowen I guess the `_picklable_classes` list in `ml.linalg.common` is
copied from `mllib.linalg.common` so it forgot to add the `SparseMatrix` which
is added later.
---
If your project is
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14301
[SPARK-16662][PySpark][SQL] update HiveContext warning
## What changes were proposed in this pull request?
move the `HiveContext` deprecate warning printing statement into
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14265
cc @rxin Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14216
@srowen yeah I have pushed, "some minor update"
https://github.com/apache/spark/pull/14216/commits/362074187d8845eeb40452eceec10f7e8ad805df
---
If your project is set up for it, you
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14265
cc @jkbradley Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14216
@srowen Oh, I miss your comment about loop brace, now it added, thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14326
@yanboliang I go through the code and there are several problems need to
solve:
The robust regression has a parameter `sigma` which must > 0, so that it is
a bound optimize prob
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14265
@srowen We can check python/ml/tests.py, `VectorTests.test_serialize`
function, it contains a test for `SparseMatrix` serializing/unserializing, so
that we can confirm that this works
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14333
[SPARK-16696][ML][MLLib] unused broadcast variables do destroy call to
release memory in time
## What changes were proposed in this pull request?
update unused broadcast in KMeans
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14335
[SPARK-16697][ML][MLLib] improve LDA submitMiniBatch method to avoid
redundant RDD computation
## What changes were proposed in this pull request?
In `LDAOptimizer.submitMiniBatch
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14333
@srowen
The `bcNewCenters` in `KMeans` has some problem.
Check the code logic in detail, we can find that in each loop, it should
destroy the broadcast var `bcNewCenters` generated in
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14335#discussion_r72003428
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala ---
@@ -472,12 +473,13 @@ final class OnlineLDAOptimizer extends
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14335#discussion_r72003530
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala ---
@@ -472,12 +473,13 @@ final class OnlineLDAOptimizer extends
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14335#discussion_r72003627
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala ---
@@ -472,12 +473,13 @@ final class OnlineLDAOptimizer extends
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14333
@srowen
I check the code about KMean `bcNewCenters` again, if we want to make sure
the recovery of RDD will successful in any unexcepted case, we have to keep
all the `bcNewCenters
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14335#discussion_r72013619
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala ---
@@ -472,12 +473,13 @@ final class OnlineLDAOptimizer extends
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14335#discussion_r72014278
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala ---
@@ -472,12 +473,13 @@ final class OnlineLDAOptimizer extends
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14333
@srowen The KMeans.initKMeansParallel already implements the pattern
"persist current step RDD, and unpersist previous one", but I think an RDD
persisted can also break down becau
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14333
yeah, but the `bcSyn0Global` in Word2Vec is a difference case, it looks
safe there to destroy,
because in each loop iteration, the RDD transform which use `bcSyn0Global`
ends with a
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14333
@srowen
The sparkContext, by default, will running a cleaner to release not
referenced RDD/broadcasts on background. But, I think, we'd better to release
them by ourselves becaus
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14335
@srowen
`stats.unpersist(false)` ==> `stats.unpersist()` updated.
is there anything else need to update ?
---
If your project is set up for it, you can reply to this email and h
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14333
@srowen
I check `RDD.persist` referenced place:
AFTSuvivalRegression, LinearRegression, LogisticRegression, will persist
input training RDD and unpersist them when `train` return
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14333
@srowen yeah, the code logic here seems confusing, but I think it is right.
Now I can explain it in a clear way:
in essence, the logic can be expressed as following:
A0->I1->
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14333
@srowen
I check code again, the problem I mentioned above
`But now I found another problem in BisectKMeans:
in line 191 there is a iteration it also need this pattern âpersist
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14440
[SPARK-16835][ML] add training data unpersist handling when throw exception
[SPARK-16835][ML] add training data `unpersist` handling when throw
exception
## What changes were
Github user WeichenXu123 closed the pull request at:
https://github.com/apache/spark/pull/14440
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14440
sounds reasonable...
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/14483
[SPARK-16880][ML][MLLib] make ann training data persisted if needed
## What changes were proposed in this pull request?
To Make sure ANN layer input training data to be persisted
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14483
@srowen yeah, others algorithm using LBFGS all have this pattern, only ANN
forgot it.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14156
cc @srowen thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14156
@srowen
The := operator in BDM is simply copy one BDM to another, and it is widely
used in breeze source, e.g, we can check DenseMatrix.copy function in Breeze:
it first use
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/14156
yeah, currently it seems to make a little overhead (do a copy), but I think
it will take advantage of breeze optimization, in the future, e.g, SIMD
instructions or something ?
---
If your
401 - 500 of 1170 matches
Mail list logo