[GitHub] spark pull request: [SPARK-5565] [ML] LDA wrapper for Pipelines AP...

2015-11-06 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/9513#discussion_r44202038 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/LDA.scala --- @@ -0,0 +1,740 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-11262][ML] Unit test for gradient, loss...

2015-10-22 Thread feynmanliang
Github user feynmanliang commented on the pull request: https://github.com/apache/spark/pull/9229#issuecomment-150337626 Made a pass --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-11262][ML] Unit test for gradient, loss...

2015-10-22 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/9229#discussion_r42797041 --- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala --- @@ -532,8 +427,13 @@ private[ml] object FeedForwardTopology { for(i <

[GitHub] spark pull request: [SPARK-11262][ML] Unit test for gradient, loss...

2015-10-22 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/9229#discussion_r42796294 --- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala --- @@ -466,6 +352,15 @@ private[ann] trait Topology extends Serializable

[GitHub] spark pull request: [SPARK-11262][ML] Unit test for gradient, loss...

2015-10-22 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/9229#discussion_r42796199 --- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala --- @@ -226,44 +236,21 @@ private[ann] trait ActivationFunction extends Serializable

[GitHub] spark pull request: [SPARK-11262][ML] Unit test for gradient, loss...

2015-10-22 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/9229#discussion_r42795707 --- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala --- @@ -546,65 +446,83 @@ private[ml] object FeedForwardTopology { * Model of

[GitHub] spark pull request: [SPARK-11262][ML] Unit test for gradient, loss...

2015-10-22 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/9229#discussion_r42795626 --- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala --- @@ -546,65 +446,83 @@ private[ml] object FeedForwardTopology { * Model of

[GitHub] spark pull request: [SPARK-11262][ML] Unit test for gradient, loss...

2015-10-22 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/9229#discussion_r42795580 --- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala --- @@ -546,65 +446,83 @@ private[ml] object FeedForwardTopology { * Model of

[GitHub] spark pull request: [SPARK-11262][ML] Unit test for gradient, loss...

2015-10-22 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/9229#discussion_r42794818 --- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/LossFunction.scala --- @@ -0,0 +1,117 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-11262][ML] Unit test for gradient, loss...

2015-10-22 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/9229#discussion_r42794479 --- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala --- @@ -208,14 +214,18 @@ private[ann] object AffineLayerModel { * Generate

[GitHub] spark pull request: [SPARK-11262][ML] Unit test for gradient, loss...

2015-10-22 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/9229#discussion_r42794044 --- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala --- @@ -111,32 +126,26 @@ private[ann] class AffineLayer(val numIn: Int, val numOut

[GitHub] spark pull request: [SPARK-11262][ML] Unit test for gradient, loss...

2015-10-22 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/9229#discussion_r42793556 --- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala --- @@ -32,20 +32,33 @@ import org.apache.spark.util.random.XORShiftRandom

[GitHub] spark pull request: [SPARK-11262][ML] Unit test for gradient, loss...

2015-10-22 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/9229#discussion_r42793154 --- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/LossFunction.scala --- @@ -0,0 +1,117 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-11262][ML] Unit test for gradient, loss...

2015-10-22 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/9229#discussion_r42792858 --- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala --- @@ -151,42 +160,39 @@ private[ann] object AffineLayerModel { * Creates a

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-10-17 Thread feynmanliang
Github user feynmanliang commented on the pull request: https://github.com/apache/spark/pull/8648#issuecomment-148893887 @mengxr added benchmarks, can you make another pass when you have a chance --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-10-10 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8648#discussion_r41700924 --- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala --- @@ -260,127 +263,126 @@ private[ann] trait ActivationFunction extends

[GitHub] spark pull request: [SPARK-6723] [MLLIB] Model import/export for C...

2015-10-07 Thread feynmanliang
Github user feynmanliang commented on the pull request: https://github.com/apache/spark/pull/6785#issuecomment-146163905 LGTM overall, some nits, ping @mengxr to trigger tests --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-6723] [MLLIB] Model import/export for C...

2015-10-07 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/6785#discussion_r41379676 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala --- @@ -102,6 +110,66 @@ class ChiSqSelectorModel @Since("

[GitHub] spark pull request: [SPARK-6723] [MLLIB] Model import/export for C...

2015-10-07 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/6785#discussion_r41379557 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala --- @@ -102,6 +110,66 @@ class ChiSqSelectorModel @Since("

[GitHub] spark pull request: [SPARK-6723] [MLLIB] Model import/export for C...

2015-10-07 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/6785#discussion_r41379495 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala --- @@ -102,6 +110,66 @@ class ChiSqSelectorModel @Since("

[GitHub] spark pull request: [SPARK-6723] [MLLIB] Model import/export for C...

2015-10-07 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/6785#discussion_r41379475 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala --- @@ -102,6 +110,66 @@ class ChiSqSelectorModel @Since("

[GitHub] spark pull request: [SPARK-6723] [MLLIB] Model import/export for C...

2015-10-07 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/6785#discussion_r41379404 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala --- @@ -19,11 +19,19 @@ package org.apache.spark.mllib.feature

[GitHub] spark pull request: [SPARK-3147][MLLib][Streaming] Streaming 2-sam...

2015-09-18 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/4716#discussion_r39897444 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/test/StreamingTestMethod.scala --- @@ -0,0 +1,165 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-3147][MLLib][Streaming] Streaming 2-sam...

2015-09-18 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/4716#discussion_r39897481 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/test/TestResult.scala --- @@ -115,3 +115,25 @@ class KolmogorovSmirnovTestResult private[stat

[GitHub] spark pull request: [SPARK-3147][MLLib][Streaming] Streaming 2-sam...

2015-09-18 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/4716#discussion_r39897325 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/test/StreamingTestMethod.scala --- @@ -0,0 +1,165 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-3147][MLLib][Streaming] Streaming 2-sam...

2015-09-18 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/4716#discussion_r39897318 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/test/StreamingTestMethod.scala --- @@ -0,0 +1,165 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-3147][MLLib][Streaming] Streaming 2-sam...

2015-09-18 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/4716#discussion_r39897297 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/test/StreamingTestMethod.scala --- @@ -0,0 +1,165 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-3147][MLLib][Streaming] Streaming 2-sam...

2015-09-18 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/4716#discussion_r39897117 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/test/StreamingTest.scala --- @@ -0,0 +1,145 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-3147][MLLib][Streaming] Streaming 2-sam...

2015-09-18 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/4716#discussion_r39897178 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/test/StreamingTestMethod.scala --- @@ -0,0 +1,165 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-3147][MLLib][Streaming] Streaming 2-sam...

2015-09-18 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/4716#discussion_r39897181 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/test/StreamingTestMethod.scala --- @@ -0,0 +1,165 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-3147][MLLib][Streaming] Streaming 2-sam...

2015-09-18 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/4716#discussion_r39897100 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/test/StreamingTest.scala --- @@ -0,0 +1,145 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-3147][MLLib][Streaming] Streaming 2-sam...

2015-09-18 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/4716#discussion_r39896998 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/test/StreamingTest.scala --- @@ -0,0 +1,145 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-3147][MLLib][Streaming] Streaming 2-sam...

2015-09-18 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/4716#discussion_r39896849 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/test/StreamingTest.scala --- @@ -0,0 +1,145 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-3147][MLLib][Streaming] Streaming 2-sam...

2015-09-18 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/4716#discussion_r39896729 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/test/StreamingTest.scala --- @@ -0,0 +1,145 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-3147][MLLib][Streaming] Streaming 2-sam...

2015-09-18 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/4716#discussion_r39896683 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/test/StreamingTest.scala --- @@ -0,0 +1,145 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-3147][MLLib][Streaming] Streaming 2-sam...

2015-09-18 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/4716#discussion_r39896449 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/test/StreamingTest.scala --- @@ -0,0 +1,145 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-3147][MLLib][Streaming] Streaming 2-sam...

2015-09-18 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/4716#discussion_r39896444 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/test/StreamingTest.scala --- @@ -0,0 +1,145 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-3147][MLLib][Streaming] Streaming 2-sam...

2015-09-18 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/4716#discussion_r39896437 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/test/StreamingTest.scala --- @@ -0,0 +1,145 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-3147][MLLib][Streaming] Streaming 2-sam...

2015-09-18 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/4716#discussion_r39896288 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/test/StreamingTest.scala --- @@ -0,0 +1,145 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-3147][MLLib][Streaming] Streaming 2-sam...

2015-09-18 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/4716#discussion_r39896192 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/test/StreamingTest.scala --- @@ -0,0 +1,145 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-3147][MLLib][Streaming] Streaming 2-sam...

2015-09-18 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/4716#discussion_r39896160 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/test/StreamingTest.scala --- @@ -0,0 +1,145 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-3147][MLLib][Streaming] Streaming 2-sam...

2015-09-18 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/4716#discussion_r39896148 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/test/StreamingTest.scala --- @@ -0,0 +1,145 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-10595] [ML] [MLLIB] [DOCS] Various ML g...

2015-09-15 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8752#discussion_r39559676 --- Diff: docs/mllib-feature-extraction.md --- @@ -506,7 +523,7 @@ v_N [`ElementwiseProduct`](api/scala/index.html

[GitHub] spark pull request: [SPARK-10595] [ML] [MLLIB] [DOCS] Various ML g...

2015-09-15 Thread feynmanliang
Github user feynmanliang commented on the pull request: https://github.com/apache/spark/pull/8752#issuecomment-140526380 LGTM after changes --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-10595] [ML] [MLLIB] [DOCS] Various ML g...

2015-09-15 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8752#discussion_r39559560 --- Diff: docs/ml-features.md --- @@ -123,12 +123,21 @@ for features_label in rescaledData.select("features", "label").take(3):

[GitHub] spark pull request: [SPARK-9715][ML] Store numFeatures in all ML P...

2015-09-15 Thread feynmanliang
Github user feynmanliang commented on the pull request: https://github.com/apache/spark/pull/8675#issuecomment-140493169 That's interesting, I don't know why `GBTRegressionModel` has a public constructor (IMO it shouldn't since users should not be directly instantia

[GitHub] spark pull request: [SPARK-10259] [ML] Add @since annotation to ml...

2015-09-15 Thread feynmanliang
Github user feynmanliang commented on the pull request: https://github.com/apache/spark/pull/8534#issuecomment-140449400 Hmm I actually don't know the answer to that, maybe @mengxr can help --- If your project is set up for it, you can reply to this email and have your reply a

[GitHub] spark pull request: [SPARK-10595] [ML] [MLLIB] [DOCS] Various ML g...

2015-09-14 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8752#discussion_r39461788 --- Diff: docs/mllib-feature-extraction.md --- @@ -486,7 +492,8 @@ sc.stop(); ## ElementwiseProduct -ElementwiseProduct multiplies

[GitHub] spark pull request: [SPARK-10595] [ML] [MLLIB] [DOCS] Various ML g...

2015-09-14 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8752#discussion_r39461756 --- Diff: docs/mllib-feature-extraction.md --- @@ -486,7 +492,8 @@ sc.stop(); ## ElementwiseProduct -ElementwiseProduct multiplies

[GitHub] spark pull request: [SPARK-10595] [ML] [MLLIB] [DOCS] Various ML g...

2015-09-14 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8752#discussion_r39461748 --- Diff: docs/mllib-feature-extraction.md --- @@ -380,35 +380,37 @@ data2 = labels.zip(normalizer2.transform(features

[GitHub] spark pull request: [SPARK-10595] [ML] [MLLIB] [DOCS] Various ML g...

2015-09-14 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8752#discussion_r39461708 --- Diff: docs/mllib-feature-extraction.md --- @@ -380,35 +380,37 @@ data2 = labels.zip(normalizer2.transform(features

[GitHub] spark pull request: [SPARK-10595] [ML] [MLLIB] [DOCS] Various ML g...

2015-09-14 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8752#discussion_r39461620 --- Diff: docs/mllib-clustering.md --- @@ -507,6 +507,10 @@ must also be $> 1.0$. Providing `Vector(-1)` results in default behavior $&g

[GitHub] spark pull request: [SPARK-10595] [ML] [MLLIB] [DOCS] Various ML g...

2015-09-14 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8752#discussion_r39461489 --- Diff: docs/ml-guide.md --- @@ -32,7 +32,18 @@ See the [algorithm guides](#algorithm-guides) section below for guides on sub-pa * This will

[GitHub] spark pull request: [SPARK-9715][ML] Store numFeatures in all ML P...

2015-09-14 Thread feynmanliang
Github user feynmanliang commented on the pull request: https://github.com/apache/spark/pull/8675#issuecomment-140235205 This introduces break changes by modifying public constructor APIs; is there any way we can refactor this to use a trait mixin to avoid the duplication and

[GitHub] spark pull request: [SPARK-10393] use ML pipeline in LDA example

2015-09-14 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8551#discussion_r39460900 --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/LDAExample.scala --- @@ -186,121 +186,52 @@ object LDAExample { * Load

[GitHub] spark pull request: [SPARK-10393] use ML pipeline in LDA example

2015-09-14 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8551#discussion_r39460837 --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/LDAExample.scala --- @@ -186,121 +186,52 @@ object LDAExample { * Load

[GitHub] spark pull request: [SPARK-10393] use ML pipeline in LDA example

2015-09-14 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8551#discussion_r39460880 --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/LDAExample.scala --- @@ -186,121 +186,52 @@ object LDAExample { * Load

[GitHub] spark pull request: [SPARK-10393] use ML pipeline in LDA example

2015-09-14 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8551#discussion_r39460759 --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/LDAExample.scala --- @@ -186,121 +186,52 @@ object LDAExample { * Load

[GitHub] spark pull request: [SPARK-10393] use ML pipeline in LDA example

2015-09-14 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8551#discussion_r39460809 --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/LDAExample.scala --- @@ -186,121 +186,52 @@ object LDAExample { * Load

[GitHub] spark pull request: [SPARK-9962][ML] Decision Tree training: prevN...

2015-09-14 Thread feynmanliang
Github user feynmanliang commented on the pull request: https://github.com/apache/spark/pull/8541#issuecomment-140234135 Changes LGTM (renaming public method but the class is package private), but why do we need this (CC @jkbradley)? `prevNodeIdsForInstances` is never persisted in

[GitHub] spark pull request: [SPARK-10259] [ML] Add @since annotation to ml...

2015-09-14 Thread feynmanliang
Github user feynmanliang commented on the pull request: https://github.com/apache/spark/pull/8534#issuecomment-140182675 I believe the `copy` overrides were added by [this commit](https://github.com/taishi-oss/spark/commit/43c7ec6384e51105dedf3a53354b6a3732cc27b2) which went in to

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-11 Thread feynmanliang
Github user feynmanliang commented on the pull request: https://github.com/apache/spark/pull/8648#issuecomment-139633496 @avulanov The benchmarking code is written against a WIP implementation; I sent you a PR for bringing it up to date. LBFGS is taking significantly long

[GitHub] spark pull request: [SPARK-10259] [ML] Add @since annotation to ml...

2015-09-10 Thread feynmanliang
Github user feynmanliang commented on the pull request: https://github.com/apache/spark/pull/8534#issuecomment-139311013 Not all methods in a file were introduced in the same version (e.g. see my comment about DecisionTreeClassifer.copy). Can you make sure that the annotation

[GitHub] spark pull request: [SPARK-10259] [ML] Add @since annotation to ml...

2015-09-10 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8534#discussion_r39184933 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala --- @@ -90,6 +92,9 @@ final class DecisionTreeClassifier

[GitHub] spark pull request: [SPARK-10259] [ML] Add @since annotation to ml...

2015-09-10 Thread feynmanliang
Github user feynmanliang commented on the pull request: https://github.com/apache/spark/pull/8534#issuecomment-139308325 @taishi-oss no worries, thank you for your help! reviewing now --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-10259] [ML] Add @since annotation to ml...

2015-09-10 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8534#discussion_r39184277 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala --- @@ -87,10 +98,12 @@ final class

[GitHub] spark pull request: [SPARK-9834][MLLIB] implement weighted least s...

2015-09-08 Thread feynmanliang
Github user feynmanliang commented on the pull request: https://github.com/apache/spark/pull/8588#issuecomment-138673319 jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-9834][MLLIB] implement weighted least s...

2015-09-08 Thread feynmanliang
Github user feynmanliang commented on the pull request: https://github.com/apache/spark/pull/8588#issuecomment-138673301 LGTM, did not check low level implementation --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: [SPARK-10259] [ML] Add @since annotation to ml...

2015-09-08 Thread feynmanliang
Github user feynmanliang commented on the pull request: https://github.com/apache/spark/pull/8534#issuecomment-138616552 I only pointed out the issues in the first few files, do you mind fixing them in all the files? --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-10259] [ML] Add @since annotation to ml...

2015-09-08 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8534#discussion_r38945344 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala --- @@ -35,6 +35,8 @@ import

[GitHub] spark pull request: [SPARK-10259] [ML] Add @since annotation to ml...

2015-09-08 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8534#discussion_r38945223 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala --- @@ -90,6 +92,9 @@ final class DecisionTreeClassifier

[GitHub] spark pull request: [SPARK-10259] [ML] Add @since annotation to ml...

2015-09-08 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8534#discussion_r38945200 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala --- @@ -90,6 +92,9 @@ final class DecisionTreeClassifier

[GitHub] spark pull request: [SPARK-10259] [ML] Add @since annotation to ml...

2015-09-08 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8534#discussion_r38944990 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala --- @@ -35,6 +35,8 @@ import

[GitHub] spark pull request: [SPARK-10480] [ML] Fix ML.LinearRegressionMode...

2015-09-08 Thread feynmanliang
Github user feynmanliang commented on the pull request: https://github.com/apache/spark/pull/8641#issuecomment-138613591 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-10480] [ML] Fix ML.LinearRegressionMode...

2015-09-08 Thread feynmanliang
Github user feynmanliang commented on the pull request: https://github.com/apache/spark/pull/8641#issuecomment-138613226 Please add [SPARK-10479] to PR title --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [Minor] [ML] Fix ML.LinearRegressionModel.copy...

2015-09-07 Thread feynmanliang
Github user feynmanliang commented on the pull request: https://github.com/apache/spark/pull/8641#issuecomment-138383247 Yep LGTM, just noticed that LogisticRegression doesn't copy the summary object, SPARK-10479 will track that work --- If your project is set up for it, yo

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-07 Thread feynmanliang
GitHub user feynmanliang opened a pull request: https://github.com/apache/spark/pull/8648 [SPARK-10478][ML] Performance, organization, and style improvements for multi-layer perceptron * Changes manual iterations into higher-performance `UFunc`s, vectorized operations, and

[GitHub] spark pull request: [SPARK-7128][ML] Bagging (bootstrap aggregatin...

2015-09-05 Thread feynmanliang
GitHub user feynmanliang opened a pull request: https://github.com/apache/spark/pull/8618 [SPARK-7128][ML] Bagging (bootstrap aggregating) ensemble method * Categorical is ~7x slower, suggestions on speeding up the SQL are appreciated You can merge this pull request into a Git

[GitHub] spark pull request: [SPARK-4980] [MLlib] Add decay factors to stre...

2015-09-04 Thread feynmanliang
Github user feynmanliang commented on the pull request: https://github.com/apache/spark/pull/8022#issuecomment-137896891 LGTM after these changes and pending tests CC @mengxr @freeman-lab --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-4980] [MLlib] Add decay factors to stre...

2015-09-04 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8022#discussion_r38806675 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingLinearRegressionWithSGD.scala --- @@ -107,4 +113,16 @@ class

[GitHub] spark pull request: [SPARK-4980] [MLlib] Add decay factors to stre...

2015-09-04 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8022#discussion_r38806669 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingLinearAlgorithm.scala --- @@ -59,11 +76,14 @@ import

[GitHub] spark pull request: [SPARK-4980] [MLlib] Add decay factors to stre...

2015-09-04 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8022#discussion_r38806663 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingDecay.scala --- @@ -0,0 +1,117 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-4980] [MLlib] Add decay factors to stre...

2015-09-04 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8022#discussion_r38806665 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingDecay.scala --- @@ -0,0 +1,117 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-4980] [MLlib] Add decay factors to stre...

2015-09-04 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8022#discussion_r38806661 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingDecay.scala --- @@ -0,0 +1,117 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-4980] [MLlib] Add decay factors to stre...

2015-09-04 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8022#discussion_r38806648 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingDecay.scala --- @@ -0,0 +1,117 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-4980] [MLlib] Add decay factors to stre...

2015-09-04 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8022#discussion_r38806603 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/classification/StreamingLogisticRegressionWithSGD.scala --- @@ -101,4 +107,14 @@ class

[GitHub] spark pull request: [SPARK-4980] [MLlib] Add decay factors to stre...

2015-09-04 Thread feynmanliang
Github user feynmanliang commented on the pull request: https://github.com/apache/spark/pull/8022#issuecomment-137896575 Streaming KMeans uses `decayFactor` and I think it's important we maintain consistency --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: [SPARK-9834][MLLIB] implement weighted least s...

2015-09-04 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8588#discussion_r38775271 --- Diff: mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala --- @@ -0,0 +1,295 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-9834][MLLIB] implement weighted least s...

2015-09-04 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8588#discussion_r38774976 --- Diff: mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala --- @@ -0,0 +1,295 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-9834][MLLIB] implement weighted least s...

2015-09-04 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8588#discussion_r38774759 --- Diff: mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala --- @@ -0,0 +1,295 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-9834][MLLIB] implement weighted least s...

2015-09-04 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8588#discussion_r38773970 --- Diff: mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala --- @@ -0,0 +1,295 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-9834][MLLIB] implement weighted least s...

2015-09-04 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8588#discussion_r38773848 --- Diff: mllib/src/test/scala/org/apache/spark/ml/optim/WeightedLeastSquaresSuite.scala --- @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-4980] [MLlib] Add decay factors to stre...

2015-09-03 Thread feynmanliang
Github user feynmanliang commented on the pull request: https://github.com/apache/spark/pull/8022#issuecomment-137534276 Made another pass --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-4980] [MLlib] Add decay factors to stre...

2015-09-03 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8022#discussion_r38678985 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/regression/StreamingLinearRegressionSuite.scala --- @@ -194,4 +196,204 @@ class

[GitHub] spark pull request: [SPARK-4980] [MLlib] Add decay factors to stre...

2015-09-03 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8022#discussion_r38676398 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/regression/StreamingLinearRegressionSuite.scala --- @@ -17,6 +17,8 @@ package

[GitHub] spark pull request: [SPARK-4980] [MLlib] Add decay factors to stre...

2015-09-03 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8022#discussion_r38676377 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingLinearRegressionWithSGD.scala --- @@ -107,4 +113,16 @@ class

[GitHub] spark pull request: [SPARK-4980] [MLlib] Add decay factors to stre...

2015-09-03 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8022#discussion_r38676215 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingDecay.scala --- @@ -0,0 +1,99 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-4980] [MLlib] Add decay factors to stre...

2015-09-03 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8022#discussion_r38675979 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingDecay.scala --- @@ -0,0 +1,99 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-4980] [MLlib] Add decay factors to stre...

2015-09-03 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8022#discussion_r38675249 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingDecay.scala --- @@ -0,0 +1,99 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-4980] [MLlib] Add decay factors to stre...

2015-09-03 Thread feynmanliang
Github user feynmanliang commented on a diff in the pull request: https://github.com/apache/spark/pull/8022#discussion_r38674682 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingDecay.scala --- @@ -0,0 +1,99 @@ +/* + * Licensed to the Apache

<    1   2   3   4   5   6   7   8   9   10   >