[GitHub] spark pull request: [SPARK-7780][MLLIB] intercept in logisticregre...

2016-01-26 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/10788#issuecomment-175341628 Thanks. Merged into master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-10780] [ML] Set initialModel in KMeans ...

2016-01-26 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/8972#issuecomment-175342081 @jayantshekhar Still working on it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-10780] [ML] Set initialModel in KMeans ...

2016-01-26 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/8972#issuecomment-175342352 We add a private API in LOR to do the same thing, and would like to generalize it for other algorithms. +cc @holdenk --- If your project is set up for it, you can

[GitHub] spark pull request: [SPARK-13372] [ML] Fix LogisticRegression when...

2016-02-18 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/11247#issuecomment-185826603 @yanboliang In #7080, It was intentionally made that `standardization = false` will run the same route as `standardization = true` without regularization

[GitHub] spark pull request: [SPARK-13379] [MLlib] Fix MLlib LogisticRegres...

2016-02-18 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/11258#issuecomment-186088729 +1 on copying the tests from ML LOR tests. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-13379] [MLlib] Fix MLlib LogisticRegres...

2016-02-21 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/11258#issuecomment-186994891 The default value in R's GLMNET is `1E-7`, and the default value in original LBFGS implementation is `1E-8`. In order to provide better and consistent result, let's

[GitHub] spark pull request: SPARK-9926: Parallelize partition logic in Uni...

2016-03-15 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/11242#issuecomment-197052153 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-9926: Parallelize partition logic in Uni...

2016-03-18 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/11242#discussion_r56432616 --- Diff: core/src/main/scala/org/apache/spark/rdd/UnionRDD.scala --- @@ -62,7 +64,21 @@ class UnionRDD[T: ClassTag]( var rdds: Seq[RDD[T

[GitHub] spark pull request: [SPARK-13777] [ML] Remove constant features fr...

2016-03-14 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/11610#discussion_r55964768 --- Diff: mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala --- @@ -108,6 +101,57 @@ private[ml] class WeightedLeastSquares

[GitHub] spark pull request: [SPARK-13777] [ML] Remove constant features fr...

2016-03-14 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/11610#discussion_r55964808 --- Diff: mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala --- @@ -108,6 +101,57 @@ private[ml] class WeightedLeastSquares

[GitHub] spark pull request: [SPARK-10780][ML][WIP] Add initial model to km...

2016-03-14 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r55966718 --- Diff: mllib/src/test/scala/org/apache/spark/ml/clustering/KMeansSuite.scala --- @@ -30,11 +33,13 @@ class KMeansSuite extends SparkFunSuite

[GitHub] spark pull request: [SPARK-13777] [ML] Remove constant features fr...

2016-03-14 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/11610#discussion_r55963818 --- Diff: mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala --- @@ -108,6 +101,57 @@ private[ml] class WeightedLeastSquares

[GitHub] spark pull request: [SPARK-13777] [ML] Remove constant features fr...

2016-03-14 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/11610#discussion_r55964546 --- Diff: mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala --- @@ -108,6 +101,57 @@ private[ml] class WeightedLeastSquares

[GitHub] spark pull request: [SPARK-13777] [ML] Remove constant features fr...

2016-03-14 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/11610#discussion_r55963496 --- Diff: mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala --- @@ -108,6 +101,57 @@ private[ml] class WeightedLeastSquares

[GitHub] spark pull request: [SPARK-13777] [ML] Remove constant features fr...

2016-03-14 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/11610#discussion_r55964211 --- Diff: mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala --- @@ -108,6 +101,57 @@ private[ml] class WeightedLeastSquares

[GitHub] spark pull request: [SPARK-13777] [ML] Remove constant features fr...

2016-03-14 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/11610#discussion_r55964826 --- Diff: mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala --- @@ -108,6 +101,57 @@ private[ml] class WeightedLeastSquares

[GitHub] spark pull request: [SPARK-13777] [ML] Remove constant features fr...

2016-03-14 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/11610#discussion_r55964451 --- Diff: mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala --- @@ -108,6 +101,57 @@ private[ml] class WeightedLeastSquares

[GitHub] spark pull request: [SPARK-10780][ML][WIP] Add initial model to km...

2016-03-14 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r55965910 --- Diff: mllib/src/test/scala/org/apache/spark/ml/clustering/KMeansSuite.scala --- @@ -118,6 +138,11 @@ object KMeansSuite { sql.createDataFrame

[GitHub] spark pull request: [SPARK-10780][ML][WIP] Add initial model to km...

2016-03-14 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r55965857 --- Diff: mllib/src/test/scala/org/apache/spark/ml/clustering/KMeansSuite.scala --- @@ -118,6 +138,11 @@ object KMeansSuite { sql.createDataFrame

[GitHub] spark pull request: [SPARK-10780][ML][WIP] Add initial model to km...

2016-03-14 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r55967102 --- Diff: mllib/src/test/scala/org/apache/spark/ml/clustering/KMeansSuite.scala --- @@ -108,6 +113,21 @@ class KMeansSuite extends SparkFunSuite

[GitHub] spark pull request: SPARK-9926: Parallelize partition logic in Uni...

2016-03-19 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/11242#discussion_r56747153 --- Diff: core/src/main/scala/org/apache/spark/rdd/UnionRDD.scala --- @@ -62,7 +64,23 @@ class UnionRDD[T: ClassTag]( var rdds: Seq[RDD[T

[GitHub] spark pull request: SPARK-9926: Parallelize partition logic in Uni...

2016-03-19 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/11242#discussion_r56747209 --- Diff: core/src/main/scala/org/apache/spark/rdd/UnionRDD.scala --- @@ -62,7 +64,23 @@ class UnionRDD[T: ClassTag]( var rdds: Seq[RDD[T

[GitHub] spark pull request: [SPARK-13777] [ML] Remove constant features fr...

2016-03-16 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/11610#issuecomment-197213572 I'm not an expert in this area, but after thinking it more, I don't think we can use `DGELSD` which minimizes `||b - A*x||` using the singular value decomposition (SVD

[GitHub] spark pull request: SPARK-9926: Parallelize partition logic in Uni...

2016-03-18 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/11242#issuecomment-198535147 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-13777] [ML] Remove constant features fr...

2016-03-14 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/11610#issuecomment-196662210 I will vote for approach 1. SVD will be the most stable algorithm, but slowest O(mn^2 + n^3) compared with Cholesky O(mn^2) or QR O(mn^2 - n^3/3) decomposition

[GitHub] spark pull request: [SPARK-13944][ML][MLLIB] add the mllib-local b...

2016-04-07 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12241#issuecomment-207122586 @JoshRosen Thanks. It's working now. @holdenk I thought each jar needs to have its own `package-info.java` to generate the Java doc and Scala doc. I'm now

[GitHub] spark pull request: [SPARK-13944][ML][MLLIB] add the mllib-local b...

2016-04-07 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12241#issuecomment-207075404 +cc @JoshRosen who may be able to give me insight on the MiMa failure caused by adding new jar. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: SPARK-9926: Parallelize partition logic in Uni...

2016-04-07 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/11242#issuecomment-207063759 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-13944][ML][MLLIB] add the mllib-local b...

2016-04-07 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12241#issuecomment-207069702 +cc @mengxr @jkbradley @srowen @holdenk @MLnick --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-13944][ML][MLLIB] add the mllib-local b...

2016-04-07 Thread dbtsai
GitHub user dbtsai opened a pull request: https://github.com/apache/spark/pull/12241 [SPARK-13944][ML][MLLIB] add the mllib-local build to maven pom ## What changes were proposed in this pull request? In order to separate the linear algebra, and vector matrix classes

[GitHub] spark pull request: SPARK-9926: Parallelize partition logic in Uni...

2016-04-07 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/11242#issuecomment-207140497 add to whitelist --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-14427][SQL] Support persisting partitio...

2016-04-06 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12204#issuecomment-206689374 +@rdblue --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-14390][GraphX] Make initialization step...

2016-04-08 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12159#issuecomment-207632470 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-14498][ML][PYTHON][SQL] Many cleanups t...

2016-04-08 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12266#issuecomment-207692381 Thanks. Merged into master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-14462][ML][MLLIB] add the mllib-local b...

2016-04-08 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12241#issuecomment-207632064 This is the minimal change for creating a new jar build. Let's wait the result of Jenkins. We'll move the code in a separate PR once this is merged. Thanks

[GitHub] spark pull request: [SPARK-14462][ML][MLLIB] add the mllib-local b...

2016-04-08 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12241#issuecomment-207632164 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-14498][ML][PYTHON][SQL] Many cleanups t...

2016-04-08 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12266#issuecomment-207663956 Both looks good to me. Thanks. I'll go ahead and merge it soon. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-13944][ML][WIP] Separate out local line...

2016-04-08 Thread dbtsai
Github user dbtsai closed the pull request at: https://github.com/apache/spark/pull/12172 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-14462][ML][MLLIB] add the mllib-local b...

2016-04-08 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12241#issuecomment-207349784 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-14549][ML][WIP] Copy the Vector and Mat...

2016-04-11 Thread dbtsai
GitHub user dbtsai opened a pull request: https://github.com/apache/spark/pull/12317 [SPARK-14549][ML][WIP] Copy the Vector and Matrix classes from mllib to ml in mllib-local ## What changes were proposed in this pull request? This task will copy the Vector and Matrix

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-11 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12259#issuecomment-208229364 + @mengxr @rxin In [SPARK-13944](https://issues.apache.org/jira/browse/SPARK-13944), the `matrix` and `vector` classes will be moved out to `spark-mllib-local

[GitHub] spark pull request: [SPARK-14462][ML][MLLIB] Add the mllib-local b...

2016-04-11 Thread dbtsai
GitHub user dbtsai opened a pull request: https://github.com/apache/spark/pull/12298 [SPARK-14462][ML][MLLIB] Add the mllib-local build to maven pom ## What changes were proposed in this pull request? In order to separate the linear algebra, and vector matrix classes

[GitHub] spark pull request: [SPARK-14462] [HOTFIX] Let DummyTestingSuite i...

2016-04-11 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12276#issuecomment-208234927 @tedyu Thanks. I created a new PR to address this issue. https://github.com/apache/spark/pull/12298 --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-14462][ML][MLLIB] add the mllib-local b...

2016-04-08 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12241#discussion_r59081171 --- Diff: core/pom.xml --- @@ -35,6 +35,11 @@ http://spark.apache.org/ + org.apache.spark --- End diff -- I

[GitHub] spark pull request: [SPARK-14462][ML][MLLIB] add the mllib-local b...

2016-04-08 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12241#discussion_r59081509 --- Diff: dev/sparktestsupport/modules.py --- @@ -256,9 +256,21 @@ def __hash__(self): ) +mllib_local = Module( +name="

[GitHub] spark pull request: [SPARK-14462][ML][MLLIB] add the mllib-local b...

2016-04-08 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12241#discussion_r59079729 --- Diff: dev/sparktestsupport/modules.py --- @@ -256,9 +256,21 @@ def __hash__(self): ) +mllib_local = Module( +name="

[GitHub] spark pull request: SPARK-9926: Parallelize partition logic in Uni...

2016-04-07 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/11242#issuecomment-207184221 LGTM. This PR dramatically improves our s3 performance at Netflix. @andrewor14 @srowen @JoshRosen @davies @marmbrus @yhuai, any further feedback? Thanks

[GitHub] spark pull request: SPARK-9926: Parallelize partition logic in Uni...

2016-03-19 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/11242#discussion_r56432618 --- Diff: core/src/main/scala/org/apache/spark/rdd/UnionRDD.scala --- @@ -62,7 +64,21 @@ class UnionRDD[T: ClassTag]( var rdds: Seq[RDD[T

[GitHub] spark pull request: SPARK-9926: Parallelize partition logic in Uni...

2016-03-19 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/11242#discussion_r56719728 --- Diff: core/src/main/scala/org/apache/spark/rdd/UnionRDD.scala --- @@ -62,7 +64,23 @@ class UnionRDD[T: ClassTag]( var rdds: Seq[RDD[T

[GitHub] spark pull request: [SPARK-13927][MLLIB] add row/column iterator t...

2016-03-19 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/11757#issuecomment-197553903 LGTM. Merged into master. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: SPARK-9926: Parallelize partition logic in Uni...

2016-03-19 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/11242#discussion_r56432610 --- Diff: core/src/main/scala/org/apache/spark/rdd/UnionRDD.scala --- @@ -62,7 +64,21 @@ class UnionRDD[T: ClassTag]( var rdds: Seq[RDD[T

[GitHub] spark pull request: SPARK-9926: Parallelize partition logic in Uni...

2016-03-21 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/11242#discussion_r56891305 --- Diff: core/src/main/scala/org/apache/spark/rdd/UnionRDD.scala --- @@ -62,7 +64,23 @@ class UnionRDD[T: ClassTag]( var rdds: Seq[RDD[T

[GitHub] spark pull request: [SPARK-12555][SQL] Result should not be corrup...

2016-03-25 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/11623#issuecomment-201383428 add to whitelist --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-13944][ML][WIP] Separate out local line...

2016-04-05 Thread dbtsai
GitHub user dbtsai opened a pull request: https://github.com/apache/spark/pull/12172 [SPARK-13944][ML][WIP] Separate out local linear algebra as a standalone module without Spark dependency ## What changes were proposed in this pull request? Separate out linear algebra

[GitHub] spark pull request: [SPARK-14390][GraphX] Make initialization step...

2016-04-04 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12159#issuecomment-205538259 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-14615][ML][WIP] Use the new ML Vector a...

2016-04-22 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12627#issuecomment-213625305 Waiting https://github.com/apache/spark/pull/12259 to be merged. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-14615][ML][WIP] Use the new ML Vector a...

2016-04-22 Thread dbtsai
GitHub user dbtsai opened a pull request: https://github.com/apache/spark/pull/12627 [SPARK-14615][ML][WIP] Use the new ML Vector and Matrix in the ML pipeline based algorithms ## What changes were proposed in this pull request? Once SPARK-14487 and SPARK-14549 are merged

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-22 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12259#discussion_r60811552 --- Diff: mllib/src/test/scala/org/apache/spark/ml/linalg/VectorUDTSuite.scala --- @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-14732][ML] spark.ml GaussianMixture sho...

2016-04-25 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12593#discussion_r60996366 --- Diff: mllib-local/src/main/scala/org/apache/spark/ml/stat/distribution/MultivariateGaussian.scala --- @@ -0,0 +1,131 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-14732][ML] spark.ml GaussianMixture sho...

2016-04-25 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12593#issuecomment-214543499 Do you want to add the test code for both `MultivariateGaussian.scala` and `Utils.scala`? --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-14732][ML] spark.ml GaussianMixture sho...

2016-04-25 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12593#discussion_r60996174 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala --- @@ -17,17 +17,19 @@ package org.apache.spark.ml.clustering

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-25 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12416#issuecomment-214548556 @vanzin I guess using java doc `/* @since /*` style, it's harder to document the public variables in the constructor. Just my 0.02 cents. --- If your project is set

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-25 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12416#discussion_r61004014 --- Diff: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Matrices.scala --- @@ -415,13 +443,14 @@ object DenseMatrix

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-25 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12416#discussion_r61004682 --- Diff: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala --- @@ -561,10 +589,11 @@ object DenseVector { * @param indices index

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-25 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12416#issuecomment-214560261 @pravingadakh I think you need to update `dev/sparktestsupport/modules.py` for Jenkins build as well. Thanks. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-25 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12259#issuecomment-214562802 I think `override def pyUDT: String = "pyspark.mllib.linalg.MatrixUDT"` has to be changed; otherwise, this will cause inconsistent result. Maybe this can

[GitHub] spark pull request: [SPARK-10780][ML][WIP] Add initial model to km...

2016-04-25 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r61017723 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -198,6 +231,17 @@ object KMeansModel extends MLReadable[KMeansModel

[GitHub] spark pull request: [SPARK-10780][ML][WIP] Add initial model to km...

2016-04-25 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r61017539 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -171,12 +192,23 @@ object KMeansModel extends MLReadable[KMeansModel

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-25 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12416#discussion_r61006720 --- Diff: project/SparkBuild.scala --- @@ -53,7 +53,7 @@ object BuildCommons { core, graphx, mllib, mllibLocal, repl, networkCommon, networkShuffle

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-25 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12416#discussion_r61003884 --- Diff: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Matrices.scala --- @@ -154,11 +172,12 @@ sealed trait Matrix extends Serializable

[GitHub] spark pull request: [SPARK-10780][ML][WIP] Add initial model to km...

2016-04-25 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r61017075 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -137,6 +138,17 @@ class KMeansModel private[ml] ( @Since("

[GitHub] spark pull request: [SPARK-14734][ML][MLLIB] Added toNew, fromNew ...

2016-04-20 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12504#discussion_r60489912 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/Matrices.scala --- @@ -158,6 +159,13 @@ sealed trait Matrix extends Serializable

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-22 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12416#issuecomment-213476037 Ping @pravingadakh for update. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-22 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12259#discussion_r60757277 --- Diff: mllib/src/main/scala/org/apache/spark/ml/linalg/MatrixUDT.scala --- @@ -0,0 +1,112 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-22 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12259#discussion_r60776248 --- Diff: mllib/src/test/scala/org/apache/spark/ml/linalg/MatrixUDTSuite.scala --- @@ -0,0 +1,68 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-22 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12259#discussion_r60775018 --- Diff: mllib/src/main/scala/org/apache/spark/ml/linalg/MatrixUDT.scala --- @@ -0,0 +1,112 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-22 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12259#discussion_r60775523 --- Diff: mllib/src/main/scala/org/apache/spark/ml/linalg/MatrixUDT.scala --- @@ -0,0 +1,112 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-22 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12259#discussion_r60775630 --- Diff: mllib/src/main/scala/org/apache/spark/ml/linalg/VectorUDT.scala --- @@ -0,0 +1,103 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-22 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12259#discussion_r60776001 --- Diff: mllib/src/test/scala/org/apache/spark/ml/linalg/MatrixUDTSuite.scala --- @@ -0,0 +1,68 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-14732][ML] spark.ml GaussianMixture sho...

2016-04-26 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12593#issuecomment-214923459 LGTM. Merged into master. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-26 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12259#discussion_r61183805 --- Diff: mllib/src/test/scala/org/apache/spark/ml/linalg/MatrixUDTSuite.scala --- @@ -0,0 +1,68 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-29 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12416#issuecomment-215829721 @pravingadakh The changes in `modules.py` is just helping Jenkins to understand the dependencies, and I don't get it why this will break the build

[GitHub] spark pull request: [SPARK-14615][ML][WIP] Use the new ML Vector a...

2016-04-29 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12627#issuecomment-215808336 @mengxr working on this now. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215573656 Jenkins, please test it again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215573882 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215573910 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215573982 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215561453 You may use some fake data to demonstrate how this PR improves. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215564655 Seems to be very promising. Since 2.0 window will be closed soon, it's unlikely to get into 2.0. Let's target 2.1 --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-28 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12259#discussion_r61473084 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/UDTRegistration.scala --- @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12416#issuecomment-215587478 Thanks. Merged into master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-14732][ML] spark.ml GaussianMixture sho...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12593#issuecomment-215588256 @jkbradley The `@Since` annotation was merged https://github.com/apache/spark/pull/12416 Could you submit a followup PR? Thanks. --- If your project is set up

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215539228 You need to manually add it into MiMa exclude. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215539469 Make it build first, and then we can start to review the code. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12416#issuecomment-215545617 @pravingadakh Yes. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-10780][ML][WIP] Add initial model to km...

2016-04-25 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r61018069 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -258,6 +290,27 @@ class KMeans @Since("1.5.0") ( @Si

[GitHub] spark pull request: [SPARK-10780][ML][WIP] Add initial model to km...

2016-04-25 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r61018493 --- Diff: mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedGeneralTypeParams.scala --- @@ -0,0 +1,34 @@ +/* --- End diff -- I

[GitHub] spark pull request: [SPARK-10780][ML][WIP] Add initial model to km...

2016-04-25 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r61018699 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -274,6 +339,12 @@ class KMeans @Since("

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-26 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12416#discussion_r61139825 --- Diff: dev/run-tests.py --- @@ -128,7 +128,7 @@ def determine_tags_to_exclude(changed_modules): tags = [] for m in modules.all_modules

[GitHub] spark pull request: [SPARK-14734][ML][MLLIB] Added asML, fromML me...

2016-04-21 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12504#issuecomment-213162793 Thanks. Merged into master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

<    5   6   7   8   9   10   11   12   13   14   >