Re: news20-binary classification with LogisticRegressionWithSGD

2014-06-17 Thread Xiangrui Meng
Hi Makoto, Are you using Spark 1.0 or 0.9? Could you go to the executor tab of the web UI and check the driver's memory? treeAggregate is not part of 1.0. Best, Xiangrui On Tue, Jun 17, 2014 at 2:00 PM, Xiangrui Meng men...@gmail.com wrote: Hi DB, treeReduce (treeAggregate) is a feature I'm

Re: news20-binary classification with LogisticRegressionWithSGD

2014-06-17 Thread Xiangrui Meng
--- My Blog: https://www.dbtsai.com LinkedIn: https://www.linkedin.com/in/dbtsai On Tue, Jun 17, 2014 at 2:00 PM, Xiangrui Meng men...@gmail.com wrote: Hi DB, treeReduce (treeAggregate) is a feature I'm testing now. It is a compromise between current reduce and butterfly

Re: news20-binary classification with LogisticRegressionWithSGD

2014-06-17 Thread Xiangrui Meng
Makoto, please use --driver-memory 8G when you launch spark-shell. -Xiangrui On Tue, Jun 17, 2014 at 4:49 PM, Xiangrui Meng men...@gmail.com wrote: DB, Yes, reduce and aggregate are linear. Makoto, dense vectors are used to in aggregation. If you have 32 partitions and each one sending

Re: MLlib-Missing Regularization Parameter and Intercept for Logistic Regression

2014-06-16 Thread Xiangrui Meng
as in the example you mentioned, but the source code reveals that the intercept is also penalized if one is included, which is usually inappropriate. The developer should fix this problem. Best, Congrui -Original Message- From: Xiangrui Meng [mailto:men...@gmail.com] Sent: Friday, June 13, 2014

Re: MLlib-Missing Regularization Parameter and Intercept for Logistic Regression

2014-06-14 Thread Xiangrui Meng
1. examples/src/main/scala/org/apache/spark/examples/mllib/BinaryClassification.scala contains example code that shows how to set regParam. 2. A static method with more than 3 parameters becomes hard to remember and hard to maintain. Please use LogistricRegressionWithSGD's default constructor

Re: Convert text into tfidf vectors for Classification

2014-06-13 Thread Xiangrui Meng
You can create tf vectors and then use RowMatrix.computeColumnSummaryStatistics to get df (numNonzeros). For tokenizer and stemmer, you can use scalanlp/chalk. Yes, it is worth having a simple interface for it. -Xiangrui On Fri, Jun 13, 2014 at 1:21 AM, Stuti Awasthi stutiawas...@hcl.com wrote:

Re: Not fully cached when there is enough memory

2014-06-11 Thread Xiangrui Meng
Could you try to click one that RDD and see the storage info per partition? I tried continuously caching RDDs, so new ones kick old ones out when there is not enough memory. I saw similar glitches but the storage info per partition is correct. If you find a way to reproduce this error, please

Re: How to process multiple classification with SVM in MLlib

2014-06-09 Thread Xiangrui Meng
For broadcast data, please read http://spark.apache.org/docs/latest/programming-guide.html#broadcast-variables . For one-vs-all, please read https://en.wikipedia.org/wiki/Multiclass_classification . -Xiangrui On Mon, Jun 9, 2014 at 7:24 AM, littlebird cxp...@163.com wrote: Thank you for your

Re: Classpath errors with Breeze

2014-06-08 Thread Xiangrui Meng
Hi dlaw, You are using breeze-0.8.1, but the spark assembly jar depends on breeze-0.7. If the spark assembly jar comes the first on the classpath but the method from DenseMatrix is only available in breeze-0.8.1, you get NoSuchMethod. So, a) If you don't need the features in breeze-0.8.1, do not

Re: Classpath errors with Breeze

2014-06-08 Thread Xiangrui Meng
Hi Tobias, Which file system and which encryption are you using? Best, Xiangrui On Sun, Jun 8, 2014 at 10:16 PM, Xiangrui Meng men...@gmail.com wrote: Hi dlaw, You are using breeze-0.8.1, but the spark assembly jar depends on breeze-0.7. If the spark assembly jar comes the first

Re: How to process multiple classification with SVM in MLlib

2014-06-07 Thread Xiangrui Meng
At this time, you need to do one-vs-all manually for multiclass training. For your second question, if the algorithm is implemented in Java/Scala/Python and designed for single machine, you can broadcast the dataset to each worker, train models on workers. If the algorithm is implemented in a

Re: Native library can not be loaded when using Mllib PCA

2014-06-05 Thread Xiangrui Meng
For standalone and yarn mode, you need to install native libraries on all nodes. The best solution is installing them to /usr/lib/libblas.so.3 and /usr/lib/liblapack.so.3 . If your matrix is sparse, the native libraries cannot help because they are for dense linear algebra. You can create RDD

Re: IllegalArgumentException on calling KMeans.train()

2014-06-04 Thread Xiangrui Meng
Could you check whether the vectors have the same size? -Xiangrui On Wed, Jun 4, 2014 at 1:43 AM, bluejoe2008 bluejoe2...@gmail.com wrote: what does this exception mean? 14/06/04 16:35:15 ERROR executor.Executor: Exception in task ID 6 java.lang.IllegalArgumentException: requirement failed

Re: Logistic Regression MLLib Slow

2014-06-04 Thread Xiangrui Meng
80M by 4 should be about 2.5GB uncompressed. 10 iterations shouldn't take that long, even on a single executor. Besides what Matei suggested, could you also verify the executor memory in http://localhost:4040 in the Executors tab. It is very likely the executors do not have enough memory. In that

Re: Logistic Regression MLLib Slow

2014-06-04 Thread Xiangrui Meng
Hi Krishna, Specifying executor memory in local mode has no effect, because all of the threads run inside the same JVM. You can either try --driver-memory 60g or start a standalone server. Best, Xiangrui On Wed, Jun 4, 2014 at 7:28 PM, Xiangrui Meng men...@gmail.com wrote: 80M by 4 should

Re: Using String Dataset for Logistic Regression

2014-06-03 Thread Xiangrui Meng
Yes. MLlib 1.0 supports sparse input data for linear methods. -Xiangrui On Mon, Jun 2, 2014 at 11:36 PM, praveshjain1991 praveshjain1...@gmail.com wrote: I am not sure. I have just been using some numerical datasets. -- View this message in context:

Re: Using MLLib in Scala

2014-06-03 Thread Xiangrui Meng
Hi Suela, (Please subscribe our user mailing list and send your questions there in the future.) For your case, each file contains a column of numbers. So you can use `sc.textFile` to read them first, zip them together, and then create labeled points: val xx = sc.textFile(/path/to/ex2x.dat).map(x

Re: How to stop a running SparkContext in the proper way?

2014-06-03 Thread Xiangrui Meng
Did you try sc.stop()? On Tue, Jun 3, 2014 at 9:54 PM, MEETHU MATHEW meethu2...@yahoo.co.in wrote: Hi, I want to know how I can stop a running SparkContext in a proper way so that next time when I start a new SparkContext, the web UI can be launched on the same port 4040.Now when i quit the

Re: pyspark MLlib examples don't work with Spark 1.0.0

2014-05-31 Thread Xiangrui Meng
The documentation you looked at is not official, though it is from @pwendell's website. It was for the Spark SQL release. Please find the official documentation here: http://spark.apache.org/docs/latest/mllib-linear-methods.html#linear-support-vector-machine-svm It contains a working example

Re: Create/shutdown objects before/after RDD use (or: Non-serializable classes)

2014-05-31 Thread Xiangrui Meng
Hi Tobias, One hack you can try is: rdd.mapPartitions(iter = { val x = new X() iter.map(row = x.doSomethingWith(row)) ++ { x.shutdown(); Iterator.empty } }) Best, Xiangrui On Thu, May 29, 2014 at 11:38 PM, Tobias Pfeiffer t...@preferred.jp wrote: Hi, I want to use an object x in my RDD

Re: Error while launching ec2 spark cluster with HVM (r3.large)

2014-05-22 Thread Xiangrui Meng
Was the error message the same as you posted when you used `root` as the user id? Could you try this: 1) Do not specify user id. (Default would be `root`.) 2) If it fails in the middle, try `spark-ec2 --resume launch cluster` to continue launching the cluster. Best, Xiangrui On Thu, May

Re: Job Processing Large Data Set Got Stuck

2014-05-21 Thread Xiangrui Meng
Many OutOfMemoryErrors in the log. Is your data distributed evenly? -Xiangrui On Wed, May 21, 2014 at 11:23 AM, yxzhao yxz...@ualr.edu wrote: I run the pagerank example processing a large data set, 5GB in size, using 48 machines. The job got stuck at the time point: 14/05/20 21:32:17, as the

Re: Job Processing Large Data Set Got Stuck

2014-05-21 Thread Xiangrui Meng
If the RDD is cached, you can check its storage information in the Storage tab of the Web UI. On Wed, May 21, 2014 at 12:31 PM, yxzhao yxz...@ualr.edu wrote: Thanks Xiangrui, How to check and make sure the data is distributed evenly? Thanks again. On Wed, May 21, 2014 at 2:17 PM, Xiangrui Meng

Re: Inconsistent RDD Sample size

2014-05-21 Thread Xiangrui Meng
It doesn't guarantee the exact sample size. If you fix the random seed, it would return the same result every time. -Xiangrui On Wed, May 21, 2014 at 2:05 PM, glxc r.ryan.mcc...@gmail.com wrote: I have a graph and am trying to take a random sample of vertices without replacement, using the

Re: reading large XML files

2014-05-20 Thread Xiangrui Meng
Try sc.wholeTextFiles(). It reads the entire file into a string record. -Xiangrui On Tue, May 20, 2014 at 8:25 AM, Nathan Kronenfeld nkronenf...@oculusinfo.com wrote: We are trying to read some large GraphML files to use in spark. Is there an easy way to read XML-based files like this that

Re: breeze DGEMM slow in spark

2014-05-17 Thread Xiangrui Meng
You need to include breeze-natives or netlib:all to load the native libraries. Check the log messages to ensure native libraries are used, especially on the worker nodes. The easiest way to use OpenBLAS is copying the shared library to /usr/lib/libblas.so.3 and /usr/lib/liblapack.so.3. -Xiangrui

Re: spark on yarn-standalone, throws StackOverflowError and fails somtimes and succeed for the rest

2014-05-16 Thread Xiangrui Meng
Could you try `println(result.toDebugString())` right after `val result = ...` and attach the result? -Xiangrui On Fri, May 9, 2014 at 8:20 AM, phoenix bai mingzhi...@gmail.com wrote: after a couple of tests, I find that, if I use: val result = model.predict(prdctpairs) result.map(x =

Re: How to run the SVM and LogisticRegression

2014-05-16 Thread Xiangrui Meng
If you check out the master branch, there are some examples that can be used as templates under examples/src/main/scala/org/apache/spark/examples/mllib Best, Xiangrui On Wed, May 14, 2014 at 1:36 PM, yxzhao yxz...@ualr.edu wrote: Hello, I found the classfication algorithms SVM and

Re: Reading from .bz2 files with Spark

2014-05-16 Thread Xiangrui Meng
Hi Andrew, Could you try varying the minPartitions parameter? For example: val r = sc.textFile(/user/aa/myfile.bz2, 4).count val r = sc.textFile(/user/aa/myfile.bz2, 8).count Best, Xiangrui On Tue, May 13, 2014 at 9:08 AM, Xiangrui Meng men...@gmail.com wrote: Which hadoop version did you use

Re: Reading from .bz2 files with Spark

2014-05-16 Thread Xiangrui Meng
On Thu, May 15, 2014 at 3:48 PM, Xiangrui Meng men...@gmail.com wrote: Hi Andrew, Could you try varying the minPartitions parameter? For example: val r = sc.textFile(/user/aa/myfile.bz2, 4).count val r = sc.textFile(/user/aa/myfile.bz2, 8).count Best, Xiangrui On Tue, May 13, 2014 at 9:08

Re: Reading from .bz2 files with Spark

2014-05-16 Thread Xiangrui Meng
Hi Andrew, This is the JIRA I created: https://issues.apache.org/jira/browse/MAPREDUCE-5893 . Hopefully someone wants to work on it. Best, Xiangrui On Fri, May 16, 2014 at 6:47 PM, Xiangrui Meng men...@gmail.com wrote: Hi Andre, I could reproduce the bug with Hadoop 2.2.0. Some older version

Re: Reading from .bz2 files with Spark

2014-05-16 Thread Xiangrui Meng
, Xiangrui Meng men...@gmail.com wrote: Which hadoop version did you use? I'm not sure whether Hadoop v2 fixes the problem you described, but it does contain several fixes to bzip2 format. -Xiangrui On Wed, May 7, 2014 at 9:19 PM, Andrew Ash and...@andrewash.com wrote: Hi all, Is anyone

Re: Distribute jar dependencies via sc.AddJar(fileName)

2014-05-14 Thread Xiangrui Meng
I don't know whether this would fix the problem. In v0.9, you need `yarn-standalone` instead of `yarn-cluster`. See https://github.com/apache/spark/commit/328c73d037c17440c2a91a6c88b4258fbefa0c08 On Tue, May 13, 2014 at 11:36 PM, Xiangrui Meng men...@gmail.com wrote: Does v0.9 support yarn

Re: Accuracy in mllib BinaryClassificationMetrics

2014-05-13 Thread Xiangrui Meng
Hi Deb, feel free to add accuracy along with precision and recall. -Xiangrui On Mon, May 12, 2014 at 1:26 PM, Debasish Das debasish.da...@gmail.com wrote: Hi, I see precision and recall but no accuracy in mllib.evaluation.binary. Is it already under development or it needs to be added ?

Re: Reading from .bz2 files with Spark

2014-05-13 Thread Xiangrui Meng
Which hadoop version did you use? I'm not sure whether Hadoop v2 fixes the problem you described, but it does contain several fixes to bzip2 format. -Xiangrui On Wed, May 7, 2014 at 9:19 PM, Andrew Ash and...@andrewash.com wrote: Hi all, Is anyone reading and writing to .bz2 files stored in

Re: Turn BLAS on MacOSX

2014-05-12 Thread Xiangrui Meng
Those are warning messages instead of errors. You need to add netlib-java:all to use native BLAS/LAPACK. But it won't work if you include netlib-java:all in an assembly jar. It has to be a separate jar when you submit your job. For SGD, we only use level-1 BLAS, so I don't think native code is

Re: running SparkALS

2014-04-28 Thread Xiangrui Meng
Hi Diana, SparkALS is an example implementation of ALS. It doesn't call the ALS algorithm implemented in MLlib. M, U, and F are used to generate synthetic data. I'm updating the examples. In the meantime, you can take a look at the updated MLlib guide:

Re: Running out of memory Naive Bayes

2014-04-26 Thread Xiangrui Meng
How many labels does your dataset have? -Xiangrui On Sat, Apr 26, 2014 at 6:03 PM, DB Tsai dbt...@stanford.edu wrote: Which version of mllib are you using? For Spark 1.0, mllib will support sparse feature vector which will improve performance a lot when computing the distance between points

Re: Spark mllib throwing error

2014-04-24 Thread Xiangrui Meng
Could you share the command you used and more of the error message? Also, is it an MLlib specific problem? -Xiangrui On Thu, Apr 24, 2014 at 11:49 AM, John King usedforprinting...@gmail.com wrote: ./spark-shell: line 153: 17654 Killed $FWDIR/bin/spark-class org.apache.spark.repl.Main $@ Any

Re: Trying to use pyspark mllib NaiveBayes

2014-04-24 Thread Xiangrui Meng
Is your Spark cluster running? Try to start with generating simple RDDs and counting. -Xiangrui On Thu, Apr 24, 2014 at 11:38 AM, John King usedforprinting...@gmail.com wrote: I receive this error: Traceback (most recent call last): File stdin, line 1, in module File

Re: Trying to use pyspark mllib NaiveBayes

2014-04-24 Thread Xiangrui Meng
RDD (~7 million lines) and mapping. Just received this error when trying to classify. On Thu, Apr 24, 2014 at 4:32 PM, Xiangrui Meng men...@gmail.com wrote: Is your Spark cluster running? Try to start with generating simple RDDs and counting. -Xiangrui On Thu, Apr 24, 2014 at 11:38 AM, John

Re: Spark mllib throwing error

2014-04-24 Thread Xiangrui Meng
at 4:27 PM, Xiangrui Meng men...@gmail.com wrote: Could you share the command you used and more of the error message? Also, is it an MLlib specific problem? -Xiangrui On Thu, Apr 24, 2014 at 11:49 AM, John King usedforprinting...@gmail.com wrote: ./spark-shell: line 153: 17654 Killed $FWDIR

Re: Spark mllib throwing error

2014-04-24 Thread Xiangrui Meng
= data.filter(isEmpty) val points = empty.map(parsePoint) points.cache() val model = new NaiveBayes().run(points) On Thu, Apr 24, 2014 at 6:57 PM, Xiangrui Meng men...@gmail.com wrote: Do you mind sharing more code and error messages? The information you provided is too little

Re: Spark mllib throwing error

2014-04-24 Thread Xiangrui Meng
the lines of code mentioned in the error have anything to do with it? On Thu, Apr 24, 2014 at 7:54 PM, Xiangrui Meng men...@gmail.com wrote: I don't see anything wrong with your code. Could you do points.count() to see how many training examples you have? Also, make sure you don't have negative

Re: skip lines in spark

2014-04-23 Thread Xiangrui Meng
If the first partition doesn't have enough records, then it may not drop enough lines. Try rddData.zipWithIndex().filter(_._2 = 10L).map(_._1) It might trigger a job. Best, Xiangrui On Wed, Apr 23, 2014 at 9:46 AM, DB Tsai dbt...@stanford.edu wrote: Hi Chengi, If you just want to skip first

Re: Spark hangs when i call parallelize + count on a ArrayListbyte[] having 40k elements

2014-04-23 Thread Xiangrui Meng
How big is each entry, and how much memory do you have on each executor? You generated all data on driver and sc.parallelize(bytesList) will send the entire dataset to a single executor. You may run into I/O or memory issues. If the entries are generated, you should create a simple RDD

Re: skip lines in spark

2014-04-23 Thread Xiangrui Meng
? On Wed, Apr 23, 2014 at 9:51 AM, Xiangrui Meng men...@gmail.com wrote: If the first partition doesn't have enough records, then it may not drop enough lines. Try rddData.zipWithIndex().filter(_._2 = 10L).map(_._1) It might trigger a job. Best, Xiangrui On Wed, Apr 23, 2014 at 9:46

Re: Hadoop—streaming

2014-04-23 Thread Xiangrui Meng
PipedRDD is an RDD[String]. If you know how to parse each result line into (key, value) pairs, then you can call reduce after. piped.map(x = (key, value)).reduceByKey((v1, v2) = v) -Xiangrui On Wed, Apr 23, 2014 at 2:09 AM, zhxfl 291221...@qq.com wrote: Hello,we know Hadoop-streaming is use

Re: checkpointing without streaming?

2014-04-21 Thread Xiangrui Meng
Checkpoint clears dependencies. You might need checkpoint to cut a long lineage in iterative algorithms. -Xiangrui On Mon, Apr 21, 2014 at 11:34 AM, Diana Carroll dcarr...@cloudera.com wrote: I'm trying to understand when I would want to checkpoint an RDD rather than just persist to disk.

Re: SVD under spark/mllib/linalg

2014-04-11 Thread Xiangrui Meng
It was moved to mllib.linalg.distributed.RowMatrix. With RowMatrix, you can compute column summary statistics, gram matrix, covariance, SVD, and PCA. We will provide multiplication for distributed matrices, but not in v1.0. -Xiangrui On Fri, Apr 11, 2014 at 9:12 PM, wxhsdp wxh...@gmail.com wrote:

Re: Error when compiling spark in IDEA and best practice to use IDE?

2014-04-09 Thread Xiangrui Meng
After sbt/sbt gen-diea, do not import as an SBT project but choose open project and point it to the spark folder. -Xiangrui On Tue, Apr 8, 2014 at 10:45 PM, Sean Owen so...@cloudera.com wrote: I let IntelliJ read the Maven build directly and that works fine. -- Sean Owen | Director, Data

Re: ui broken in latest 1.0.0

2014-04-08 Thread Xiangrui Meng
Kuipers ko...@tresata.com wrote: got it thanks On Mon, Apr 7, 2014 at 4:08 PM, Xiangrui Meng men...@gmail.com wrote: This is fixed in https://github.com/apache/spark/pull/281. Please try again with the latest master. -Xiangrui On Mon, Apr 7, 2014 at 1:06 PM, Koert Kuipers ko

Re: ui broken in latest 1.0.0

2014-04-08 Thread Xiangrui Meng
...@gmail.com Closes #281 from andrewor14/ui-storage-fix and squashes the following commits: 408585a [Andrew Or] Fix storage UI bug On Mon, Apr 7, 2014 at 4:21 PM, Koert Kuipers ko...@tresata.com wrote: got it thanks On Mon, Apr 7, 2014 at 4:08 PM, Xiangrui Meng men...@gmail.com

Re: Issue with zip and partitions

2014-04-02 Thread Xiangrui Meng
From API docs: Zips this RDD with another one, returning key-value pairs with the first element in each RDD, second element in each RDD, etc. Assumes that the two RDDs have the *same number of partitions* and the *same number of elements in each partition* (e.g. one was made through a map on the

Re: Kmeans example reduceByKey slow

2014-03-24 Thread Xiangrui Meng
Hi Tsai, Could you share more information about the machine you used and the training parameters (runs, k, and iterations)? It can help solve your issues. Thanks! Best, Xiangrui On Sun, Mar 23, 2014 at 3:15 AM, Tsai Li Ming mailingl...@ltsai.com wrote: Hi, At the reduceBuyKey stage, it takes

Re: Kmeans example reduceByKey slow

2014-03-24 Thread Xiangrui Meng
. K=50. Here's the code I use: http://pastebin.com/2yXL3y8i , which is a copy-and-paste of the example. Thanks! On 24 Mar, 2014, at 2:46 pm, Xiangrui Meng men...@gmail.com wrote: Hi Tsai, Could you share more information about the machine you used and the training parameters (runs, k

Re: Kmeans example reduceByKey slow

2014-03-24 Thread Xiangrui Meng
K. Does the size of the input data matters for the example? Currently I have 50M rows. What is a reasonable size to demonstrate the capability of Spark? On 24 Mar, 2014, at 3:38 pm, Xiangrui Meng men...@gmail.com wrote: K = 50 is certainly a large number for k-means

Re: Kmeans example reduceByKey slow

2014-03-24 Thread Xiangrui Meng
/driver/spark-shell? Thanks! On 25 Mar, 2014, at 1:03 am, Xiangrui Meng men...@gmail.com wrote: Number of rows doesn't matter much as long as you have enough workers to distribute the work. K-means has complexity O(n * d * k), where n is number of points, d is the dimension, and k is the number

Re: possible bug in Spark's ALS implementation...

2014-03-18 Thread Xiangrui Meng
Sorry, the link was wrong. Should be https://github.com/apache/spark/pull/131 -Xiangrui On Tue, Mar 18, 2014 at 10:20 AM, Michael Allman m...@allman.ms wrote: Hi Xiangrui, I don't see how https://github.com/apache/spark/pull/161 relates to ALS. Can you explain? Also, thanks for addressing

Re: Feed KMeans algorithm with a row major matrix

2014-03-18 Thread Xiangrui Meng
Hi Jaonary, With the current implementation, you need to call Array.slice to make each row an Array[Double] and cache the result RDD. There is a plan to support block-wise input data and I will keep you informed. Best, Xiangrui On Tue, Mar 18, 2014 at 2:46 AM, Jaonary Rabarisoa

Re: possible bug in Spark's ALS implementation...

2014-03-18 Thread Xiangrui Meng
Glad to hear the speed-up. Wish we can improve the implementation further in the future. -Xiangrui On Tue, Mar 18, 2014 at 1:55 PM, Michael Allman m...@allman.ms wrote: I just ran a runtime performance comparison between 0.9.0-incubating and your als branch. I saw a 1.5x improvement in

Re: possible bug in Spark's ALS implementation...

2014-03-17 Thread Xiangrui Meng
The factor matrix Y is used twice in implicit ALS computation, one to compute global Y^T Y, and another to compute local Y_i^T C_i Y_i. -Xiangrui On Sun, Mar 16, 2014 at 1:18 PM, Matei Zaharia matei.zaha...@gmail.com wrote: On Mar 14, 2014, at 5:52 PM, Michael Allman m...@allman.ms wrote: I

Re: possible bug in Spark's ALS implementation...

2014-03-17 Thread Xiangrui Meng
Hi Michael, I made couple changes to implicit ALS. One gives faster construction of YtY (https://github.com/apache/spark/pull/161), which was merged into master. The other caches intermediate matrix factors properly (https://github.com/apache/spark/pull/165). They should give you the same result

Re: possible bug in Spark's ALS implementation...

2014-03-11 Thread Xiangrui Meng
Hi Michael, I can help check the current implementation. Would you please go to https://spark-project.atlassian.net/browse/SPARK and create a ticket about this issue with component MLlib? Thanks! Best, Xiangrui On Tue, Mar 11, 2014 at 3:18 PM, Michael Allman m...@allman.ms wrote: Hi, I'm

<    1   2   3   4   5