[mllib] Add multiplying large scale matrices

2014-09-05 Thread Yu Ishikawa
Hi all, It seems that there is a method to multiply a RowMatrix and a (local) Matrix. However, there is not a method to multiply a large scale matrix and another one in Spark. It would be helpful. Does anyone have a plan to add multiplying large scale matrices? Or shouldn't we support it in

Re: [mllib] Add multiplying large scale matrices

2014-09-05 Thread RJ Nowling
I think it would be interesting to have a variety of matrix operations (multiplication, addition / subtraction, powers, scalar multiply, etc.) available in Spark. Diagonalization may be more difficult but iterative approximation approaches may be quite amenable. On Fri, Sep 5, 2014 at 5:26 AM,

Re: [mllib] Add multiplying large scale matrices

2014-09-05 Thread Yu Ishikawa
Hi RJ, Thank you for your comment. I am interested in to have other matrix operations too. I will create a JIRA issue in the first place. thanks, -- View this message in context:

Re: amplab jenkins is down

2014-09-05 Thread Nicholas Chammas
Hmm, looks like at least some builds https://amplab.cs.berkeley.edu/jenkins/view/Pull%20Request%20Builders/job/SparkPullRequestBuilder/19804/consoleFull are working now, though this last one was from ~5 hours ago. On Fri, Sep 5, 2014 at 1:02 AM, shane knapp skn...@berkeley.edu wrote: yep.

Re: [mllib] Add multiplying large scale matrices

2014-09-05 Thread Evan R. Sparks
There's some work on this going on in the AMP Lab. Create a ticket and we can update with our progress so that we don't duplicate effort. On Fri, Sep 5, 2014 at 8:18 AM, Yu Ishikawa yuu.ishikawa+sp...@gmail.com wrote: Hi RJ, Thank you for your comment. I am interested in to have other matrix

Re: [mllib] Add multiplying large scale matrices

2014-09-05 Thread Yu Ishikawa
Hi Evan, That's sounds interesting. Here is the ticket which I created. https://issues.apache.org/jira/browse/SPARK-3416 thanks, -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/mllib-Add-multiplying-large-scale-matrices-tp8291p8296.html Sent from

Re: [mllib] Add multiplying large scale matrices

2014-09-05 Thread Patrick Wendell
Hey There, I believe this is on the roadmap for the 1.2 next release. But Xiangrui can comment on this. - Patrick On Fri, Sep 5, 2014 at 9:18 AM, Yu Ishikawa yuu.ishikawa+sp...@gmail.com wrote: Hi Evan, That's sounds interesting. Here is the ticket which I created.

Re: [mllib] Add multiplying large scale matrices

2014-09-05 Thread Shivaram Venkataraman
FWIW matrix multiplication is extremely communication intensive when you have two row partitioned matrices and there are often other ways to solve problems. Regardless, it would be good to have a more complete matrix library and it would be good to contribute some of the stuff we have done in the

Re: How to kill a Spark job running in local mode programmatically ?

2014-09-05 Thread Marcelo Vanzin
I don't think that's possible at the moment, mainly because SparkSubmit expects it to be run from the command line, and not programatically, so it doesn't return anything that can be used to control what's going on. You may try to interrupt the thread calling into SparkSubmit, but that might not

Re: amplab jenkins is down

2014-09-05 Thread shane knapp
it's looking like everything except the pull request builders are working. i'm going to be working on getting this resolved today. On Fri, Sep 5, 2014 at 8:18 AM, Nicholas Chammas nicholas.cham...@gmail.com wrote: Hmm, looks like at least some builds

Re: [mllib] Add multiplying large scale matrices

2014-09-05 Thread Jeremy Freeman
Hey all, Definitely agreed this would be nice! In our own work we've done element-wise addition, subtraction, and scalar multiplication of similarly partitioned matrices very efficiently with zipping. We've also done matrix-matrix multiplication with zipping, but that only works in certain

Re: amplab jenkins is down

2014-09-05 Thread Nicholas Chammas
How's it going? It looks like during the last build https://amplab.cs.berkeley.edu/jenkins/view/Pull%20Request%20Builders/job/SparkPullRequestBuilder/lastBuild/console from about 30 min ago Jenkins was still having trouble fetching from GitHub. It also looks like not all requests for testing are

Re: Dependency hell in Spark applications

2014-09-05 Thread Tathagata Das
If httpClient dependency is coming from Hive, you could build Spark without Hive. Alternatively, have you tried excluding httpclient from spark-streaming dependency in your sbt/maven project? TD On Thu, Sep 4, 2014 at 6:42 AM, Koert Kuipers ko...@tresata.com wrote: custom spark builds should

Re: Dependency hell in Spark applications

2014-09-05 Thread Ted Yu
From output of dependency:tree: [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ spark-streaming_2.10 --- [INFO] org.apache.spark:spark-streaming_2.10:jar:1.1.0-SNAPSHOT INFO] +- org.apache.spark:spark-core_2.10:jar:1.1.0-SNAPSHOT:compile [INFO] | +-

Re: amplab jenkins is down

2014-09-05 Thread Josh Rosen
We have successfully purged Jenkins’ build queue.  If you want a PR to be re-tested, please ask Jenkins again. On September 5, 2014 at 5:36:30 PM, shane knapp (skn...@berkeley.edu) wrote: yeah, it was a problem w/the PRB's OAuth key. josh rosen added a new key, and magique! we're about to

Re: [mllib] Add multiplying large scale matrices

2014-09-05 Thread 顾荣
Missed the dev-list last email. Resent it again. Please ignore the duplicated one. 2014-09-06 11:22 GMT+08:00 顾荣 gurongwal...@gmail.com: Hi All, This is RongGu from PasaLab at Nanjing Universtiy,China. Actually, we have been working on a distributed matrix operations library on Spark this