Re: [VOTE] Designating maintainers for some Spark components

2014-11-05 Thread Liquan Pei
on the PMC list, as > > well as call for an official vote on it on a public list. Basically, as > the > > Spark project scales up, we need to define a model to make sure there is > > still great oversight of key components (in particular internal > > architecture and public APIs), and to this end I've proposed > implementing a > > maintainer model for some of these components, similar to other large > > projects. > > > > > > - > > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > > For additional commands, e-mail: dev-h...@spark.apache.org > > > > > -- Liquan Pei Department of Physics University of Massachusetts Amherst

Re: Issues with ALS positive definite

2014-10-15 Thread Liquan Pei
as to be positive definite... > > I think the tests are not running any 0.0 regularization tests otherwise we > should have caught it as well... > > For the sparse coding NMF variant that I am running, I have to turn off L2 > regularization when I run a L1 on products to extract sparse topics... > > Thanks. > > Deb > -- Liquan Pei Department of Physics University of Massachusetts Amherst

Re: Spark SQL question: why build hashtable for both sides in HashOuterJoin?

2014-10-08 Thread Liquan Pei
u Wang wrote: > > Liquan, yes, for full outer join, one hash table on both sides is more > efficient. > > For the left/right outer join, it looks like one hash table should be > enought. > > -- > *From:* Liquan Pei [mailto:liquan...@gm

Re: Spark SQL question: why build hashtable for both sides in HashOuterJoin?

2014-09-30 Thread Liquan Pei
uot; > side, so Spark can iterate through the left side and find matches in the > right side from the hash table efficiently. Please comment and suggest, > thanks again! > > > ------ > > *From:* Liquan Pei [mailto:liquan...@gmail.com] > *Se

Re: Spark SQL question: why build hashtable for both sides in HashOuterJoin?

2014-09-29 Thread Liquan Pei
n the partition is big. And it > doesn't reduce the iteration on streamed relation, right? > > Thanks! > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail:

Current status of Sparrow

2014-06-20 Thread Liquan Pei
Hi What is the current status of Sparrow integration with Spark? I would like to integrate Sparrow with Spark 1.0 on a 100 node cluster. Any suggestions? Thanks a lot for your help! Liquan

Matrix Multiplication of two RDD[Array[Double]]'s

2014-05-17 Thread Liquan Pei
Hi I am currently implementing an algorithm involving matrix multiplication. Basically, I have matrices represented as RDD[Array[Double]]. For example, If I have A:RDD[Array[Double]] and B:RDD[Array[Double]] and what would be the most efficient way to get C = A * B Both A and B are large, so it w