It is common for double serialization to creep into the systems as well. My
guess however is that the primitive serialization is just much faster than the
vector serialization.
Sent from my iPhone
On Jul 8, 2013, at 22:55, Dmitriy Lyubimov dlie...@gmail.com wrote:
yes, but it is just a
yes that's my working hypothesis. Serializing and combining
RandomAccessSparseVectors is slower than elementwise messages.
On Mon, Jul 8, 2013 at 11:00 PM, Ted Dunning ted.dunn...@gmail.com wrote:
It is common for double serialization to creep into the systems as well.
My guess however is
Also, it is likely that the combiner has little effect. This means that you
are essentially using a vector to serialized single elements.
Sent from my iPhone
On Jul 8, 2013, at 23:13, Dmitriy Lyubimov dlie...@gmail.com wrote:
yes that's my working hypothesis. Serializing and combining
that has occurred to me too. we are not inferring any aggregations really
here. it may turn out that its use beneficial with bigger volumes and real
I/O though. hard to tell. anyway i will probably keep both as an option.
On Tue, Jul 9, 2013 at 7:51 AM, Ted Dunning ted.dunn...@gmail.com wrote:
Anybody knows how good (or bad) our performance on matrix transpose? how
long will it take to transpose a 10M non-zeros with Mahout (if i wanted to
setup fully distributed but single node MR cluster?)
Trying to figure if the numbers i see with Bagel-based Mahout matrix
transposition are any good.
Transpose of that small a matrix should happen in memory.
Sent from my iPhone
On Jul 8, 2013, at 17:26, Dmitriy Lyubimov dlie...@gmail.com wrote:
Anybody knows how good (or bad) our performance on matrix transpose? how
long will it take to transpose a 10M non-zeros with Mahout (if i wanted
yes, but it is just a test and I am trying to interpolate results that i
see to bigger volume. sort of. To get some taste of the programming model
performance.
I do get cpu-bound behavior and i hit spark cache 100% of the time. so i
theory, since i am not having spills and i am not doing sorts,
Ted,
would it make sense to port parts of QR in-core row-wise Givens solver out
of SSVD to work on any Matrix? I know givens method is advertised as stable
but not sure if it is the fastest accepted one. I guess they are all about
the same.
If yes, i will need also to port the UpperTriangular
FWIW, Givens streaming qr will be a bit more economical on memory than
Householder's since it doesn't need the full buffer to compute R and
doesn't need to keep entire original matrix around.
On Thu, Jul 4, 2013 at 11:15 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:
Ted,
would it make sense
For anyone good at scala DSLs, the following is the puzzle i can't seem to
figure at the moment.
I mentioned before that I implemented assignment notations to a row or a
block, e.g. for a row vector : A(5,::) := (1,2,3)
what it really translates into in this particular case is
On Fri, Jul 5, 2013 at 1:15 AM, Dmitriy Lyubimov dlie...@gmail.com wrote:
For anyone good at scala DSLs, the following is the puzzle i can't seem to
figure at the moment.
I mentioned before that I implemented assignment notations to a row or a
block, e.g. for a row vector : A(5,::) :=
On Fri, Jul 5, 2013 at 1:25 AM, Jake Mannix jake.man...@gmail.com wrote:
at this point i have only a very obvious apply(Double,Double):Double =
m.getQuick(...), i.e. only element reads are supported with that syntax.
I am guessing Jake, if anyone, might have an idea here... thanks.
Hi Dmitry
You can take a look at using the update magic method which is similar to
apply but handles assignment.
If you want to keep the := as assignment I think you could do
def :=(value: Double) = update ...
(I don't have my laptop around at the moment so can't check this works).
On Fri, Jul 5, 2013 at 1:40 AM, Nick Pentreath nick.pentre...@gmail.comwrote:
Hi Dmitry
You can take a look at using the update magic method which is similar
to apply but handles assignment.
If you want to keep the := as assignment I think you could do
def :=(value: Double) = update
This is pretty exciting!
Thanks Dmitriy.
On Wed, Jul 3, 2013 at 10:12 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:
Excellent!
so I guess SSVD can be divorced from apache-math solver then.
Actually it all shaping up surprisingly well, with scala DSL for both
in-core and mahout DRMS and
On Wed, Jul 3, 2013 at 6:25 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:
On Wed, Jun 19, 2013 at 12:20 AM, Ted Dunning ted.dunn...@gmail.com
wrote:
As far as in-memory solvers, we have:
1) LR decomposition (tested and kinda fast)
2) Cholesky decomposition (tested)
3) SVD
Excellent!
so I guess SSVD can be divorced from apache-math solver then.
Actually it all shaping up surprisingly well, with scala DSL for both
in-core and mahout DRMS and spark solvers. I haven't been able to pay as
much attention to this as i hoped due to being pretty sick last month. But
even
Ok, so i was fairly easily able to build some DSL for our matrix
manipulation (similar to breeze) in scala:
inline matrix or vector:
val a = dense((1, 2, 3), (3, 4, 5))
val b:Vector = (1,2,3)
block views and assignments (element/row/vector/block/block of row or
vector)
a(::, 0)
a(1, ::)
a(0
Dmitriy,
This is very pretty.
On Mon, Jun 24, 2013 at 6:48 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:
Ok, so i was fairly easily able to build some DSL for our matrix
manipulation (similar to breeze) in scala:
inline matrix or vector:
val a = dense((1, 2, 3), (3, 4, 5))
val
Yeah, I'm totally on board with a pretty scala DSL on top of some of our
stuff.
In particular, I've been experimenting with with wrapping the
DistributedRowMatrix
in a scalding wrapper, so we can do things like
val matrixAsTypedPipe =
DistributedRowMatrixPipe(new DistributedRowMatrix(numRows,
That looks great Dmitry!
The thing about Breeze that drives the complexity in it is partly
specialization for Float, Double and Int matrices, and partly getting the
syntax to just work for all combinations of matrix types and operands etc.
mostly it does just work but occasionally not.
On Mon, Jun 24, 2013 at 1:46 PM, Nick Pentreath nick.pentre...@gmail.comwrote:
That looks great Dmitry!
The thing about Breeze that drives the complexity in it is partly
specialization for Float, Double and Int matrices, and partly getting the
syntax to just work for all combinations of
I think that contrib modules would be very interesting. Specifically, good
Scala DSL, pig integration and so on.
On Mon, Jun 24, 2013 at 9:55 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:
On Mon, Jun 24, 2013 at 1:46 PM, Nick Pentreath nick.pentre...@gmail.com
wrote:
That looks great
You're right on that - so far doubles is all I've needed and all I can
currently see needing.
I'll take a look at your project and see how easy it is to integrate with my
Spark ALS and other code - syntax wise it looks almost the same so swapping out
the linear algebra backend would be
Well one fundamental step to get there in Mahout realm, the way i see it,
is to create DSLs for Mahout's DRMs in spark. That's actually one of the
other reasons i chose not to follow Breeze. When we unwind Mahout DRM's, we
may see sparse or dense slices there with named vectors. To translate that
What does Matrix. iterateAll() contractually do? Practically it seems to be
row wise iteration for some implementations but it doesnt seem
contractually state so in the javadoc. What is MatrixSlice if it is neither
a row nor a colimn? How can i tell what exactly it is i am iterating over?
On Jun
I think that this contract has migrated a bit from the first starting point.
My feeling is that there is a de facto contract now that the matrix slice is a
single row.
Sent from my iPhone
On Jun 23, 2013, at 16:32, Dmitriy Lyubimov dlie...@gmail.com wrote:
What does Matrix. iterateAll()
Thank you.
On Jun 23, 2013 6:16 PM, Ted Dunning ted.dunn...@gmail.com wrote:
I think that this contract has migrated a bit from the first starting
point.
My feeling is that there is a de facto contract now that the matrix slice
is a single row.
Sent from my iPhone
On Jun 23, 2013, at
Let us know how I went, I'm pretty interested to see how well our stuff
integrates with Spark, especially since Spark is in the process of
joining Apache.
-sebastian
On 19.06.2013 03:14, Dmitriy Lyubimov wrote:
Hello,
so i finally got around to actually do it.
I want to get Mahout sparse
On Wed, Jun 19, 2013 at 5:29 AM, Jake Mannix jake.man...@gmail.com wrote:
Question #2: which in-core solvers are available for Mahout matrices? I
know there's SSVD, probably Cholesky, is there something else? In
paticular, i need to be solving linear systems, I guess Cholesky should
be
Hi Dmitriy
I'd be interested to look at helping with this potentially (time
permitting).
I've recently been working on a port of Mahout's ALS implementation to
Spark. I spent a bit of time thinking about how much of mahout-math to use.
For now I found that using the Breeze linear algebra
I have a JBlas version of our ALS solving code lying around [1], feel
free to use it. Would also be interested to see the Spark port.
-sebastian
[1]
https://github.com/sscdotopen/mahout-als/blob/jblas/math/src/main/java/org/apache/mahout/math/als/JBlasAlternatingLeastSquaresSolver.java
On
Thank you, Ted.
On Wed, Jun 19, 2013 at 12:20 AM, Ted Dunning ted.dunn...@gmail.com wrote:
On Wed, Jun 19, 2013 at 5:29 AM, Jake Mannix jake.man...@gmail.com
wrote:
Question #2: which in-core solvers are available for Mahout matrices? I
know there's SSVD, probably Cholesky, is there
Thank you, Sebastian.
Actually ALS flavours are indeed one of my first pragmatic goals -- i have
also done a few customization for my employer -- so i probably will
pragmatically pursue those customizations first. In particular, i do use
Koren-Volinsky confidence weighting, but assume we still
Nick, thank you for the hints and poniters! I will check out the Breeze.
Let me take a look.
as far as collaboration, unfortunately i think the only way to go for me
and my employer is to cut it, test it and then (after long negotiations
with CEO) donate if accepted. They are ok with my small
Hello,
so i finally got around to actually do it.
I want to get Mahout sparse vectors and matrices (DRMs) and rebuild some
solvers using spark and Bagel /scala.
I also want to use in-core solvers that run directly on Mahout.
Question #1: which mahout artifacts are better be imported if I don't
On Tue, Jun 18, 2013 at 6:14 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:
Hello,
so i finally got around to actually do it.
I want to get Mahout sparse vectors and matrices (DRMs) and rebuild some
solvers using spark and Bagel /scala.
I also want to use in-core solvers that run directly
Thank you, Jake. I suspected as much about Colt.
On Jun 18, 2013 8:30 PM, Jake Mannix jake.man...@gmail.com wrote:
On Tue, Jun 18, 2013 at 6:14 PM, Dmitriy Lyubimov dlie...@gmail.com
wrote:
Hello,
so i finally got around to actually do it.
I want to get Mahout sparse vectors and
38 matches
Mail list logo