saw a lot of these, some still bewildering, but they all related to
non-local mode (different classpaths on backed and front end).
On Fri, Apr 3, 2015 at 1:39 PM, Andrew Palumbo ap@outlook.com wrote:
Has anybody seen an exception like this when running a spark job?
the job completes but
i guess i can run it on the phone.
On Thu, Apr 2, 2015 at 10:30 AM, Dmitriy Lyubimov dlie...@gmail.com wrote:
signed in but can't connect, probably because of filtering.
On Thu, Apr 2, 2015 at 8:10 AM, Andrew Palumbo ap@outlook.com wrote:
it looks like only admins can invite.
On 04
signed in but can't connect, probably because of filtering.
On Thu, Apr 2, 2015 at 8:10 AM, Andrew Palumbo ap@outlook.com wrote:
it looks like only admins can invite.
On 04/02/2015 11:05 AM, Andrew Palumbo wrote:
I'll try to invite someone, i'm on it now.
On 04/02/2015 10:59 AM, Pat
Pat, duplication of my email to your PR is coincidental. It is not about
your PR. Sorry. I was looking at the master log and posting to @dev.
On Wed, Apr 1, 2015 at 1:31 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:
yeah. https://github.com/apache/mahout/commits/master.
This link is MASTER
On Wed, Apr 1, 2015 at 1:01 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:
Actually, 617 if git pull brings merge (somebody pushed something while
you were doing changelog etc.) there'd be merge. I'd try to rebase in 617
this case (if it works) to avoid merge, if possible. or re-do the whole
, 2015, at 11:53 AM, Dmitriy Lyubimov notificati...@github.com
wrote:
yeah. https://github.com/apache/mahout/commits/master.
we should not see merged master commits there (clear sign of not
squashing your personal PR history! )
On Wed, Apr 1, 2015 at 11:26 AM, Suneel Marthi notificati
happen; at least to me.
On Wed, Apr 1, 2015 at 12:58 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:
Pat, actually i did not say I noticed problems in your commits. It was
somebody else :)
On Wed, Apr 1, 2015 at 12:41 PM, Pat Ferrel p...@occamsmachete.com wrote:
Here is my history dump
in the master log is cleaned up. IMO there is no problem
here.
On Apr 1, 2015, at 1:05 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:
On Wed, Apr 1, 2015 at 1:01 PM, Dmitriy Lyubimov dlie...@gmail.com
wrote:
Actually, 617 if git pull brings merge (somebody pushed something while
you were doing
[
https://issues.apache.org/jira/browse/MAHOUT-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitriy Lyubimov reassigned MAHOUT-1641:
Assignee: Dmitriy Lyubimov
Add conversion from a RDD[(String, String)] to a Drm
FYI t-digest is now is also part of spark classpath, part of stream-lib.
On Tue, Mar 31, 2015 at 4:05 PM, Suneel Marthi (JIRA) j...@apache.org
wrote:
[
[
https://issues.apache.org/jira/browse/MAHOUT-1660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitriy Lyubimov updated MAHOUT-1660:
-
Affects Version/s: (was: 0.10.0)
0.10.1
[
https://issues.apache.org/jira/browse/MAHOUT-1660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14387162#comment-14387162
]
Dmitriy Lyubimov commented on MAHOUT-1660:
--
i have a fix for that. if you don't
[
https://issues.apache.org/jira/browse/MAHOUT-1660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitriy Lyubimov reassigned MAHOUT-1660:
Assignee: Dmitriy Lyubimov (was: Suneel Marthi)
Hadoop1HDFSUtil.readDRMHEader
I switched to idea since i started doing mixed projects with scala.
Standalone scala is bearable in eclipse but mixed projects simply don't
work. (and Mahout likely one of them).
On Mon, Mar 30, 2015 at 3:58 PM, Suneel Marthi suneel.mar...@gmail.com
wrote:
I believe its only Shannon from
So I can plot with matplotlib from scala here? Any examples ?
On Mar 28, 2015 7:13 AM, Suneel Marthi suneel.mar...@gmail.com wrote:
Here's a gist of an iScala notebook, and has integration with matplotlib
for visualization, could complement well with present scala-shell.
Thoughts?
Shannon,
How difficult would it be to port spectral clustering to our scala alg and
math? We have ssvd there as well.
On Mar 27, 2015 7:26 AM, Shannon Quinn (JIRA) j...@apache.org wrote:
Shannon Quinn created MAHOUT-1659:
-
Summary: Remove
Note also that all these related beasts come in pairs (in-core input -
distributed input):
ssvd - dssvd
spca - dspca
On Fri, Mar 27, 2015 at 3:45 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:
But MR version of SSVD is more stable because of the QR differences.
On Fri, Mar 27, 2015 at 3:44
But MR version of SSVD is more stable because of the QR differences.
On Fri, Mar 27, 2015 at 3:44 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:
Yes. Except it doesn't follow same parallel reordered Givens QR but uses
Cholesky QR (which we call thin QR) as an easy-to-implement shortcut
Yes. Except it doesn't follow same parallel reordered Givens QR but uses
Cholesky QR (which we call thin QR) as an easy-to-implement shortcut. But
this page makes no mention of QR specifics i think
On Fri, Mar 27, 2015 at 12:57 PM, Andrew Palumbo ap@outlook.com wrote:
math-scala dssvd
The algorithm outline for in-core is exactly the same. Except in-core
version is using Householder Reflections QR (I think). but logic is exactly
the same.
On Fri, Mar 27, 2015 at 3:58 PM, Andrew Palumbo ap@outlook.com wrote:
On 03/27/2015 06:46 PM, Dmitriy Lyubimov wrote:
Note also
in the implementation than there are steps in the
algorithm.
On 03/27/2015 06:58 PM, Andrew Palumbo wrote:
On 03/27/2015 06:46 PM, Dmitriy Lyubimov wrote:
Note also that all these related beasts come in pairs (in-core input -
distributed input):
ssvd - dssvd
spca - dspca
yeah I've been thinking
content in the same place.
On 03/27/2015 06:58 PM, Andrew Palumbo wrote:
On 03/27/2015 06:46 PM, Dmitriy Lyubimov wrote:
Note also that all these related beasts come in pairs (in-core input -
distributed input):
ssvd - dssvd
spca - dspca
yeah I've been thinking that i'd give a less detailed
it is possible to say
import o.a.m.math._
import decompositions._
then it will assume second line as o.a.m.math.decompositions automatically
On Fri, Mar 27, 2015 at 4:09 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:
i think there's a typo in package name under usage. It should
and R simulation sources perhaps ...
On Fri, Mar 27, 2015 at 4:57 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:
Andrew, thanks a lot!
I think acknowledgement and refference to N. Halko's dissertation from MR
page is also worthy of mention on this page as well.
On Fri, Mar 27, 2015 at 4:41
know if you see any other changes that need to be made.
On 03/27/2015 07:06 PM, Dmitriy Lyubimov wrote:
In fact, algorithm just executes the outline formulas. Not always line for
line, but step for step for sure.
On Fri, Mar 27, 2015 at 4:05 PM, Andrew Palumbo ap@outlook.com
wrote
I'd venture a herecy again. what if we put off 1.3 compatibility until
better times and focus on 0.9...1.2.x compatibility we have now.
Chances are by the time we are done with 0.10.x we'd need to consider 1.4
or 1.5 at the rate this project is bloating. Keep in mind that higher
version != higher
, at 11:02 AM, Dmitriy Lyubimov dlie...@gmail.com wrote:
I like math-*. And it is math only there. Or was last time i checked. it
will be what R calls R-base, and I would welcome no other scope there.
all environment things are math. all ML things are math. quasi-newton,
bayesian optimizers, linear
ok... but this will require also reworking jenkins and CI builds... and
build engineering always scared me :)
On Tue, Mar 24, 2015 at 10:54 AM, Stevo Slavic (JIRA) j...@apache.org
wrote:
Stevo Slavic created MAHOUT-1654:
Summary: Migrate from
On Tue, Mar 24, 2015 at 11:21 AM, Andrew Musselman
andrew.mussel...@gmail.com wrote:
Summary of the call; please chime in with any corrections or
clarifications:
(1) Support Lucene 5, Hadoop 2, Java 7, Spark 1.1 1.3
(2) Ensure build is solid, add Scaladocs in poms and in Jenkins
(3)
I had the same idea myself. i like it.
On Tue, Mar 24, 2015 at 1:35 PM, Stevo Slavić ssla...@gmail.com wrote:
If I understand correctly, mrlegacy should remain, just hdfs/non-mr stuff
extracted into separate module, for reuse in math-scala and mahout-spark
module, so they do not depend on
[
https://issues.apache.org/jira/browse/MAHOUT-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14378571#comment-14378571
]
Dmitriy Lyubimov commented on MAHOUT-1648:
--
Re: thin QR: this is also known
[
https://issues.apache.org/jira/browse/MAHOUT-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14378587#comment-14378587
]
Dmitriy Lyubimov commented on MAHOUT-1648:
--
why? they are 100% algebraic
I like math-*. And it is math only there. Or was last time i checked. it
will be what R calls R-base, and I would welcome no other scope there.
all environment things are math. all ML things are math. quasi-newton,
bayesian optimizers, linear search are all math. Stats are math. als,
(d)ssvd,
lemme read this issue really quick.
This looks like a redundant double-contract. Why require implicit
conversions if they are already requiring explicit types? And vice versa.
On Sun, Mar 22, 2015 at 10:17 AM, Pat Ferrel p...@occamsmachete.com wrote:
Due to a bug in spark we have a nasty work
this a bit as well. 'cause spark api has the same
problems.
On Mon, Mar 23, 2015 at 11:06 AM, Dmitriy Lyubimov dlie...@gmail.com
wrote:
lemme read this issue really quick.
This looks like a redundant double-contract. Why require implicit
conversions if they are already requiring explicit types
We (well, I at least) are extremely appreciative of Anand's effort and
commitment to integrate h2o engine as one of the algebraic backs. Even
10 times more so as it turns out it had nothing to do with 0xdata.
I think among other things i think we can agree this renders the
suggestion of h20
[
https://issues.apache.org/jira/browse/MAHOUT-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367581#comment-14367581
]
Dmitriy Lyubimov commented on MAHOUT-1648:
--
Maybe we should really call stuff
[
https://issues.apache.org/jira/browse/MAHOUT-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367801#comment-14367801
]
Dmitriy Lyubimov commented on MAHOUT-1648:
--
looks like algebra and environment
[
https://issues.apache.org/jira/browse/MAHOUT-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367923#comment-14367923
]
Dmitriy Lyubimov commented on MAHOUT-1648:
--
yes i'd go by spliting
it. Is anyone signing up for
that?
On Mar 17, 2015, at 8:59 AM, Dmitriy Lyubimov dlie...@gmail.com wrote:
I dont like the term dsl.
It is algebtaic optimizer, folks. Calling it dsl brings in wrong and too
trivial ideas about it.
On Mar 17, 2015 8:27 AM, Andrew Palumbo ap@outlook.com
i was thinking 0.10.0 mid-april, update 0.10.1 end of spring.
i would suggest feature extraction topics for 0.11.x. Esp. w.r.t.
SchemaRDD aka DataFrame -- vectorizing, hashing, ML schema support,
imputation of missing data, outlier cleanups etc. There's a lot.
Hardware backs integration -- i
IMO deprecated is for something that got released at least once. If
there's no intent to see it released, it just should be purged.
On Tue, Mar 17, 2015 at 12:49 PM, Ted Dunning ted.dunn...@gmail.com wrote:
On Tue, Mar 17, 2015 at 10:14 AM, Pat Ferrel p...@occamsmachete.com wrote:
I’m nervous
On Tue, Mar 17, 2015 at 8:26 AM, Andrew Palumbo ap@outlook.com wrote:
On 03/15/2015 01:42 PM, Pat Ferrel wrote:
Lots of discussion off the record about doing a release but shouldn’t we
plan this?
What has to be in a release of Mahout 0.10?
Seems like we could release as-is but it
... but it would be nice to confirm with them directly here on @dev to
avoid pitfalls of third party hearsay of course.
But I assume if nobody comes forward and the tests are not working
then the issue of releasing the contribution is moot.
On Tue, Mar 17, 2015 at 12:55 PM, Dmitriy Lyubimov dlie
I dont like the term dsl.
It is algebtaic optimizer, folks. Calling it dsl brings in wrong and too
trivial ideas about it.
On Mar 17, 2015 8:27 AM, Andrew Palumbo ap@outlook.com wrote:
On 03/15/2015 01:42 PM, Pat Ferrel wrote:
Lots of discussion off the record about doing a release but
We already discussed this.
honestly, i don't see it as a priority for a number of reasons.
(1) it increases dependency footprint
(2) it only increases speed (i think) of random access vectors
(3) it still will not get us anywhere close in terms of matrix-matrix
operations to where mkl, openblas
my take is legacy is just a module (aka maven artifact). Just like it
is now. we just need to re-route(cut) dependencies on it.
On Fri, Mar 6, 2015 at 2:56 PM, Pat Ferrel p...@occamsmachete.com wrote:
The simplest way to split the project is into engines—hadoop and spark. What
is happening
We are dropping support of ML on MR.
Our backs are now (as it stands) are spark and h20; mostly spark.
Maybe flink in the future.
On Fri, Mar 6, 2015 at 11:31 AM, jay vyas jayunit100.apa...@gmail.com wrote:
Hi Mahout. We are prepping for 0.9 Release of bigtop.
Is anyone in the mahout
nope afaik.
MAHOUT_OPTS is the place to set that (if we are talking about shell).
On Thu, Mar 5, 2015 at 3:50 PM, Andrew Palumbo (JIRA) j...@apache.org wrote:
[
[
https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitriy Lyubimov resolved MAHOUT-1603.
--
Resolution: Fixed
Tweaks for Spark 1.0.x
note that with MAHOUT_OPTS you have a choice. You can either set up
env or you can use inline syntax like
MAHOUT_OPTS='-Dk=n' bin/mahout spark-shell
On Thu, Mar 5, 2015 at 4:50 PM, Andrew Palumbo (JIRA) j...@apache.org wrote:
[
the hack i have only takes MAHOUT_OPTS.
it normally actually makes more sense to set it there since spark
options are too numerous and too long to enter on command line.
so i'd say we need to support MAHOUT_OPTS at minimum; or both.
On Thu, Mar 5, 2015 at 4:04 PM, Andrew Palumbo (JIRA)
I am willing to +1 any contribution at this point.
my previous company used pmml to serialize simple stuff, but i don't
have first hand experience. Its flexibility is ultimately pretty
limited, isn't it? And xml is ultimately a media which is too ugly and
too verbose at the same time to represent
is bug also present in 1.2.0? cdh 5.3 is 1.2.0 not 1.2.1
On Wed, Mar 4, 2015 at 1:58 PM, Pat Ferrel p...@occamsmachete.com wrote:
Spark 1.2.1 has a bug that blocks any JavaSerializer without a work around.
It requires the SparkConf to get a path to a jar that exists on all nodes.
So I’ve
(1) no mentors this year.
(2) what was the PR #?
On Wed, Mar 4, 2015 at 2:35 PM, Олег Зотов olegzoto...@gmail.com wrote:
Hi
I want to contribute to the Mahout and I have two questions:
1) What about Mahout and Google Summer of Code this year?
2) To take the first step, I fixed one not so
afaik spark is not built by sbt by default any longer, but rather, by
maven. At least self-build instructions are for maven only (and sbt
build changed enough that i can't effectively use it any longer).
but when it was, it was always sbt-for-all.
On Wed, Mar 4, 2015 at 3:45 PM, Pat Ferrel
think h20 has some java
code mixed in).
On Wed, Mar 4, 2015 at 2:41 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:
(1) no mentors this year.
(2) what was the PR #?
On Wed, Mar 4, 2015 at 2:35 PM, Олег Зотов olegzoto...@gmail.com wrote:
Hi
I want to contribute to the Mahout and I have two
it looks like assigning 0s in a view of SequentialAccessSparseVector
doesn't work, as it internally using setQuick() which tirms the length
of non-zero elements (?)
which causes invalidation of the iterator state.
in particular, this simple test fails:
val svec = new
it looks like assigning 0s in a view of SequentialAccessSparseVector
doesn't work, as it internally using setQuick() which tirms the length
of non-zero elements (?)
which causes invalidation of the iterator state.
in particular, this simple test fails:
val svec = new
...@occamsmachete.com wrote:
I vaguely remember the NonZeroIterator being optimized just as we were
switching to Scala. Something Sebastian was working on but no idea if it was
related to this.
On Mar 2, 2015, at 3:51 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:
actually the test error is something
it looks like an attempt to eliminate reusable elements in vector
view's iterators but why? Vector contract already implies element
reusability inside iterators, so why special treatment inside vector
views?
umph.
On Mon, Mar 2, 2015 at 3:35 PM, Dmitriy Lyubimov dlie...@gmail.com wrote
actually the test error is something else but i think vector view
iterator implementation is still wrong. I will scan if that produces
any more errors elsewhere.
On Mon, Mar 2, 2015 at 3:43 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:
this test is failing after i remove non-reusable elements
org.apache.mahout.math.VectorBinaryAggregateCostTest
T
On Mon, Mar 2, 2015 at 3:41 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:
it looks like an attempt to eliminate reusable elements in vector
view's iterators but why? Vector contract already implies element
reusability inside iterators, so why special treatment inside vector
views
Following CDH releases perhaps helps a bit. they tend to skip buggy releases.
On Fri, Feb 27, 2015 at 2:07 PM, Pat Ferrel p...@occamsmachete.com wrote:
The deserialization thing is a Spark bug. The work around requires that you
put a key/value in the SparkConf to point to a jar on _all_
algebraic optimizer binary should be compatible with pretty wide range of
spark. At very least, current head is backward compatible with 1.1.x. The
only thing that locked it to that is using unpersist api.
Before that it should've been compatible all the way to at least 0.9. spark
0.8.something
I think a release with some value in it and a talk clarifying status will
suffice for starters.
Name change IMO is immaterial if there's the value and talks clarify
general philosophy sufficiently. Nobody else can tell people better what it
is all about, it is lack of the release and information
-1 on incubation as well. The website and docs and user lists and this
champion and mentor stuff, and logos and promotions for committers
absolutely do not make any sense at this point. From what i hear, people
are pretty busy without having that as it is. It would probably make more
sense to
ASF also mirrors dropping branches, i remember doing that too. but it won't
allow history rewrites.
On Tue, Feb 24, 2015 at 11:35 AM, Dmitriy Lyubimov dlie...@gmail.com
wrote:
what exactly did you try to do?
just resetting HEAD will not work on remote branch -- you need force-sync
Branches:
refs/heads/spark-1.2 [created] 901ef03b4
On Tue, Feb 24, 2015 at 11:47 AM, Dmitriy Lyubimov dlie...@gmail.com
wrote:
yeah ok so you pushed 1.2 branch to asf but it is not yet in github. iti
should be there eventually, give it a bit of time.
On Tue, Feb 24, 2015 at 11:35 AM
seems like different builds on client and backend.
shell is using your local spark setup (pointed to with SPARK_HOME). make
sure it points to identical binaries (not just spark version) to what is
used in the backend.
the reason is spark is not binary-canonical w.r.t. to release version, it
is it local or standalone? local should not have these types of errors. for
anything else it is likely what i said.
On Tue, Feb 24, 2015 at 1:08 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:
seems like different builds on client and backend.
shell is using your local spark setup (pointed
Andrew, perhaps you could commit a patch on top of 1.2 branch? much
appreciated.
On Tue, Feb 24, 2015 at 1:25 PM, Andrew Palumbo ap@outlook.com wrote:
sorry- I left out the scala-compiler artifact (at the top) it should read:
dependency
groupIdorg.scala-lang/groupId
yeah ok so you pushed 1.2 branch to asf but it is not yet in github. iti
should be there eventually, give it a bit of time.
On Tue, Feb 24, 2015 at 11:35 AM, Dmitriy Lyubimov dlie...@gmail.com
wrote:
what exactly did you try to do?
just resetting HEAD will not work on remote branch -- you
(eventually) get there.
On Tue, Feb 24, 2015 at 11:18 AM, Andrew Musselman
andrew.mussel...@gmail.com wrote:
Does ASF git get mirrored to GitHub? I tried pushing a branch and don't
see it there yet.
On Tue, Feb 24, 2015 at 11:16 AM, Dmitriy Lyubimov dlie...@gmail.com
wrote:
On Tue, Feb 24
issued a revert to head.
On Tue, Feb 24, 2015 at 11:47 AM, Dmitriy Lyubimov dlie...@gmail.com
wrote:
yeah ok so you pushed 1.2 branch to asf but it is not yet in github. iti
should be there eventually, give it a bit of time.
On Tue, Feb 24, 2015 at 11:35 AM, Dmitriy Lyubimov dlie
:30 PM, Dmitriy Lyubimov wrote:
Andrew, perhaps you could commit a patch on top of 1.2 branch? much
appreciated.
On Tue, Feb 24, 2015 at 1:25 PM, Andrew Palumbo ap@outlook.com
wrote:
sorry- I left out the scala-compiler artifact (at the top) it should
read:
dependency
PS normally i would just reset the head to ^1 but that would require forced
rewrite and asf git doesn't allow this (and for a good reason, really). so
revert on master would be necessary.
On Tue, Feb 24, 2015 at 10:51 AM, Dmitriy Lyubimov dlie...@gmail.com
wrote:
i mean roll back #74 and apply
, Dmitriy Lyubimov dlie...@gmail.com
wrote:
As a remedy, i'd suggest to branch out spark 1.2 work and rollback 1.2.1
commit on master until 1.2 branch is fixed.
On Tue, Feb 24, 2015 at 10:19 AM, Dmitriy Lyubimov dlie...@gmail.com
wrote:
oops.
tests dont test shell startup
oops.
tests dont test shell startup.
apparently stuff got out of sync with 1.2
On Tue, Feb 24, 2015 at 10:02 AM, Pat Ferrel p...@occamsmachete.com wrote:
Me too and I built with 1.2.1
On Feb 24, 2015, at 9:50 AM, Andrew Musselman andrew.mussel...@gmail.com
wrote:
I've just rebuild mahout
As a remedy, i'd suggest to branch out spark 1.2 work and rollback 1.2.1
commit on master until 1.2 branch is fixed.
On Tue, Feb 24, 2015 at 10:19 AM, Dmitriy Lyubimov dlie...@gmail.com
wrote:
oops.
tests dont test shell startup.
apparently stuff got out of sync with 1.2
On Tue, Feb 24
On Tue, Feb 24, 2015 at 10:55 AM, Pat Ferrel p...@occamsmachete.com wrote:
to be safe I’d “git reset —hard xyz” to the commit previous to the 1.2.1
As i just explained, that resets are not possible with ASF git. Reverting
is the only option.
-d
)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
On Tue, Feb 24, 2015 at 1:08 PM, Dmitriy Lyubimov dlie...@gmail.com
wrote:
seems like different builds on client and backend.
shell
On 02/24/2015 05:15 PM, Andrew Musselman wrote:
Makes sense; I'm still getting those errors after restarting my rebuilt
spark..
On Tue, Feb 24, 2015 at 2:12 PM, Dmitriy Lyubimov dlie...@gmail.com
wrote:
IIRC MAHOUT_LOCAL doesn't mean a thing with spark mode. It is purely MR
thing
We had various operational configuration problems with snappy as well so
had to disable it for now completely until somebody has time to figure it
out (which has been like forever)
On Thu, Feb 19, 2015 at 4:26 PM, Pat Ferrel p...@occamsmachete.com wrote:
It seems like after a clean install I
of MLlib Vector and back would
solve my Kmeans use case. You know MLlib better than I so choose the best
level to perform type conversions or inheritance splicing. The point is to
make the two as seamless as possible. Doesn’t this seem a worthy goal?
On Feb 8, 2015, at 4:59 PM, Dmitriy Lyubimov
either/or choices for devs.
On Feb 5, 2015, at 1:32 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:
On Thu, Feb 5, 2015 at 1:14 AM, Gokhan Capan gkhn...@gmail.com wrote:
What I am saying is that for certain algorithms including both
engine-specific (such as aggregation) and DSL stuff, what
thank you very much.
Github pull request is what we use these days. Do you think you could put
one up ?
thanks.
-d
On Thu, Feb 5, 2015 at 1:17 AM, Sebastiano Vigna vi...@di.unimi.it wrote:
On 19 Jan 2015, at 22:26, Robin Anil robin.a...@gmail.com wrote:
@Sebastiano, sounds like an easy
On Thu, Feb 5, 2015 at 1:14 AM, Gokhan Capan gkhn...@gmail.com wrote:
What I am saying is that for certain algorithms including both
engine-specific (such as aggregation) and DSL stuff, what is the best way
of handling them?
i) should we add the distributed operations to Mahout codebase as
. Need to think through
how
to use a DataFrame in a streaming case, probably through some
checkpointing
of the window DStream—hmm.
On Feb 4, 2015, at 7:37 AM, Andrew Palumbo ap@outlook.com wrote:
On 02/03/2015 08:22 PM, Dmitriy Lyubimov wrote:
I'd suggest to consider
btw a good seq2sparse and seqdirectory ports are the only thing that
separates us from having bigram, trigram based LSA tutorial.
On Wed, Feb 4, 2015 at 10:35 AM, Dmitriy Lyubimov dlie...@gmail.com wrote:
i think they are debating the details now, not the idea. Like how NA is
different from
at 2:07 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:
Spark's DataFrame is obviously not agnostic.
I don't believe there's a good way to abstract it. Unfortunately. I think
getting too much into distributed operation abstraction is a bit dangerous.
I think MLI was one project that attempted
On Wed, Feb 4, 2015 at 1:51 PM, Andrew Palumbo ap@outlook.com wrote:
My thought was not to bring primitive engine specific aggregetors,
combiners, etc. into math-scala.
Yeah. +1. I would like to support that as an experiment, see where it goes.
Clearly some distributed use cases are
Re: Gokhan's PR post: here are my thoughts but i did not want to post it
there since they are going beyond the scope of that PR's work to chase the
root of the issue.
on quasi-algebraic methods
What is the dilemma here? don't see any.
I already explained that no more
On Feb 4, 2015, at 7:47 AM, Andrew Palumbo ap@outlook.com wrote:
Just copied over the relevant last few messages to keep the other thread
on topic...
On 02/03/2015 08:22 PM, Dmitriy Lyubimov wrote:
I'd suggest to consider this: remember all this talk about
language-integrated spark
both expressed interest in the distributed
aggregation stuff. It sounds like we are agreeing that
non-algebra—computation method type things can be engine specific.
So does anyone have an objection to Gokhan pushing his PR?
On Feb 4, 2015, at 2:20 PM, Dmitriy Lyubimov dlie...@gmail.com wrote
But first I need to do massive fixes and improvements to the distributed
optimizer itself. Still waiting on green light for that.
On Feb 3, 2015 8:45 AM, Dmitriy Lyubimov dlie...@gmail.com wrote:
On Feb 3, 2015 7:20 AM, Pat Ferrel p...@occamsmachete.com wrote:
BTW what level of difficulty
similarity. Attach Kafka and get evergreen models, if not incrementally
updating models.
On Feb 2, 2015, at 4:54 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:
bottom line compile-time dependencies are satisfied with no extra stuff
from mr-legacy or its transitives. This is proven by virtue
On Tue, Feb 3, 2015 at 11:57 AM, Олег Зотов olegzoto...@gmail.com wrote:
Hello.
I develop recommendation system and use mahout on spark (1.0 snapshot). In
the process I have found, that spark-itemsimilarity driver do not allow to
process more than two action types. After reading the
PS to run mahout shell, one can use
MASTER=master mahout/bin spark-shell
Syntax to load scripts is retained from Scala shell.
ideally one also needs stuf like MAHOUT_OPTS=-Xmx=5G but as i mentioned it
is broken right now, you can do a quick hack
On Tue, Feb 3, 2015 at 12:06 PM, Dmitriy
and IDF. I'll put them up soon.
Hopefully they'll be of some use.
On Feb 3, 2015, at 8:47 AM, Dmitriy Lyubimov dlie...@gmail.com wrote:
But first I need to do massive fixes and improvements to the distributed
optimizer itself. Still waiting on green light for that.
On Feb 3, 2015 8:45 AM
501 - 600 of 2219 matches
Mail list logo