Classify new data using Decision Forest
---
Key: MAHOUT-323
URL: https://issues.apache.org/jira/browse/MAHOUT-323
Project: Mahout
Issue Type: Improvement
Components: Classification
Affects Ve
[
https://issues.apache.org/jira/browse/MAHOUT-322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842213#action_12842213
]
Jake Mannix commented on MAHOUT-322:
It should actually be noted that Danny's original
[
https://issues.apache.org/jira/browse/MAHOUT-314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jake Mannix resolved MAHOUT-314.
Resolution: Fixed
Fix Version/s: 0.3
Committed.
Current implementation is a map-side join
[
https://issues.apache.org/jira/browse/MAHOUT-313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jake Mannix resolved MAHOUT-313.
Resolution: Fixed
Fix Version/s: 0.3
Committed, code piggybacks on timesSquared() with a lit
[
https://issues.apache.org/jira/browse/MAHOUT-310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jake Mannix resolved MAHOUT-310.
Resolution: Fixed
Fix Version/s: 0.3
committed
> LanczosSolver and DistributedLanczosSolver
[
https://issues.apache.org/jira/browse/MAHOUT-312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jake Mannix resolved MAHOUT-312.
Resolution: Fixed
Fix Version/s: 0.3
committed
> DistributedRowMatrix iterateAll() and iter
+1
Sounds right to me.
On Fri, Mar 5, 2010 at 6:33 PM, Drew Farris wrote:
> Unit tests are included and I've regression tested the patch against
> the original implementation on the 20news corpus -- it produces the
> same results.
>
> So, with the group's blessing I will commit.
>
--
Ted Du
Speaking of spinning, Mike, there is a bit of a move afoot to use the 0.3
release to do some *really* big SVD in order to claim a size record of
sorts. The goal is to find some realistic and interesting matrix with about
5 x 10^9 non-zero elements.
On Fri, Mar 5, 2010 at 8:05 PM, Jake Mannix wro
Hi Mike,
Welcome to the long journey down the road of dimensional reduction. :)
On Fri, Mar 5, 2010 at 5:05 PM, mike bowles wrote:
>
> Really large matrices require using one of the randomizing methods to get
> done.
"Require" is a strong term. Really really large (but still sparse!)
matric
In the spirit of Jake's message, would anyone be opposed to a commit
of MAHOUT-317? (https://issues.apache.org/jira/browse/MAHOUT-317)
It is a re-factoring of the LLR Collocation work to eliminate
in-memory frequency calculations for ngram and n-1gram frequencies.
Using a secondary sort eliminates
Mike,
http://issues.apache.org/jira/browse/MAHOUT-180 might be of interest. Jake
has done a fair bit of work beyond that.
Next up is a stochastic decomposition version. You can see the seeds of
that in Jake's other JIRA's.
On Fri, Mar 5, 2010 at 5:05 PM, mike bowles wrote:
> ... I thought i
I've been trying to figure out how to code an svd algorithm and I've seen
some questions about svd algorithms floating around the Mahout mailing
lists. I thought it might be helpful to share what I've found so far.
Really large matrices require using one of the randomizing methods to get
don
Ha! Perpetual code freeze to get new features, now there's a concept!
Three +1's, ok if I don't get any negative feedback before I get back to a
computer, I'll check in.
I've just added more wiki pages for me to write, too I guess...
-jake
On Mar 5, 2010 2:53 PM, "Jeff Eastman" wrote:
Robi
Robin Anil wrote:
Seems we are most productive when its a code freeze :) +1 from me as well.
You have time till Hadoop resolves 6617, assuming nothing gets broken
Maybe we should just declare a perpetual code freeze . But you
guys are really on a roll so I'm +1 too
Jeff
On Thu, Mar 4, 2010 at 7:41 AM, Robin Anil wrote:
> Based on what i have in mind, the usage will just be
>
> mahout vectorize -i s3://input -o s3://output -tmp hdfs://file (here, there
> is a risk of fixing a exact path and not knowing the hadoop user, I would
> have preferred a relative path)
>
Seems we are most productive when its a code freeze :) +1 from me as well.
You have time till Hadoop resolves 6617, assuming nothing gets broken
Robin
On Sat, Mar 6, 2010 at 12:50 AM, Jake Mannix wrote:
> Hey all,
>
> Our "flash-freeze" has unthawed considerably, but I've been trying to be
> g
Tentative +1 from me.
On Fri, Mar 5, 2010 at 11:20 AM, Jake Mannix wrote:
> Can I get these in for 0.3? I could commit today if it's ok with the team.
--
Ted Dunning, CTO
DeepDyve
Hey all,
Our "flash-freeze" has unthawed considerably, but I've been trying to be
good and not check in stuff with functionality improvements I've been
wanting to check in.
What do you folks say about me checking in fixes for
MAHOUT-312 DistributedRowMatrix iterateAll() and
iterate() don't wo
[
https://issues.apache.org/jira/browse/MAHOUT-315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jake Mannix resolved MAHOUT-315.
Resolution: Fixed
Fix Version/s: (was: 0.4)
0.3
Committed.
> VectorD
Ahh, a JIRA issue, I should have thought of that. Thanks Benson.
Drew
On Fri, Mar 5, 2010 at 12:59 PM, Benson Margulies wrote:
> https://issues.apache.org/jira/browse/HADOOP-6617
>
>
> On Fri, Mar 5, 2010 at 9:32 AM, Grant Ingersoll wrote:
>
>> Has anyone filed a JIRA with them to do so?
>>
>>
https://issues.apache.org/jira/browse/HADOOP-6617
On Fri, Mar 5, 2010 at 9:32 AM, Grant Ingersoll wrote:
> Has anyone filed a JIRA with them to do so?
>
> The Extremely Esteemed PMC Chair (aka Paper Pusher Extraordinaire),
> Grant
>
> On Mar 5, 2010, at 7:28 AM, Benson Margulies wrote:
>
> > Co
Coming Right Up.
On Fri, Mar 5, 2010 at 9:32 AM, Grant Ingersoll wrote:
> Has anyone filed a JIRA with them to do so?
>
> The Extremely Esteemed PMC Chair (aka Paper Pusher Extraordinaire),
> Grant
>
> On Mar 5, 2010, at 7:28 AM, Benson Margulies wrote:
>
> > Could I be stupid for a moment? Our
Has anyone filed a JIRA with them to do so?
The Extremely Esteemed PMC Chair (aka Paper Pusher Extraordinaire),
Grant
On Mar 5, 2010, at 7:28 AM, Benson Margulies wrote:
> Could I be stupid for a moment? Our fellow Apache project, Hadoop, makes
> releases but doesn't bother to stick them into
On Fri, Mar 5, 2010 at 7:28 AM, Benson Margulies wrote:
>
> Are we proposing to just do their work for them, or to publish them under a
> Mahout-specific Maven triple?
The former situation would be the most ideal of the two.
> If the later, I would ask our esteemed PMC
> chair to make a personal
Could I be stupid for a moment? Our fellow Apache project, Hadoop, makes
releases but doesn't bother to stick them into the Apache repo where they
will replicate to central?
Are we proposing to just do their work for them, or to publish them under a
Mahout-specific Maven triple? If the later, I wo
OK, Sean. I still need to take a final pass over the pom:
I'll replace the placeholder version variables with real versions and
switch the group to be org.apache.mahout.hadoop (as before) instead of
org.apache.hadoop. Once the pom's in good shape, I'll switch the
dependency in mahout, do a full re
If you have the mojo ready and working, commit? Sounds OK to me.
On Fri, Mar 5, 2010 at 5:14 AM, Drew Farris wrote:
> Ok, the unit tests completed successfully with this setup. I suspect
> this means we probably want to deploy our own dependency for hadoop
> 0.20.2 with the proper versions specif
27 matches
Mail list logo