Hi,
Is there an API that is available to easily embed Mahout in a java app,
feed data and get output?
PS: Forgive me if this is a noob question. Still trying to figure out
Mahout.
--
Regards,
Tharindu
blog: http://mackiemathew.com/
Mahout is written in Java, so 'yes' you can put it in any Java program
trivially. Why would it have anything to do with an API? I think you need
to be clearer about what you are doing, and probably first have a basic
look at the project.
On Nov 2, 2011 8:49 AM, Tharindu Mathew mcclou...@gmail.com
Doesn't look like they're used anywhere but tests.
In the spirit of removing clutter, I suggest we rip that stuff out! It's
really not that unreasonable to carry around BiMapString, Integer
dictionary to translate between feature labels and featureIds outside a
Matrix rather than set it on the
Hi Sean,
I guess with a proper API it just makes it easier. I was hoping you'd point
me to a code sample or a tutorial.
I only could find everything referring to quick starts which tell how to
run a sample, such as
The wiki has examples of calling most of the code via Java, and javadoc
ought to cover the rest. What are you looking for specifically? Mahout is
not one thing. All of it is callable from Java.
On Nov 2, 2011 9:21 AM, Tharindu Mathew mcclou...@gmail.com wrote:
Hi Sean,
I guess with a proper
On Wed, Nov 2, 2011 at 2:51 PM, Tharindu Mathew mcclou...@gmail.com wrote:
Hi Sean,
I guess with a proper API it just makes it easier. I was hoping you'd point
me to a code sample or a tutorial.
Hi
For detailed code samples and tutorials see the book Mahout in Action.
You will get a
I want to create a java UI tool (based on a web app) that can pick and
apply different algorithms available in Mahout to different data sets.
Hence the embedding with java. Obviously, I understand that everything is
callable from Java since it's written in Java :).
For example, I want to do a
I see, the Java interfaces vary from area to area since different
algos are different things and sometimes take different input.
Generally, the classifiers take in Mahout Vector input, and are
Hadoop-based, so you'd be writing some code to run Mahout jobs on
Hadoop from your GUI app. Not all are
Thanks Sean.
Looks like I'll have to dig into the code will start from MahoutDriver.
Is there a mode that will work for all algorithms. For example, all
algorithms can run on a single node mode or all algorithms run on a hadoop
mode ( I know Hadoop has a local mode, but that's not what I'm
MahoutDriver is the closest thing to a single point of entry for all the
algorithms. It's for command line use but you can see what it does after
parsing args.
In general, most algorithms use Hadoop, so in general no there is not a
Hadoop free mode. Some bits have non Hadoop parts though that's
What's the Minhash key groups value used for in the MinhashDriver? I mean, I
see it is used for building up the key out of the hashed values, but what's the
significance of different values for it? The default is 2, what does it mean
practically speaking if I choose, say, 10? AFAICT, it
What functionality, specifically, are you proposing to remove? I know we had a
lot of discussion around some of this stuff way back when as to how best to do
it, but of course, that doesn't mean it has uptake. If it's on the Matrix,
then doesn't it more easily get shipped around via the
On Nov 2, 2011, at 7:17 AM, Tharindu Mathew wrote:
I want to create a java UI tool (based on a web app) that can pick and
apply different algorithms available in Mahout to different data sets.
Very cool! Keep us posted, as this would be immensely useful! Any chance it
will be donated back?
On Wed, Nov 2, 2011 at 7:34 AM, Grant Ingersoll gsing...@apache.org wrote:
What functionality, specifically, are you proposing to remove?
I'm suggesting we kill, from Matrix.java and descendents, all of the
following methods:
MapString, Integer getColumnLabelBindings();
MapString, Integer
On Nov 2, 2011, at 10:58 AM, Jake Mannix wrote:
On Wed, Nov 2, 2011 at 7:34 AM, Grant Ingersoll gsing...@apache.org wrote:
What functionality, specifically, are you proposing to remove?
I'm suggesting we kill, from Matrix.java and descendents, all of the
following methods:
Ah, ok, I was looking at an older source tree. Then in that case, no
*release*
we've had touches them, and nowhere in the codebase does anyone
currently use the bindings, even if it is the case that if you *did* use
them,
they would indeed get serialized with the matrix.
Which is why I was
On Tue, Nov 1, 2011 at 8:47 PM, Grant Ingersoll gsing...@apache.org wrote:
On Nov 1, 2011, at 2:16 PM, Patrick Hunt wrote:
On Tue, Nov 1, 2011 at 10:44 AM, Ted Dunning ted.dunn...@gmail.com wrote:
On Tue, Nov 1, 2011 at 9:18 AM, Patrick Hunt ph...@apache.org wrote:
2011/10/31 Ted Dunning
On Wed, Nov 2, 2011 at 10:15 AM, Grant Ingersoll gsing...@apache.orgwrote:
On Nov 2, 2011, at 11:50 AM, Jake Mannix wrote:
Ah, ok, I was looking at an older source tree. Then in that case, no
*release*
we've had touches them, and nowhere in the codebase does anyone
currently use the
These labels are here by analogy with R data.frames where having the labels
inside the data is really handy.
On Wed, Nov 2, 2011 at 10:15 AM, Grant Ingersoll gsing...@apache.orgwrote:
HDFS all nice and safe, and I've got a pile of numeric serialized
(DistributedRow-)Matrix instances which
It seems like a good idea, but it definitely is not impossible to work
around the lack.
Having the labels should make certain forms of cluster dumping easier, but
for all the stuff I do with hashed representations, the hashing destroys
any utility of labels.
It may be that label utility is
Forwarded to mahout list instead of lucene. Let's move the discussion
there.
-- Forwarded message --
From: Sam Cunningham sam_cun...@yahoo.com
Date: Wed, Nov 2, 2011 at 10:33 AM
Subject: Mahout In Action - Bayes/CBayes Classification returns NaN
To: gene...@lucene.apache.org
My
On Wed, Nov 2, 2011 at 11:22 AM, Ted Dunning ted.dunn...@gmail.com wrote:
It seems like a good idea, but it definitely is not impossible to work
around the lack.
And more importantly, it may be a good idea in theory, but has anyone
actually used it, or foresee using it soon?
It's 9 methods
The only thought I have about it is that there's a to-do to make that
stuff actually used and integrate into a wrapper class. I think it's
fine to kill it. If someone goes to all the trouble of re-implementing
it later it will not have been extra work; it probably was to be
redone anyway.
On Wed,
Let's nuke it.
I am the most vocal in favor and I can't get up the enthusiasm to push for
keeping it.
On Wed, Nov 2, 2011 at 11:31 AM, Sean Owen sro...@gmail.com wrote:
The only thought I have about it is that there's a to-do to make that
stuff actually used and integrate into a wrapper
Ok, we can always resurrect it.
I'll leave this thread open until after work tonight (8 hrs or so from
now), and if
I don't hear any vociferous complaints or reasoned thoughts on why this is
crazy,
I'll chop 'em.
-jake
On Wed, Nov 2, 2011 at 11:34 AM, Ted Dunning ted.dunn...@gmail.com wrote:
Below I am providing with some documents regarding the issue. The top 4
documents are sample normalized classes (Entertainment, Health, SciTech, and
Sports). The last document is the model.
http://12.233.16.76/icons/Entertainment.zip
http://12.233.16.76/icons/Health.zip
Sam,
I recommend actually subscribing to the mailing list while you have active
questions. There is a long history of nabble postings not actually making
it to the apache mailing lists.
On Wed, Nov 2, 2011 at 12:19 PM, Sam Cunningham sam_cun...@yahoo.comwrote:
Below I am providing with some
In the vein of users become contributors become committers:
It seems there has been some spark of interest in contributing more, so I
thought I would pass along a few pointers:
1. https://cwiki.apache.org/MAHOUT/how-to-contribute.html -- Details how to
submit patches, etc. IDE codestyles at
I can't download these files. The server never responds as far as I can
tell. You may have given out an local address. Or turned the machine off.
Or whatever.
Can you put them onto dropbox or pastebin or S3 or something so that we can
look at these?
On Wed, Nov 2, 2011 at 12:19 PM, Sam
+1 from me too. IIRC this all got added when we were annotating Vectors too and
there we ended up with NamedVector as a wrapper. If this Matrix annotation is
not being used then let's clean it up.
-Original Message-
From: Ted Dunning [mailto:ted.dunn...@gmail.com]
Sent: Wednesday,
On 02.11.2011 Jake Mannix wrote:
I'll leave this thread open until after work tonight (8 hrs or so from
now), and if I don't hear any vociferous complaints or reasoned thoughts on
why this is crazy, I'll chop 'em.
+1 for the cleanup, however if you are leaving the thread open for that
On Wed, Nov 2, 2011 at 5:24 PM, Isabel Drost isa...@apache.org wrote:
On 02.11.2011 Jake Mannix wrote:
I'll leave this thread open until after work tonight (8 hrs or so from
now), and if I don't hear any vociferous complaints or reasoned thoughts
on
why this is crazy, I'll chop 'em.
+1
It seems that some of us were not able to get to the URLs. So, I am uploading
the files here.
http://lucene.472066.n3.nabble.com/file/n3475998/Entertainment.zip
Entertainment.zip
http://lucene.472066.n3.nabble.com/file/n3475998/Health.zip Health.zip
Thanks everyone for the encouraging replies.
If it's possible I will work on and contribute a clean API that will ease
the learning curve of applying Mahout.
On Wed, Nov 2, 2011 at 9:40 PM, Matteo Moci mox...@gmail.com wrote:
I just found this [1] project.
It seems a bit old, and I don't know
34 matches
Mail list logo