It's not as difficult as you might think to run a local 1-node Hadoop
cluster. It ought to be about like this:
http://hadoop.apache.org/common/docs/stable/single_node_setup.html

I don't know if you can run the decision forest stuff from the command
line. You would look at Classifier and Builder
in org.apache.mahout.classifier.df.mapreduce and use those classes to start
the MapReduce jobs.

On Thu, Nov 24, 2011 at 6:33 PM, Sturm, Martin <[email protected]> wrote:

> I was afraid of that answer, but thanks anyway.
> Since I only want to try it out "standalone" I was hoping that this was
> possible without any Hadoop stuff. Are there any tutorials or examples
> available that show how to load a Dataset? Because I do not even know what
> files are expected here.. cvs?
>
> -----Original Message-----
> From: Sean Owen [mailto:[email protected]]
> Sent: Donnerstag, 24. November 2011 18:30
> To: [email protected]
> Subject: Re: Load Dataset and Instances from database
>
> Yes, that's the only point of direct JDBC integration in the project.
>
> For the Hadoop-based bits, and that's most of Mahout, the question is
> really, does Hadoop integrate with JDBC? Since the code is really a bunch
> of Hadoop jobs, and not tied directly to a data store.
>
> A relational database is not a common data source for Hadoop. Not that it
> couldn't be, it's just that Hadoop operates by sequentially accessing
> petabytes of potentially unstructured data. A relational database would be
> expensive overkill for just storing huge blobs.
>
> I would not be surprised if you can find an InputFormat implementation for
> Hadoop that reads from JDBC.
>
> Breaking news: I found DBInputFormat in Hadoop!
>
> http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/lib/db/DBInputFormat.html
>
> So, the high-level answer is that you would tweak the Mahout
> implementation to use DBInputFormat if you wanted to read stuff out of a
> database.
>
>
>
> On Thu, Nov 24, 2011 at 3:02 PM, Sturm, Martin <[email protected]>
> wrote:
>
> > The only JDBC related classes in Mahout are in the
> > org.apache.mahout.cf.taste.* package to load a JDBCDataModel used for
> > mining preferences.
> > But I need Dataset and Instance which are used for building a decision
> > tree (in org.apache.mahout.classifier.df.data). Can I somehow get
> > these directly from the DB also with a standalone application without a
> Hadoop?
> >
> > -----Original Message-----
> > From: Shern Shiou Tan [mailto:[email protected]]
> > Sent: Donnerstag, 24. November 2011 15:53
> > To: [email protected]
> > Subject: Re: Load Dataset and Instances from database
> >
> > Yes, if not mistaken mahout support JDBC based database using JDBC
> driver.
> > ShernShiou
> > On 11/24/2011 10:34 PM, Sturm, Martin wrote:
> > > Hello,
> > > I am relatively new to Mahout, so it is possible that this is a
> > > quite
> > obvious question: Has Mahout any database access support besides the
> > in the recommender-related package?
> > > I want to use a decision forest classification and need the data
> > > from an
> > Oracle or MS SQL database.  Is there any way to access them so that I
> > automatically get the Dataset and a list of Instance?
> > >
> > > Thanks in advance,
> > > Martin
> > >
> >
> >
> > UC4 Senactive Software GmbH, Hauptstrasse 3C, 3012 Wolfsgraben mit
> > einer weiteren Betriebsstaette in /with an office at
> > Prinz-Eugen-Stra?e 72, 1040 Wien Firmenbuchnummer/Commercial Register
> > No. 261186y Firmenbuchgericht/Commercial Register Court: Landesgericht
> > St. Poelten This email (including any attachments) may contain
> > information which is privileged, confidential, or protected. If you
> > are not the intended recipient, note that any disclosure, copying,
> > distribution, or use of the contents of this message and attached
> > files is prohibited. If you have received this email in error, please
> > notify the sender and delete this email and any attached files.
> >
>
> UC4 Senactive Software GmbH, Hauptstrasse 3C, 3012 Wolfsgraben mit einer
> weiteren Betriebsstaette in /with an office at Prinz-Eugen-Straße 72, 1040
> Wien Firmenbuchnummer/Commercial Register No. 261186y
> Firmenbuchgericht/Commercial Register Court: Landesgericht St. Poelten
>

Reply via email to