Yes, that's the only point of direct JDBC integration in the project.

For the Hadoop-based bits, and that's most of Mahout, the question is
really, does Hadoop integrate with JDBC? Since the code is really a bunch
of Hadoop jobs, and not tied directly to a data store.

A relational database is not a common data source for Hadoop. Not that it
couldn't be, it's just that Hadoop operates by sequentially accessing
petabytes of potentially unstructured data. A relational database would be
expensive overkill for just storing huge blobs.

I would not be surprised if you can find an InputFormat implementation for
Hadoop that reads from JDBC.

Breaking news: I found DBInputFormat in Hadoop!
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/lib/db/DBInputFormat.html

So, the high-level answer is that you would tweak the Mahout implementation
to use DBInputFormat if you wanted to read stuff out of a database.



On Thu, Nov 24, 2011 at 3:02 PM, Sturm, Martin <martin.st...@uc4.com> wrote:

> The only JDBC related classes in Mahout are in the
> org.apache.mahout.cf.taste.* package to load a JDBCDataModel used for
> mining preferences.
> But I need Dataset and Instance which are used for building a decision
> tree (in org.apache.mahout.classifier.df.data). Can I somehow get these
> directly from the DB also with a standalone application without a Hadoop?
>
> -----Original Message-----
> From: Shern Shiou Tan [mailto:shernshiou....@mnc.com.my]
> Sent: Donnerstag, 24. November 2011 15:53
> To: user@mahout.apache.org
> Subject: Re: Load Dataset and Instances from database
>
> Yes, if not mistaken mahout support JDBC based database using JDBC driver.
> ShernShiou
> On 11/24/2011 10:34 PM, Sturm, Martin wrote:
> > Hello,
> > I am relatively new to Mahout, so it is possible that this is a quite
> obvious question: Has Mahout any database access support besides the in the
> recommender-related package?
> > I want to use a decision forest classification and need the data from an
> Oracle or MS SQL database.  Is there any way to access them so that I
> automatically get the Dataset and a list of Instance?
> >
> > Thanks in advance,
> > Martin
> >
>
>
> UC4 Senactive Software GmbH, Hauptstrasse 3C, 3012 Wolfsgraben mit einer
> weiteren Betriebsstaette in /with an office at Prinz-Eugen-Stra?e 72, 1040
> Wien Firmenbuchnummer/Commercial Register No. 261186y
> Firmenbuchgericht/Commercial Register Court: Landesgericht St. Poelten
> This email (including any attachments) may contain information which is
> privileged, confidential, or protected. If you are not the intended
> recipient, note that any disclosure, copying, distribution, or use of the
> contents of this message and attached files is prohibited. If you have
> received this email in error, please notify the sender and delete this
> email and any attached files.
>

Reply via email to