[
https://issues.apache.org/jira/browse/HADOOP-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Enis Soztutar updated HADOOP-2536:
----------------------------------
Attachment: mapred_jdbc_v3.patch
Since Fredrik said that he cannot continue to work on the patch, I have updated
it with some changes.
The changes include :
# package and class names have DB prefix instead of database.
# DBInputSplit is now an inner class of DBInputFormat
# instead of the type mapping to convert the data types in the library, a new
DBWritable interface is introduced. The classes implement DBWritable to convert
from/to db tuples.
# DBRecordReader emits <LongWritable, T> types where record number is the key
and T is of type DBWritable.
# DBRecordWriter accepts <K, V> where K implements DBWritable(hence written to
db) and V is discarded.
# JDBC uses JDBC batch update.
# introduced two ways of setting the input query.
# improved documentation.
# added a sample mapred program reading data from db and writing the results
back to db. The program calculates the number of pageviews in a syntactically
generated access log. The example program uses HSQLDB as an embedded database.
# added a test case running the example job in the MiniCluster.
> MapReduce for MySQL
> -------------------
>
> Key: HADOOP-2536
> URL: https://issues.apache.org/jira/browse/HADOOP-2536
> Project: Hadoop Core
> Issue Type: New Feature
> Components: mapred
> Reporter: Fredrik Hedberg
> Assignee: Fredrik Hedberg
> Priority: Minor
> Attachments: database-2.diff, database.diff, mapred_jdbc_v3.patch
>
>
> Add support for running MapReduce jobs over data residing in a MySQL table.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.