Re: Saving data in db instead of hdfs

2013-05-02 Thread Ahmed Radwan
You can use the DBOutputFormat to directly write your job output to a DB, see: http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/lib/db/DBOutputFormat.html I'd also recommend looking into sqoop (http://sqoop.apache.org/) for more capabilities. On Thu, May 2, 2013 at 2:03 PM, Che

Re: Saving data in db instead of hdfs

2013-05-02 Thread Mirko Kämpf
Hi, just use Sqoop to push the data from HDFS to a database via JDBC. Intro to Sqoop: http://blog.cloudera.com/blog/2009/06/introducing-sqoop/ Or even use Hive-JDBC to connect to your result data from outside the hadoop cluster. You can also create your own OutputFormat (with Java API), which w

Saving data in db instead of hdfs

2013-05-02 Thread Chengi Liu
Hi, I am using hadoop streaming api (python) for some processing. While I want the data to be processed via hadoop but I want to pipe it to db instead of hdfs. How do I do this? THanks