*Sqoop* is often used in this scenario. You might also want to look at https://github.com/mongodb/mongo-hadoop *MongoDBHadoop Connector*. More on streaming support can be found here http://api.mongodb.org/hadoop/Hadoop+Streaming+Support.html There are pros and cons. Choose what suits you the best.
Pramod N <http://atmachinelearner.blogspot.in> Bruce Wayne of web @machinelearner <https://twitter.com/machinelearner> -- On Wed, May 29, 2013 at 2:13 AM, Kai Voigt <k...@123.org> wrote: > You can have your python streaming script simply not write any key/value > pairs to stdout, so you'll get an empty job output. > > Independently, your script could do anything external, such as connecting > to a remote database and store data in those. You probably want to avoid > too many tasks doing this in parallel. > > But more common would be a regular job which writes data to HDFS, and then > use Sqoop to store that data into a RDBMS. But it's your choice. > > Kai > > Am 28.05.2013 um 20:57 schrieb jamal sasha <jamalsha...@gmail.com>: > > > Hi, > > I want to process some text files and then save the output in a db. > > I am using python (hadoop streaming). > > I am using mongo as backend server. > > Is it possible to run hadoop streaming jobs without specifying any > output? > > What is the best way to deal with this. > > > > -- > Kai Voigt > k...@123.org > > > > >