Multiple Input

2014-10-13 Thread Pedro Magalhaes
Does anyone can help me? http://stackoverflow.com/questions/26341913/hadoop-multipleinputs

Re: multiple input splits from single file

2012-06-10 Thread Karthik Kambatla
he euclidean coordinates > of the cities. we need to pass this single line to each mapper who will > then process that. How can we do this so that we can achieve parallelism in > a hadoop cluster. Is there any way to generate multiple input splits from > the single input file. > > Thanks > > Sharat >

Re: multiple input splits from single file

2012-06-10 Thread Harsh J
Sharat, To answer your specific question of: > Is there any way to generate multiple input splits from the single input file. Yes there is. Use the NLineInputFormat class, with an N value of 1. You should then, for a single file of N lines (dupe or not), get N map tasks. On Sun, Jun 10, 2

multiple input splits from single file

2012-06-10 Thread sharat attupurath
parallelism in a hadoop cluster. Is there any way to generate multiple input splits from the single input file. Thanks Sharat

Re: Multiple input formats and multiple output formats in Hadoop 0.20.2

2011-08-10 Thread Jian Fang
ra.com/cdh/3/hadoop/api/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.html > > Examples of usages are part of API doc. > > Regards, > Dino Kečo > > > On Wed, Aug 10, 2011 at 6:08 PM, Jian Fang > wrote: > >> Hi, >> >> I am working on a pr

Re: Multiple input formats and multiple output formats in Hadoop 0.20.2

2011-08-10 Thread Dino Kečo
doc. Regards, Dino Kečo On Wed, Aug 10, 2011 at 6:08 PM, Jian Fang wrote: > Hi, > > I am working on a project, which requires multiple input formats and > multiple output formats. Basically, I store some sales rank data to a > Cassandra cluster and I get a sales rank update f

Multiple input formats and multiple output formats in Hadoop 0.20.2

2011-08-10 Thread Jian Fang
Hi, I am working on a project, which requires multiple input formats and multiple output formats. Basically, I store some sales rank data to a Cassandra cluster and I get a sales rank update file each day to update the ranks in the Cassandra. In the meanwhile, I need to find all the products

Re: Multiple input files, no reducer, output is "stomped" by the just one of the files

2011-01-08 Thread Brett Hoerner
I found my issue, for future readers: I forgot to append the 3rd argument of generateFileNameForKeyValue (the String) to the returned filenames. That String is the "part0001", etc which gives each mapper a unique filename. On Thu, Jan 6, 2011 at 2:51 PM, Brett Hoerner wrote: > Hello, > > I'm

Multiple input files, no reducer, output is "stomped" by the just one of the files

2011-01-06 Thread Brett Hoerner
Hello, I'm running a very simple job that returns the input with a null key and uses no reducer (see below). I'm using MultipleSequenceFileOutputFormat to "split" the input into different files, but for simplicity's sake right now I'm just returning "file1" from generateFileNameForKeyValue so tha