You can add that sometimes the input file is too small and you don't get the
desired parallelism.
Sent from a remote device. Please excuse any typos...
Mike Segel
On May 27, 2011, at 12:25 PM, Harsh J wrote:
> Mohit,
>
> On Fri, May 27, 2011 at 10:44 PM, Mohit Anchlia
> wrote:
>> Actually
Mohit,
On Fri, May 27, 2011 at 10:44 PM, Mohit Anchlia wrote:
> Actually this link confused me
>
> http://hadoop.apache.org/common/docs/current/mapred_tutorial.html#Job+Input
>
> "Clearly, logical splits based on input-size is insufficient for many
> applications since record boundaries must be r
Actually this link confused me
http://hadoop.apache.org/common/docs/current/mapred_tutorial.html#Job+Input
"Clearly, logical splits based on input-size is insufficient for many
applications since record boundaries must be respected. In such cases,
the application should implement a RecordReader,
The query fit into mapreduce-user, since it primarily dealt with how
Map/Reduce operates over data, just to clarify :)
On Fri, May 27, 2011 at 10:38 PM, Mohit Anchlia wrote:
> thanks! Just thought it's better to post to multiple groups together
> since I didn't know where it belongs :)
>
> On Fri
thanks! Just thought it's better to post to multiple groups together
since I didn't know where it belongs :)
On Fri, May 27, 2011 at 10:04 AM, Harsh J wrote:
> Mohit,
>
> Please do not cross-post a question to multiple lists unless you're
> announcing something.
>
> What you describe, does not ha
Mohit,
Please do not cross-post a question to multiple lists unless you're
announcing something.
What you describe, does not happen; and the way the splitting is done
for Text files is explained in good detail here:
http://wiki.apache.org/hadoop/HadoopMapReduce
Hope this solves your doubt :)
On