Re: Block vs FileSplit vs record vs line

2013-03-14 Thread Mohammad Tariq
by recordreader code. One record can be series > of maps or splits or blocks. > > Hope this will clear. > > Sent from HTC via Rocket! excuse typo. > > -- > * From: * Sai Sai ; > * To: * user@hadoop.apache.org ; > * Subject: * Re: Block vs Fil

Re: Block vs FileSplit vs record vs line

2013-03-14 Thread Manish Bhoge
Sai, Each file is divided into split as per the map input format, each split is equal to a map. You rightly stated 1 split=1 block=1 map. Record can be combination of block defined by recordreader code. One record can be series of maps or splits or blocks. Hope this will clear. Sent from HTC

Re: Block vs FileSplit vs record vs line

2013-03-14 Thread Sai Sai
Just wondering if this is right way to understand this: A large file is split into multiple blocks and each block is split into multiple file splits and each file split has multiple records and each record has multiple lines. Each line is processed by 1 instance of mapper. Any help is appreciated