The block size controls lots of things in Hadoop.

It affects read parallelism, scalability, block allocation and other
aspects of operations either directly or indirectly.


On Sun, May 12, 2013 at 10:38 AM, shashwat shriparv <
dwivedishash...@gmail.com> wrote:

> The block size is for allocation not storage on the disk.
>
> *Thanks & Regards    *
>
> ∞
> Shashwat Shriparv
>
>
>
> On Fri, May 10, 2013 at 8:54 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> Thanks. I failed to add: It should be okay to do if those cases are
>> true and the cluster seems under-utilized right now.
>>
>> On Fri, May 10, 2013 at 8:29 PM, yypvsxf19870706
>> <yypvsxf19870...@gmail.com> wrote:
>> > Hi harsh
>> >
>> > Yep.
>> >
>> >
>> >
>> > Regards
>> >
>> >
>> >
>> >
>> >
>> >
>> > 发自我的 iPhone
>> >
>> > 在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
>> >
>> >> Are you looking to decrease it to get more parallel map tasks out of
>> >> the small files? Are you currently CPU bound on processing these small
>> >> files?
>> >>
>> >> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <
>> yypvsxf19870...@gmail.com> wrote:
>> >>> hi ALL
>> >>>
>> >>>     I am going to setup a new hadoop  environment, .Because  of
>>  there  are
>> >>> lots of small  files, I would  like to change  the
>>  default.block.size to
>> >>> 16MB
>> >>> other than adopting the ways to merge  the files into large  enough
>> (e.g
>> >>> using  sequencefiles).
>> >>>    I want to ask are  there  any bad influences or issues?
>> >>>
>> >>> Regards
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Reply via email to