Theory question: good values for FileStatus.getBlockSize()

Steve Loughran Mon, 16 Feb 2015 09:45:24 -0800

HADOOP-11601 tightens up the filesystem spec by saying "if len(file) > 0, 
getFileStatus().getBlockSize() > 0"


this is to stop filesystems (most recently s3a) returning 0 as a block size, 
which then kills any analytics work that tries to partition the workload by 
blocksize.

I'm currently changing the markdown text to say

MUST be >0 for a file size >0
MAY be 0 for a file of size==0.

+ the relevant tests to check this.

There's one thing I do want to understand from HDFS first: what about small 
files.? That is: what does HDFS return as a blocksize if a file is smaller than 
its block size?

-Steve

Theory question: good values for FileStatus.getBlockSize()

Reply via email to