Doubling the seeks would be a big deal for many of our installations.  I 
practice it means that we would need to fully buffer the two streams (assuming 
they are placed after each other) to avoid the extra seeks.  If this could be 
done in one stream, I think it would be a big improvement.  Is it possible to 
place the data row oriented, or to use sub chunks in the stream (e.g. N items 
from a then N items from b), so you can still stream?


> On Mar 26, 2018, at 2:24 AM, Xiening Dai <> wrote:
> Where does the 2x IO drop come from? Based on Cheng Xu’s data, Split + Zstd 
> has ~15% improvement over PlainV2 + Zstd in terms of the file size. If I 
> understand correctly, the total number of IO reads are almost the same, but 
> Split will need an additional seek for each read.
> The random IOPS would eventually determines the throughput of HDD. IO queue 
> can build up quickly when there are too many seeks and then drastically 
> affects read/write performance. That’s the major concern, and it’s not 
> related to locality. 
>> On Mar 26, 2018, at 2:47 PM, Gopal Vijayaraghavan <> wrote:
>>>  2. Under seek or predicate pushdown scenario, there’s no need to load the 
>>> entire stream.
>> Yes, that is a valid scenario where the reader reads partial-streams & 
>> causes random IO.
>> The current double encoding is actually 2 streams today & will continue to 
>> use 2 streams for the FLIP implementation.
>> The SPLIT implementation will go from the current 2 streams to 4 streams 
>> (i.e 1+1->1+3 streams) & the total data IO will drop by ~2x or so. More so 
>> if one of the streams can be suppressed (like in my IoT data-set, where the 
>> sign-bit is always +ve for my electric meter data).
>> The trade-offs seem to be working out on regular HDDs with locality & for 
>> LLAP SSD caches - if your use-cases are different, I'd like to hear more 
>> about it.
>> The only significant random IO delays expected seem to be entirely within 
>> the HDFS API network hops (which offers 0% locality when data is erasure 
>> coded or for cloud-storage), which I hope to fix in the Hadoop-3.x branch 
>> with a new API.
>> Cheers,
>> Gopal

Reply via email to