[jira] [Commented] (HADOOP-17038) Support positional read in AbfsInputStream

Anoop Sam John (Jira) Thu, 13 Aug 2020 19:06:58 -0700


    [ 
https://issues.apache.org/jira/browse/HADOOP-17038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17177422#comment-17177422
 ]


Anoop Sam John commented on HADOOP-17038:
-----------------------------------------

Thanks [~ste...@apache.org] for those Qs. 
I agree to the fact that it should not be enabled just for HBase as other may 
suffer.  Within HBase itself for long range scan (compaction needs it anyways), 
this is not good.  And that is exactly why this config was added and made to 
false by default.  The config can be turned on only for HBase via 
hbase-site.xml in RegionServer side.   And HBase as such use both type of APIs. 
ie.  Hadoop's pread API as well as normal read() after a seek..  When we do the 
long range scan, we make sure to use the seek+ read mode.  HBase have the 
intelligence to do this switch back.
That is interesting Steve.. The per file open option.  In old versions which I 
have seen, this was not there.. In fact that is what at 1st I checked. To 
control it per each opened InputStream.. Let me read that and understand how to 
leverage that if possible.. If that is not really what needed, will come back 
and address ur comments. Appreciate it Steve.  Thanks.

> Support positional read in AbfsInputStream
> ------------------------------------------
>
>                 Key: HADOOP-17038
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17038
>             Project: Hadoop Common
>          Issue Type: Sub-task
>            Reporter: Anoop Sam John
>            Assignee: Anoop Sam John
>            Priority: Major
>              Labels: HBase, abfsactive
>         Attachments: HBase Perf Test Report.xlsx, screenshot-1.png
>
>
> Right now it will do a seek to the position , read and then seek back to the 
> old position.  (As per the impl in the super class)
> In HBase kind of workloads we rely mostly on short preads. (like 64 KB size 
> by default).  So would be ideal to support a pure pos read API which will not 
> even keep the data in a buffer but will only read the required data as what 
> is asked for by the caller. (Not reading ahead more data as per the read size 
> config)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-17038) Support positional read in AbfsInputStream

Reply via email to