[jira] [Updated] (HDFS-10351) Ozone: Optimize key writes to chunks by providing a bulk write implementation in ChunkOutputStream.

Chris Nauroth (JIRA) Fri, 29 Apr 2016 20:46:40 -0700

     [ 
https://issues.apache.org/jira/browse/HDFS-10351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Chris Nauroth updated HDFS-10351:
---------------------------------
    Attachment: HDFS-10351-HDFS-7240.002.patch

[~anu] and [~jingzhao], thank you for the code reviews.

bq. Technically this can happen even if there is an overflow. But I would 
rather have this check and fail the write than otherwise

{{OutputStream}} defines a very specific contract about throwing particular 
kinds of runtime exceptions in response to invalid inputs.  To ensure adherence 
to this contract, I actually adapted the code from OpenJDK's base 
{{OutputStream}} class implementation:

http://hg.openjdk.java.net/jdk8u/jdk8u/jdk/file/df209f221cca/src/share/classes/java/io/OutputStream.java#l106

bq. Where are we using this call from ?

This is called from {{org.apache.hadoop.ozone.web.handlers.KeyHandler#putKey}}, 
which calls {{DistributedStorageHandler}} to get a {{ChunkOutputStream}}, and 
then loops reading from the input HTTP request and writing the bytes.

{code}
            stream.write(buffer, 0, len);
{code}

Before my patch, this would enter the base class bulk {{OutputStream#write}} 
implementation, which is an inefficient loop over the single-byte {{write}} 
method.  After this patch, this call site instead enters the more efficient 
implementation of the bulk {{write}}.  This prevents a lot of method call 
overhead and allows us to take advantage of the bulk {{ByteBuffer}} operations 
for faster transfer.

bq. One minor comment: maybe the following code can be simplified as "int 
writeLen = Math.min(CHUNK_SIZE - buffer.position(), len);".

That's a good idea.  Here is patch v002 with that change.

> Ozone: Optimize key writes to chunks by providing a bulk write implementation 
> in ChunkOutputStream.
> ---------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-10351
>                 URL: https://issues.apache.org/jira/browse/HDFS-10351
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone
>            Reporter: Chris Nauroth
>            Assignee: Chris Nauroth
>         Attachments: HDFS-10351-HDFS-7240.001.patch, 
> HDFS-10351-HDFS-7240.002.patch
>
>
> HDFS-10268 introduced the {{ChunkOutputStream}} class as part of end-to-end 
> integration of Ozone receiving key content and writing it to chunks in a 
> container.  That patch provided an implementation of the mandatory 
> single-byte write method.  We can improve performance by adding an 
> implementation of the bulk write method too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10351) Ozone: Optimize key writes to chunks by providing a bulk write implementation in ChunkOutputStream.

Reply via email to