[ 
https://issues.apache.org/jira/browse/HDDS-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanisha Koneru reassigned HDDS-4552:
------------------------------------

    Assignee: Hanisha Koneru

> Read data from chunk into ByteBuffer[] instead of single ByteBuffer
> -------------------------------------------------------------------
>
>                 Key: HDDS-4552
>                 URL: https://issues.apache.org/jira/browse/HDDS-4552
>             Project: Hadoop Distributed Data Store
>          Issue Type: Improvement
>            Reporter: Hanisha Koneru
>            Assignee: Hanisha Koneru
>            Priority: Major
>
> When a ReadChunk operation is performed, all the data to be read from one 
> chunk is read into a single ByteBuffer. 
> {code:java}
> #ChunkUtils#readData()
> public static void readData(File file, ByteBuffer buf,
>     long offset, long len, VolumeIOStats volumeIOStats)
>     throws StorageContainerException {
>   .....
>   try {
>     bytesRead = processFileExclusively(path, () -> {
>       try (FileChannel channel = open(path, READ_OPTIONS, NO_ATTRIBUTES);
>            FileLock ignored = channel.lock(offset, len, true)) {
>         return channel.read(buf, offset);
>       } catch (IOException e) {
>         throw new UncheckedIOException(e);
>       }
>     });
>   } catch (UncheckedIOException e) {
>     throw wrapInStorageContainerException(e.getCause());
>   }
>   .....
>   .....{code}
> This Jira proposes to read the data from the channel and put it into an array 
> of ByteBuffers each with a set capacity. This capacity can be configurable. 
> This would help with optimizing Ozone InputStreams in terms of cached memory. 
> Currently, data in ChunkInputStream is cached till either the stream is 
> closed or the chunk EOF is reached. This sometimes leads to upto 4MB (default 
> ChunkSize) of data being cached in memory per ChunkInputStream. 
> After the proposed change, we can optimize ChunkInputStream to release a 
> ByteBuffer as soon as that ByteBuffer is read instead of waiting to read the 
> whole chunk. Read I/O performance will not be affected as the read from DN 
> still returns the requested length of data at one go. Only difference would 
> be that the data would be returned in an array of ByteBuffer instead of a 
> single ByteBuffer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to