Thomas created HADOOP-14535: ------------------------------- Summary: Support for random access and seek of block blobs Key: HADOOP-14535 URL: https://issues.apache.org/jira/browse/HADOOP-14535 Project: Hadoop Common Issue Type: Improvement Components: fs/azure Reporter: Thomas Fix For: 2.9.0, 3.0.0-alpha4
This change adds a seek-able stream for reading block blobs to the wasb:// file system. If seek() is not used or if only forward seek() is used, the behavior of read() is unchanged. That is, the stream is optimized for sequential reads by reading chunks (over the network) in the size specified by "fs.azure.read.request.size" (default is 4 megabytes). If reverse seek() is used, the behavior of read() changes in favor of reading the actual number of bytes requested in the call to read(), with some contraints. If the size requested is smaller than 16 kilobytes and cannot be satisfied by the internal buffer, the network read will be 16 kilobytes. If the size requested is greater than 4 megabytes, it will be satisifed by sequential 4 megabyte reads over the network. This change improves the performance of FSInputStream.seek() by not closing and re-opening the stream, which for block blobs also involves a network operation to read the blob metadata. Now NativeAzureFsInputStream.seek() checks if the stream is seek-able and moves the read position. [^attachment-name.zip] -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org