Arnaud Nauwynck created HADOOP-19345:
----------------------------------------
Summary: AzureBlobFileSystem.open() should override readVectored()
much more efficiently for small reads
Key: HADOOP-19345
URL: https://issues.apache.org/jira/browse/HADOOP-19345
Project: Hadoop Common
Issue Type: Improvement
Components: tools
Reporter: Arnaud Nauwynck
In hadoop-azure, there are huge performance problems when reading file in a too
fragmented way: by reading many small file fragments even with the
readVectored() Hadoop API, resulting in distinct Https Requests (=TCP-IP
connection established + TLS handshake + requests).
Internally, at lowest level, haddop azure is using class HttpURLConnection from
jdk 1.0, and the ReadAhead Threads do not sufficiently solve all problems.
The hadoop azure implementation of "readVectored()" should make a compromise
between reading extra ignored data wholes, and establishing too many https
connections.
Currently, the class AzureBlobFileSystem#open() does return a default
inneficient imlpementation of readVectored:
```
private FSDataInputStream open(final Path path,
final Optional<OpenFileParameters> parameters) throws IOException {
...
InputStream inputStream = getAbfsStore().openFileForRead(qualifiedPath,
parameters, statistics, tracingContext);
return new FSDataInputStream(inputStream); // <== FSDataInputStream is
not efficiently overriding readVectored() !
}
```
see default implementation of FSDataInpustStream.readVectored:
```
public void readVectored(List<? extends FileRange> ranges,
IntFunction<ByteBuffer> allocate) throws IOException {
((PositionedReadable)this.in).readVectored(ranges, allocate);
}
```
it calls the underlying method from class AbfsInputStream, which is not
overriden:
```
default void readVectored(List<? extends FileRange> ranges,
IntFunction<ByteBuffer> allocate) throws IOException {
VectoredReadUtils.readVectored(this, ranges, allocate);
}
```
AbfsInputStream should override this method, and accept internally to do less
Https calls, with merged range, and ignore some returned data (wholes in the
range).
It is like honouring the parameter of hadoop FSDataInputStream (implements
PositionedReadable)
```
/**
* What is the smallest reasonable seek?
* @return the minimum number of bytes
*/
default int minSeekForVectorReads() {
return 4 * 1024;
}
```
Even this 4096 value is very conservative, and should be redined by
AbfsFileSystem to be 4Mo or even 8mo.
ask chat gpt: "on Azure Storage, what is the speed of getting 8Mo of a page
block, compared to the time to establish a https tls handshake ?"
The response (untrusted from chat gpt..) says :
HTTPS/TLS Handshake: ~100–300 ms ... is generally slower than downloading 8
MB from Page Blob: on Standard Tier: ~100–200 ms / on Premium Tier: ~30–50 ms
Azure Abfsclient already setup by default a lot of Threads for Prefecth Read
Ahead, to prefetch 4Mo of data, but it is NOT sufficent, and less efficient
that simply implementing correctly what is already in Hadoop API :
readVectored(). It also have the drawback of reading tons of useless data (past
parquet blocks), that are never used.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]