[
https://issues.apache.org/jira/browse/HDDS-15183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ivan Andika updated HDDS-15183:
-------------------------------
Description:
This is simply an idea.
Container replications currently still use buffered writes, meaning that it
still occupies the page cache. This can cause page cache pollution. We can
maybe achieve this by using DIRECT_IO, but currently we need to use
TarContainerPacker which needs to be tar and untarred (since one container
contains multiple blocks, unlike the proposal HDDS-12659 to put a single
container in a single file). Therefore, we cannot use DIRECT_IO cleanly.
We can consider using POSIX_FADV_DONTNEED
(https://man7.org/linux/man-pages/man2/posix_fadvise.2.html) after container
replication so that the OS can free this page caches immediately.
Of course, we need to check whether the initial claim is correct and whether
this is actually beneficial to prevent premature optimization.
{code:java}
import java.io.FileDescriptor;
import org.apache.hadoop.io.nativeio.NativeIO;
import org.apache.hadoop.io.nativeio.NativeIOException;
private static void dontNeed(String identifier, FileDescriptor fd,
long offset, long length) {
try {
NativeIO.POSIX.getCacheManipulator().posixFadviseIfPossible(
identifier,
fd,
offset,
length,
NativeIO.POSIX.POSIX_FADV_DONTNEED);
} catch (NativeIOException e) {
LOG.debug("Failed to advise DONTNEED for {}", identifier, e);
}
}
{code}
{code:java}
try (FileInputStream input = new FileInputStream(file)) {
IOUtils.copy(input, output, bufferSize);
dontNeed(file.getAbsolutePath(), input.getFD(), 0, 0);
}
{code}
was:
This is simply an idea.
Container replications currently still use buffered writes, meaning that it
still occupies the page cache. This can cause page cache pollution.
We can consider using POSIX_FADV_DONTNEED
(https://man7.org/linux/man-pages/man2/posix_fadvise.2.html) after container
replication so that the OS can free this page caches immediately.
Of course, we need to check whether the initial claim is correct and whether
this is actually beneficial to prevent premature optimization.
{code:java}
import java.io.FileDescriptor;
import org.apache.hadoop.io.nativeio.NativeIO;
import org.apache.hadoop.io.nativeio.NativeIOException;
private static void dontNeed(String identifier, FileDescriptor fd,
long offset, long length) {
try {
NativeIO.POSIX.getCacheManipulator().posixFadviseIfPossible(
identifier,
fd,
offset,
length,
NativeIO.POSIX.POSIX_FADV_DONTNEED);
} catch (NativeIOException e) {
LOG.debug("Failed to advise DONTNEED for {}", identifier, e);
}
}
{code}
{code:java}
try (FileInputStream input = new FileInputStream(file)) {
IOUtils.copy(input, output, bufferSize);
dontNeed(file.getAbsolutePath(), input.getFD(), 0, 0);
}
{code}
> Use POSIX_FADV_DONTNEED after container replication
> ---------------------------------------------------
>
> Key: HDDS-15183
> URL: https://issues.apache.org/jira/browse/HDDS-15183
> Project: Apache Ozone
> Issue Type: Improvement
> Reporter: Ivan Andika
> Priority: Major
>
> This is simply an idea.
> Container replications currently still use buffered writes, meaning that it
> still occupies the page cache. This can cause page cache pollution. We can
> maybe achieve this by using DIRECT_IO, but currently we need to use
> TarContainerPacker which needs to be tar and untarred (since one container
> contains multiple blocks, unlike the proposal HDDS-12659 to put a single
> container in a single file). Therefore, we cannot use DIRECT_IO cleanly.
> We can consider using POSIX_FADV_DONTNEED
> (https://man7.org/linux/man-pages/man2/posix_fadvise.2.html) after container
> replication so that the OS can free this page caches immediately.
> Of course, we need to check whether the initial claim is correct and whether
> this is actually beneficial to prevent premature optimization.
> {code:java}
> import java.io.FileDescriptor;
> import org.apache.hadoop.io.nativeio.NativeIO;
> import org.apache.hadoop.io.nativeio.NativeIOException;
> private static void dontNeed(String identifier, FileDescriptor fd,
> long offset, long length) {
> try {
> NativeIO.POSIX.getCacheManipulator().posixFadviseIfPossible(
> identifier,
> fd,
> offset,
> length,
> NativeIO.POSIX.POSIX_FADV_DONTNEED);
> } catch (NativeIOException e) {
> LOG.debug("Failed to advise DONTNEED for {}", identifier, e);
> }
> }
> {code}
> {code:java}
> try (FileInputStream input = new FileInputStream(file)) {
> IOUtils.copy(input, output, bufferSize);
> dontNeed(file.getAbsolutePath(), input.getFD(), 0, 0);
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]