[jira] [Commented] (HDFS-14211) [Consistent Observer Reads] Allow for configurable "always msync" mode

2019-03-16 Thread Konstantin Shvachko (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16794334#comment-16794334
 ] 

Konstantin Shvachko commented on HDFS-14211:


+1. All tests pass with HADOOP-16192

> [Consistent Observer Reads] Allow for configurable "always msync" mode
> --
>
> Key: HDFS-14211
> URL: https://issues.apache.org/jira/browse/HDFS-14211
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14211.000.patch, HDFS-14211.001.patch
>
>
> To allow for reads to be serviced from an ObserverNode (see HDFS-12943) in a 
> consistent way, an {{msync}} API was introduced (HDFS-13688) to allow for a 
> client to fetch the latest transaction ID from the Active NN, thereby 
> ensuring that subsequent reads from the ObserverNode will be up-to-date with 
> the current state of the Active.
> Using this properly, however, requires application-side changes: for 
> examples, a NodeManager should call {{msync}} before localizing the resources 
> for a client, since it received notification of the existence of those 
> resources via communicate which is out-of-band to HDFS and thus could 
> potentially attempt to localize them prior to the availability of those 
> resources on the ObserverNode.
> Until such application-side changes can be made, which will be a longer-term 
> effort, we need to provide a mechanism for unchanged clients to utilize the 
> ObserverNode without exposing such a client to inconsistencies. This is 
> essentially phase 3 of the roadmap outlined in the [design 
> document|https://issues.apache.org/jira/secure/attachment/12915990/ConsistentReadsFromStandbyNode.pdf]
>  for HDFS-12943.
> The design document proposes some heuristics based on understanding of how 
> common applications (e.g. MR) use HDFS for resources. As an initial pass, we 
> can simply have a flag which tells a client to call {{msync}} before _every 
> single_ read operation. This may seem counterintuitive, as it turns every 
> read operation into two RPCs: {{msync}} to the Active following by an actual 
> read operation to the Observer. However, the {{msync}} operation is extremely 
> lightweight, as it does not acquire the {{FSNamesystemLock}}, and in 
> experiments we have found that this approach can easily scale to well over 
> 100,000 {{msync}} operations per second on the Active (while still servicing 
> approx. 10,000 write op/s). Combined with the fast-path edit log tailing for 
> standby/observer nodes (HDFS-13150), this "always msync" approach should 
> introduce only a few ms of extra latency to each read call.
> Below are some experimental results collected from experiments which convert 
> a normal RPC workload into one in which all read operations are turned into 
> an {{msync}}. The baseline is a workload of 1.5k write op/s and 25k read op/s.
> ||Rate Multiplier|2|4|6|8||
> ||RPC Queue Avg Time (ms)|14|53|110|125||
> ||RPC Queue NumOps Avg (k)|51|102|147|177||
> ||RPC Queue NumOps Max (k)|148|269|306|312||
> _(numbers are approximate and should be viewed primarily for their trends)_
> Results are promising up to between 4x and 6x of the baseline workload, which 
> is approx. 100-150k read op/s.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14355) Implement SCM cache using pure java mapped byte buffer

2019-03-16 Thread Feilong He (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16794291#comment-16794291
 ] 

Feilong He edited comment on HDFS-14355 at 3/16/19 3:28 PM:


[~rakeshr], thanks so much for your review and comments! 
 # I have renamed the configs as you suggested.
 # New way to get MappableBlockLoader instance is introduced to replace the 
previous if-else check way, which can make new implementation for 
MappableBlockLoader easily plugin in.
 # The message has been supplemented with the specific property name.
 # Unmap statement has been added in that piece of code.
 # Yes, this assert statement is not necessary and has been removed.
 # The path where block is mapped to has been described in the corresponding 
function's javadoc.
 # The annotation has been added for {{public static void 
verifyIfValidPmemVolume(File pmemDir)}} function.
 # The annotations has been added for two new classes.
 # 1) the MLOCK config has been removed. 2) The test class has been moved into 
the suggested package. Thus our previous change for FsDatasetImpl is 
unnecessary.

I have uploaded a new patch. This patch can be directly applied on upstream 
trunk branch. [~rakeshr], please feel free to post your comments if you have 
more suggestions. Suggestions from other reviewers are also welcome.


was (Author: philohe):
[~rakeshr], thanks so much for your review and comments! 
 # I have renamed the configs as you suggested.
 # New way to get MappableBlockLoader instance is introduced to replace the 
previous if-else check way, which can make new implementation for 
MappableBlockLoader easily plugin in.
 # The message has been supplemented with the specific property name.
 # Unmap statement has been added in that piece of code.
 # Yes, this assert statement is not necessary and has been removed.
 # The path where block is mapped to has been described in the corresponding 
function's javadoc.
 # The annotation has been added for {{public static void 
verifyIfValidPmemVolume(File pmemDir)}} function.
 # The annotations has been added for two new classes.
 # 1) the MLOCK config has been removed. 2) The test class has been moved into 
the suggested package. Thus our previous change for FsDatasetImpl is 
unnecessary.

I have uploaded a new patch. This patch can be directly applied to upstream 
trunk branch. [~rakeshr], please feel free to post your comments if you have 
more suggestions. Suggestions from other reviewers are also welcome.

> Implement SCM cache using pure java mapped byte buffer
> --
>
> Key: HDFS-14355
> URL: https://issues.apache.org/jira/browse/HDFS-14355
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14355.000.patch, HDFS-14355.001.patch, 
> HDFS-14355.002.patch
>
>
> This task is to implement the caching to persistent memory using pure 
> {{java.nio.MappedByteBuffer}}, which could be useful in case native support 
> isn't available or convenient in some environments or platforms.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14355) Implement SCM cache using pure java mapped byte buffer

2019-03-16 Thread Feilong He (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16794291#comment-16794291
 ] 

Feilong He commented on HDFS-14355:
---

[~rakeshr], thanks so much for your review and comments! 
 # I have renamed the configs as you suggested.
 # New way to get MappableBlockLoader instance is introduced to replace the 
previous if-else check way, which can make new implementation for 
MappableBlockLoader easily plugin in.
 # The message has been supplemented with the specific property name.
 # Unmap statement has been added in that piece of code.
 # Yes, this assert statement is not necessary and has been removed.
 # The path where block is mapped to has been described in the corresponding 
function's javadoc.
 # The annotation has been added for {{public static void 
verifyIfValidPmemVolume(File pmemDir)}} function.
 # The annotations has been added for two new classes.
 # 1) the MLOCK config has been removed. 2) The test class has been moved into 
the suggested package. Thus our previous change for FsDatasetImpl is 
unnecessary.

I have uploaded a new patch. This patch can be directly applied to upstream 
trunk branch. [~rakeshr], please feel free to post your comments if you have 
more suggestions. Suggestions from other reviewers are also welcome.

> Implement SCM cache using pure java mapped byte buffer
> --
>
> Key: HDFS-14355
> URL: https://issues.apache.org/jira/browse/HDFS-14355
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14355.000.patch, HDFS-14355.001.patch, 
> HDFS-14355.002.patch
>
>
> This task is to implement the caching to persistent memory using pure 
> {{java.nio.MappedByteBuffer}}, which could be useful in case native support 
> isn't available or convenient in some environments or platforms.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14355) Implement SCM cache using pure java mapped byte buffer

2019-03-16 Thread Feilong He (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-14355:
--
Attachment: HDFS-14355.002.patch

> Implement SCM cache using pure java mapped byte buffer
> --
>
> Key: HDFS-14355
> URL: https://issues.apache.org/jira/browse/HDFS-14355
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14355.000.patch, HDFS-14355.001.patch, 
> HDFS-14355.002.patch
>
>
> This task is to implement the caching to persistent memory using pure 
> {{java.nio.MappedByteBuffer}}, which could be useful in case native support 
> isn't available or convenient in some environments or platforms.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14349) Edit log may be rolled more frequently than necessary with multiple Standby nodes

2019-03-16 Thread star (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16794201#comment-16794201
 ] 

star commented on HDFS-14349:
-

The autoroll operation is executed in ANN service, not triggered by SNN.

So it will not degrade NN performance with more SNN.

Related code FSNamesystem:

 
{code:java}
//Active Service
void startActiveServices() throws IOException {
...
nnEditLogRoller = new Daemon(new NameNodeEditLogRoller(
editLogRollerThreshold, editLogRollerInterval));
nnEditLogRoller.start();
...
}

//Auto roll log
long numEdits = getCorrectTransactionsSinceLastLogRoll();
if (numEdits > rollThreshold) {
  FSNamesystem.LOG.info("NameNode rolling its own edit log because"
  + " number of edits in open segment exceeds threshold of "
  + rollThreshold);
  rollEditLog();
}

{code}
 

> Edit log may be rolled more frequently than necessary with multiple Standby 
> nodes
> -
>
> Key: HDFS-14349
> URL: https://issues.apache.org/jira/browse/HDFS-14349
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, hdfs, qjm
>Reporter: Erik Krogen
>Assignee: Ekanth Sethuramalingam
>Priority: Major
>
> When HDFS-14317 was fixed, we tackled the problem that in a cluster with 
> in-progress edit log tailing enabled, a Standby NameNode may _never_ roll the 
> edit logs, which can eventually cause data loss.
> Unfortunately, in the process, it was made so that if there are multiple 
> Standby NameNodes, they will all roll the edit logs at their specified 
> frequency, so the edit log will be rolled X times more frequently than they 
> should be (where X is the number of Standby NNs). This is not as bad as the 
> original bug since rolling frequently does not affect correctness or data 
> availability, but may degrade performance by creating more edit log segments 
> than necessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1215) Change hadoop-runner and apache/hadoop base image to use Java8

2019-03-16 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16794169#comment-16794169
 ] 

Elek, Marton commented on HDDS-1215:


Images are created. The only missing part is to update the simple ozone 
docker-compose file to use jdk11 instead of jdk8 (=latest)

{code}
docker run apache/hadoop-runner:jdk8 java -version
openjdk version "1.8.0_201"
OpenJDK Runtime Environment (build 1.8.0_201-b09)
OpenJDK 64-Bit Server VM (build 25.201-b09, mixed mode)
 
docker run apache/hadoop-runner:jdk11 java -version
openjdk version "11.0.2" 2019-01-15 LTS
OpenJDK Runtime Environment 18.9 (build 11.0.2+7-LTS)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+7-LTS, mixed mode, sharing)
{code}

> Change hadoop-runner and apache/hadoop base image to use Java8
> --
>
> Key: HDDS-1215
> URL: https://issues.apache.org/jira/browse/HDDS-1215
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Elek, Marton
>Priority: Blocker
>
> {code}
> kms_1           | Exception in thread "main" java.lang.NoClassDefFoundError: 
> javax/activation/DataSource
> kms_1           | at 
> com.sun.xml.bind.v2.model.impl.RuntimeBuiltinLeafInfoImpl.(RuntimeBuiltinLeafInfoImpl.java:457)
> kms_1           | at 
> com.sun.xml.bind.v2.model.impl.RuntimeTypeInfoSetImpl.(RuntimeTypeInfoSetImpl.java:65)
> kms_1           | at 
> com.sun.xml.bind.v2.model.impl.RuntimeModelBuilder.createTypeInfoSet(RuntimeModelBuilder.java:133)
> kms_1           | at 
> com.sun.xml.bind.v2.model.impl.RuntimeModelBuilder.createTypeInfoSet(RuntimeModelBuilder.java:85)
> kms_1           | at 
> com.sun.xml.bind.v2.model.impl.ModelBuilder.(ModelBuilder.java:156)
> kms_1           | at 
> com.sun.xml.bind.v2.model.impl.RuntimeModelBuilder.(RuntimeModelBuilder.java:93)
> kms_1           | at 
> com.sun.xml.bind.v2.runtime.JAXBContextImpl.getTypeInfoSet(JAXBContextImpl.java:473)
> kms_1           | at 
> com.sun.xml.bind.v2.runtime.JAXBContextImpl.(JAXBContextImpl.java:319)
> kms_1           | at 
> com.sun.xml.bind.v2.runtime.JAXBContextImpl$JAXBContextBuilder.build(JAXBContextImpl.java:1170)
> kms_1           | at 
> com.sun.xml.bind.v2.ContextFactory.createContext(ContextFactory.java:145)
> kms_1           | at 
> com.sun.xml.bind.v2.ContextFactory.createContext(ContextFactory.java:236)
> kms_1           | at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> kms_1           | at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> kms_1           | at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> kms_1           | at 
> java.base/java.lang.reflect.Method.invoke(Method.java:566)
> kms_1           | at 
> javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:186)
> kms_1           | at 
> javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:146)
> kms_1           | at javax.xml.bind.ContextFinder.find(ContextFinder.java:350)
> kms_1           | at 
> javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:446)
> kms_1           | at 
> javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:409)
> kms_1           | at 
> com.sun.jersey.server.impl.wadl.WadlApplicationContextImpl.(WadlApplicationContextImpl.java:103)
> kms_1           | at 
> com.sun.jersey.server.impl.wadl.WadlFactory.init(WadlFactory.java:100)
> kms_1           | at 
> com.sun.jersey.server.impl.application.RootResourceUriRules.initWadl(RootResourceUriRules.java:169)
> kms_1           | at 
> com.sun.jersey.server.impl.application.RootResourceUriRules.(RootResourceUriRules.java:106)
> kms_1           | at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._initiate(WebApplicationImpl.java:1359)
> kms_1           | at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.access$700(WebApplicationImpl.java:180)
> kms_1           | at 
> com.sun.jersey.server.impl.application.WebApplicationImpl$13.f(WebApplicationImpl.java:799)
> kms_1           | at 
> com.sun.jersey.server.impl.application.WebApplicationImpl$13.f(WebApplicationImpl.java:795)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14374) Expose total number of delegation tokens in AbstractDelegationTokenSecretManager

2019-03-16 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16794154#comment-16794154
 ] 

He Xiaoqiao commented on HDFS-14374:


[~crh] Thanks for working on this. just one suggestion, we could expose the 
number as one metric, if that it will be more convenient to monitor. FYI.

> Expose total number of delegation tokens in 
> AbstractDelegationTokenSecretManager
> 
>
> Key: HDFS-14374
> URL: https://issues.apache.org/jira/browse/HDFS-14374
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: CR Hota
>Assignee: CR Hota
>Priority: Major
> Attachments: HDFS-14374.001.patch, HDFS-14374.002.patch
>
>
> AbstractDelegationTokenSecretManager should expose total number of active 
> delegation tokens for specific implementations to track for observability.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org