[jira] [Commented] (HDFS-8929) Add a metric to expose the timestamp of the last journal
[ https://issues.apache.org/jira/browse/HDFS-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14736421#comment-14736421 ] Hudson commented on HDFS-8929: -- SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #367 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/367/]) HDFS-8929. Add a metric to expose the timestamp of the last journal (Contributed by surendra singh lilhore) (vinayakumarb: rev 94cf7ab9d28a885181afeb2c181dfe857d158254) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/qjournal/server/TestJournalNode.java * hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JournalMetrics.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/Journal.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Add a metric to expose the timestamp of the last journal > > > Key: HDFS-8929 > URL: https://issues.apache.org/jira/browse/HDFS-8929 > Project: Hadoop HDFS > Issue Type: Improvement > Components: journal-node >Reporter: Akira AJISAKA >Assignee: Surendra Singh Lilhore > Fix For: 2.8.0 > > Attachments: HDFS-8929-001.patch, HDFS-8929-002.patch, > HDFS-8929-003.patch, HDFS-8929-004.patch, HDFS-8929-005.patch > > > If there are three JNs and only one JN is failing to journal, we can detect > it by monitoring the difference of the last written transaction id among JNs > from NN WebUI or JN metrics. However, it's difficult to define the threshold > to alert because the increase rate of the number of transaction depends on > how busy the cluster is. Therefore I'd like to propose a metric to expose the > timestamp of the last journal. That way we can easily alert if a JN is > failing to journal for some fixed period. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8929) Add a metric to expose the timestamp of the last journal
[ https://issues.apache.org/jira/browse/HDFS-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14736419#comment-14736419 ] Hudson commented on HDFS-8929: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #359 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/359/]) HDFS-8929. Add a metric to expose the timestamp of the last journal (Contributed by surendra singh lilhore) (vinayakumarb: rev 94cf7ab9d28a885181afeb2c181dfe857d158254) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/qjournal/server/TestJournalNode.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JournalMetrics.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/Journal.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md > Add a metric to expose the timestamp of the last journal > > > Key: HDFS-8929 > URL: https://issues.apache.org/jira/browse/HDFS-8929 > Project: Hadoop HDFS > Issue Type: Improvement > Components: journal-node >Reporter: Akira AJISAKA >Assignee: Surendra Singh Lilhore > Fix For: 2.8.0 > > Attachments: HDFS-8929-001.patch, HDFS-8929-002.patch, > HDFS-8929-003.patch, HDFS-8929-004.patch, HDFS-8929-005.patch > > > If there are three JNs and only one JN is failing to journal, we can detect > it by monitoring the difference of the last written transaction id among JNs > from NN WebUI or JN metrics. However, it's difficult to define the threshold > to alert because the increase rate of the number of transaction depends on > how busy the cluster is. Therefore I'd like to propose a metric to expose the > timestamp of the last journal. That way we can easily alert if a JN is > failing to journal for some fixed period. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8929) Add a metric to expose the timestamp of the last journal
[ https://issues.apache.org/jira/browse/HDFS-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14736392#comment-14736392 ] Hudson commented on HDFS-8929: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #347 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/347/]) HDFS-8929. Add a metric to expose the timestamp of the last journal (Contributed by surendra singh lilhore) (vinayakumarb: rev 94cf7ab9d28a885181afeb2c181dfe857d158254) * hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/qjournal/server/TestJournalNode.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/Journal.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JournalMetrics.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Add a metric to expose the timestamp of the last journal > > > Key: HDFS-8929 > URL: https://issues.apache.org/jira/browse/HDFS-8929 > Project: Hadoop HDFS > Issue Type: Improvement > Components: journal-node >Reporter: Akira AJISAKA >Assignee: Surendra Singh Lilhore > Fix For: 2.8.0 > > Attachments: HDFS-8929-001.patch, HDFS-8929-002.patch, > HDFS-8929-003.patch, HDFS-8929-004.patch, HDFS-8929-005.patch > > > If there are three JNs and only one JN is failing to journal, we can detect > it by monitoring the difference of the last written transaction id among JNs > from NN WebUI or JN metrics. However, it's difficult to define the threshold > to alert because the increase rate of the number of transaction depends on > how busy the cluster is. Therefore I'd like to propose a metric to expose the > timestamp of the last journal. That way we can easily alert if a JN is > failing to journal for some fixed period. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8929) Add a metric to expose the timestamp of the last journal
[ https://issues.apache.org/jira/browse/HDFS-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14736377#comment-14736377 ] Hadoop QA commented on HDFS-8929: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 20m 25s | Findbugs (version ) appears to be broken on trunk. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 8m 2s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 38s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 25s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | site | 3m 25s | Site still builds. | | {color:green}+1{color} | checkstyle | 1m 40s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 39s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 24s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | common tests | 23m 56s | Tests passed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 57m 59s | Tests failed in hadoop-hdfs. | | | | 133m 9s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestParallelShortCircuitRead | | | hadoop.fs.contract.hdfs.TestHDFSContractMkdir | | | hadoop.hdfs.server.datanode.TestDataNodeUUID | | | hadoop.hdfs.server.namenode.TestAllowFormat | | | hadoop.hdfs.server.namenode.TestCheckPointForSecurityTokens | | | hadoop.hdfs.TestBlockStoragePolicy | | | hadoop.hdfs.server.datanode.TestRefreshNamenodes | | | hadoop.hdfs.server.namenode.ha.TestBootstrapStandbyWithQJM | | | hadoop.hdfs.server.namenode.snapshot.TestSnapshotMetrics | | | hadoop.cli.TestDeleteCLI | | | hadoop.hdfs.tools.TestDFSZKFailoverController | | | hadoop.hdfs.TestFileLengthOnClusterRestart | | | hadoop.hdfs.TestAppendSnapshotTruncate | | | hadoop.hdfs.server.namenode.snapshot.TestSnapshottableDirListing | | | hadoop.fs.contract.hdfs.TestHDFSContractRootDirectory | | | hadoop.hdfs.server.namenode.snapshot.TestUpdatePipelineWithSnapshots | | | hadoop.hdfs.server.namenode.TestDiskspaceQuotaUpdate | | | hadoop.cli.TestHDFSCLI | | | hadoop.hdfs.server.datanode.TestDatanodeStartupOptions | | | hadoop.hdfs.server.namenode.TestCheckpoint | | | hadoop.hdfs.TestDFSUpgradeFromImage | | | hadoop.hdfs.TestReplaceDatanodeOnFailure | | | hadoop.hdfs.tools.TestGetGroups | | | hadoop.hdfs.TestRemoteBlockReader2 | | | hadoop.hdfs.server.namenode.TestStartup | | | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistPolicy | | | hadoop.hdfs.TestDFSStorageStateRecovery | | | hadoop.hdfs.server.namenode.TestFSImageWithXAttr | | | hadoop.hdfs.TestRemoteBlockReader | | | hadoop.hdfs.TestMultiThreadedHflush | | | hadoop.fs.contract.hdfs.TestHDFSContractRename | | | hadoop.hdfs.TestBlockReaderLocal | | | hadoop.cli.TestCacheAdminCLI | | | hadoop.hdfs.server.mover.TestMover | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestRbwSpaceReservation | | | hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks | | | hadoop.hdfs.server.namenode.ha.TestInitializeSharedEdits | | | hadoop.hdfs.server.namenode.web.resources.TestWebHdfsDataLocality | | | hadoop.hdfs.server.namenode.TestNameNodeRecovery | | | hadoop.hdfs.server.namenode.ha.TestFailureOfSharedDir | | | hadoop.fs.loadGenerator.TestLoadGenerator | | | hadoop.hdfs.server.namenode.TestFSImageWithAcl | | | hadoop.hdfs.server.namenode.TestLargeDirectoryDelete | | | hadoop.fs.TestFcHdfsSetUMask | | | hadoop.hdfs.TestPread | | | hadoop.hdfs.server.namenode.TestFSEditLogLoader | | | hadoop.hdfs.server.datanode.TestFsDatasetCacheRevocation | | | hadoop.hdfs.server.namenode.ha.TestQuotasWithHA | | | hadoop.hdfs.crypto.TestHdfsCryptoStreams | | | hadoop.fs.viewfs.TestViewFsFileStatusHdfs | | | hadoop.hdfs.server.namenode.TestCommitBlockSynchronization | | | hadoop.hdfs.server.datanode.TestReadOnlySharedStorage | | | hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewerForAcl | | | hadoop.hdfs.TestDFSAddressConfig | | | hadoop.tracing.TestTracingShortCircuitLocalRead | | | hadoop.hdfs.server.namenode.ha.TestBootstrapStandby | | | hadoop.hdfs.server.namenode.TestFSDirectory | | | hadoop.hdfs.server.datanode.TestBlockScanner | | | hadoop.hdfs.ser
[jira] [Commented] (HDFS-8929) Add a metric to expose the timestamp of the last journal
[ https://issues.apache.org/jira/browse/HDFS-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14736359#comment-14736359 ] Hudson commented on HDFS-8929: -- FAILURE: Integrated in Hadoop-Yarn-trunk #1098 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/1098/]) HDFS-8929. Add a metric to expose the timestamp of the last journal (Contributed by surendra singh lilhore) (vinayakumarb: rev 94cf7ab9d28a885181afeb2c181dfe857d158254) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/Journal.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/qjournal/server/TestJournalNode.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JournalMetrics.java > Add a metric to expose the timestamp of the last journal > > > Key: HDFS-8929 > URL: https://issues.apache.org/jira/browse/HDFS-8929 > Project: Hadoop HDFS > Issue Type: Improvement > Components: journal-node >Reporter: Akira AJISAKA >Assignee: Surendra Singh Lilhore > Fix For: 2.8.0 > > Attachments: HDFS-8929-001.patch, HDFS-8929-002.patch, > HDFS-8929-003.patch, HDFS-8929-004.patch, HDFS-8929-005.patch > > > If there are three JNs and only one JN is failing to journal, we can detect > it by monitoring the difference of the last written transaction id among JNs > from NN WebUI or JN metrics. However, it's difficult to define the threshold > to alert because the increase rate of the number of transaction depends on > how busy the cluster is. Therefore I'd like to propose a metric to expose the > timestamp of the last journal. That way we can easily alert if a JN is > failing to journal for some fixed period. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8929) Add a metric to expose the timestamp of the last journal
[ https://issues.apache.org/jira/browse/HDFS-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14736353#comment-14736353 ] Hudson commented on HDFS-8929: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2309 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2309/]) HDFS-8929. Add a metric to expose the timestamp of the last journal (Contributed by surendra singh lilhore) (vinayakumarb: rev 94cf7ab9d28a885181afeb2c181dfe857d158254) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/qjournal/server/TestJournalNode.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/Journal.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JournalMetrics.java > Add a metric to expose the timestamp of the last journal > > > Key: HDFS-8929 > URL: https://issues.apache.org/jira/browse/HDFS-8929 > Project: Hadoop HDFS > Issue Type: Improvement > Components: journal-node >Reporter: Akira AJISAKA >Assignee: Surendra Singh Lilhore > Fix For: 2.8.0 > > Attachments: HDFS-8929-001.patch, HDFS-8929-002.patch, > HDFS-8929-003.patch, HDFS-8929-004.patch, HDFS-8929-005.patch > > > If there are three JNs and only one JN is failing to journal, we can detect > it by monitoring the difference of the last written transaction id among JNs > from NN WebUI or JN metrics. However, it's difficult to define the threshold > to alert because the increase rate of the number of transaction depends on > how busy the cluster is. Therefore I'd like to propose a metric to expose the > timestamp of the last journal. That way we can easily alert if a JN is > failing to journal for some fixed period. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8929) Add a metric to expose the timestamp of the last journal
[ https://issues.apache.org/jira/browse/HDFS-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14736312#comment-14736312 ] Hudson commented on HDFS-8929: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2286 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2286/]) HDFS-8929. Add a metric to expose the timestamp of the last journal (Contributed by surendra singh lilhore) (vinayakumarb: rev 94cf7ab9d28a885181afeb2c181dfe857d158254) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/Journal.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/qjournal/server/TestJournalNode.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JournalMetrics.java * hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md > Add a metric to expose the timestamp of the last journal > > > Key: HDFS-8929 > URL: https://issues.apache.org/jira/browse/HDFS-8929 > Project: Hadoop HDFS > Issue Type: Improvement > Components: journal-node >Reporter: Akira AJISAKA >Assignee: Surendra Singh Lilhore > Fix For: 2.8.0 > > Attachments: HDFS-8929-001.patch, HDFS-8929-002.patch, > HDFS-8929-003.patch, HDFS-8929-004.patch, HDFS-8929-005.patch > > > If there are three JNs and only one JN is failing to journal, we can detect > it by monitoring the difference of the last written transaction id among JNs > from NN WebUI or JN metrics. However, it's difficult to define the threshold > to alert because the increase rate of the number of transaction depends on > how busy the cluster is. Therefore I'd like to propose a metric to expose the > timestamp of the last journal. That way we can easily alert if a JN is > failing to journal for some fixed period. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8929) Add a metric to expose the timestamp of the last journal
[ https://issues.apache.org/jira/browse/HDFS-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14736240#comment-14736240 ] Surendra Singh Lilhore commented on HDFS-8929: -- Thanks [~vinayrpet] for review and commit.. > Add a metric to expose the timestamp of the last journal > > > Key: HDFS-8929 > URL: https://issues.apache.org/jira/browse/HDFS-8929 > Project: Hadoop HDFS > Issue Type: Improvement > Components: journal-node >Reporter: Akira AJISAKA >Assignee: Surendra Singh Lilhore > Fix For: 2.8.0 > > Attachments: HDFS-8929-001.patch, HDFS-8929-002.patch, > HDFS-8929-003.patch, HDFS-8929-004.patch, HDFS-8929-005.patch > > > If there are three JNs and only one JN is failing to journal, we can detect > it by monitoring the difference of the last written transaction id among JNs > from NN WebUI or JN metrics. However, it's difficult to define the threshold > to alert because the increase rate of the number of transaction depends on > how busy the cluster is. Therefore I'd like to propose a metric to expose the > timestamp of the last journal. That way we can easily alert if a JN is > failing to journal for some fixed period. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8929) Add a metric to expose the timestamp of the last journal
[ https://issues.apache.org/jira/browse/HDFS-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14736225#comment-14736225 ] Hudson commented on HDFS-8929: -- FAILURE: Integrated in Hadoop-trunk-Commit #8419 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8419/]) HDFS-8929. Add a metric to expose the timestamp of the last journal (Contributed by surendra singh lilhore) (vinayakumarb: rev 94cf7ab9d28a885181afeb2c181dfe857d158254) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/qjournal/server/TestJournalNode.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JournalMetrics.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/Journal.java > Add a metric to expose the timestamp of the last journal > > > Key: HDFS-8929 > URL: https://issues.apache.org/jira/browse/HDFS-8929 > Project: Hadoop HDFS > Issue Type: Improvement > Components: journal-node >Reporter: Akira AJISAKA >Assignee: Surendra Singh Lilhore > Fix For: 2.8.0 > > Attachments: HDFS-8929-001.patch, HDFS-8929-002.patch, > HDFS-8929-003.patch, HDFS-8929-004.patch, HDFS-8929-005.patch > > > If there are three JNs and only one JN is failing to journal, we can detect > it by monitoring the difference of the last written transaction id among JNs > from NN WebUI or JN metrics. However, it's difficult to define the threshold > to alert because the increase rate of the number of transaction depends on > how busy the cluster is. Therefore I'd like to propose a metric to expose the > timestamp of the last journal. That way we can easily alert if a JN is > failing to journal for some fixed period. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8929) Add a metric to expose the timestamp of the last journal
[ https://issues.apache.org/jira/browse/HDFS-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14736208#comment-14736208 ] Vinayakumar B commented on HDFS-8929: - +1 for the latest update. > Add a metric to expose the timestamp of the last journal > > > Key: HDFS-8929 > URL: https://issues.apache.org/jira/browse/HDFS-8929 > Project: Hadoop HDFS > Issue Type: New Feature > Components: journal-node >Reporter: Akira AJISAKA >Assignee: Surendra Singh Lilhore > Attachments: HDFS-8929-001.patch, HDFS-8929-002.patch, > HDFS-8929-003.patch, HDFS-8929-004.patch, HDFS-8929-005.patch > > > If there are three JNs and only one JN is failing to journal, we can detect > it by monitoring the difference of the last written transaction id among JNs > from NN WebUI or JN metrics. However, it's difficult to define the threshold > to alert because the increase rate of the number of transaction depends on > how busy the cluster is. Therefore I'd like to propose a metric to expose the > timestamp of the last journal. That way we can easily alert if a JN is > failing to journal for some fixed period. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8929) Add a metric to expose the timestamp of the last journal
[ https://issues.apache.org/jira/browse/HDFS-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14735680#comment-14735680 ] Hadoop QA commented on HDFS-8929: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 22m 18s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 46s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 58s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | site | 3m 1s | Site still builds. | | {color:green}+1{color} | checkstyle | 2m 25s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 28s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 19s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | common tests | 22m 58s | Tests passed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 162m 36s | Tests failed in hadoop-hdfs. | | | | 237m 49s | | \\ \\ || Reason || Tests || | Timed out tests | org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCacheRevocation | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12754679/HDFS-8929-004.patch | | Optional Tests | site javadoc javac unit findbugs checkstyle | | git revision | trunk / 970daaa | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12342/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12342/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12342/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12342/console | This message was automatically generated. > Add a metric to expose the timestamp of the last journal > > > Key: HDFS-8929 > URL: https://issues.apache.org/jira/browse/HDFS-8929 > Project: Hadoop HDFS > Issue Type: New Feature > Components: journal-node >Reporter: Akira AJISAKA >Assignee: Surendra Singh Lilhore > Attachments: HDFS-8929-001.patch, HDFS-8929-002.patch, > HDFS-8929-003.patch, HDFS-8929-004.patch > > > If there are three JNs and only one JN is failing to journal, we can detect > it by monitoring the difference of the last written transaction id among JNs > from NN WebUI or JN metrics. However, it's difficult to define the threshold > to alert because the increase rate of the number of transaction depends on > how busy the cluster is. Therefore I'd like to propose a metric to expose the > timestamp of the last journal. That way we can easily alert if a JN is > failing to journal for some fixed period. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8929) Add a metric to expose the timestamp of the last journal
[ https://issues.apache.org/jira/browse/HDFS-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14734747#comment-14734747 ] Vinayakumar B commented on HDFS-8929: - Changes looks great. Just one more improvement in Test. Timestamp update can be verified after sending every edits in TestJournalNode#testJournal(). +1 once addressed. > Add a metric to expose the timestamp of the last journal > > > Key: HDFS-8929 > URL: https://issues.apache.org/jira/browse/HDFS-8929 > Project: Hadoop HDFS > Issue Type: New Feature > Components: journal-node >Reporter: Akira AJISAKA >Assignee: Surendra Singh Lilhore > Attachments: HDFS-8929-001.patch, HDFS-8929-002.patch, > HDFS-8929-003.patch > > > If there are three JNs and only one JN is failing to journal, we can detect > it by monitoring the difference of the last written transaction id among JNs > from NN WebUI or JN metrics. However, it's difficult to define the threshold > to alert because the increase rate of the number of transaction depends on > how busy the cluster is. Therefore I'd like to propose a metric to expose the > timestamp of the last journal. That way we can easily alert if a JN is > failing to journal for some fixed period. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8929) Add a metric to expose the timestamp of the last journal
[ https://issues.apache.org/jira/browse/HDFS-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14731723#comment-14731723 ] Brahma Reddy Battula commented on HDFS-8929: [~surendrasingh] thanks for updating the patch..Latest Patch LGTM,[~ajisakaa] do you have some comments on this latest patch..? > Add a metric to expose the timestamp of the last journal > > > Key: HDFS-8929 > URL: https://issues.apache.org/jira/browse/HDFS-8929 > Project: Hadoop HDFS > Issue Type: New Feature > Components: journal-node >Reporter: Akira AJISAKA >Assignee: Surendra Singh Lilhore > Attachments: HDFS-8929-001.patch, HDFS-8929-002.patch, > HDFS-8929-003.patch > > > If there are three JNs and only one JN is failing to journal, we can detect > it by monitoring the difference of the last written transaction id among JNs > from NN WebUI or JN metrics. However, it's difficult to define the threshold > to alert because the increase rate of the number of transaction depends on > how busy the cluster is. Therefore I'd like to propose a metric to expose the > timestamp of the last journal. That way we can easily alert if a JN is > failing to journal for some fixed period. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8929) Add a metric to expose the timestamp of the last journal
[ https://issues.apache.org/jira/browse/HDFS-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14730520#comment-14730520 ] Surendra Singh Lilhore commented on HDFS-8929: -- Failed test cases are unrelated > Add a metric to expose the timestamp of the last journal > > > Key: HDFS-8929 > URL: https://issues.apache.org/jira/browse/HDFS-8929 > Project: Hadoop HDFS > Issue Type: New Feature > Components: journal-node >Reporter: Akira AJISAKA >Assignee: Surendra Singh Lilhore > Attachments: HDFS-8929-001.patch, HDFS-8929-002.patch, > HDFS-8929-003.patch > > > If there are three JNs and only one JN is failing to journal, we can detect > it by monitoring the difference of the last written transaction id among JNs > from NN WebUI or JN metrics. However, it's difficult to define the threshold > to alert because the increase rate of the number of transaction depends on > how busy the cluster is. Therefore I'd like to propose a metric to expose the > timestamp of the last journal. That way we can easily alert if a JN is > failing to journal for some fixed period. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8929) Add a metric to expose the timestamp of the last journal
[ https://issues.apache.org/jira/browse/HDFS-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14728006#comment-14728006 ] Hadoop QA commented on HDFS-8929: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 22m 43s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 54s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 11s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | site | 3m 3s | Site still builds. | | {color:green}+1{color} | checkstyle | 2m 28s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 25s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 31s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 20s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | common tests | 22m 53s | Tests passed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 164m 20s | Tests failed in hadoop-hdfs. | | | | 240m 15s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.namenode.TestStorageRestore | | | hadoop.hdfs.web.TestWebHDFSOAuth2 | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12753798/HDFS-8929-003.patch | | Optional Tests | site javadoc javac unit findbugs checkstyle | | git revision | trunk / 7d6687f | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12249/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12249/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12249/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12249/console | This message was automatically generated. > Add a metric to expose the timestamp of the last journal > > > Key: HDFS-8929 > URL: https://issues.apache.org/jira/browse/HDFS-8929 > Project: Hadoop HDFS > Issue Type: New Feature > Components: journal-node >Reporter: Akira AJISAKA >Assignee: Surendra Singh Lilhore > Attachments: HDFS-8929-001.patch, HDFS-8929-002.patch, > HDFS-8929-003.patch > > > If there are three JNs and only one JN is failing to journal, we can detect > it by monitoring the difference of the last written transaction id among JNs > from NN WebUI or JN metrics. However, it's difficult to define the threshold > to alert because the increase rate of the number of transaction depends on > how busy the cluster is. Therefore I'd like to propose a metric to expose the > timestamp of the last journal. That way we can easily alert if a JN is > failing to journal for some fixed period. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8929) Add a metric to expose the timestamp of the last journal
[ https://issues.apache.org/jira/browse/HDFS-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14716071#comment-14716071 ] Brahma Reddy Battula commented on HDFS-8929: [~surendrasingh] thanks for updating patch.. Again you missed to correct *transections* in following two lines., it should be *transactions* 1) `LastJournalTimestamp` | The timestamp of last *successfully written transections* 2) @Metric("The timestamp of last *successfully written transections* ") and 3) please remove the white space also.. > Add a metric to expose the timestamp of the last journal > > > Key: HDFS-8929 > URL: https://issues.apache.org/jira/browse/HDFS-8929 > Project: Hadoop HDFS > Issue Type: New Feature > Components: journal-node >Reporter: Akira AJISAKA >Assignee: Surendra Singh Lilhore > Attachments: HDFS-8929-001.patch, HDFS-8929-002.patch > > > If there are three JNs and only one JN is failing to journal, we can detect > it by monitoring the difference of the last written transaction id among JNs > from NN WebUI or JN metrics. However, it's difficult to define the threshold > to alert because the increase rate of the number of transaction depends on > how busy the cluster is. Therefore I'd like to propose a metric to expose the > timestamp of the last journal. That way we can easily alert if a JN is > failing to journal for some fixed period. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8929) Add a metric to expose the timestamp of the last journal
[ https://issues.apache.org/jira/browse/HDFS-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14714478#comment-14714478 ] Hadoop QA commented on HDFS-8929: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 21m 53s | Pre-patch trunk has 4 extant Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 52s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 4s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | site | 3m 1s | Site still builds. | | {color:green}+1{color} | checkstyle | 2m 36s | There were no new checkstyle issues. | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 24s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 29s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | common tests | 22m 59s | Tests passed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 0m 25s | Tests failed in hadoop-hdfs. | | | | 75m 40s | | \\ \\ || Reason || Tests || | Failed build | hadoop-hdfs | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12752499/HDFS-8929-002.patch | | Optional Tests | site javadoc javac unit findbugs checkstyle | | git revision | trunk / a4d9acc | | Pre-patch Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/12138/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/12138/artifact/patchprocess/whitespace.txt | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12138/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12138/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12138/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12138/console | This message was automatically generated. > Add a metric to expose the timestamp of the last journal > > > Key: HDFS-8929 > URL: https://issues.apache.org/jira/browse/HDFS-8929 > Project: Hadoop HDFS > Issue Type: New Feature > Components: journal-node >Reporter: Akira AJISAKA >Assignee: Surendra Singh Lilhore > Attachments: HDFS-8929-001.patch, HDFS-8929-002.patch > > > If there are three JNs and only one JN is failing to journal, we can detect > it by monitoring the difference of the last written transaction id among JNs > from NN WebUI or JN metrics. However, it's difficult to define the threshold > to alert because the increase rate of the number of transaction depends on > how busy the cluster is. Therefore I'd like to propose a metric to expose the > timestamp of the last journal. That way we can easily alert if a JN is > failing to journal for some fixed period. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8929) Add a metric to expose the timestamp of the last journal
[ https://issues.apache.org/jira/browse/HDFS-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14712919#comment-14712919 ] Brahma Reddy Battula commented on HDFS-8929: [~surendrasingh] thanks for wokring on this issue.. Patch overall looks good.. some minor nits can you correct the following typos 1) `LastJournalTimestamp` | The timestamp of last *successfuly written transections* 2) @Metric("The timestamp of last *successfuly written transections"* ) > Add a metric to expose the timestamp of the last journal > > > Key: HDFS-8929 > URL: https://issues.apache.org/jira/browse/HDFS-8929 > Project: Hadoop HDFS > Issue Type: New Feature > Components: journal-node >Reporter: Akira AJISAKA >Assignee: Surendra Singh Lilhore > Attachments: HDFS-8929-001.patch > > > If there are three JNs and only one JN is failing to journal, we can detect > it by monitoring the difference of the last written transaction id among JNs > from NN WebUI or JN metrics. However, it's difficult to define the threshold > to alert because the increase rate of the number of transaction depends on > how busy the cluster is. Therefore I'd like to propose a metric to expose the > timestamp of the last journal. That way we can easily alert if a JN is > failing to journal for some fixed period. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8929) Add a metric to expose the timestamp of the last journal
[ https://issues.apache.org/jira/browse/HDFS-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14710586#comment-14710586 ] Surendra Singh Lilhore commented on HDFS-8929: -- Failed test cases are unrelated... > Add a metric to expose the timestamp of the last journal > > > Key: HDFS-8929 > URL: https://issues.apache.org/jira/browse/HDFS-8929 > Project: Hadoop HDFS > Issue Type: New Feature > Components: journal-node >Reporter: Akira AJISAKA >Assignee: Surendra Singh Lilhore > Attachments: HDFS-8929-001.patch > > > If there are three JNs and only one JN is failing to journal, we can detect > it by monitoring the difference of the last written transaction id among JNs > from NN WebUI or JN metrics. However, it's difficult to define the threshold > to alert because the increase rate of the number of transaction depends on > how busy the cluster is. Therefore I'd like to propose a metric to expose the > timestamp of the last journal. That way we can easily alert if a JN is > failing to journal for some fixed period. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8929) Add a metric to expose the timestamp of the last journal
[ https://issues.apache.org/jira/browse/HDFS-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14710374#comment-14710374 ] Hadoop QA commented on HDFS-8929: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 22m 21s | Pre-patch trunk has 1 extant Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 10m 35s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 11m 49s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 26s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | site | 3m 42s | Site still builds. | | {color:green}+1{color} | checkstyle | 2m 52s | There were no new checkstyle issues. | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 43s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 39s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 5m 9s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | common tests | 25m 4s | Tests failed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 84m 17s | Tests failed in hadoop-hdfs. | | | | 168m 39s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.security.token.delegation.web.TestWebDelegationToken | | Timed out tests | org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12752079/HDFS-8929-001.patch | | Optional Tests | site javadoc javac unit findbugs checkstyle | | git revision | trunk / 48774d0 | | Pre-patch Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/12096/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/12096/artifact/patchprocess/whitespace.txt | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12096/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12096/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12096/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12096/console | This message was automatically generated. > Add a metric to expose the timestamp of the last journal > > > Key: HDFS-8929 > URL: https://issues.apache.org/jira/browse/HDFS-8929 > Project: Hadoop HDFS > Issue Type: New Feature > Components: journal-node >Reporter: Akira AJISAKA >Assignee: Surendra Singh Lilhore > Attachments: HDFS-8929-001.patch > > > If there are three JNs and only one JN is failing to journal, we can detect > it by monitoring the difference of the last written transaction id among JNs > from NN WebUI or JN metrics. However, it's difficult to define the threshold > to alert because the increase rate of the number of transaction depends on > how busy the cluster is. Therefore I'd like to propose a metric to expose the > timestamp of the last journal. That way we can easily alert if a JN is > failing to journal for some fixed period. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8929) Add a metric to expose the timestamp of the last journal
[ https://issues.apache.org/jira/browse/HDFS-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14706056#comment-14706056 ] Akira AJISAKA commented on HDFS-8929: - I haven't started yet. I'll appreciate your contribution. > Add a metric to expose the timestamp of the last journal > > > Key: HDFS-8929 > URL: https://issues.apache.org/jira/browse/HDFS-8929 > Project: Hadoop HDFS > Issue Type: New Feature > Components: journal-node >Reporter: Akira AJISAKA >Assignee: Surendra Singh Lilhore > > If there are three JNs and only one JN is failing to journal, we can detect > it by monitoring the difference of the last written transaction id among JNs > from NN WebUI or JN metrics. However, it's difficult to define the threshold > to alert because the increase rate of the number of transaction depends on > how busy the cluster is. Therefore I'd like to propose a metric to expose the > timestamp of the last journal. That way we can easily alert if a JN is > failing to journal for some fixed period. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8929) Add a metric to expose the timestamp of the last journal
[ https://issues.apache.org/jira/browse/HDFS-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704544#comment-14704544 ] Surendra Singh Lilhore commented on HDFS-8929: -- I am interested to working on this, Please feel free to reassign if you already started working. > Add a metric to expose the timestamp of the last journal > > > Key: HDFS-8929 > URL: https://issues.apache.org/jira/browse/HDFS-8929 > Project: Hadoop HDFS > Issue Type: New Feature > Components: journal-node >Reporter: Akira AJISAKA >Assignee: Surendra Singh Lilhore > > If there are three JNs and only one JN is failing to journal, we can detect > it by monitoring the difference of the last written transaction id among JNs > from NN WebUI or JN metrics. However, it's difficult to define the threshold > to alert because the increase rate of the number of transaction depends on > how busy the cluster is. Therefore I'd like to propose a metric to expose the > timestamp of the last journal. That way we can easily alert if a JN is > failing to journal for some fixed period. -- This message was sent by Atlassian JIRA (v6.3.4#6332)