[jira] [Commented] (HDFS-4210) NameNode Format should not fail for DNS resolution on minority of JournalNode
[ https://issues.apache.org/jira/browse/HDFS-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436339#comment-15436339 ] Charles Lamb commented on HDFS-4210: bq. I am taking over the jira in order to push it over the finish line. Hope that is ok with you. No problem [~jzhuge]. Thanks for taking over. > NameNode Format should not fail for DNS resolution on minority of JournalNode > - > > Key: HDFS-4210 > URL: https://issues.apache.org/jira/browse/HDFS-4210 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, journal-node, namenode >Affects Versions: 2.6.0 >Reporter: Damien Hardy >Assignee: John Zhuge >Priority: Trivial > Labels: BB2015-05-TBR > Attachments: HDFS-4210.001.patch > > > Setting : > qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster > cdh4master01 and cdh4master02 JournalNode up and running, > cdh4worker03 not yet provisionning (no DNS entrie) > With : > `hadoop namenode -format` fails with : > 12/11/19 14:42:42 FATAL namenode.NameNode: Exception in namenode join > java.lang.IllegalArgumentException: Unable to construct journal, > qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1235) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:745) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1099) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) > at java.lang.reflect.Constructor.newInstance(Constructor.java:513) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1233) > ... 5 more > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:161) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:141) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:353) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:135) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:104) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:93) > ... 10 more > I suggest that if quorum is up format should not fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-7847) Modify NNThroughputBenchmark to be able to operate on a remote NameNode
[ https://issues.apache.org/jira/browse/HDFS-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7847: --- Attachment: HDFS-7847.004.patch [~cmccabe], I've rebased the patch. Modify NNThroughputBenchmark to be able to operate on a remote NameNode --- Key: HDFS-7847 URL: https://issues.apache.org/jira/browse/HDFS-7847 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.8.0 Reporter: Colin Patrick McCabe Assignee: Charles Lamb Fix For: HDFS-7836 Attachments: HDFS-7847.000.patch, HDFS-7847.001.patch, HDFS-7847.002.patch, HDFS-7847.003.patch, HDFS-7847.004.patch, HDFS-7847.004.patch, make_blocks.tar.gz Modify NNThroughputBenchmark to be able to operate on a NN that is not in process. A followon Jira will modify it some more to allow quantifying native and java heap sizes, and some latency numbers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7847) Modify NNThroughputBenchmark to be able to operate on a remote NameNode
[ https://issues.apache.org/jira/browse/HDFS-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7847: --- Attachment: (was: HDFS-7847.004.patch) Modify NNThroughputBenchmark to be able to operate on a remote NameNode --- Key: HDFS-7847 URL: https://issues.apache.org/jira/browse/HDFS-7847 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.8.0 Reporter: Colin Patrick McCabe Assignee: Charles Lamb Fix For: HDFS-7836 Attachments: HDFS-7847.000.patch, HDFS-7847.001.patch, HDFS-7847.002.patch, HDFS-7847.003.patch, HDFS-7847.004.patch, HDFS-7847.005.patch, make_blocks.tar.gz Modify NNThroughputBenchmark to be able to operate on a NN that is not in process. A followon Jira will modify it some more to allow quantifying native and java heap sizes, and some latency numbers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7847) Modify NNThroughputBenchmark to be able to operate on a remote NameNode
[ https://issues.apache.org/jira/browse/HDFS-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7847: --- Attachment: HDFS-7847.005.patch Modify NNThroughputBenchmark to be able to operate on a remote NameNode --- Key: HDFS-7847 URL: https://issues.apache.org/jira/browse/HDFS-7847 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.8.0 Reporter: Colin Patrick McCabe Assignee: Charles Lamb Fix For: HDFS-7836 Attachments: HDFS-7847.000.patch, HDFS-7847.001.patch, HDFS-7847.002.patch, HDFS-7847.003.patch, HDFS-7847.004.patch, HDFS-7847.005.patch, make_blocks.tar.gz Modify NNThroughputBenchmark to be able to operate on a NN that is not in process. A followon Jira will modify it some more to allow quantifying native and java heap sizes, and some latency numbers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8292) Move conditional in fmt_time from dfs-dust.js to status.html
[ https://issues.apache.org/jira/browse/HDFS-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14522637#comment-14522637 ] Charles Lamb commented on HDFS-8292: Yes, I tested it manually. Thanks Andrew. Move conditional in fmt_time from dfs-dust.js to status.html Key: HDFS-8292 URL: https://issues.apache.org/jira/browse/HDFS-8292 Project: Hadoop HDFS Issue Type: Bug Components: HDFS Affects Versions: 2.8.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8292.000.patch Per [~wheat9]'s comment in HDFS-8214, move the check for 0 from dfs-dust.js to status.html. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8292) Move conditional in fmt_time from dfs-dust.js to status.html
[ https://issues.apache.org/jira/browse/HDFS-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-8292: --- Attachment: HDFS-8292.000.patch [~andrew.wang], The attached patch moves the check for 0 from fmt_time to status.html. Please take a look when you get a chance. Thanks. Move conditional in fmt_time from dfs-dust.js to status.html Key: HDFS-8292 URL: https://issues.apache.org/jira/browse/HDFS-8292 Project: Hadoop HDFS Issue Type: Bug Components: HDFS Affects Versions: 2.8.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Attachments: HDFS-8292.000.patch Per [~wheat9]'s comment in HDFS-8214, move the check for 0 from dfs-dust.js to status.html. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8292) Move conditional in fmt_time from dfs-dust.js to status.html
[ https://issues.apache.org/jira/browse/HDFS-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-8292: --- Status: Patch Available (was: Open) Move conditional in fmt_time from dfs-dust.js to status.html Key: HDFS-8292 URL: https://issues.apache.org/jira/browse/HDFS-8292 Project: Hadoop HDFS Issue Type: Bug Components: HDFS Affects Versions: 2.8.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Attachments: HDFS-8292.000.patch Per [~wheat9]'s comment in HDFS-8214, move the check for 0 from dfs-dust.js to status.html. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7847) Modify NNThroughputBenchmark to be able to operate on a remote NameNode
[ https://issues.apache.org/jira/browse/HDFS-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7847: --- Status: Patch Available (was: Reopened) Modify NNThroughputBenchmark to be able to operate on a remote NameNode --- Key: HDFS-7847 URL: https://issues.apache.org/jira/browse/HDFS-7847 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7836 Reporter: Colin Patrick McCabe Assignee: Charles Lamb Fix For: HDFS-7836 Attachments: HDFS-7847.000.patch, HDFS-7847.001.patch, HDFS-7847.002.patch, HDFS-7847.003.patch, HDFS-7847.004.patch, make_blocks.tar.gz Modify NNThroughputBenchmark to be able to operate on a NN that is not in process. A followon Jira will modify it some more to allow quantifying native and java heap sizes, and some latency numbers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HDFS-7847) Modify NNThroughputBenchmark to be able to operate on a remote NameNode
[ https://issues.apache.org/jira/browse/HDFS-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb reopened HDFS-7847: Porting to trunk. .004 submitted. Modify NNThroughputBenchmark to be able to operate on a remote NameNode --- Key: HDFS-7847 URL: https://issues.apache.org/jira/browse/HDFS-7847 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7836 Reporter: Colin Patrick McCabe Assignee: Charles Lamb Fix For: HDFS-7836 Attachments: HDFS-7847.000.patch, HDFS-7847.001.patch, HDFS-7847.002.patch, HDFS-7847.003.patch, HDFS-7847.004.patch, make_blocks.tar.gz Modify NNThroughputBenchmark to be able to operate on a NN that is not in process. A followon Jira will modify it some more to allow quantifying native and java heap sizes, and some latency numbers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8214) Secondary NN Web UI shows wrong date for Last Checkpoint
[ https://issues.apache.org/jira/browse/HDFS-8214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14519304#comment-14519304 ] Charles Lamb commented on HDFS-8214: The test failure is unrelated. The checkstyle issue has already been discussed above. Secondary NN Web UI shows wrong date for Last Checkpoint Key: HDFS-8214 URL: https://issues.apache.org/jira/browse/HDFS-8214 Project: Hadoop HDFS Issue Type: Bug Components: HDFS, namenode Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-8214.001.patch, HDFS-8214.002.patch, HDFS-8214.003.patch SecondaryNamenode is using Time.monotonicNow() to display Last Checkpoint in the web UI. This causes weird times, generally, just after the epoch, to be displayed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7847) Modify NNThroughputBenchmark to be able to operate on a remote NameNode
[ https://issues.apache.org/jira/browse/HDFS-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7847: --- Target Version/s: 2.8.0 (was: HDFS-7836) Status: In Progress (was: Patch Available) Modify NNThroughputBenchmark to be able to operate on a remote NameNode --- Key: HDFS-7847 URL: https://issues.apache.org/jira/browse/HDFS-7847 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7836 Reporter: Colin Patrick McCabe Assignee: Charles Lamb Fix For: HDFS-7836 Attachments: HDFS-7847.000.patch, HDFS-7847.001.patch, HDFS-7847.002.patch, HDFS-7847.003.patch, HDFS-7847.004.patch, make_blocks.tar.gz Modify NNThroughputBenchmark to be able to operate on a NN that is not in process. A followon Jira will modify it some more to allow quantifying native and java heap sizes, and some latency numbers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7923) The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages
[ https://issues.apache.org/jira/browse/HDFS-7923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7923: --- Attachment: HDFS-7923.002.patch Thanks for the review and comments [~cmccabe]. {code} public static final String DFS_NAMENODE_MAX_CONCURRENT_BLOCK_REPORTS_KEY = dfs.namenode.max.concurrent.block.reports; public static final int DFS_NAMENODE_MAX_CONCURRENT_BLOCK_REPORTS_DEFAULT = Integer.MAX_VALUE; {code} bq. It seems like this should default to something less than the default number of RPC handler threads, not to MAX_INT. Given that dfs.namenode.handler.count = 10, it seems like this should be no more than 5 or 6, right? The main point here to avoid having the NN handler threads completely choked with block reports, and that is defeated if the value is MAX_INT. I realize that you probably intended this to be configured. But it seems like we should have a reasonable default that works for most people. Actually, my intent was to not have this feature kick in unless it was configured, but you have said that you want it enabled by default. I've changed the default to the above setting to 6. {code} + /* Number of block reports currently being processed. */ + private final AtomicInteger blockReportProcessingCount = new AtomicInteger(0); {code} bq. I'm not sure an AtomicInteger makes sense here. We only modify this variable (write to it) when holding the FSN lock in write mode, right? And we only read from it when holding the FSN in read mode. So, there isn't any need to add atomic ops. Actually, it is incr'd outside the FSN lock, otherwise it could never be 1. bq. I think we need to track which datanodes we gave the green light to, and not decrement the counter until they either send that report, or some timeout expires. (We need the timeout in case datanodes go away after requesting permission-to-send.) The timeout can probably be as short as a few minutes. If you can't manage to send an FBR in a few minutes, there's more problems going on. I've added a map called 'pendingBlockReports' to BlockManager to track the datanodes that we've given the ok to as well as when we gave it to them. There's also a method to clean the table. {code} public static final String DFS_BLOCKREPORT_MAX_DEFER_MSEC_KEY = dfs.blockreport.max.deferMsec; public static final longDFS_BLOCKREPORT_MAX_DEFER_MSEC_DEFAULT = Long.MAX_VALUE; {code} bq. Do we really need this config key? I've added a TreeBidiMap called lastBlockReportTime to track this. I would have used guava instead of apache.commons.collections, but Guava doesn't have a sorted BidiMap. The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages --- Key: HDFS-7923 URL: https://issues.apache.org/jira/browse/HDFS-7923 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Colin Patrick McCabe Assignee: Charles Lamb Attachments: HDFS-7923.000.patch, HDFS-7923.001.patch, HDFS-7923.002.patch The DataNodes should rate-limit their full block reports. They can do this by first sending a heartbeat message to the NN with an optional boolean set which requests permission to send a full block report. If the NN responds with another optional boolean set, the DN will send an FBR... if not, it will wait until later. This can be done compatibly with optional fields. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8292) Move conditional in fmt_time from dfs-dust.js to status.html
Charles Lamb created HDFS-8292: -- Summary: Move conditional in fmt_time from dfs-dust.js to status.html Key: HDFS-8292 URL: https://issues.apache.org/jira/browse/HDFS-8292 Project: Hadoop HDFS Issue Type: Bug Components: HDFS Affects Versions: 2.8.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Per [~wheat9]'s comment in HDFS-8214, move the check for 0 from dfs-dust.js to status.html. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8214) Secondary NN Web UI shows wrong date for Last Checkpoint
[ https://issues.apache.org/jira/browse/HDFS-8214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14520613#comment-14520613 ] Charles Lamb commented on HDFS-8214: I created HDFS-8292 for this. Secondary NN Web UI shows wrong date for Last Checkpoint Key: HDFS-8214 URL: https://issues.apache.org/jira/browse/HDFS-8214 Project: Hadoop HDFS Issue Type: Bug Components: HDFS, namenode Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-8214.001.patch, HDFS-8214.002.patch, HDFS-8214.003.patch SecondaryNamenode is using Time.monotonicNow() to display Last Checkpoint in the web UI. This causes weird times, generally, just after the epoch, to be displayed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8214) Secondary NN Web UI shows wrong date for Last Checkpoint
[ https://issues.apache.org/jira/browse/HDFS-8214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14520603#comment-14520603 ] Charles Lamb commented on HDFS-8214: [~wheat9], I'll make the change in a followup-jira. Thanks for the review. Secondary NN Web UI shows wrong date for Last Checkpoint Key: HDFS-8214 URL: https://issues.apache.org/jira/browse/HDFS-8214 Project: Hadoop HDFS Issue Type: Bug Components: HDFS, namenode Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-8214.001.patch, HDFS-8214.002.patch, HDFS-8214.003.patch SecondaryNamenode is using Time.monotonicNow() to display Last Checkpoint in the web UI. This causes weird times, generally, just after the epoch, to be displayed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7847) Modify NNThroughputBenchmark to be able to operate on a remote NameNode
[ https://issues.apache.org/jira/browse/HDFS-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7847: --- Attachment: HDFS-7847.004.patch .004 is rebased onto trunk. Modify NNThroughputBenchmark to be able to operate on a remote NameNode --- Key: HDFS-7847 URL: https://issues.apache.org/jira/browse/HDFS-7847 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7836 Reporter: Colin Patrick McCabe Assignee: Charles Lamb Fix For: HDFS-7836 Attachments: HDFS-7847.000.patch, HDFS-7847.001.patch, HDFS-7847.002.patch, HDFS-7847.003.patch, HDFS-7847.004.patch, make_blocks.tar.gz Modify NNThroughputBenchmark to be able to operate on a NN that is not in process. A followon Jira will modify it some more to allow quantifying native and java heap sizes, and some latency numbers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8214) Secondary NN Web UI shows wrong date for Last Checkpoint
[ https://issues.apache.org/jira/browse/HDFS-8214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-8214: --- Attachment: HDFS-8214.003.patch [~andrew.wang], Thanks for the review. The .003 patch makes your suggested changes. I also added a 0 check to dfs-dust.js. Perhaps it should just return instead of unknown. Secondary NN Web UI shows wrong date for Last Checkpoint Key: HDFS-8214 URL: https://issues.apache.org/jira/browse/HDFS-8214 Project: Hadoop HDFS Issue Type: Bug Components: HDFS, namenode Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-8214.001.patch, HDFS-8214.002.patch, HDFS-8214.003.patch SecondaryNamenode is using Time.monotonicNow() to display Last Checkpoint in the web UI. This causes weird times, generally, just after the epoch, to be displayed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8214) Secondary NN Web UI shows wrong date for Last Checkpoint
[ https://issues.apache.org/jira/browse/HDFS-8214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514367#comment-14514367 ] Charles Lamb commented on HDFS-8214: The test failure is spurious. I ran the failed test (TestDiskspaceQuotaUpdate) and it passed on my machine. The checkstyle warning is {quote} error line=56 column=3 severity=error message=Redundant apos;publicapos; modifier. source=com.puppycrawl.tools.checkstyle.checks.modifier.RedundantModifierCheck/ {quote} This is because I added the new getLastCheckpointDeltaMs() method. It is complaining about public being redundant. I could remove it, but keeping it there is maintain the existing style of other getters. Secondary NN Web UI shows wrong date for Last Checkpoint Key: HDFS-8214 URL: https://issues.apache.org/jira/browse/HDFS-8214 Project: Hadoop HDFS Issue Type: Bug Components: HDFS, namenode Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-8214.001.patch, HDFS-8214.002.patch SecondaryNamenode is using Time.monotonicNow() to display Last Checkpoint in the web UI. This causes weird times, generally, just after the epoch, to be displayed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8214) Secondary NN Web UI shows wrong date for Last Checkpoint
[ https://issues.apache.org/jira/browse/HDFS-8214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-8214: --- Attachment: HDFS-8214.001.patch [~yzhang], could you please take a look at this? We could either try to have Last Checkpoint: be a relative time (e.g. 26 secs ago as the current SecondaryNameNode#toString does, or we could have it be a wallclock time. Unfortunately, having it be a relative time means that the JS for secondary/status.html would have to compute that in the brower client's local TZ which is problematic. In fact, Start Time is already in wallclock time so it feels best to mimic that. This does, however, mean that the #toString method has to be changed (back) to have Last Checkpoint be a wallclock time rather than the relative time that HDFS-5591 changed it to be. Secondary NN Web UI shows wrong date for Last Checkpoint Key: HDFS-8214 URL: https://issues.apache.org/jira/browse/HDFS-8214 Project: Hadoop HDFS Issue Type: Bug Components: HDFS, namenode Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-8214.001.patch SecondaryNamenode is using Time.monotonicNow() to display Last Checkpoint in the web UI. This causes weird times, generally, just after the epoch, to be displayed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8214) Secondary NN Web UI shows wrong date for Last Checkpoint
[ https://issues.apache.org/jira/browse/HDFS-8214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-8214: --- Status: Patch Available (was: Open) Secondary NN Web UI shows wrong date for Last Checkpoint Key: HDFS-8214 URL: https://issues.apache.org/jira/browse/HDFS-8214 Project: Hadoop HDFS Issue Type: Bug Components: HDFS, namenode Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-8214.001.patch SecondaryNamenode is using Time.monotonicNow() to display Last Checkpoint in the web UI. This causes weird times, generally, just after the epoch, to be displayed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8214) Secondary NN Web UI shows wrong date for Last Checkpoint
[ https://issues.apache.org/jira/browse/HDFS-8214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-8214: --- Attachment: HDFS-8214.002.patch [~andrew.wang], Thanks for the review. Please check out the attached. Secondary NN Web UI shows wrong date for Last Checkpoint Key: HDFS-8214 URL: https://issues.apache.org/jira/browse/HDFS-8214 Project: Hadoop HDFS Issue Type: Bug Components: HDFS, namenode Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-8214.001.patch, HDFS-8214.002.patch SecondaryNamenode is using Time.monotonicNow() to display Last Checkpoint in the web UI. This causes weird times, generally, just after the epoch, to be displayed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7923) The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages
[ https://issues.apache.org/jira/browse/HDFS-7923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7923: --- Attachment: HDFS-7923.001.patch [~cmccabe], attached is a patch that is rebased onto the trunk. The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages --- Key: HDFS-7923 URL: https://issues.apache.org/jira/browse/HDFS-7923 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Colin Patrick McCabe Assignee: Charles Lamb Attachments: HDFS-7923.000.patch, HDFS-7923.001.patch The DataNodes should rate-limit their full block reports. They can do this by first sending a heartbeat message to the NN with an optional boolean set which requests permission to send a full block report. If the NN responds with another optional boolean set, the DN will send an FBR... if not, it will wait until later. This can be done compatibly with optional fields. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8214) Secondary NN Web UI shows wrong date for Last Checkpoint
[ https://issues.apache.org/jira/browse/HDFS-8214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511161#comment-14511161 ] Charles Lamb commented on HDFS-8214: No test is needed since it's just a change to a display message. Secondary NN Web UI shows wrong date for Last Checkpoint Key: HDFS-8214 URL: https://issues.apache.org/jira/browse/HDFS-8214 Project: Hadoop HDFS Issue Type: Bug Components: HDFS, namenode Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-8214.001.patch SecondaryNamenode is using Time.monotonicNow() to display Last Checkpoint in the web UI. This causes weird times, generally, just after the epoch, to be displayed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8214) Secondary NN Web UI shows wrong date for Last Checkpoint
Charles Lamb created HDFS-8214: -- Summary: Secondary NN Web UI shows wrong date for Last Checkpoint Key: HDFS-8214 URL: https://issues.apache.org/jira/browse/HDFS-8214 Project: Hadoop HDFS Issue Type: Bug Components: HDFS, namenode Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb SecondaryNamenode is using Time.monotonicNow() to display Last Checkpoint in the web UI. This causes weird times, generally, just after the epoch, to be displayed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8099) Change DFSInputStream has been closed already message to debug log level
[ https://issues.apache.org/jira/browse/HDFS-8099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487805#comment-14487805 ] Charles Lamb commented on HDFS-8099: The test failure is unrelated. No new tests were included since it just a change from INFO to DEBUG and I manually tested it with the CLI. Change DFSInputStream has been closed already message to debug log level -- Key: HDFS-8099 URL: https://issues.apache.org/jira/browse/HDFS-8099 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Attachments: HDFS-8099.000.patch, HDFS-8099.001.patch The hadoop fs -get command always shows this warning: {noformat} $ hadoop fs -get /data/schemas/sfdc/BusinessHours-2014-12-09.avsc 15/04/06 06:22:19 WARN hdfs.DFSClient: DFSInputStream has been closed already {noformat} This was introduced by HDFS-7494. The easiest thing is to just remove the warning from the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7240) Object store in HDFS
[ https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487558#comment-14487558 ] Charles Lamb commented on HDFS-7240: [~jnp] et al, This is very interesting. Thanks for posting it. Is the 1KB key size limit a hard limit or just a design/implementation target? There will be users who want keys that can be arbitrarily large (e.g. 10's to 100's of KB). So although it may be acceptable to degrade above 1KB, I don't think you want to make it a hard limit. You could argue that they could just hash their keys, or that they could have some sort of key map, but then it would be hard to do secondary indices in the future. The details of partitions are kind of lacking beyond the second to last paragraph on page 4. Are partitions and storage containers 1:1? (A storage container can contain a maximum of one partition...). Obviously a storage container holds more than just a partition. Perhaps a little more detail about partitions and how they are located, etc. is warranted. In the call flow diagram on page 6, it looks like there's a lot going on in terms of network traffic. There's the initial REST call, then an RPC to get the bucket metadata, then one to read the bucket metadata, then another to get the object's container location, then back to the client who gets redirected. That's a lot of REST/RPCs just to get to the actual data. Will any of this be cached, perhaps in the Ozone Handler or maybe even on the client (I realize that's a bit hard with a REST based protocol). For instance, if it were possible to cache some of the hash in the client, then that would cut some RPCs to the Ozone Handler. If the cache were out of date, then the actual call to the data (step (9) in the diagram) could be rejected, the cache invalidated, and the entire call sequence (1) - (8) could be executed to get the right location. IWBNI there was some description of the protocols used between all these moving parts. I know that it's REST from client to Ozone Handler, but what about the other network calls in the diagram? Will it be more REST, or Hadoop RPC, or something else? You talk about security at the end so I guess the authentication will be Kerberos based? Or will you allow more authentication options such as those that HDFS currently has? Hash partitioning can also suffer from hotspots depending on the semantics of the key. That's not to say that it's the wrong decision to use it, only that it can have similar drawbacks as key partitioning. Since it looks like you have two separate hashes, one for buckets, and then one for the object key within the bucket, it is possible that there could be hotspots based on a particular bucket. Presumably some sort of caching would help here since the bucket mapping is relatively immutable. Secondary indexing will not be easy in a distributed sharded system, especially the consistency issues in dealing with updates. That said, I am reasonably certain that you will find that many users will need this feature relatively soon such that it is high on the roadmap. You don't say much about ACLs other than to include them in the REST API. I suppose they'll be implemented in the Ozone Handler, but what will they look like? HDFS/Linux ACLs? In the Cluster Level APIs, presumably DELETE Storage Volume is only allowed by the admin. What about GET? How are quotas enabled and set? I don't see it in the API anywhere. There's mention early on that they're set up by the administrator. Perhaps it's via some http jsp thing to the Ozone Handler or Storage Container Manager? Who enforces them? no guarantees on partially written objects - Does this also mean that there are no block-order guarantees during write? Are holey objects allowed or will the only inconsistencies be at the tail of an object. This is obviously important for log-based storage systems. In the Size requirements section on page 3 you say Number of objects per bucket: 1 million, and then later on you say A bucket can have millions of objects. You may want to shore that up a little. Also in the Size requirements section you say Object Size: 5G, but then later it says The storage container needs to store object data that can vary from a few hundred KB to hundreds of megabytes. I'm not sure those are necessarily inconsistent, but I'm also not sure how to reconcile them. Perhaps you could include a diagram showing how an object maps to partitions and storage containers and then onto DNs. In other words, a general diagram showing all the various storage concepts (objects, partitions, storage containers, hash tables, transactions, etc.) We plan to re-use Namenode's block management implementation for container management, as much as possible. I'd love to see more detail on what can be reused, what high level changes to the BlkMgr code will be needed, what
[jira] [Created] (HDFS-8099) Remove extraneous warning from DFSInputStream.close()
Charles Lamb created HDFS-8099: -- Summary: Remove extraneous warning from DFSInputStream.close() Key: HDFS-8099 URL: https://issues.apache.org/jira/browse/HDFS-8099 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor The hadoop fs -get command always shows this warning: {noformat} $ hadoop fs -get /data/schemas/sfdc/BusinessHours-2014-12-09.avsc 15/04/06 06:22:19 WARN hdfs.DFSClient: DFSInputStream has been closed already {noformat} This was introduced by HDFS-7494. The easiest thing is to just remove the warning from the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8099) Remove extraneous warning from DFSInputStream.close()
[ https://issues.apache.org/jira/browse/HDFS-8099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-8099: --- Attachment: HDFS-8099.000.patch Remove extraneous warning from DFSInputStream.close() - Key: HDFS-8099 URL: https://issues.apache.org/jira/browse/HDFS-8099 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Attachments: HDFS-8099.000.patch The hadoop fs -get command always shows this warning: {noformat} $ hadoop fs -get /data/schemas/sfdc/BusinessHours-2014-12-09.avsc 15/04/06 06:22:19 WARN hdfs.DFSClient: DFSInputStream has been closed already {noformat} This was introduced by HDFS-7494. The easiest thing is to just remove the warning from the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8099) Remove extraneous warning from DFSInputStream.close()
[ https://issues.apache.org/jira/browse/HDFS-8099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-8099: --- Status: Patch Available (was: Open) Remove extraneous warning from DFSInputStream.close() - Key: HDFS-8099 URL: https://issues.apache.org/jira/browse/HDFS-8099 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Attachments: HDFS-8099.000.patch The hadoop fs -get command always shows this warning: {noformat} $ hadoop fs -get /data/schemas/sfdc/BusinessHours-2014-12-09.avsc 15/04/06 06:22:19 WARN hdfs.DFSClient: DFSInputStream has been closed already {noformat} This was introduced by HDFS-7494. The easiest thing is to just remove the warning from the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8099) Remove extraneous warning from DFSInputStream.close()
[ https://issues.apache.org/jira/browse/HDFS-8099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-8099: --- Attachment: HDFS-8099.001.patch [~cmccabe], good idea. New patch attached. Tested manually: {code} [cwl@localhost hadoop]$ rm hosts;hadoop fs -get /hosts [cwl@localhost hadoop]$ {code} Remove extraneous warning from DFSInputStream.close() - Key: HDFS-8099 URL: https://issues.apache.org/jira/browse/HDFS-8099 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Attachments: HDFS-8099.000.patch, HDFS-8099.001.patch The hadoop fs -get command always shows this warning: {noformat} $ hadoop fs -get /data/schemas/sfdc/BusinessHours-2014-12-09.avsc 15/04/06 06:22:19 WARN hdfs.DFSClient: DFSInputStream has been closed already {noformat} This was introduced by HDFS-7494. The easiest thing is to just remove the warning from the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7923) The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages
[ https://issues.apache.org/jira/browse/HDFS-7923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481715#comment-14481715 ] Charles Lamb commented on HDFS-7923: Here is a description of the heuristic that my patch has implemented for the NN to determine what to send back in response to the should I send a BR? question. In the vein of keeping it relatively simple, let's consider 3 parameters: * The max # of FBR requests that the NN is willing to process at any given time (to be called 'dfs.namenode.max.concurrent.block.reports', with a default of Integer.MAX_INTEGER) * The DN's configured block report interval (dfs.blockreport.intervalMsec). This parameter already exists. * The max time we ever want the NN to go without receiving an FBR from a given DN ('dfs.blockreport.max.deferMsec'). If the time since the last FBR received from the DN is less than dfs.blockreport.intervalMsec, then it returns false (No, don't send an FBR). In theory, this should never happen if the DN is obeying dfs.blockreport.intervalMsec. If the number of block reports currently being processed by an NN is less than dfs.namenode.max.concurrent.block.reports, and the time since it last received an FBR from the DN sending the heartbeat is greater than dfs.blockreport.intervalMsec, then the NN automatically answers true (Yes, send along an FBR). If the number of BRs being processed by an NN is than dfs.namenode.max.concurrent.block.reports when it receives the heartbeat, then it checks the last time that it received an FBR from the DN sending the heartbeat and if it's greater than dfs.blockreport.max.deferMsec, then it returns true (Yes, send along an FBR). If the time-since-last-FBR is less than dfs.blockreport.max.deferMsec, then it returns false. The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages --- Key: HDFS-7923 URL: https://issues.apache.org/jira/browse/HDFS-7923 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Colin Patrick McCabe Assignee: Charles Lamb Attachments: HDFS-7923.000.patch The DataNodes should rate-limit their full block reports. They can do this by first sending a heartbeat message to the NN with an optional boolean set which requests permission to send a full block report. If the NN responds with another optional boolean set, the DN will send an FBR... if not, it will wait until later. This can be done compatibly with optional fields. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7923) The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages
[ https://issues.apache.org/jira/browse/HDFS-7923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7923: --- Attachment: HDFS-7923.000.patch Attached is a patch that implements the behavior I described. The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages --- Key: HDFS-7923 URL: https://issues.apache.org/jira/browse/HDFS-7923 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Colin Patrick McCabe Assignee: Charles Lamb Attachments: HDFS-7923.000.patch The DataNodes should rate-limit their full block reports. They can do this by first sending a heartbeat message to the NN with an optional boolean set which requests permission to send a full block report. If the NN responds with another optional boolean set, the DN will send an FBR... if not, it will wait until later. This can be done compatibly with optional fields. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HDFS-7923) The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages
[ https://issues.apache.org/jira/browse/HDFS-7923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-7923 started by Charles Lamb. -- The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages --- Key: HDFS-7923 URL: https://issues.apache.org/jira/browse/HDFS-7923 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Colin Patrick McCabe Assignee: Charles Lamb Attachments: HDFS-7923.000.patch The DataNodes should rate-limit their full block reports. They can do this by first sending a heartbeat message to the NN with an optional boolean set which requests permission to send a full block report. If the NN responds with another optional boolean set, the DN will send an FBR... if not, it will wait until later. This can be done compatibly with optional fields. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-8040) Able to move encryption zone to Trash
[ https://issues.apache.org/jira/browse/HDFS-8040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb resolved HDFS-8040. Resolution: Not a Problem Able to move encryption zone to Trash - Key: HDFS-8040 URL: https://issues.apache.org/jira/browse/HDFS-8040 Project: Hadoop HDFS Issue Type: Bug Components: security Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Sumana Sathish Users can remove encryption directory using the FsShell remove commands without -skipTrash option. {code} /usr/hdp/current/hadoop-hdfs-client/bin/hdfs dfs -D fs.trash.interval=60 -rm -r /user/hrt_qa/encryptionZone_1 2015-04-01 19:19:46,510|beaver.machine|INFO|654|140309507495680|MainThread|15/04/01 19:19:46 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 360 minutes, Emptier interval = 0 minutes. 2015-04-01 19:19:46,534|beaver.machine|INFO|654|140309507495680|MainThread|Moved: 'hdfs://sumana-dal-secure-4.novalocal:8020/user/hrt_qa/encryptionZone_1' to trash at: hdfs://sumana-dal-secure-4.novalocal:8020/user/hrt_qa/.Trash/Current 2015-04-01 19:19:46,863|test_TDE_trash|INFO|654|140309507495680|MainThread|Checking if the encryption zone is in Trash or not 2015-04-01 19:19:46,864|beaver.machine|INFO|654|140309507495680|MainThread|RUNNING: /usr/hdp/current/hadoop-client/bin/hadoop dfs -ls -R /user/hrt_qa/.Trash/Current 2015-04-01 19:19:46,892|beaver.machine|INFO|654|140309507495680|MainThread|DEPRECATED: Use of this script to execute hdfs command is deprecated. 2015-04-01 19:19:46,893|beaver.machine|INFO|654|140309507495680|MainThread|Instead use the hdfs command for it. 2015-04-01 19:19:46,893|beaver.machine|INFO|654|140309507495680|MainThread| 2015-04-01 19:19:50,289|beaver.machine|INFO|654|140309507495680|MainThread|drwx-- - hrt_qa hrt_qa 0 2015-04-01 19:19 /user/hrt_qa/.Trash/Current/user 2015-04-01 19:19:50,292|beaver.machine|INFO|654|140309507495680|MainThread|drwx-- - hrt_qa hrt_qa 0 2015-04-01 19:19 /user/hrt_qa/.Trash/Current/user/hrt_qa 2015-04-01 19:19:50,296|beaver.machine|INFO|654|140309507495680|MainThread|drwxr-xr-x - hrt_qa hrt_qa 0 2015-04-01 19:19 /user/hrt_qa/.Trash/Current/user/hrt_qa/encryptionZone_1 2015-04-01 19:19:50,326|beaver.machine|INFO|654|140309507495680|MainThread|-rw-r--r-- 3 hrt_qa hrt_qa 3273 2015-04-01 19:19 /user/hrt_qa/.Trash/Current/user/hrt_qa/encryptionZone_1/file_to_get.txt {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8040) Able to move encryption zone to Trash
[ https://issues.apache.org/jira/browse/HDFS-8040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391411#comment-14391411 ] Charles Lamb commented on HDFS-8040: Hi [~ssath...@hortonworks.com], I tried reproducing this: {code} [cwl@localhost hadoop]$ hdfs crypto -listZones /ez mykey [cwl@localhost hadoop]$ hdfs dfs -ls / Found 1 items drwxr-xr-x - cwl supergroup 0 2015-04-01 15:41 /ez [cwl@localhost hadoop]$ hdfs dfs -ls /ez Found 1 items -rw-r--r-- 3 cwl supergroup158 2015-04-01 15:41 /ez/hosts [cwl@localhost hadoop]$ hdfs dfs -D fs.trash.interval=60 -rm -r /ez 15/04/01 16:41:15 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 60 minutes, Emptier interval = 0 minutes. rm: Failed to move to trash: hdfs://localhost/ez: /ez can't be moved from an encryption zone. [cwl@localhost hadoop]$ hdfs dfs -ls -R / drwxr-xr-x - cwl supergroup 0 2015-04-01 15:41 /ez -rw-r--r-- 3 cwl supergroup158 2015-04-01 15:41 /ez/hosts drwx-- - cwl supergroup 0 2015-04-01 16:41 /user drwx-- - cwl supergroup 0 2015-04-01 16:41 /user/cwl drwx-- - cwl supergroup 0 2015-04-01 16:41 /user/cwl/.Trash drwx-- - cwl supergroup 0 2015-04-01 16:41 /user/cwl/.Trash/Current [cwl@localhost hadoop]$ hdfs dfs -ls -R /user/cwl/.Trash drwx-- - cwl supergroup 0 2015-04-01 16:41 /user/cwl/.Trash/Current [cwl@localhost hadoop]$ hdfs dfs -ls -R /user/cwl/.Trash/Current {code} Do you see any difference between what you did and what I did? Able to move encryption zone to Trash - Key: HDFS-8040 URL: https://issues.apache.org/jira/browse/HDFS-8040 Project: Hadoop HDFS Issue Type: Bug Components: security Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: sumana sathish Users can remove encryption directory using the FsShell remove commands without -skipTrash option. {code} /usr/hdp/current/hadoop-hdfs-client/bin/hdfs dfs -D fs.trash.interval=60 -rm -r /user/hrt_qa/encryptionZone_1 2015-04-01 19:19:46,510|beaver.machine|INFO|654|140309507495680|MainThread|15/04/01 19:19:46 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 360 minutes, Emptier interval = 0 minutes. 2015-04-01 19:19:46,534|beaver.machine|INFO|654|140309507495680|MainThread|Moved: 'hdfs://sumana-dal-secure-4.novalocal:8020/user/hrt_qa/encryptionZone_1' to trash at: hdfs://sumana-dal-secure-4.novalocal:8020/user/hrt_qa/.Trash/Current 2015-04-01 19:19:46,863|test_TDE_trash|INFO|654|140309507495680|MainThread|Checking if the encryption zone is in Trash or not 2015-04-01 19:19:46,864|beaver.machine|INFO|654|140309507495680|MainThread|RUNNING: /usr/hdp/current/hadoop-client/bin/hadoop dfs -ls -R /user/hrt_qa/.Trash/Current 2015-04-01 19:19:46,892|beaver.machine|INFO|654|140309507495680|MainThread|DEPRECATED: Use of this script to execute hdfs command is deprecated. 2015-04-01 19:19:46,893|beaver.machine|INFO|654|140309507495680|MainThread|Instead use the hdfs command for it. 2015-04-01 19:19:46,893|beaver.machine|INFO|654|140309507495680|MainThread| 2015-04-01 19:19:50,289|beaver.machine|INFO|654|140309507495680|MainThread|drwx-- - hrt_qa hrt_qa 0 2015-04-01 19:19 /user/hrt_qa/.Trash/Current/user 2015-04-01 19:19:50,292|beaver.machine|INFO|654|140309507495680|MainThread|drwx-- - hrt_qa hrt_qa 0 2015-04-01 19:19 /user/hrt_qa/.Trash/Current/user/hrt_qa 2015-04-01 19:19:50,296|beaver.machine|INFO|654|140309507495680|MainThread|drwxr-xr-x - hrt_qa hrt_qa 0 2015-04-01 19:19 /user/hrt_qa/.Trash/Current/user/hrt_qa/encryptionZone_1 2015-04-01 19:19:50,326|beaver.machine|INFO|654|140309507495680|MainThread|-rw-r--r-- 3 hrt_qa hrt_qa 3273 2015-04-01 19:19 /user/hrt_qa/.Trash/Current/user/hrt_qa/encryptionZone_1/file_to_get.txt {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8040) Able to move encryption zone to Trash
[ https://issues.apache.org/jira/browse/HDFS-8040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391656#comment-14391656 ] Charles Lamb commented on HDFS-8040: [~xyao], [~ssath...@hortonworks.com], This is actually correct behavior. If you have an EZ rooted at /user/hrt_qa/encryptionZone_1, it is ok to be able to move around an entire ez to another directory, in this case /user/hrt_qa/encryptionZone_1. That's what HDFS-7530 fixed. Hence, the -rm -r command is effectively a rename of /user/hrt_qa/encryptionZone_1 to /user/hrt_qa/.Trash/Current. Since you're picking up the entire EZ, that's allowed. Does this make sense? Able to move encryption zone to Trash - Key: HDFS-8040 URL: https://issues.apache.org/jira/browse/HDFS-8040 Project: Hadoop HDFS Issue Type: Bug Components: security Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: sumana sathish Users can remove encryption directory using the FsShell remove commands without -skipTrash option. {code} /usr/hdp/current/hadoop-hdfs-client/bin/hdfs dfs -D fs.trash.interval=60 -rm -r /user/hrt_qa/encryptionZone_1 2015-04-01 19:19:46,510|beaver.machine|INFO|654|140309507495680|MainThread|15/04/01 19:19:46 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 360 minutes, Emptier interval = 0 minutes. 2015-04-01 19:19:46,534|beaver.machine|INFO|654|140309507495680|MainThread|Moved: 'hdfs://sumana-dal-secure-4.novalocal:8020/user/hrt_qa/encryptionZone_1' to trash at: hdfs://sumana-dal-secure-4.novalocal:8020/user/hrt_qa/.Trash/Current 2015-04-01 19:19:46,863|test_TDE_trash|INFO|654|140309507495680|MainThread|Checking if the encryption zone is in Trash or not 2015-04-01 19:19:46,864|beaver.machine|INFO|654|140309507495680|MainThread|RUNNING: /usr/hdp/current/hadoop-client/bin/hadoop dfs -ls -R /user/hrt_qa/.Trash/Current 2015-04-01 19:19:46,892|beaver.machine|INFO|654|140309507495680|MainThread|DEPRECATED: Use of this script to execute hdfs command is deprecated. 2015-04-01 19:19:46,893|beaver.machine|INFO|654|140309507495680|MainThread|Instead use the hdfs command for it. 2015-04-01 19:19:46,893|beaver.machine|INFO|654|140309507495680|MainThread| 2015-04-01 19:19:50,289|beaver.machine|INFO|654|140309507495680|MainThread|drwx-- - hrt_qa hrt_qa 0 2015-04-01 19:19 /user/hrt_qa/.Trash/Current/user 2015-04-01 19:19:50,292|beaver.machine|INFO|654|140309507495680|MainThread|drwx-- - hrt_qa hrt_qa 0 2015-04-01 19:19 /user/hrt_qa/.Trash/Current/user/hrt_qa 2015-04-01 19:19:50,296|beaver.machine|INFO|654|140309507495680|MainThread|drwxr-xr-x - hrt_qa hrt_qa 0 2015-04-01 19:19 /user/hrt_qa/.Trash/Current/user/hrt_qa/encryptionZone_1 2015-04-01 19:19:50,326|beaver.machine|INFO|654|140309507495680|MainThread|-rw-r--r-- 3 hrt_qa hrt_qa 3273 2015-04-01 19:19 /user/hrt_qa/.Trash/Current/user/hrt_qa/encryptionZone_1/file_to_get.txt {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6658) Namenode memory optimization - Block replicas list
[ https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376532#comment-14376532 ] Charles Lamb commented on HDFS-6658: [~daryn], I spent a couple of hours making a first pass through the patch. * The BlockReplicaId encodings seem sufficiently large for the foreseeable future. * As you point out in the .jpg of your whiteboard, +1 on getting rid of the triplets. * You solve the issue of sparse block ids by converting them to scalars and maintaining the skipBitSet. Why did you roll your own bitset instead of using the Java bitset? I'd like to hear more about concurrency in the overall data structure since that's a problem that [~cmccabe] and I are trying to tackle. Would you be able to have a phone conversation on Thursday or Friday this week to discuss it? Namenode memory optimization - Block replicas list --- Key: HDFS-6658 URL: https://issues.apache.org/jira/browse/HDFS-6658 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.4.1 Reporter: Amir Langer Assignee: Daryn Sharp Attachments: BlockListOptimizationComparison.xlsx, BlocksMap redesign.pdf, HDFS-6658.patch, HDFS-6658.patch, HDFS-6658.patch, Namenode Memory Optimizations - Block replicas list.docx, New primative indexes.jpg, Old triplets.jpg Part of the memory consumed by every BlockInfo object in the Namenode is a linked list of block references for every DatanodeStorageInfo (called triplets). We propose to change the way we store the list in memory. Using primitive integer indexes instead of object references will reduce the memory needed for every block replica (when compressed oops is disabled) and in our new design the list overhead will be per DatanodeStorageInfo and not per block replica. see attached design doc. for details and evaluation results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-7847) Modify NNThroughputBenchmark to be able to operate on a remote NameNode
[ https://issues.apache.org/jira/browse/HDFS-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb resolved HDFS-7847. Resolution: Fixed Fix Version/s: HDFS-7836 Committed to HDFS-7836 branch. Modify NNThroughputBenchmark to be able to operate on a remote NameNode --- Key: HDFS-7847 URL: https://issues.apache.org/jira/browse/HDFS-7847 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7836 Reporter: Colin Patrick McCabe Assignee: Charles Lamb Fix For: HDFS-7836 Attachments: HDFS-7847.000.patch, HDFS-7847.001.patch, HDFS-7847.002.patch, HDFS-7847.003.patch, make_blocks.tar.gz Modify NNThroughputBenchmark to be able to operate on a NN that is not in process. A followon Jira will modify it some more to allow quantifying native and java heap sizes, and some latency numbers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7847) Modify NNThroughputBenchmark to be able to operate on a remote NameNode
[ https://issues.apache.org/jira/browse/HDFS-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7847: --- Attachment: HDFS-7847.003.patch Thanks for the review [~cmccabe]. I moved those two methods over to DFSTestUtils.java in .003. Modify NNThroughputBenchmark to be able to operate on a remote NameNode --- Key: HDFS-7847 URL: https://issues.apache.org/jira/browse/HDFS-7847 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7836 Reporter: Colin Patrick McCabe Assignee: Charles Lamb Attachments: HDFS-7847.000.patch, HDFS-7847.001.patch, HDFS-7847.002.patch, HDFS-7847.003.patch, make_blocks.tar.gz Modify NNThroughputBenchmark to be able to operate on a NN that is not in process. A followon Jira will modify it some more to allow quantifying native and java heap sizes, and some latency numbers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7847) Modify NNThroughputBenchmark to be able to operate on a remote NameNode
[ https://issues.apache.org/jira/browse/HDFS-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7847: --- Attachment: HDFS-7847.002.patch @cmccabe, @stack, thanks for the review! bq. DFSClient.java: this change adds three new fields to DFSClient. But they only seem to be used by unit tests. It seems like we should just put these inside the unit test(s) that are using these-- if necessary, by adding a helper method. There's no reason to add more fields to DFSClient. Also remember that when using FileContext, we create new DFSClients all the time. Good point. I've left the existing {code}ClientProtocol namenode{code} field alone. The other 3 proxies are created on-demand by their getters. That means no change in DFSClient instance size. bq. It seems kind of odd to have NameNodeProxies#createProxy create a proxy to the datanode. It's actually a proxy to the NN for the DatanodeProtocol. That's the same protocol that the DN uses to speak with the NN when it's sending (among other things) block reports. But with some ideas from @stack, I got rid of the changes to NameNodeProxies. bq. Of course the NameNode may or may not be remote here. It seems like --nnuri or just --namenode or something like that would be more descriptive. Yeah, I agree. I changed it to -namenode. bq. Instead of this boilerplate, just use StringUtils#popOptionWithArgument. Changed. I was just trying to match the existing code, but the using StringUtils is for the better. {code} - replication, BLOCK_SIZE, null); + replication, BLOCK_SIZE, CryptoProtocolVersion.supported()); {code} bq. This fix is a little bit separate, right? I suppose we can do it in this JIRA, though. Without this, the relevant PBHelper.convert code throws NPE on the supportVersions arg. Modify NNThroughputBenchmark to be able to operate on a remote NameNode --- Key: HDFS-7847 URL: https://issues.apache.org/jira/browse/HDFS-7847 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7836 Reporter: Colin Patrick McCabe Assignee: Charles Lamb Attachments: HDFS-7847.000.patch, HDFS-7847.001.patch, HDFS-7847.002.patch, make_blocks.tar.gz Modify NNThroughputBenchmark to be able to operate on a NN that is not in process. A followon Jira will modify it some more to allow quantifying native and java heap sizes, and some latency numbers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7846) Create off-heap BlocksMap and BlockData structures
[ https://issues.apache.org/jira/browse/HDFS-7846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355210#comment-14355210 ] Charles Lamb commented on HDFS-7846: Colin, this looks pretty good. A few questions and comments. Yi mentioned unused imports, but there are also unnecessary java.lang.{String,ClassCastException} imports. BlockId.equals: constructing a ClassCastException, and especially the resulting call to fillInStackTrace, is an expensive way of checking the type. I would think instanceof is preferred. Are you planning on doing something with Shard.name in the future? The indentation of the assignment to htable is off a bit. Jenkins will ask you this question, but why no unit tests? Create off-heap BlocksMap and BlockData structures -- Key: HDFS-7846 URL: https://issues.apache.org/jira/browse/HDFS-7846 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7836 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-7846-scl.001.patch Create off-heap BlocksMap, BlockInfo, and DataNodeInfo structures. The BlocksMap will use the off-heap hash table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7846) Create off-heap BlocksMap and BlockData structures
[ https://issues.apache.org/jira/browse/HDFS-7846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355217#comment-14355217 ] Charles Lamb commented on HDFS-7846: Oh, I forgot to mention there are three places where git apply flags the patch for adding trailing whitespace. Create off-heap BlocksMap and BlockData structures -- Key: HDFS-7846 URL: https://issues.apache.org/jira/browse/HDFS-7846 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7836 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-7846-scl.001.patch Create off-heap BlocksMap, BlockInfo, and DataNodeInfo structures. The BlocksMap will use the off-heap hash table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6658) Namenode memory optimization - Block replicas list
[ https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357672#comment-14357672 ] Charles Lamb commented on HDFS-6658: Hi [~daryn], Colin and I read over the design doc. I confess that I still need to read over the patch, but I will do that. Do you think it will be possible to create a safe mode to run this in so that inconsistencies can be detected? I'm also wondering what the field widths are, but I can find those when I read the patch. Namenode memory optimization - Block replicas list --- Key: HDFS-6658 URL: https://issues.apache.org/jira/browse/HDFS-6658 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.4.1 Reporter: Amir Langer Assignee: Daryn Sharp Attachments: BlockListOptimizationComparison.xlsx, BlocksMap redesign.pdf, HDFS-6658.patch, HDFS-6658.patch, HDFS-6658.patch, Namenode Memory Optimizations - Block replicas list.docx, New primative indexes.jpg, Old triplets.jpg Part of the memory consumed by every BlockInfo object in the Namenode is a linked list of block references for every DatanodeStorageInfo (called triplets). We propose to change the way we store the list in memory. Using primitive integer indexes instead of object references will reduce the memory needed for every block replica (when compressed oops is disabled) and in our new design the list overhead will be per DatanodeStorageInfo and not per block replica. see attached design doc. for details and evaluation results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7847) Modify NNThroughputBenchmark to be able to operate on a remote NameNode
[ https://issues.apache.org/jira/browse/HDFS-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7847: --- Attachment: HDFS-7847.001.patch @cmccabe, @stack, thanks for the review! bq. DFSClient.java: this change adds three new fields to DFSClient. But they only seem to be used by unit tests. It seems like we should just put these inside the unit test(s) that are using these-- if necessary, by adding a helper method. There's no reason to add more fields to DFSClient. Also remember that when using FileContext, we create new DFSClients all the time. Good point. I've left the existing {code}ClientProtocol namenode{code} field alone. The other 3 proxies are created on-demand by their getters. That means no change in DFSClient instance size. bq. It seems kind of odd to have NameNodeProxies#createProxy create a proxy to the datanode. It's actually a proxy to the NN for the DatanodeProtocol. That's the same protocol that the DN uses to speak with the NN when it's sending (among other things) block reports. bq. In general, when you see NameNodeProxies I think proxies used by the NameNode and this doesn't fit with that. These are actually proxies used to talk to the NN, not proxies used by the NN. I didn't make the name. bq. Can you give a little more context about why this is a good idea (as opposed to just having some custom code in the unit test or in a unit test util class that creates a proxy) While the name DatanodeProtocol makes us think of an RPC protocol to the datanode, it is in fact yet another one of the many protocols to the namenode which is embodied in the NamenodeProtocols (plural) omnibus interface. The problem this is addressing is that when we are talking to an in-process NN in the NNThroughputBenchmark, then it's easy to get our hands on a NamenodeProtocols instance -- you simply call NameNode.getRpcServer(). However, the idea of this patch is to let you run the benchmark against a non-in-process NN, so there's no NameNode instance to use. That means we have to create RPC proxy objects for each of the NN protocols that we need to use. It would be nice if we could create a single proxy for the omnibus NamenodeProtocols interface, but we can't. Instead, we have to pick and choose the different namenode protocols that we want to use -- ClientProtocol, NamenodeProtocol, RefreshUserMappingProtocol, and DatanodeProtocol -- and create proxies for them. Code to create proxies for the first three of these already existed in NameNodeProxies.java, but we have to add a few new lines to create the DatanodeProtocol proxy. @stack I looked into your (offline) suggestion to try calling through the TinyDatanode, but it's just doing the same thing that my patch does -- it uses the same ClientProtocol instance that the rest of the test uses. TinyDataNode is really just a skeleton and doesn't really borrow much code from the real DN. bq. Of course the NameNode may or may not be remote here. It seems like --nnuri or just --namenode or something like that would be more descriptive. Yeah, I agree. I changed it to -namenode. bq. Instead of this boilerplate, just use StringUtils#popOptionWithArgument. Changed. I was just trying to match the existing code, but the using StringUtils is for the better. {code} - replication, BLOCK_SIZE, null); + replication, BLOCK_SIZE, CryptoProtocolVersion.supported()); {code} bq. This fix is a little bit separate, right? I suppose we can do it in this JIRA, though. Without this, the relevant PBHelper.convert code throws NPE on the supportVersions arg. Modify NNThroughputBenchmark to be able to operate on a remote NameNode --- Key: HDFS-7847 URL: https://issues.apache.org/jira/browse/HDFS-7847 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7836 Reporter: Colin Patrick McCabe Assignee: Charles Lamb Attachments: HDFS-7847.000.patch, HDFS-7847.001.patch, make_blocks.tar.gz Modify NNThroughputBenchmark to be able to operate on a NN that is not in process. A followon Jira will modify it some more to allow quantifying native and java heap sizes, and some latency numbers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7836) BlockManager Scalability Improvements
[ https://issues.apache.org/jira/browse/HDFS-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353386#comment-14353386 ] Charles Lamb commented on HDFS-7836: JOIN WEBEX MEETING https://cloudera.webex.com/join/clamb | 622 867 972 JOIN BY PHONE 1-650-479-3208 Call-in toll number (US/Canada) Access code: 622 867 972 Global call-in numbers: https://cloudera.webex.com/cloudera/globalcallin.php?serviceType=MCED=342142257tollFree=0 https://cloudera.webex.com/cloudera/globalcallin.php?serviceType=MCED=342142257tollFree=0 Can't join the meeting? Contact support here: https://cloudera.webex.com/mc IMPORTANT NOTICE: Please note that this WebEx service allows audio and other information sent during the session to be recorded, which may be discoverable in a legal matter. By joining this session, you automatically consent to such recordings. If you do not consent to being recorded, discuss your concerns with the host or do not join the session. BlockManager Scalability Improvements - Key: HDFS-7836 URL: https://issues.apache.org/jira/browse/HDFS-7836 Project: Hadoop HDFS Issue Type: Improvement Reporter: Charles Lamb Assignee: Charles Lamb Attachments: BlockManagerScalabilityImprovementsDesign.pdf Improvements to BlockManager scalability. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7836) BlockManager Scalability Improvements
[ https://issues.apache.org/jira/browse/HDFS-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353455#comment-14353455 ] Charles Lamb commented on HDFS-7836: If you are planning on attending the meeting in-person, please drop me an email so I have an idea of how large a CR to book. Thanks. BlockManager Scalability Improvements - Key: HDFS-7836 URL: https://issues.apache.org/jira/browse/HDFS-7836 Project: Hadoop HDFS Issue Type: Improvement Reporter: Charles Lamb Assignee: Charles Lamb Attachments: BlockManagerScalabilityImprovementsDesign.pdf Improvements to BlockManager scalability. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7847) Modify NNThroughputBenchmark to be able to operate on a remote NameNode
[ https://issues.apache.org/jira/browse/HDFS-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7847: --- Description: Modify NNThroughputBenchmark to be able to operate on a NN that is not in process. A followon Jira will modify it some more to allow quantifying native and java heap sizes, and some latency numbers. (was: Write a junit test to simulate a heavy BlockManager load. Quantify native and java heap sizes, and some latency numbers.) Summary: Modify NNThroughputBenchmark to be able to operate on a remote NameNode (was: Write a junit test to simulate a heavy BlockManager load) Modify NNThroughputBenchmark to be able to operate on a remote NameNode --- Key: HDFS-7847 URL: https://issues.apache.org/jira/browse/HDFS-7847 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7836 Reporter: Colin Patrick McCabe Assignee: Charles Lamb Attachments: make_blocks.tar.gz Modify NNThroughputBenchmark to be able to operate on a NN that is not in process. A followon Jira will modify it some more to allow quantifying native and java heap sizes, and some latency numbers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7910) Modify NNThroughputBenchmark to be able to provide some metrics
Charles Lamb created HDFS-7910: -- Summary: Modify NNThroughputBenchmark to be able to provide some metrics Key: HDFS-7910 URL: https://issues.apache.org/jira/browse/HDFS-7910 Project: Hadoop HDFS Issue Type: Sub-task Components: test Reporter: Charles Lamb Assignee: Charles Lamb Modify NNThroughputBenchmark to quantify native and java heap sizes, as well as some latency numbers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7847) Write a junit test to simulate a heavy BlockManager load
[ https://issues.apache.org/jira/browse/HDFS-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353850#comment-14353850 ] Charles Lamb commented on HDFS-7847: Thanks [~aagarwal]. I'll take a look at them. Write a junit test to simulate a heavy BlockManager load Key: HDFS-7847 URL: https://issues.apache.org/jira/browse/HDFS-7847 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7836 Reporter: Colin Patrick McCabe Assignee: Charles Lamb Attachments: make_blocks.tar.gz Write a junit test to simulate a heavy BlockManager load. Quantify native and java heap sizes, and some latency numbers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7847) Modify NNThroughputBenchmark to be able to operate on a remote NameNode
[ https://issues.apache.org/jira/browse/HDFS-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7847: --- Attachment: HDFS-7847.000.patch Add a new -remoteNamenode option to the CLI which takes a URI of a remote NN. The existing NNThroughputBenchmark uses the umbrella NamenodeProtocols (plural) interface, but you can only create proxies for the underlying RPC interfaces. This separates all the calls made in NNThroughputBenchmark out into the smaller sub-interfaces. Modify DFSClient so that proxies for each of the required interfaces can be created. Minor typo fixes encountered along the way. Modify NNThroughputBenchmark to be able to operate on a remote NameNode --- Key: HDFS-7847 URL: https://issues.apache.org/jira/browse/HDFS-7847 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7836 Reporter: Colin Patrick McCabe Assignee: Charles Lamb Attachments: HDFS-7847.000.patch, make_blocks.tar.gz Modify NNThroughputBenchmark to be able to operate on a NN that is not in process. A followon Jira will modify it some more to allow quantifying native and java heap sizes, and some latency numbers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HDFS-7847) Modify NNThroughputBenchmark to be able to operate on a remote NameNode
[ https://issues.apache.org/jira/browse/HDFS-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-7847 started by Charles Lamb. -- Modify NNThroughputBenchmark to be able to operate on a remote NameNode --- Key: HDFS-7847 URL: https://issues.apache.org/jira/browse/HDFS-7847 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7836 Reporter: Colin Patrick McCabe Assignee: Charles Lamb Attachments: HDFS-7847.000.patch, make_blocks.tar.gz Modify NNThroughputBenchmark to be able to operate on a NN that is not in process. A followon Jira will modify it some more to allow quantifying native and java heap sizes, and some latency numbers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7844) Create an off-heap hash table implementation
[ https://issues.apache.org/jira/browse/HDFS-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14351163#comment-14351163 ] Charles Lamb commented on HDFS-7844: I applied your latest patch and set breakpoints at all of the exceptional throws in ByteArrayMemoryManager.java. Then I ran the unit test. The following lines did not trigger: 91, 94, 117, 129, 135, 165, 171, 190, 203, 245, 251. I think those are the exceptions in allocate, free, one of the ones in putShort, and all of the throws in the getters. Create an off-heap hash table implementation Key: HDFS-7844 URL: https://issues.apache.org/jira/browse/HDFS-7844 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7836 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-7844-scl.001.patch, HDFS-7844-scl.002.patch, HDFS-7844-scl.003.patch Create an off-heap hash table implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7844) Create an off-heap hash table implementation
[ https://issues.apache.org/jira/browse/HDFS-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14351073#comment-14351073 ] Charles Lamb commented on HDFS-7844: [~cmccabe], This is a nice piece of work! Here are some comments: General: Several lines bust the 80 char limit. Many unused imports throughout. I guess Yi got this already. What happens if someone runs this with the -d32 to the jvm? Do we need to make that check and throw accordingly? ProbingHashSet.java: A small enhancement might be: {code}close(boolean force){code} which will close unconditionally. The line in #getSlot which is {code}hash = -hash{code} is in fact tested by your unit tests, but I don't think it's tested by design in the test. You might want to put in an explicit test for that particular line. #expandTable: using {code}catch(Throwable){code} feels like a rather wide net to cast, but I guess it's the right thing. I debated whether all you needed was catch (Error), but I guess you can't be sure that the callers above you won't just keep going after some RuntimeException gets into their hands. The comment for #capacity() total number of slots is either misleading or wrong. MemoryManager.java any reason not to have get/putShort along with the existing byte/int/long? Should #toString() be declared as {code}@Override public String toString(){code} NativeMemoryManager.java The comments say nothing about whether it's thread safe or not. Ditto for ByteArrayMemoryManager. ByteArrayMemoryManager There is no test coverage for the failure case of {code}BAMM.close(){code} s/valiation/validation/ (Yi caught this) Why does curAddress start at 1000? s/2^^31/2^31/ For all of the put/get/byte/int/long routines, it wouldn't be hard to move all of the {code}if() { throw new RuntimeException }{code} snippits into their own routine. Maybe that's not worth the trouble, but if feels like there's a lot of repeated code. TestMemoryManager.java The indentation of #testMemoryManagerCreate formals is messed up. #testCatchInvalidPuts: you test putByte against freed memory, but not int or long. the Assert.fail messages should be different for each fail() call. The exception checks in getByte/Int/Long are not tested. None of the entry==null exceptions are tested in putByte/Long/Int I tried running TestMemoryManager.testNativeMemoryManagerCreate and it failed like this: {code} 2015-03-06 17:10:22,430 ERROR offheap.MemoryManager$Factory (MemoryManager.java:create(91)) - Unable to create org.apache.hadoop.util.offheap.NativeMemoryManager. Falling back on org.apache.hadoop.util.offheap.ByteArrayMemoryManager java.lang.IllegalArgumentException: wrong number of arguments at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.util.offheap.MemoryManager$Factory.create(MemoryManager.java:89) at org.apache.hadoop.util.offheap.TestMemoryManager.testMemoryManagerCreate(TestMemoryManager.java:135) at org.apache.hadoop.util.offheap.TestMemoryManager.testNativeMemoryManagerCreate(TestMemoryManager.java:151) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) org.junit.ComparisonFailure: Expected :org.apache.hadoop.util.offheap.NativeMemoryManager Actual :org.apache.hadoop.util.offheap.ByteArrayMemoryManager Click to see difference at org.junit.Assert.assertEquals(Assert.java:115) at org.junit.Assert.assertEquals(Assert.java:144) at org.apache.hadoop.util.offheap.TestMemoryManager.testMemoryManagerCreate(TestMemoryManager.java:137) at org.apache.hadoop.util.offheap.TestMemoryManager.testNativeMemoryManagerCreate(TestMemoryManager.java:151) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
[jira] [Commented] (HDFS-7844) Create an off-heap hash table implementation
[ https://issues.apache.org/jira/browse/HDFS-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14351282#comment-14351282 ] Charles Lamb commented on HDFS-7844: Thanks Colin, +1, I'll file a follow up jira for the coverage. Create an off-heap hash table implementation Key: HDFS-7844 URL: https://issues.apache.org/jira/browse/HDFS-7844 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7836 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-7844-scl.001.patch, HDFS-7844-scl.002.patch, HDFS-7844-scl.003.patch Create an off-heap hash table implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7836) BlockManager Scalability Improvements
[ https://issues.apache.org/jira/browse/HDFS-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14349438#comment-14349438 ] Charles Lamb commented on HDFS-7836: We'll hold a design review meeting and discussion of this project next Weds, March 11th, 10am to 1pm (PDT) at the Cloudera offices in Palo Alto. I'll post webex information on this Jira before then. If you plan on attending in person, please send me a private email so I know how many people to expect. BlockManager Scalability Improvements - Key: HDFS-7836 URL: https://issues.apache.org/jira/browse/HDFS-7836 Project: Hadoop HDFS Issue Type: Improvement Reporter: Charles Lamb Assignee: Charles Lamb Attachments: BlockManagerScalabilityImprovementsDesign.pdf Improvements to BlockManager scalability. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7836) BlockManager Scalability Improvements
[ https://issues.apache.org/jira/browse/HDFS-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345012#comment-14345012 ] Charles Lamb commented on HDFS-7836: Yes, there would definitely be a webex available. BlockManager Scalability Improvements - Key: HDFS-7836 URL: https://issues.apache.org/jira/browse/HDFS-7836 Project: Hadoop HDFS Issue Type: Improvement Reporter: Charles Lamb Assignee: Charles Lamb Attachments: BlockManagerScalabilityImprovementsDesign.pdf Improvements to BlockManager scalability. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7435) PB encoding of block reports is very inefficient
[ https://issues.apache.org/jira/browse/HDFS-7435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345311#comment-14345311 ] Charles Lamb commented on HDFS-7435: Hi @daryn, The new patch looks pretty good to me. Just a few nits. FsDatasetImpl still has one line that exceeds the 80 chars, and there are a couple of unused imports in the new test TestBlockListAsLongs. Also in that test, IWBNI you could use the specific Mockito static imports that are needed rather than the * import. PB encoding of block reports is very inefficient Key: HDFS-7435 URL: https://issues.apache.org/jira/browse/HDFS-7435 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Critical Attachments: HDFS-7435.000.patch, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.patch, HDFS-7435.patch, HDFS-7435.patch, HDFS-7435.patch, HDFS-7435.patch Block reports are encoded as a PB repeating long. Repeating fields use an {{ArrayList}} with default capacity of 10. A block report containing tens or hundreds of thousand of longs (3 for each replica) is extremely expensive since the {{ArrayList}} must realloc many times. Also, decoding repeating fields will box the primitive longs which must then be unboxed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7845) Compress block reports
[ https://issues.apache.org/jira/browse/HDFS-7845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345372#comment-14345372 ] Charles Lamb commented on HDFS-7845: bq. Charles Lamb did some tests with a block report and got around 50% (if I'm remembering correctly?) Charles Lamb, can you comment on whether those tests were done with vints or regular integers? Yes, 50% is about what I saw. Those were done on the array of longs, not vints, using plain lz4. Compress block reports -- Key: HDFS-7845 URL: https://issues.apache.org/jira/browse/HDFS-7845 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7836 Reporter: Colin Patrick McCabe Assignee: Charles Lamb We should optionally compress block reports using a low-cpu codec such as lz4 or snappy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7836) BlockManager Scalability Improvements
[ https://issues.apache.org/jira/browse/HDFS-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338394#comment-14338394 ] Charles Lamb commented on HDFS-7836: Hi [~arpit99], Thanks for reading over the design doc and commenting on it. bq. The DataNode can now split block reports per storage directory (post HDFS-2832), controlled by DFS_BLOCKREPORT_SPLIT_THRESHOLD_KEY. Did you get a chance to try it out and see if it helps? Splitting reports addresses all of the above. (edit: does not address network bandwidth gains from compression though) I think you may mean your work on HDFS-5153, right? If I understand that correctly, it sends one report per storage. We have seen block reports in the 100MB+ sizes so we suspect that an even small chunksize than a storage may yield benefits. That said, I am also watching [~daryn]'s work on HDFS-7435 which addresses a lot of this piece of this Jira's proposal. I think that once HDFS-7435 is committed, we will make some measurements and see if anything else in the area of chunking is necessary. As you point out, compression should also help. bq. Do you have any estimates for startup time overhead due to GCs? We know of at least one large deployment which experiences a full GC pause during startup. I'm not sure of the time, but in general, the off-heaping will help with NN throughput just by reducing the number of objects on the heap. bq. How does this affect block report processing? We cannot assume DataNodes will sort blocks by target stripe. Will the NameNode sort received reports or will it acquire+release a lock per block? If the former, then there should probably be some randomization of order across threads to avoid unintended serialization e.g. lock convoys. The idea is that currently, processing a block report requires taking the FSN lock. So this proposal is two part. First, use better locking semantics so that we don't have to take the FSN lock. Next, shard the blocksMap structure so that multiple threads can operate concurrently on that structure. Even if we continue to process BRs under one big happy FSN lock, having multiple threads operate concurrently will yield benefits. The sharding (stripes) is along arbitrary boundaries. For instance, the design doc suggests that it could be striped by doing blockId % nStripes. nStripes would be configurable to a relatively small number (the dd suggests 4 to 16), and if the modulo calculation is used, then nStripes would be a prime that is roughly equal to the number of threads available. As long as block report processing per block does not need to access more than one shard at a time, this will be fine -- multiple threads can process blocks in parallel. It is a technique that Berkeley DB Java Edition uses for its lock table to improve concurrency. BlockManager Scalability Improvements - Key: HDFS-7836 URL: https://issues.apache.org/jira/browse/HDFS-7836 Project: Hadoop HDFS Issue Type: Improvement Reporter: Charles Lamb Assignee: Charles Lamb Attachments: BlockManagerScalabilityImprovementsDesign.pdf Improvements to BlockManager scalability. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7836) BlockManager Scalability Improvements
[ https://issues.apache.org/jira/browse/HDFS-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338395#comment-14338395 ] Charles Lamb commented on HDFS-7836: Hi [~arpit99], Thanks for reading over the design doc and commenting on it. bq. The DataNode can now split block reports per storage directory (post HDFS-2832), controlled by DFS_BLOCKREPORT_SPLIT_THRESHOLD_KEY. Did you get a chance to try it out and see if it helps? Splitting reports addresses all of the above. (edit: does not address network bandwidth gains from compression though) I think you may mean your work on HDFS-5153, right? If I understand that correctly, it sends one report per storage. We have seen block reports in the 100MB+ sizes so we suspect that an even small chunksize than a storage may yield benefits. That said, I am also watching [~daryn]'s work on HDFS-7435 which addresses a lot of this piece of this Jira's proposal. I think that once HDFS-7435 is committed, we will make some measurements and see if anything else in the area of chunking is necessary. As you point out, compression should also help. bq. Do you have any estimates for startup time overhead due to GCs? We know of at least one large deployment which experiences a full GC pause during startup. I'm not sure of the time, but in general, the off-heaping will help with NN throughput just by reducing the number of objects on the heap. bq. How does this affect block report processing? We cannot assume DataNodes will sort blocks by target stripe. Will the NameNode sort received reports or will it acquire+release a lock per block? If the former, then there should probably be some randomization of order across threads to avoid unintended serialization e.g. lock convoys. The idea is that currently, processing a block report requires taking the FSN lock. So this proposal is two part. First, use better locking semantics so that we don't have to take the FSN lock. Next, shard the blocksMap structure so that multiple threads can operate concurrently on that structure. Even if we continue to process BRs under one big happy FSN lock, having multiple threads operate concurrently will yield benefits. The sharding (stripes) is along arbitrary boundaries. For instance, the design doc suggests that it could be striped by doing blockId % nStripes. nStripes would be configurable to a relatively small number (the dd suggests 4 to 16), and if the modulo calculation is used, then nStripes would be a prime that is roughly equal to the number of threads available. As long as block report processing per block does not need to access more than one shard at a time, this will be fine -- multiple threads can process blocks in parallel. It is a technique that Berkeley DB Java Edition uses for its lock table to improve concurrency. BlockManager Scalability Improvements - Key: HDFS-7836 URL: https://issues.apache.org/jira/browse/HDFS-7836 Project: Hadoop HDFS Issue Type: Improvement Reporter: Charles Lamb Assignee: Charles Lamb Attachments: BlockManagerScalabilityImprovementsDesign.pdf Improvements to BlockManager scalability. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7435) PB encoding of block reports is very inefficient
[ https://issues.apache.org/jira/browse/HDFS-7435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338768#comment-14338768 ] Charles Lamb commented on HDFS-7435: @daryn, This looks really good. I like the new approach and your current patch does a pre-emptive strike on several of the comments I was going to make on the .002 patch. I really only have nits. The patch needs to be rebased. There was one .rej when I applied it (obviously I worked past that for my review). BlockListAsLongs.java: BlockListAsLongs(Collection) needs an @param for the javadoc. In #BlockListAsLongs(Collection), the ReplicaState is being written as a varint64. I realize it's a varint, but since it's really only a single byte in the implementation, it seems a little heavy handed to write it to the cos as a varint64. I also realize that it will need to be a long on the way back out for the uc long[]. If you don't want to change it from being a varint64 in the cos, then perhaps just add a comment saying that you know it's a byte (actually int) in the impl but for consistency you're using a varint64? Since you throw UnsupportedOperationException from multiple #remove methods, you might want to add the class name to the message. e.g. Sorry. remove not implemented for BlockReportListIterator. Along a similar vein, would it be appropriate to add a message to BlockReportReplica.getVisibleLength, getStorageUuid, and isOnTransientStorage's UnsupportedOperationException? BlockReportTestBase.java: getBlockReports has one line that exceeds the 80 char width. DatanodeProtocolClientSideTranslatorPB.java: the import of NameNodeLayoutVersion is unused. the decl of useBlocksBuffer busts the 80 char width. DatanodeProtocolServerSideTranslatorPB.java: import ByteString is unused. FsDatasetImpl.java: in #getBlockReports, the line under case RUR busts the 80 char limit. NameNodeLayoutVersion.java: Perhaps s/Protobuf optimized/Optimized protobuf/ NNThroughputBenchmark.java: Thanks for fixing the formatting in here. TestBlockHasMultipleReplicasOnSameDN.java: blocks.add(... busts the 80 char limit. PB encoding of block reports is very inefficient Key: HDFS-7435 URL: https://issues.apache.org/jira/browse/HDFS-7435 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Critical Attachments: HDFS-7435.000.patch, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.patch, HDFS-7435.patch Block reports are encoded as a PB repeating long. Repeating fields use an {{ArrayList}} with default capacity of 10. A block report containing tens or hundreds of thousand of longs (3 for each replica) is extremely expensive since the {{ArrayList}} must realloc many times. Also, decoding repeating fields will box the primitive longs which must then be unboxed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7836) BlockManager Scalability Improvements
[ https://issues.apache.org/jira/browse/HDFS-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338901#comment-14338901 ] Charles Lamb commented on HDFS-7836: bq. Are you proposing that off-heaping is an opt-in feature that must be explicitly enabled in configuration, or are you proposing that off-heaping will be the new default behavior? Arguably, jumping to off-heaping as the default could be seen as a backwards-incompatibility, because it might be unsafe to deploy the feature without simultaneous down-tuning the NameNode max heap size. Some might see that as backwards-incompatible with existing configurations. The proposal is to have an option that lets the offheap code allocate slabs using 'new byte[]' rather than malloc. This would be used for debugging purposes and not in a normal deployment. BlockManager Scalability Improvements - Key: HDFS-7836 URL: https://issues.apache.org/jira/browse/HDFS-7836 Project: Hadoop HDFS Issue Type: Improvement Reporter: Charles Lamb Assignee: Charles Lamb Attachments: BlockManagerScalabilityImprovementsDesign.pdf Improvements to BlockManager scalability. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7435) PB encoding of block reports is very inefficient
[ https://issues.apache.org/jira/browse/HDFS-7435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339390#comment-14339390 ] Charles Lamb commented on HDFS-7435: bq. I mildly disagree will overly verbose message for UnsupportedOperationExceptions since the JDK rarely uses message and the class method is in the stack trace. Yeah, I figured there was a reason for not having a message in the UOE, so it won't cause me any heartburn if you don't put them in. bq. I think the whole fragmented over-the-write buffer implementation is over engineered. Oh, I kind of liked it. I'll try to take a look at the new patch soon. Thanks for the reply. PB encoding of block reports is very inefficient Key: HDFS-7435 URL: https://issues.apache.org/jira/browse/HDFS-7435 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Critical Attachments: HDFS-7435.000.patch, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.patch, HDFS-7435.patch, HDFS-7435.patch Block reports are encoded as a PB repeating long. Repeating fields use an {{ArrayList}} with default capacity of 10. A block report containing tens or hundreds of thousand of longs (3 for each replica) is extremely expensive since the {{ArrayList}} must realloc many times. Also, decoding repeating fields will box the primitive longs which must then be unboxed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7836) BlockManager Scalability Improvements
[ https://issues.apache.org/jira/browse/HDFS-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339385#comment-14339385 ] Charles Lamb commented on HDFS-7836: bq. it would be useful to see some perf comparison before we add that complexity. We definitely plan on getting some baseline measurements and sharing them. We definitely want to know what the before and after effects are of any changes. As an aside, I worked on a case where we had to increase the RPC limit to 192MB in order to get block reports handled correctly so I know this type of deployments are out there. bq. I'll see if I can clean up and post what I used on HDFS-7847. That would be much appreciated. I'm starting to look at HDFS-7847 (subtask of this Jira) and maybe that could come into play somehow. bq. These two sound contradictory. Yes, they do, but aren't meant to be. The first level would be to do concurrent processing under the FSN lock. That would at least get some parallelism. The second step would be to make a more lockless blocksMap which wouldn't require the big FSN lock to be held. BTW, since you have edit privs, if you want to get rid of my reduntent reply to you above that would be great. My browser suckered me into hitting Add twice. Thanks. BlockManager Scalability Improvements - Key: HDFS-7836 URL: https://issues.apache.org/jira/browse/HDFS-7836 Project: Hadoop HDFS Issue Type: Improvement Reporter: Charles Lamb Assignee: Charles Lamb Attachments: BlockManagerScalabilityImprovementsDesign.pdf Improvements to BlockManager scalability. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7836) BlockManager Scalability Improvements
[ https://issues.apache.org/jira/browse/HDFS-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335747#comment-14335747 ] Charles Lamb commented on HDFS-7836: Problem Statement The number of blocks stored by the largest HDFS clusters continues to increase. This increase adds pressure to the BlockManager, that part of the NameNode which handles block data from across the cluster. Full block reports are problematic. The more blocks each DataNode has, the longer it takes to process a full block report from that DataNode. Storage densities have roughly doubled each year for the past few years. Meanwhile, increases in CPU power have come mostly in the form of additional cores rather than faster clock speeds. Currently, the NameNode cannot use these additional cores because full block reports are processed while holding the namesystem lock. The BlockManager stores all blocks in memory and this contributes to a large heap size. As the NameNode Java heap size has grown, full garbage collection events have started to take several minutes. Although it is often possible to avoid full GCs by re-using Java objects, they remain an operational concern for administrators. They also contribute to a long NameNode startup time, sometimes measured in tens of minutes for the biggest clusters. Goals We need to improve the BlockManager to handle the challenges of the next few years. Our specific goals for this project are to: * Reduce lock contention for the FSNamesystem lock * Enable concurrent processing of block reports * Reduce the Java heap size of the NameNode * Optimize the use of network resources [~cmccabe] and I will be working on this Jira. We propose doing this work on a separate branch. If there is interest in a community meeting to discuss these changes, then perhaps Tuesday 3/10/15 at Cloudera in Palo Alto, CA would work? I suggest that date because I will be in the bay area that day and would like to meet with other interested community members in person. I'll also be around 3/11 and 3/12 if we need an alternate date. BlockManager Scalability Improvements - Key: HDFS-7836 URL: https://issues.apache.org/jira/browse/HDFS-7836 Project: Hadoop HDFS Issue Type: Improvement Reporter: Charles Lamb Assignee: Charles Lamb Improvements to BlockManager scalability. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7836) BlockManager Scalability Improvements
Charles Lamb created HDFS-7836: -- Summary: BlockManager Scalability Improvements Key: HDFS-7836 URL: https://issues.apache.org/jira/browse/HDFS-7836 Project: Hadoop HDFS Issue Type: Bug Reporter: Charles Lamb Assignee: Charles Lamb Improvements to BlockManager scalability. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7836) BlockManager Scalability Improvements
[ https://issues.apache.org/jira/browse/HDFS-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7836: --- Attachment: BlockManagerScalabilityImprovementsDesign.pdf BlockManager Scalability Improvements - Key: HDFS-7836 URL: https://issues.apache.org/jira/browse/HDFS-7836 Project: Hadoop HDFS Issue Type: Improvement Reporter: Charles Lamb Assignee: Charles Lamb Attachments: BlockManagerScalabilityImprovementsDesign.pdf Improvements to BlockManager scalability. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7682) {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes non-snapshotted content
[ https://issues.apache.org/jira/browse/HDFS-7682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7682: --- Attachment: HDFS-7682.003.patch [~jingzhao], Thanks for the comments. I think the latest patch address them by changing the test to a check for the src path being a snapshotted file. Charles {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes non-snapshotted content Key: HDFS-7682 URL: https://issues.apache.org/jira/browse/HDFS-7682 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-7682.000.patch, HDFS-7682.001.patch, HDFS-7682.002.patch, HDFS-7682.003.patch DistributedFileSystem#getFileChecksum of a snapshotted file includes non-snapshotted content. The reason why this happens is because DistributedFileSystem#getFileChecksum simply calculates the checksum of all of the CRCs from the blocks in the file. But, in the case of a snapshotted file, we don't want to include data in the checksum that was appended to the last block in the file after the snapshot was taken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7682) {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes non-snapshotted content
[ https://issues.apache.org/jira/browse/HDFS-7682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7682: --- Attachment: HDFS-7682.002.patch Rebased. {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes non-snapshotted content Key: HDFS-7682 URL: https://issues.apache.org/jira/browse/HDFS-7682 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-7682.000.patch, HDFS-7682.001.patch, HDFS-7682.002.patch DistributedFileSystem#getFileChecksum of a snapshotted file includes non-snapshotted content. The reason why this happens is because DistributedFileSystem#getFileChecksum simply calculates the checksum of all of the CRCs from the blocks in the file. But, in the case of a snapshotted file, we don't want to include data in the checksum that was appended to the last block in the file after the snapshot was taken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7704) DN heartbeat to Active NN may be blocked and expire if connection to Standby NN continues to time out.
[ https://issues.apache.org/jira/browse/HDFS-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305452#comment-14305452 ] Charles Lamb commented on HDFS-7704: Hi [~shahrs87], I only have a few nits on the latest rev: BPServiceActor.java: Line continuations are 4 spaces. At line 257 you've introduced a line containing spaces. Also, you've removed the last newline of the file. BPServiceActorAction.java: Line continuations are 4 spaces (statements in a new block are indented 2 spaces). ErrorReportAction.java: s/A ErrorReportAction/An ErrorReportAction/ Can LOG be private? ReportBadBlockAction.java Can LOG be private? s/to namenode :/to namenode: / The block comment in that catch should be: /* * One common reason ... */ DN heartbeat to Active NN may be blocked and expire if connection to Standby NN continues to time out. --- Key: HDFS-7704 URL: https://issues.apache.org/jira/browse/HDFS-7704 Project: Hadoop HDFS Issue Type: Bug Components: datanode, namenode Affects Versions: 2.5.0 Reporter: Rushabh S Shah Assignee: Rushabh S Shah Attachments: HDFS-7704-v2.patch, HDFS-7704-v3.patch, HDFS-7704-v4.patch, HDFS-7704.patch There are couple of synchronous calls in BPOfferservice (i.e reportBadBlocks and trySendErrorReport) which will wait for both of the actor threads to process this calls. This calls are made with writeLock acquired. When reportBadBlocks() is blocked at the RPC layer due to unreachable NN, subsequent heartbeat response processing has to wait for the write lock. It eventually gets through, but takes too long and it blocks the next heartbeat. In our HA cluster setup, the standby namenode was taking a long time to process the request. Requesting improvement in datanode to make the above calls asynchronous since these reports don't have any specific deadlines, so extra few seconds of delay should be acceptable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7704) DN heartbeat to Active NN may be blocked and expire if connection to Standby NN continues to time out.
[ https://issues.apache.org/jira/browse/HDFS-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305742#comment-14305742 ] Charles Lamb commented on HDFS-7704: Oh, sorry, one more comment. In the test, to be consistent with the code that is already there, you can add an import static for Assert.assertTrue rather than importing (non-static) Assert. Or, the opposite (eliminate the import statics). DN heartbeat to Active NN may be blocked and expire if connection to Standby NN continues to time out. --- Key: HDFS-7704 URL: https://issues.apache.org/jira/browse/HDFS-7704 Project: Hadoop HDFS Issue Type: Bug Components: datanode, namenode Affects Versions: 2.5.0 Reporter: Rushabh S Shah Assignee: Rushabh S Shah Attachments: HDFS-7704-v2.patch, HDFS-7704-v3.patch, HDFS-7704-v4.patch, HDFS-7704.patch There are couple of synchronous calls in BPOfferservice (i.e reportBadBlocks and trySendErrorReport) which will wait for both of the actor threads to process this calls. This calls are made with writeLock acquired. When reportBadBlocks() is blocked at the RPC layer due to unreachable NN, subsequent heartbeat response processing has to wait for the write lock. It eventually gets through, but takes too long and it blocks the next heartbeat. In our HA cluster setup, the standby namenode was taking a long time to process the request. Requesting improvement in datanode to make the above calls asynchronous since these reports don't have any specific deadlines, so extra few seconds of delay should be acceptable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7704) DN heartbeat to Active NN may be blocked and expire if connection to Standby NN continues to time out.
[ https://issues.apache.org/jira/browse/HDFS-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301835#comment-14301835 ] Charles Lamb commented on HDFS-7704: Hi [~shahrs87], I don't understand why you create the BPServiceActorAction class with subtypes (ErrorReportAction and ReportBadBlockAction) and then not do a method dispatch on the class. Instead you're using an enum and case to do the dispatch. This seems rather un-OO-like, no? Other nits: BPOfferService.java: {code} ErrorReportAction errorReportAction = new ErrorReportAction (BPServiceActorAction.ActionEnum.TRYSENDERRORREPORT,errCode, errMsg); {code} Needs a space before errCode. BPServiceActor.java: s/synchronized(/synchronized (/ s/switch(/switch (/ ErrorReportAction.java: s/BPServiceActorAction{/BPServiceActorAction {/ ReportBadBlockAction.java: s/BPServiceActorAction{/BPServiceActorAction {/ Charles DN heartbeat to Active NN may be blocked and expire if connection to Standby NN continues to time out. --- Key: HDFS-7704 URL: https://issues.apache.org/jira/browse/HDFS-7704 Project: Hadoop HDFS Issue Type: Bug Components: datanode, namenode Affects Versions: 2.5.0 Reporter: Rushabh S Shah Assignee: Rushabh S Shah Attachments: HDFS-7704-v2.patch, HDFS-7704.patch There are couple of synchronous calls in BPOfferservice (i.e reportBadBlocks and trySendErrorReport) which will wait for both of the actor threads to process this calls. This calls are made with writeLock acquired. When reportBadBlocks() is blocked at the RPC layer due to unreachable NN, subsequent heartbeat response processing has to wait for the write lock. It eventually gets through, but takes too long and it blocks the next heartbeat. In our HA cluster setup, the standby namenode was taking a long time to process the request. Requesting improvement in datanode to make the above calls asynchronous since these reports don't have any specific deadlines, so extra few seconds of delay should be acceptable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7682) {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes non-snapshotted content
[ https://issues.apache.org/jira/browse/HDFS-7682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301885#comment-14301885 ] Charles Lamb commented on HDFS-7682: [~jingzhao], Did you have any more comments on this Jira? BTW, the test failure is unrelated. Thanks. Charles {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes non-snapshotted content Key: HDFS-7682 URL: https://issues.apache.org/jira/browse/HDFS-7682 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-7682.000.patch, HDFS-7682.001.patch DistributedFileSystem#getFileChecksum of a snapshotted file includes non-snapshotted content. The reason why this happens is because DistributedFileSystem#getFileChecksum simply calculates the checksum of all of the CRCs from the blocks in the file. But, in the case of a snapshotted file, we don't want to include data in the checksum that was appended to the last block in the file after the snapshot was taken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7423) various typos and message formatting fixes in nfs daemon and doc
[ https://issues.apache.org/jira/browse/HDFS-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296801#comment-14296801 ] Charles Lamb commented on HDFS-7423: bq. Is it correct? Yes, it's ok because statistics is only declared and never used (except there). Hence, it's always null. Probably a better change would have been to just eliminate statistics completely from the file. Thanks for the commit [~hitliuyi]. various typos and message formatting fixes in nfs daemon and doc Key: HDFS-7423 URL: https://issues.apache.org/jira/browse/HDFS-7423 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Trivial Fix For: 2.7.0 Attachments: HDFS-7423-branch-2.004.patch, HDFS-7423.001.patch, HDFS-7423.002.patch, HDFS-7423.003.patch, HDFS-7423.004.patch These are accumulated fixes for log messages, formatting, typos, etc. in the nfs3 daemon that I came across while working on a customer issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7702) Move metadata across namenode - Effort to a real distributed namenode
[ https://issues.apache.org/jira/browse/HDFS-7702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297657#comment-14297657 ] Charles Lamb commented on HDFS-7702: Hi [~xiyunyue], I read over your proposal and have some high level questions. I am unclear about your proposal in the failure scenarios. If a source or target NN or one or more of the DNs fails in the middle of a migration, how are things restarted? Why use Kryo and not protobuf for serialization? Why use Kryo and not the existing Hadoop/HDFS protocols and infrastructure for network communications between the various nodes? Is the transfer granularity blockpool only? I infer that from this statement: bq. The target namenode will notify datanode remove blockpool id which belong to the source namenode, but then this statement: bq. it will mark delete the involved sub-tree from its own namespace leads me to believe that it's sub-trees in the namespace. Could you please clarify this statement: bq. all read and write operation regarding the same namespace sub-tree is forwarding to the target namenode. Who does the forwarding, the client or the source NN? Move metadata across namenode - Effort to a real distributed namenode - Key: HDFS-7702 URL: https://issues.apache.org/jira/browse/HDFS-7702 Project: Hadoop HDFS Issue Type: New Feature Reporter: Ray Assignee: Ray Implement a tool can show in memory namespace tree structure with weight(size) and a API can move metadata across different namenode. The purpose is moving data efficiently and faster, without moving blocks on datanode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7704) DN heartbeat to Active NN may be blocked and expire if connection to Standby NN continues to time out.
[ https://issues.apache.org/jira/browse/HDFS-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297887#comment-14297887 ] Charles Lamb commented on HDFS-7704: Hi [~shahrs87], A couple of quick comments: {code} public void bpThreadEnqueue(DatanodeCommand datanodeCommand) { if (bpThreadQueue != null) { bpThreadQueue.add(datanodeCommand); } } {code} When would bpThreadQueue be null? Don't you want to use Preconditions here? Several lines exceed the 80 char limit. s/if(/if (/ I'll wait for your second version with [~kihwal]'s comments addressed. DN heartbeat to Active NN may be blocked and expire if connection to Standby NN continues to time out. --- Key: HDFS-7704 URL: https://issues.apache.org/jira/browse/HDFS-7704 Project: Hadoop HDFS Issue Type: Bug Components: datanode, namenode Affects Versions: 2.5.0 Reporter: Rushabh S Shah Assignee: Rushabh S Shah Attachments: HDFS-7704.patch There are couple of synchronous calls in BPOfferservice (i.e reportBadBlocks and trySendErrorReport) which will wait for both of the actor threads to process this calls. This calls are made with writeLock acquired. When reportBadBlocks() is blocked at the RPC layer due to unreachable NN, subsequent heartbeat response processing has to wait for the write lock. It eventually gets through, but takes too long and it blocks the next heartbeat. In our HA cluster setup, the standby namenode was taking a long time to process the request. Requesting improvement in datanode to make the above calls asynchronous since these reports don't have any specific deadlines, so extra few seconds of delay should be acceptable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7423) various typos and message formatting fixes in nfs daemon and doc
[ https://issues.apache.org/jira/browse/HDFS-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7423: --- Attachment: HDFS-7423.004.patch Hi [~hitliuyi], The .004 is rebased for the trunk. Let's wait for the jenkins run. Once that completes, I'll upload the branch-2 rebase diffs. various typos and message formatting fixes in nfs daemon and doc Key: HDFS-7423 URL: https://issues.apache.org/jira/browse/HDFS-7423 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Trivial Attachments: HDFS-7423.001.patch, HDFS-7423.002.patch, HDFS-7423.003.patch, HDFS-7423.004.patch These are accumulated fixes for log messages, formatting, typos, etc. in the nfs3 daemon that I came across while working on a customer issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7423) various typos and message formatting fixes in nfs daemon and doc
[ https://issues.apache.org/jira/browse/HDFS-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295475#comment-14295475 ] Charles Lamb commented on HDFS-7423: Test failures unrelated. various typos and message formatting fixes in nfs daemon and doc Key: HDFS-7423 URL: https://issues.apache.org/jira/browse/HDFS-7423 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Trivial Attachments: HDFS-7423-branch-2.004.patch, HDFS-7423.001.patch, HDFS-7423.002.patch, HDFS-7423.003.patch, HDFS-7423.004.patch These are accumulated fixes for log messages, formatting, typos, etc. in the nfs3 daemon that I came across while working on a customer issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7423) various typos and message formatting fixes in nfs daemon and doc
[ https://issues.apache.org/jira/browse/HDFS-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7423: --- Attachment: HDFS-7423-branch-2.004.patch branch-2 diffs attached. various typos and message formatting fixes in nfs daemon and doc Key: HDFS-7423 URL: https://issues.apache.org/jira/browse/HDFS-7423 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Trivial Attachments: HDFS-7423-branch-2.004.patch, HDFS-7423.001.patch, HDFS-7423.002.patch, HDFS-7423.003.patch, HDFS-7423.004.patch These are accumulated fixes for log messages, formatting, typos, etc. in the nfs3 daemon that I came across while working on a customer issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7423) various typos and message formatting fixes in nfs daemon and doc
[ https://issues.apache.org/jira/browse/HDFS-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293543#comment-14293543 ] Charles Lamb commented on HDFS-7423: Thank you for the review [~ste...@apache.org]. various typos and message formatting fixes in nfs daemon and doc Key: HDFS-7423 URL: https://issues.apache.org/jira/browse/HDFS-7423 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Trivial Attachments: HDFS-7423.001.patch, HDFS-7423.002.patch, HDFS-7423.003.patch These are accumulated fixes for log messages, formatting, typos, etc. in the nfs3 daemon that I came across while working on a customer issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-6571) NameNode should delete intermediate fsimage.ckpt when checkpoint fails
[ https://issues.apache.org/jira/browse/HDFS-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb reassigned HDFS-6571: -- Assignee: Charles Lamb NameNode should delete intermediate fsimage.ckpt when checkpoint fails -- Key: HDFS-6571 URL: https://issues.apache.org/jira/browse/HDFS-6571 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.0 Reporter: Akira AJISAKA Assignee: Charles Lamb When checkpoint fails in getting a new fsimage from standby NameNode or SecondaryNameNode, intermediate fsimage (fsimage.ckpt_txid) is left and never to be cleaned up. If fsimage is large and fails to checkpoint many times, the growing intermediate fsimage may cause out of disk space. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7682) {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes non-snapshotted content
[ https://issues.apache.org/jira/browse/HDFS-7682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7682: --- Attachment: HDFS-7682.001.patch Hi [~jingzhao], Thanks for looking at this. isLastBlockComplete() covers the case where it's a snapshot path as well as a closed non-snapshot path. The file length is correct in both those cases so it's ok to use that. In the case of a still-being-written file, then isLastBlockComplete() returns false and the code works just same as it does today. The particular case that this patch is fixing is that a snapshotted file is frozen, so the file length is the limit of what should be checksummed, not the block lengths (which include the non-snapshotted portion). I've added more assertions in the test to demonstrate this. In other words, the behavior for non-snapshotted files that are still open (and possibly being appended to) is not changed by this patch, only that of snapshotted files, for which isLastBlockComplete() is a valid check. HDFS-5343 took a similar approach. {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes non-snapshotted content Key: HDFS-7682 URL: https://issues.apache.org/jira/browse/HDFS-7682 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-7682.000.patch, HDFS-7682.001.patch DistributedFileSystem#getFileChecksum of a snapshotted file includes non-snapshotted content. The reason why this happens is because DistributedFileSystem#getFileChecksum simply calculates the checksum of all of the CRCs from the blocks in the file. But, in the case of a snapshotted file, we don't want to include data in the checksum that was appended to the last block in the file after the snapshot was taken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7682) {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes non-snapshotted content
[ https://issues.apache.org/jira/browse/HDFS-7682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7682: --- Status: Patch Available (was: Open) {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes non-snapshotted content Key: HDFS-7682 URL: https://issues.apache.org/jira/browse/HDFS-7682 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-7682.000.patch DistributedFileSystem#getFileChecksum of a snapshotted file includes non-snapshotted content. The reason why this happens is because DistributedFileSystem#getFileChecksum simply calculates the checksum of all of the CRCs from the blocks in the file. But, in the case of a snapshotted file, we don't want to include data in the checksum that was appended to the last block in the file after the snapshot was taken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7682) {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes non-snapshotted content
[ https://issues.apache.org/jira/browse/HDFS-7682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7682: --- Attachment: HDFS-7682.000.patch Posting patch for a jenkins run. {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes non-snapshotted content Key: HDFS-7682 URL: https://issues.apache.org/jira/browse/HDFS-7682 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-7682.000.patch DistributedFileSystem#getFileChecksum of a snapshotted file includes non-snapshotted content. The reason why this happens is because DistributedFileSystem#getFileChecksum simply calculates the checksum of all of the CRCs from the blocks in the file. But, in the case of a snapshotted file, we don't want to include data in the checksum that was appended to the last block in the file after the snapshot was taken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7682) {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes non-snapshotted content
Charles Lamb created HDFS-7682: -- Summary: {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes non-snapshotted content Key: HDFS-7682 URL: https://issues.apache.org/jira/browse/HDFS-7682 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb DistributedFileSystem#getFileChecksum of a snapshotted file includes non-snapshotted content. The reason why this happens is because DistributedFileSystem#getFileChecksum simply calculates the checksum of all of the CRCs from the blocks in the file. But, in the case of a snapshotted file, we don't want to include data in the checksum that was appended to the last block in the file after the snapshot was taken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7667) Various typos and improvements to HDFS Federation doc
[ https://issues.apache.org/jira/browse/HDFS-7667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7667: --- Attachment: HDFS-7667.001.patch [~aw], Thanks for looking it over. The .001 version makes those two changes. Various typos and improvements to HDFS Federation doc - Key: HDFS-7667 URL: https://issues.apache.org/jira/browse/HDFS-7667 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 2.6.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Attachments: HDFS-7667.000.patch, HDFS-7667.001.patch Fix several incorrect commands, typos, grammatical errors, etc. in the HDFS Federation doc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7667) Various typos and improvements to HDFS Federation doc
[ https://issues.apache.org/jira/browse/HDFS-7667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7667: --- Attachment: HDFS-7667.000.patch Diffs attached. Various typos and improvements to HDFS Federation doc - Key: HDFS-7667 URL: https://issues.apache.org/jira/browse/HDFS-7667 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 2.6.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Attachments: HDFS-7667.000.patch Fix several incorrect commands, typos, grammatical errors, etc. in the HDFS Federation doc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7667) Various typos and improvements to HDFS Federation doc
[ https://issues.apache.org/jira/browse/HDFS-7667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7667: --- Status: Patch Available (was: Open) Various typos and improvements to HDFS Federation doc - Key: HDFS-7667 URL: https://issues.apache.org/jira/browse/HDFS-7667 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 2.6.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Attachments: HDFS-7667.000.patch Fix several incorrect commands, typos, grammatical errors, etc. in the HDFS Federation doc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7667) Various typos and improvements to HDFS Federation doc
Charles Lamb created HDFS-7667: -- Summary: Various typos and improvements to HDFS Federation doc Key: HDFS-7667 URL: https://issues.apache.org/jira/browse/HDFS-7667 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 2.6.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Fix several incorrect commands, typos, grammatical errors, etc. in the HDFS Federation doc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7667) Various typos and improvements to HDFS Federation doc
[ https://issues.apache.org/jira/browse/HDFS-7667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14290002#comment-14290002 ] Charles Lamb commented on HDFS-7667: [~aw], Thanks for the review. I started out intending to just fix a few minor errors (missing articles, obviously wrong typos in commands, etc.). Then I couldn't help myself so I made some slightly larger grammatical changes and tightened up a few things. Please stop me before I kill any more and commit this. Thanks! Of course we still have not heard from Mr. Jenkins... I wonder where he is today. Various typos and improvements to HDFS Federation doc - Key: HDFS-7667 URL: https://issues.apache.org/jira/browse/HDFS-7667 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 2.6.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Attachments: HDFS-7667.000.patch, HDFS-7667.001.patch Fix several incorrect commands, typos, grammatical errors, etc. in the HDFS Federation doc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7667) Various typos and improvements to HDFS Federation doc
[ https://issues.apache.org/jira/browse/HDFS-7667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14290032#comment-14290032 ] Charles Lamb commented on HDFS-7667: Thanks for the review and the commit [~aw]. If you're bored, HDFS-7644 is a 3 char fix. Various typos and improvements to HDFS Federation doc - Key: HDFS-7667 URL: https://issues.apache.org/jira/browse/HDFS-7667 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 2.6.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Fix For: 3.0.0 Attachments: HDFS-7667.000.patch, HDFS-7667.001.patch Fix several incorrect commands, typos, grammatical errors, etc. in the HDFS Federation doc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7644) minor typo in HttpFS doc
[ https://issues.apache.org/jira/browse/HDFS-7644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14290048#comment-14290048 ] Charles Lamb commented on HDFS-7644: Gee, here I am fixing all these typos and I can't even get the Jira title correct. Thanks for the review and the commit [~aw]. minor typo in HttpFS doc Key: HDFS-7644 URL: https://issues.apache.org/jira/browse/HDFS-7644 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Trivial Fix For: 2.7.0 Attachments: HDFS-7644.000.patch In hadoop-httpfs/src/site/apt/index.apt.vm, s/seening/seen/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7644) minor typo in HffpFS doc
[ https://issues.apache.org/jira/browse/HDFS-7644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289203#comment-14289203 ] Charles Lamb commented on HDFS-7644: The FB warnings are spurious. minor typo in HffpFS doc Key: HDFS-7644 URL: https://issues.apache.org/jira/browse/HDFS-7644 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Trivial Attachments: HDFS-7644.000.patch In hadoop-httpfs/src/site/apt/index.apt.vm, s/seening/seen/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7644) minor typo in HffpFS doc
[ https://issues.apache.org/jira/browse/HDFS-7644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7644: --- Status: Patch Available (was: Open) minor typo in HffpFS doc Key: HDFS-7644 URL: https://issues.apache.org/jira/browse/HDFS-7644 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Trivial Attachments: HDFS-7644.000.patch In hadoop-httpfs/src/site/apt/index.apt.vm, s/seening/seen/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6874) Add GET_BLOCK_LOCATIONS operation to HttpFS
[ https://issues.apache.org/jira/browse/HDFS-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286613#comment-14286613 ] Charles Lamb commented on HDFS-6874: [~lianggz], Thanks for working on this. In general the patch looks good. I have a few minor comments. The patch on the trunk needs to be rebased. I didn't check the branch-2 patch, so it may need to be rebased too. In general, lots of lines exceed the 80 char limit. FSOperations.java s/private static Map blockLocationsToJSON/private static Map blockLocationsToJSON/ You may want to add java doc for the @param, and @return of that method. HttpFSFileSystem.java getFileBlockLocations should have javadoc for the @return. In this method, the call to HttpFSUtils.validateResponse should probably be changed to HttpExceptionUtils.validateResponse(). HttpFSServer.java s/offset,len/offset, len/ Is it correct that passing a len=0 implies Long.MAX_VALUE? JsonUtil.java The javadoc formatting for toBlockLocations is messed up a little. s/IOException{/IOException {/ WebHdfsFileSystem.java for isWebHDFSJson, s/json){/json) {/ and s/m!=null/m != null/. Also, the javadoc needs filling in. Charles Add GET_BLOCK_LOCATIONS operation to HttpFS --- Key: HDFS-6874 URL: https://issues.apache.org/jira/browse/HDFS-6874 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.1 Reporter: Gao Zhong Liang Assignee: Gao Zhong Liang Attachments: HDFS-6874-branch-2.6.0.patch, HDFS-6874.patch GET_BLOCK_LOCATIONS operation is missing in HttpFS, which is already supported in WebHDFS. For the request of GETFILEBLOCKLOCATIONS in org.apache.hadoop.fs.http.server.HttpFSServer, BAD_REQUEST is returned so far: ... case GETFILEBLOCKLOCATIONS: { response = Response.status(Response.Status.BAD_REQUEST).build(); break; } -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7644) minor typo in HffpFS doc
Charles Lamb created HDFS-7644: -- Summary: minor typo in HffpFS doc Key: HDFS-7644 URL: https://issues.apache.org/jira/browse/HDFS-7644 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Trivial In hadoop-httpfs/src/site/apt/index.apt.vm, s/seening/seen/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7644) minor typo in HffpFS doc
[ https://issues.apache.org/jira/browse/HDFS-7644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7644: --- Attachment: HDFS-7644.000.patch minor typo in HffpFS doc Key: HDFS-7644 URL: https://issues.apache.org/jira/browse/HDFS-7644 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Trivial Attachments: HDFS-7644.000.patch In hadoop-httpfs/src/site/apt/index.apt.vm, s/seening/seen/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7637) Fix the check condition for reserved path
[ https://issues.apache.org/jira/browse/HDFS-7637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14282453#comment-14282453 ] Charles Lamb commented on HDFS-7637: LGTM [~hitliuyi]. Charles Fix the check condition for reserved path - Key: HDFS-7637 URL: https://issues.apache.org/jira/browse/HDFS-7637 Project: Hadoop HDFS Issue Type: Bug Reporter: Yi Liu Assignee: Yi Liu Priority: Minor Attachments: HDFS-7637.001.patch Currently the {{.reserved}} patch check function is: {code} public static boolean isReservedName(String src) { return src.startsWith(DOT_RESERVED_PATH_PREFIX); } {code} And {{DOT_RESERVED_PATH_PREFIX}} is {{/.reserved}}, it should be {{/.reserved/}}, for example: if some other directory prefix with _/.reserved_, we say it's _/.reservedpath_, then the check is wrong. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7633) When Datanode has too many blocks, BlockPoolSliceScanner.getNewBlockScanTime throws IllegalArgumentException
[ https://issues.apache.org/jira/browse/HDFS-7633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7633: --- Summary: When Datanode has too many blocks, BlockPoolSliceScanner.getNewBlockScanTime throws IllegalArgumentException (was: When Datanode has too many blocks, BlockPoolSliceScanner.getNewBlockScanTime thows IllegalArgumentException) When Datanode has too many blocks, BlockPoolSliceScanner.getNewBlockScanTime throws IllegalArgumentException Key: HDFS-7633 URL: https://issues.apache.org/jira/browse/HDFS-7633 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.6.0 Reporter: Walter Su Assignee: Walter Su Priority: Minor Attachments: h7633_20150116.patch issue: When Total blocks of one of my DNs reaches 33554432, It refuses to accept more blocks, this is the ERROR. 2015-01-16 15:21:44,571 | ERROR | DataXceiver for client at /172.1.1.8:50490 [Receiving block BP-1976278848-172.1.1.2-1419846518085:blk_1221043436_147936990] | datasight-198:25009:DataXceiver error processing WRITE_BLOCK operation src: /172.1.1.8:50490 dst: /172.1.1.11:25009 | org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:250) java.lang.IllegalArgumentException: n must be positive at java.util.Random.nextInt(Random.java:300) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.getNewBlockScanTime(BlockPoolSliceScanner.java:263) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.addBlock(BlockPoolSliceScanner.java:276) at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.addBlock(DataBlockScanner.java:193) at org.apache.hadoop.hdfs.server.datanode.DataNode.closeBlock(DataNode.java:1733) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:765) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:124) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232) at java.lang.Thread.run(Thread.java:745) analysis: in function org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.getNewBlockScanTime() when blockMap.size() is too big, Math.max(blockMap.size(),1) * 600 is int type, and negtive Math.max(blockMap.size(),1) * 600 * 1000L is long type, and negtive (int)period is Integer.MIN_VALUE Math.abs((int)period) is Integer.MIN_VALUE , which is negtive DFSUtil.getRandom().nextInt(periodInt) will thows IllegalArgumentException I use Java HotSpot (build 1.7.0_05-b05) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7067) ClassCastException while using a key created by keytool to create encryption zone.
[ https://issues.apache.org/jira/browse/HDFS-7067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14277619#comment-14277619 ] Charles Lamb commented on HDFS-7067: [~cmccabe], bq. Charles, is the TestKeyProviderFactory failure due to this patch? Correct. test-patch.sh doesn't apply the hdfs7067.keystore file to hadoop-common/hadoop-common/src/test/resources and so the new test (which depends on it) will fail. The test passes when I apply the patch and the .keystore file in a fresh clone. ClassCastException while using a key created by keytool to create encryption zone. --- Key: HDFS-7067 URL: https://issues.apache.org/jira/browse/HDFS-7067 Project: Hadoop HDFS Issue Type: Bug Components: encryption Affects Versions: 2.6.0 Reporter: Yi Yao Assignee: Charles Lamb Priority: Minor Attachments: HDFS-7067.001.patch, HDFS-7067.002.patch, hdfs7067.keystore I'm using transparent encryption. If I create a key for KMS keystore via keytool and use the key to create an encryption zone. I get a ClassCastException rather than an exception with decent error message. I know we should use 'hadoop key create' to create a key. It's better to provide an decent error message to remind user to use the right way to create a KMS key. [LOG] ERROR[user=hdfs] Method:'GET' Exception:'java.lang.ClassCastException: javax.crypto.spec.SecretKeySpec cannot be cast to org.apache.hadoop.crypto.key.JavaKeyStoreProvider$KeyMetadata' -- This message was sent by Atlassian JIRA (v6.3.4#6332)