[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists
[ https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16166676#comment-16166676 ] Ajay Kumar commented on HDFS-12420: --- Failed tests seems unrelated. Below two tests fail irrespective of patch. TestNameNodeMetrics TestLeaseRecoveryStriped All other tests passed when i tested them locally. > Disable Namenode format when data already exists > > > Key: HDFS-12420 > URL: https://issues.apache.org/jira/browse/HDFS-12420 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ajay Kumar >Assignee: Ajay Kumar > Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch, > HDFS-12420.03.patch, HDFS-12420.04.patch > > > Disable NameNode format to avoid accidental formatting of Namenode in > production cluster. If someone really wants to delete the complete fsImage, > they can first delete the metadata dir and then run {code} hdfs namenode > -format{code} manually. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists
[ https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165847#comment-16165847 ] Hadoop QA commented on HDFS-12420: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 41s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}128m 14s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}158m 35s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery | | | hadoop.hdfs.web.TestWebHDFSXAttr | | | hadoop.hdfs.TestLeaseRecoveryStriped | | | hadoop.hdfs.server.balancer.TestBalancer | | | hadoop.hdfs.TestReconstructStripedFile | | | hadoop.hdfs.web.TestWebHDFSAcl | | | hadoop.hdfs.web.TestHttpsFileSystem | | | hadoop.hdfs.TestReadStripedFileWithMissingBlocks | | | hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithNodeGroup | | | hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics | | | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped | | | hadoop.hdfs.server.namenode.TestDecommissioningStatus | | | hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithUpgradeDomain | | Timed out junit tests | org.apache.hadoop.hdfs.TestWriteReadStripedFile | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:71bbb86 | | JIRA Issue | HDFS-12420 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12887028/HDFS-12420.04.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml | | uname | Linux 4543e7d6009c 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / e0b3c64 | | Default Java | 1.8.0_144 | |
[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists
[ https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165628#comment-16165628 ] Hadoop QA commented on HDFS-12420: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 36s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 6 new + 616 unchanged - 1 fixed = 622 total (was 617) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 88m 10s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}114m 0s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.TestGenericJournalConf | | | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA | | | hadoop.hdfs.qjournal.server.TestJournalNodeSync | | | hadoop.hdfs.qjournal.TestNNWithQJM | | | hadoop.hdfs.server.namenode.TestReencryptionWithKMS | | | hadoop.hdfs.TestRollingUpgrade | | | hadoop.hdfs.TestRollingUpgradeRollback | | | hadoop.hdfs.TestDFSInotifyEventInputStream | | | hadoop.hdfs.TestLeaseRecoveryStriped | | | hadoop.hdfs.qjournal.TestSecureNNWithQJM | | | hadoop.hdfs.server.namenode.TestDecommissioningStatus | | | hadoop.hdfs.server.namenode.ha.TestStandbyInProgressTail | | | hadoop.hdfs.TestCrcCorruption | | | hadoop.hdfs.server.namenode.ha.TestBootstrapStandbyWithQJM | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithUpgradeDomain | | | hadoop.hdfs.tools.TestDFSAdminWithHA | | | hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.TestWriteReadStripedFile | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:71bbb86 | | JIRA Issue | HDFS-12420 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12886992/HDFS-12420.03.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml | | uname | Linux 7a5daedd93e7
[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists
[ https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165515#comment-16165515 ] Allen Wittenauer commented on HDFS-12420: - bq. current format functionality is broken itself. It deletes the metadata while doing nothing about the data stored in data-nodes. Just like mkfs. And just like it, the fact that it doesn't delete the actual data is a feature, not a bug. If I restore the fsimage back then my data should come back too. (mostly... new data ofc is likely to be missing, etc) It's why making a copy of the fsimage is Hadoop Ops 101. Some key advice I give to admins: you can try to prevent mistakes, but they'll still happen despite your best efforts. After low hanging warnings, the energy is better spent on how to quickly recover. But that's a problem that's outside of the core code. For the record, yes, I've made HUGE mistakes like this in my career. Every admin has. In my case, I brought down an entire hospital once. Even with that experience, I still think requiring metadata deletion outside of the tool set is way overkill. bq. may be being able to tag a cluster as "production" like discussed above is a better idea? Yeah, sure, whatever. All that's going to happen is: {code} hdfs --config /tmp/mymodifiedconfig namenode -format -force {code} If a user is too lazy/impatient/distracted to check that they are on a live system before hitting y, they'll just change the flag and then format. But if that makes folks happy, fine. It still sounds like the console output needs some work though if a user couldn't "see" it. (Not sure I agree with that either, but whatever.) BTW, a quick search for how the equivalent problem is solved in databases is interesting. Almost all of them that I looked at: don't give the user access. So yes, enough rope to hang themselves seems to be the expectation operationally. > Disable Namenode format when data already exists > > > Key: HDFS-12420 > URL: https://issues.apache.org/jira/browse/HDFS-12420 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ajay Kumar >Assignee: Ajay Kumar > Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch > > > Disable NameNode format to avoid accidental formatting of Namenode in > production cluster. If someone really wants to delete the complete fsImage, > they can first delete the metadata dir and then run {code} hdfs namenode > -format{code} manually. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists
[ https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165484#comment-16165484 ] Anu Engineer commented on HDFS-12420: - [~aw] Thanks for your comments. bq. The argument here is the same as "newfs should fail if it detects a partition table. You'll need to dd onto the raw disk to wipe it out first". If you ask any experienced admin, 9/10 they're going to tell you that makes zero sense. Makes sense, Let us not proceed down this path. The only difference is that in case of Hadoop the damage that a command can do is multiplied by the number of data nodes. Having seen that accidental formats can happen, may be being able to tag a cluster as "production" like discussed above is a better idea? > Disable Namenode format when data already exists > > > Key: HDFS-12420 > URL: https://issues.apache.org/jira/browse/HDFS-12420 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ajay Kumar >Assignee: Ajay Kumar > Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch > > > Disable NameNode format to avoid accidental formatting of Namenode in > production cluster. If someone really wants to delete the complete fsImage, > they can first delete the metadata dir and then run {code} hdfs namenode > -format{code} manually. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists
[ https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165482#comment-16165482 ] Ajay Kumar commented on HDFS-12420: --- Hi [~aw], What you said is true but as [~arpitagarwal] has pointed out current format functionality is broken itself. It deletes the metadata while doing nothing about the data stored in data-nodes. We can keep the existing functionality as it is and add a new property to identify prod cluster. By default this property will be set to non-prod. If someone marks there cluster as prod cluster than this can be an additional safeguard. This will maintain the backward compatibility and hopefully will address your concerns as well. > Disable Namenode format when data already exists > > > Key: HDFS-12420 > URL: https://issues.apache.org/jira/browse/HDFS-12420 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ajay Kumar >Assignee: Ajay Kumar > Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch > > > Disable NameNode format to avoid accidental formatting of Namenode in > production cluster. If someone really wants to delete the complete fsImage, > they can first delete the metadata dir and then run {code} hdfs namenode > -format{code} manually. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists
[ https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165444#comment-16165444 ] Allen Wittenauer commented on HDFS-12420: - bq. cluster owner, who was visibly distressed. Well sure. They screwed up. They can either own up to the fact they made a mistake and learn from it or try to push blame off onto someone or something else, like their vendor. Besides, who *doesn't* make a copy of the fsimage data on a regular basis? That's Hadoop Ops 101. That said: there comes a point where it becomes impossible to protect every admin from every mistake they may possibly make. -format is the functional equivalent of newfs. The argument here is the same as "newfs should fail if it detects a partition table. You'll need to dd onto the raw disk to wipe it out first". If you ask any experienced admin, 9/10 they're going to tell you that makes zero sense. The same thing here. The code specifically warns the user that they are about to delete live data. Could the messaging be improved? Sure and that's probably what should be happening if users are confused enough to file this drastic overreaction. But the warning is there all the same. It is up to the user to act upon that information and determine it is safe or not to continue with the operation. If they blindly -force it, well, that's on them. Users might remove data they need by always doing -skipTrash. So we should remove it, right? Of course not. One of the key principals of operations is that admins have enough rope to hang themselves. This is exactly the same case. In this instance, the admin did exactly that: hung themselves because they weren't careful. bq. How you can delete the shared edits dir in journal nodes manually? I'm really glad you asked that question because it's a key one. It's sort of ridiculous to have admins go hunt down where Hadoop might be stuffing metadata. Add in the complexity of HA and it is even more ludicrous. bq. That said, if you have examples of automated deployments that will be broken by this change and that we haven't thought of, we can abandon the idea. I have clients that do this on a regular basis. They regularly roll out small, short term clusters to external groups. Yes, this change will break them horribly. > Disable Namenode format when data already exists > > > Key: HDFS-12420 > URL: https://issues.apache.org/jira/browse/HDFS-12420 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ajay Kumar >Assignee: Ajay Kumar > Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch > > > Disable NameNode format to avoid accidental formatting of Namenode in > production cluster. If someone really wants to delete the complete fsImage, > they can first delete the metadata dir and then run {code} hdfs namenode > -format{code} manually. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists
[ https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165177#comment-16165177 ] Arpit Agarwal commented on HDFS-12420: -- Allen, thanks for bringing up the automation concern. We certainly don't want to break any deployment scripts. This patch will not break scripted deployment of new clusters since it eliminates the prompt completely. Formatting clusters with pre-existing data was a bad idea in the first place. It deletes the NameNode metadata and leaves the cluster in an unusable state since DataNodes cannot connect anymore. I don't think any existing automation can depend on this behavior since it is functionally broken. > Disable Namenode format when data already exists > > > Key: HDFS-12420 > URL: https://issues.apache.org/jira/browse/HDFS-12420 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ajay Kumar >Assignee: Ajay Kumar > Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch > > > Disable NameNode format to avoid accidental formatting of Namenode in > production cluster. If someone really wants to delete the complete fsImage, > they can first delete the metadata dir and then run {code} hdfs namenode > -format{code} manually. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists
[ https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165068#comment-16165068 ] Vinayakumar B commented on HDFS-12420: -- bq. In spite of the -force option or the prompt for Y/N, admins do make mistakes and end up loosing data. In a real production cluster with real data, why would someone want to do a format? In dev/qa clusters, I can see the need for format. Yes I agree, admin can make mistakes. In real cluster 'format' command (especially with -force) should be used with at-most attention(same as 'rm -r' in linux). bq. Another option is to configure the cluster as "production" mode, where format will not be allowed. Dev/test clusters can be configured with 'dev' mode, where format is allowed. Still if you insist to disallow format in 'production' clusters, this option looks good, provided default value set to 'dev' mode to keep current 'prompt' behavior as is. > Disable Namenode format when data already exists > > > Key: HDFS-12420 > URL: https://issues.apache.org/jira/browse/HDFS-12420 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ajay Kumar >Assignee: Ajay Kumar > Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch > > > Disable NameNode format to avoid accidental formatting of Namenode in > production cluster. If someone really wants to delete the complete fsImage, > they can first delete the metadata dir and then run {code} hdfs namenode > -format{code} manually. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists
[ https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165043#comment-16165043 ] Jitendra Nath Pandey commented on HDFS-12420: - In spite of the -force option or the prompt for Y/N, admins do make mistakes and end up loosing data. In a real production cluster with real data, why would someone want to do a format? In dev/qa clusters, I can see the need for format. Another option is to configure the cluster as "production" mode, where format will not be allowed. Dev/test clusters can be configured with 'dev' mode, where format is allowed. > Disable Namenode format when data already exists > > > Key: HDFS-12420 > URL: https://issues.apache.org/jira/browse/HDFS-12420 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ajay Kumar >Assignee: Ajay Kumar > Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch > > > Disable NameNode format to avoid accidental formatting of Namenode in > production cluster. If someone really wants to delete the complete fsImage, > they can first delete the metadata dir and then run {code} hdfs namenode > -format{code} manually. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists
[ https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165029#comment-16165029 ] Vinayakumar B commented on HDFS-12420: -- bq. Don't we already have the y/n check when data exists? Why do we need another? Yes. We do have the prompt, which is the exact line being removed in the patch. {{fsImage.confirmFormat(force, isInteractive)}}. User can format the existing data, if passed a -force flag or given 'y' as an answer to the prompt. I too wanted to understand the real need for this complete disable of format. bq. If someone really wants to delete the complete fsImage, they can first delete the metadata dir How you can delete the shared edits dir in journal nodes manually? I think current behavior of format works fine. -force option should not be used too lightly. > Disable Namenode format when data already exists > > > Key: HDFS-12420 > URL: https://issues.apache.org/jira/browse/HDFS-12420 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ajay Kumar >Assignee: Ajay Kumar > Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch > > > Disable NameNode format to avoid accidental formatting of Namenode in > production cluster. If someone really wants to delete the complete fsImage, > they can first delete the metadata dir and then run {code} hdfs namenode > -format{code} manually. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists
[ https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165023#comment-16165023 ] Anu Engineer commented on HDFS-12420: - bq. Don't we already have the y/n check when data exists? Why do we need another? We do, but the fact that it not very clear with lots of other text on the screen was pointed out by a cluster owner, who was visibly distressed. We are just trying to avoid losing data by operator mistake. I thought that you might have a concern with automation that is why I flagged it for your consideration. Let me try to understand that a bit more, do you think people automate formatting the clusters? if they do, then preventing accidental data loss is all the more important. >From an HDFS user hat on, I think this is a good improvement to have. I would >expect HDFS to refuse to format a cluster with data. But from a >sysadmin/developer hat on, I do like that fact that I can format a cluster >with data. I do that when I test and develop. So in my mind, the question boils down to easier dev/ops cycles vs. user safety. The reason why this is filed for 3.0 is that it might be our last opportunity to make this change. bq. Completely breaks automation. Automation MUST work. I see that you are voting with the devops hat on, and I do not disagree. But this is a place where breaking the automation might avoid a disaster for some poor user. One more data point, this JIRA is based on real feedback from a real large cluster. I am not apologizing for sloppy operation but trying to understand what we can do to prevent a user from making such a mistake. I am presuming (please correct me if I am wrong) that you are not objecting to the change or the intent per se, but more about the fact that we are out right refusing to format a cluster with Namenode metadata. Do you think adding a flag which says *-DothisIamReallySmart* address the automation concern? > Disable Namenode format when data already exists > > > Key: HDFS-12420 > URL: https://issues.apache.org/jira/browse/HDFS-12420 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ajay Kumar >Assignee: Ajay Kumar > Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch > > > Disable NameNode format to avoid accidental formatting of Namenode in > production cluster. If someone really wants to delete the complete fsImage, > they can first delete the metadata dir and then run {code} hdfs namenode > -format{code} manually. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists
[ https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165000#comment-16165000 ] Allen Wittenauer commented on HDFS-12420: - The more I think about this, the more I'm -1: Completely breaks automation. Automation MUST work. bq. Let's also make the -force option a no-op. We can continue to accept it but it should have no effect and we should print a warning saying that the force option is being ignored. This just makes it worse. HDFS-5138 was a disaster for automation when -finalize was made a no-op. See HDFS-8241 for the follow-up to clean it up. > Disable Namenode format when data already exists > > > Key: HDFS-12420 > URL: https://issues.apache.org/jira/browse/HDFS-12420 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ajay Kumar >Assignee: Ajay Kumar > Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch > > > Disable NameNode format to avoid accidental formatting of Namenode in > production cluster. If someone really wants to delete the complete fsImage, > they can first delete the metadata dir and then run {code} hdfs namenode > -format{code} manually. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists
[ https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16164970#comment-16164970 ] Allen Wittenauer commented on HDFS-12420: - Don't we already have the y/n check when data exists? Why do we need another? > Disable Namenode format when data already exists > > > Key: HDFS-12420 > URL: https://issues.apache.org/jira/browse/HDFS-12420 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ajay Kumar >Assignee: Ajay Kumar > Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch > > > Disable NameNode format to avoid accidental formatting of Namenode in > production cluster. If someone really wants to delete the complete fsImage, > they can first delete the metadata dir and then run {code} hdfs namenode > -format{code} manually. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists
[ https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16164907#comment-16164907 ] Anu Engineer commented on HDFS-12420: - [~aw] This does break backward compatibility. So wanted to hear your thought on this. The reason why we are doing this is that people are capable of formatting clusters with data on them :). Just wondering how big of an issue would this be if we put this in 3.0?. Appreciate any comments you might have. > Disable Namenode format when data already exists > > > Key: HDFS-12420 > URL: https://issues.apache.org/jira/browse/HDFS-12420 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ajay Kumar >Assignee: Ajay Kumar > Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch > > > Disable NameNode format to avoid accidental formatting of Namenode in > production cluster. If someone really wants to delete the complete fsImage, > they can first delete the metadata dir and then run {code} hdfs namenode > -format{code} manually. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists
[ https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16164067#comment-16164067 ] Hadoop QA commented on HDFS-12420: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 40s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 5 new + 184 unchanged - 1 fixed = 189 total (was 185) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 95m 35s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}122m 53s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.TestClusterId | | | hadoop.hdfs.TestClientProtocolForPipelineRecovery | | | hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations | | | hadoop.hdfs.server.namenode.TestGenericJournalConf | | | hadoop.hdfs.qjournal.TestNNWithQJM | | | hadoop.hdfs.server.namenode.TestAllowFormat | | | hadoop.hdfs.server.namenode.TestReencryptionWithKMS | | | hadoop.hdfs.TestLeaseRecoveryStriped | | | hadoop.hdfs.server.namenode.TestNameEditsConfigs | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.TestDistributedFileSystem | | | hadoop.hdfs.TestReplaceDatanodeOnFailure | | | hadoop.hdfs.TestReconstructStripedFile | | | hadoop.hdfs.TestLease | | Timed out junit tests | org.apache.hadoop.hdfs.TestWriteReadStripedFile | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:71bbb86 | | JIRA Issue | HDFS-12420 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12886764/HDFS-12420.02.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux d719cb91ece5 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / f4b6267 | | Default Java | 1.8.0_144 | | findbugs | v3.1.0-RC1 | | checkstyle |
[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists
[ https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163963#comment-16163963 ] Ajay Kumar commented on HDFS-12420: --- [~rushabh.shah],[~arpitagarwal],[~vagarychen] thanks for review. Attaching new patch with suggested changes. > Disable Namenode format when data already exists > > > Key: HDFS-12420 > URL: https://issues.apache.org/jira/browse/HDFS-12420 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ajay Kumar >Assignee: Ajay Kumar > Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch > > > Disable NameNode format to avoid accidental formatting of Namenode in > production cluster. If someone really wants to delete the complete fsImage, > they can first delete the metadata dir and then run {code} hdfs namenode > -format{code} manually. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists
[ https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163342#comment-16163342 ] Chen Liang commented on HDFS-12420: --- Thanks for taking care of this [~ajayydv]! Some of the failed tests seem related. For example seems that {{TestSaveNamespace#testTxIdPersistence}} fails because it tries to formatName but the previous test has data leftover in test dir, so the format aborted. Consequently the following txid assertion in this test also fail. We will need to delete test directory content for certain tests. > Disable Namenode format when data already exists > > > Key: HDFS-12420 > URL: https://issues.apache.org/jira/browse/HDFS-12420 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ajay Kumar >Assignee: Ajay Kumar > Attachments: HDFS-12420.01.patch > > > Disable NameNode format to avoid accidental formatting of Namenode in > production cluster. If someone really wants to delete the complete fsImage, > they can first delete the metadata dir and then run {code} hdfs namenode > -format{code} manually. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists
[ https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163283#comment-16163283 ] Arpit Agarwal commented on HDFS-12420: -- Thanks for this improvement [~ajayydv]. Couple of comments, in addition to the test case as suggested by Rushabh: # Let's also make the -force option a no-op. We can continue to accept it but it should have no effect and we should print a warning saying that the force option is being ignored. # Same with the -nonInteractive option. > Disable Namenode format when data already exists > > > Key: HDFS-12420 > URL: https://issues.apache.org/jira/browse/HDFS-12420 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ajay Kumar >Assignee: Ajay Kumar > Attachments: HDFS-12420.01.patch > > > Disable NameNode format to avoid accidental formatting of Namenode in > production cluster. If someone really wants to delete the complete fsImage, > they can first delete the metadata dir and then run {code} hdfs namenode > -format{code} manually. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists
[ https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163071#comment-16163071 ] Rushabh S Shah commented on HDFS-12420: --- [~ajaykumar] Can you please write a test case for the new behavior ? > Disable Namenode format when data already exists > > > Key: HDFS-12420 > URL: https://issues.apache.org/jira/browse/HDFS-12420 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ajay Kumar >Assignee: Ajay Kumar > Attachments: HDFS-12420.01.patch > > > Disable NameNode format to avoid accidental formatting of Namenode in > production cluster. If someone really wants to delete the complete fsImage, > they can first delete the metadata dir and then run {code} hdfs namenode > -format{code} manually. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org