[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists

2017-09-14 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16166676#comment-16166676
 ] 

Ajay Kumar commented on HDFS-12420:
---

Failed tests seems unrelated. Below two tests fail irrespective of patch.
TestNameNodeMetrics
TestLeaseRecoveryStriped

All other tests passed when i tested them locally.

> Disable Namenode format when data already exists
> 
>
> Key: HDFS-12420
> URL: https://issues.apache.org/jira/browse/HDFS-12420
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
> Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch, 
> HDFS-12420.03.patch, HDFS-12420.04.patch
>
>
> Disable NameNode format to avoid accidental formatting of Namenode in 
> production cluster. If someone really wants to delete the complete fsImage, 
> they can first delete the metadata dir and then run {code} hdfs namenode 
> -format{code} manually.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists

2017-09-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165847#comment-16165847
 ] 

Hadoop QA commented on HDFS-12420:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
41s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}128m 14s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}158m 35s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery 
|
|   | hadoop.hdfs.web.TestWebHDFSXAttr |
|   | hadoop.hdfs.TestLeaseRecoveryStriped |
|   | hadoop.hdfs.server.balancer.TestBalancer |
|   | hadoop.hdfs.TestReconstructStripedFile |
|   | hadoop.hdfs.web.TestWebHDFSAcl |
|   | hadoop.hdfs.web.TestHttpsFileSystem |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
|   | hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithNodeGroup |
|   | hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics |
|   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped |
|   | hadoop.hdfs.server.namenode.TestDecommissioningStatus |
|   | hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithUpgradeDomain 
|
| Timed out junit tests | org.apache.hadoop.hdfs.TestWriteReadStripedFile |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | HDFS-12420 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12887028/HDFS-12420.04.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux 4543e7d6009c 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 
14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / e0b3c64 |
| Default Java | 1.8.0_144 |
| 

[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists

2017-09-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165628#comment-16165628
 ] 

Hadoop QA commented on HDFS-12420:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
 9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 36s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 6 new + 616 unchanged - 1 fixed = 622 total (was 617) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 88m 10s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}114m  0s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestGenericJournalConf |
|   | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
|   | hadoop.hdfs.qjournal.server.TestJournalNodeSync |
|   | hadoop.hdfs.qjournal.TestNNWithQJM |
|   | hadoop.hdfs.server.namenode.TestReencryptionWithKMS |
|   | hadoop.hdfs.TestRollingUpgrade |
|   | hadoop.hdfs.TestRollingUpgradeRollback |
|   | hadoop.hdfs.TestDFSInotifyEventInputStream |
|   | hadoop.hdfs.TestLeaseRecoveryStriped |
|   | hadoop.hdfs.qjournal.TestSecureNNWithQJM |
|   | hadoop.hdfs.server.namenode.TestDecommissioningStatus |
|   | hadoop.hdfs.server.namenode.ha.TestStandbyInProgressTail |
|   | hadoop.hdfs.TestCrcCorruption |
|   | hadoop.hdfs.server.namenode.ha.TestBootstrapStandbyWithQJM |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithUpgradeDomain 
|
|   | hadoop.hdfs.tools.TestDFSAdminWithHA |
|   | hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.TestWriteReadStripedFile |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | HDFS-12420 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12886992/HDFS-12420.03.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux 7a5daedd93e7 

[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists

2017-09-13 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165515#comment-16165515
 ] 

Allen Wittenauer commented on HDFS-12420:
-

bq. current format functionality is broken itself. It deletes the metadata 
while doing nothing about the data stored in data-nodes. 

Just like mkfs.  And just like it, the fact that it doesn't delete the actual 
data is a feature, not a bug.  If I restore the fsimage back then my data 
should come back too.  (mostly... new data ofc is likely to be missing, etc) 
It's why making a copy of the fsimage is Hadoop Ops 101. 

Some key advice I give to admins:  you can try to prevent mistakes, but they'll 
still happen despite your best efforts.  After low hanging warnings, the energy 
is better spent on how to quickly recover. But that's a problem that's outside 
of the core code.

For the record, yes, I've made HUGE mistakes like this in my career.  Every 
admin has. In my case, I brought down an entire hospital once.  Even with that 
experience, I still think requiring metadata deletion outside of the tool set 
is way overkill.

bq. may be being able to tag a cluster as "production" like discussed above is 
a better idea?

Yeah, sure, whatever.  All that's going to happen is:

{code}
hdfs --config /tmp/mymodifiedconfig namenode -format -force
{code}

If a user is too lazy/impatient/distracted to check that they are on a live 
system before hitting y, they'll just change the flag and then format.  But if 
that makes folks happy, fine.  It still sounds like the console output needs 
some work though if a user couldn't "see" it.  (Not sure I agree with that 
either, but whatever.)

BTW, a quick search for how the equivalent problem is solved in databases is 
interesting. Almost all of them that I looked at: don't give the user access. 
So yes, enough rope to hang themselves seems to be the expectation 
operationally.

> Disable Namenode format when data already exists
> 
>
> Key: HDFS-12420
> URL: https://issues.apache.org/jira/browse/HDFS-12420
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
> Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch
>
>
> Disable NameNode format to avoid accidental formatting of Namenode in 
> production cluster. If someone really wants to delete the complete fsImage, 
> they can first delete the metadata dir and then run {code} hdfs namenode 
> -format{code} manually.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists

2017-09-13 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165484#comment-16165484
 ] 

Anu Engineer commented on HDFS-12420:
-

[~aw] Thanks for your comments.

bq. The argument here is the same as "newfs should fail if it detects a 
partition table. You'll need to dd onto the raw disk to wipe it out first". If 
you ask any experienced admin, 9/10 they're going to tell you that makes zero 
sense.

Makes sense, Let us not proceed down this path. The only difference is that in 
case of Hadoop the damage that a command can do is multiplied by the number of 
data nodes.

Having seen that accidental formats can happen, may be being able to tag a 
cluster as "production" like discussed above is a better idea?



> Disable Namenode format when data already exists
> 
>
> Key: HDFS-12420
> URL: https://issues.apache.org/jira/browse/HDFS-12420
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
> Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch
>
>
> Disable NameNode format to avoid accidental formatting of Namenode in 
> production cluster. If someone really wants to delete the complete fsImage, 
> they can first delete the metadata dir and then run {code} hdfs namenode 
> -format{code} manually.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists

2017-09-13 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165482#comment-16165482
 ] 

Ajay Kumar commented on HDFS-12420:
---

Hi [~aw],  What you said is true but as [~arpitagarwal] has pointed out current 
format functionality is broken itself. It deletes the metadata while doing 
nothing about the data stored in data-nodes. 
We can keep the existing functionality as it is and add a new property to 
identify prod cluster. By default this property will be set to non-prod. If 
someone marks there cluster as prod cluster than this can be an additional 
safeguard. This will maintain the backward compatibility and hopefully will 
address your concerns as well. 

> Disable Namenode format when data already exists
> 
>
> Key: HDFS-12420
> URL: https://issues.apache.org/jira/browse/HDFS-12420
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
> Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch
>
>
> Disable NameNode format to avoid accidental formatting of Namenode in 
> production cluster. If someone really wants to delete the complete fsImage, 
> they can first delete the metadata dir and then run {code} hdfs namenode 
> -format{code} manually.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists

2017-09-13 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165444#comment-16165444
 ] 

Allen Wittenauer commented on HDFS-12420:
-

bq.  cluster owner, who was visibly distressed. 

Well sure. They screwed up.  They can either own up to the fact they made a 
mistake and learn from it or try to push blame off onto someone or something 
else, like their vendor.  Besides, who *doesn't* make a copy of the fsimage 
data on a regular basis?  That's Hadoop Ops 101.

That said: there comes a point where it becomes impossible to protect every 
admin from every mistake they may possibly make.

-format is the functional equivalent of newfs.  The argument here is the same 
as "newfs should fail if it detects a partition table.  You'll need to dd onto 
the raw disk to wipe it out first".  If you ask any experienced admin, 9/10 
they're going to tell you that makes zero sense.

The same thing here.  The code specifically warns the user that they are about 
to delete live data.  Could the messaging be improved? Sure and that's probably 
what should be happening if users are confused enough to file this drastic 
overreaction.  But the warning is there all the same.  It is up to the user to 
act upon that information and determine it is safe or not to continue with the 
operation.  If they blindly -force it, well, that's on them.  Users might 
remove data they need by always doing -skipTrash.  So we should remove it, 
right?  Of course not.

One of the key principals of operations is that admins have enough rope to hang 
themselves.  This is exactly the same case.  In this instance, the admin did 
exactly that: hung themselves because they weren't careful.

bq. How you can delete the shared edits dir in journal nodes manually?

I'm really glad you asked that question because it's a key one. It's sort of 
ridiculous to have admins go hunt down where Hadoop might be stuffing metadata. 
 Add in the complexity of HA and it is even more ludicrous.

bq. That said, if you have examples of automated deployments that will be 
broken by this change and that we haven't thought of, we can abandon the idea.

I have clients that do this on a regular basis. They regularly roll out small, 
short term clusters to external groups. Yes, this change will break them 
horribly.  


> Disable Namenode format when data already exists
> 
>
> Key: HDFS-12420
> URL: https://issues.apache.org/jira/browse/HDFS-12420
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
> Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch
>
>
> Disable NameNode format to avoid accidental formatting of Namenode in 
> production cluster. If someone really wants to delete the complete fsImage, 
> they can first delete the metadata dir and then run {code} hdfs namenode 
> -format{code} manually.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists

2017-09-13 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165177#comment-16165177
 ] 

Arpit Agarwal commented on HDFS-12420:
--

Allen, thanks for bringing up the automation concern. We certainly don't want 
to break any deployment scripts. This patch will not break scripted deployment 
of new clusters since it eliminates the prompt completely.

Formatting clusters with pre-existing data was a bad idea in the first place. 
It deletes the NameNode metadata and leaves the cluster in an unusable state 
since DataNodes cannot connect anymore. I don't think any existing automation 
can depend on this behavior since it is functionally broken.

> Disable Namenode format when data already exists
> 
>
> Key: HDFS-12420
> URL: https://issues.apache.org/jira/browse/HDFS-12420
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
> Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch
>
>
> Disable NameNode format to avoid accidental formatting of Namenode in 
> production cluster. If someone really wants to delete the complete fsImage, 
> they can first delete the metadata dir and then run {code} hdfs namenode 
> -format{code} manually.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists

2017-09-13 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165068#comment-16165068
 ] 

Vinayakumar B commented on HDFS-12420:
--

bq. In spite of the -force option or the prompt for Y/N, admins do make 
mistakes and end up loosing data. In a real production cluster with real data, 
why would someone want to do a format? In dev/qa clusters, I can see the need 
for format.
Yes I agree, admin can make mistakes. In real cluster 'format' command 
(especially with -force) should be used with at-most attention(same as 'rm -r' 
in linux).
bq.  Another option is to configure the cluster as "production" mode, where 
format will not be allowed. Dev/test clusters can be configured with 'dev' 
mode, where format is allowed. 
Still if you insist to disallow format in 'production' clusters, this option 
looks good, provided default value set to 'dev' mode to keep current 'prompt' 
behavior as is.

> Disable Namenode format when data already exists
> 
>
> Key: HDFS-12420
> URL: https://issues.apache.org/jira/browse/HDFS-12420
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
> Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch
>
>
> Disable NameNode format to avoid accidental formatting of Namenode in 
> production cluster. If someone really wants to delete the complete fsImage, 
> they can first delete the metadata dir and then run {code} hdfs namenode 
> -format{code} manually.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists

2017-09-13 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165043#comment-16165043
 ] 

Jitendra Nath Pandey commented on HDFS-12420:
-

In spite of the -force option or the prompt for Y/N, admins do make mistakes 
and end up loosing data. In a real production cluster with real data, why would 
someone want to do a format? In dev/qa clusters, I can see the need for format. 
Another option is to configure the cluster as "production" mode, where format 
will not be allowed. Dev/test clusters can be configured with 'dev' mode, where 
format is allowed. 

> Disable Namenode format when data already exists
> 
>
> Key: HDFS-12420
> URL: https://issues.apache.org/jira/browse/HDFS-12420
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
> Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch
>
>
> Disable NameNode format to avoid accidental formatting of Namenode in 
> production cluster. If someone really wants to delete the complete fsImage, 
> they can first delete the metadata dir and then run {code} hdfs namenode 
> -format{code} manually.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists

2017-09-13 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165029#comment-16165029
 ] 

Vinayakumar B commented on HDFS-12420:
--

bq. Don't we already have the y/n check when data exists? Why do we need 
another?
Yes. We do have the prompt, which is the exact line being removed in the patch. 
{{fsImage.confirmFormat(force, isInteractive)}}.
User can format the existing data, if passed a -force flag or given 'y' as an 
answer to the prompt.

I too wanted to understand the real need for this complete disable of format.
bq. If someone really wants to delete the complete fsImage, they can first 
delete the metadata dir
How you can delete the shared edits dir in journal nodes manually?

I think current behavior of format works fine. -force option should not be used 
too lightly.

> Disable Namenode format when data already exists
> 
>
> Key: HDFS-12420
> URL: https://issues.apache.org/jira/browse/HDFS-12420
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
> Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch
>
>
> Disable NameNode format to avoid accidental formatting of Namenode in 
> production cluster. If someone really wants to delete the complete fsImage, 
> they can first delete the metadata dir and then run {code} hdfs namenode 
> -format{code} manually.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists

2017-09-13 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165023#comment-16165023
 ] 

Anu Engineer commented on HDFS-12420:
-

bq. Don't we already have the y/n check when data exists? Why do we need 
another? 
We do, but the fact that it not very clear with lots of other text on the 
screen was pointed out by a cluster owner, who was visibly distressed. 

We are just trying to avoid losing data by operator mistake. I thought that you 
might have a concern with automation that is why I flagged it for your 
consideration. Let me try to understand that a bit more, do you think people 
automate formatting the clusters? if they do, then preventing accidental data 
loss is all the more important.

>From an HDFS user hat on,  I think this is a good improvement to have. I would 
>expect HDFS to refuse to format a cluster with data. But from a 
>sysadmin/developer hat on, I do like that fact that I can format a cluster 
>with data. I do that when I test and develop. 

So in my mind, the question boils down to easier dev/ops cycles vs. user 
safety. The reason why this is filed for 3.0 is that it might be our last 
opportunity to make this change.

bq. Completely breaks automation. Automation MUST work. 
I see that you are voting with the devops hat on, and I do not disagree. But 
this is a place where breaking the automation might avoid a disaster for some 
poor user. One more data point, this JIRA is based on real feedback from a real 
large cluster.  I am not apologizing for sloppy operation but trying to 
understand what we can do to prevent a user from making such a mistake.

I am presuming (please correct me if I am wrong) that you are not objecting to 
the change or the intent per se, but more about the fact that we are out right 
refusing to format a cluster with Namenode metadata. Do you think adding a flag 
which says *-DothisIamReallySmart* address the automation concern?



> Disable Namenode format when data already exists
> 
>
> Key: HDFS-12420
> URL: https://issues.apache.org/jira/browse/HDFS-12420
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
> Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch
>
>
> Disable NameNode format to avoid accidental formatting of Namenode in 
> production cluster. If someone really wants to delete the complete fsImage, 
> they can first delete the metadata dir and then run {code} hdfs namenode 
> -format{code} manually.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists

2017-09-13 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165000#comment-16165000
 ] 

Allen Wittenauer commented on HDFS-12420:
-

The more I think about this, the more I'm -1:

Completely breaks automation. Automation MUST work.  

bq. Let's also make the -force option a no-op. We can continue to accept it but 
it should have no effect and we should print a warning saying that the force 
option is being ignored.

This just makes it worse.  HDFS-5138 was a disaster for automation when 
-finalize was made a no-op.  See HDFS-8241 for the follow-up to clean it up.  



> Disable Namenode format when data already exists
> 
>
> Key: HDFS-12420
> URL: https://issues.apache.org/jira/browse/HDFS-12420
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
> Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch
>
>
> Disable NameNode format to avoid accidental formatting of Namenode in 
> production cluster. If someone really wants to delete the complete fsImage, 
> they can first delete the metadata dir and then run {code} hdfs namenode 
> -format{code} manually.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists

2017-09-13 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16164970#comment-16164970
 ] 

Allen Wittenauer commented on HDFS-12420:
-

Don't we already have the y/n check when data exists?  Why do we need another?

> Disable Namenode format when data already exists
> 
>
> Key: HDFS-12420
> URL: https://issues.apache.org/jira/browse/HDFS-12420
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
> Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch
>
>
> Disable NameNode format to avoid accidental formatting of Namenode in 
> production cluster. If someone really wants to delete the complete fsImage, 
> they can first delete the metadata dir and then run {code} hdfs namenode 
> -format{code} manually.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists

2017-09-13 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16164907#comment-16164907
 ] 

Anu Engineer commented on HDFS-12420:
-

[~aw] This does break backward compatibility. So wanted to hear your thought on 
this. 
The reason why we are doing this is that people are capable of formatting 
clusters with data on them :). Just wondering how big of an issue would this be 
if we put this in 3.0?. Appreciate any comments you might have.


> Disable Namenode format when data already exists
> 
>
> Key: HDFS-12420
> URL: https://issues.apache.org/jira/browse/HDFS-12420
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
> Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch
>
>
> Disable NameNode format to avoid accidental formatting of Namenode in 
> production cluster. If someone really wants to delete the complete fsImage, 
> they can first delete the metadata dir and then run {code} hdfs namenode 
> -format{code} manually.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists

2017-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16164067#comment-16164067
 ] 

Hadoop QA commented on HDFS-12420:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 40s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 5 new + 184 unchanged - 1 fixed = 189 total (was 185) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 95m 35s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}122m 53s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestClusterId |
|   | hadoop.hdfs.TestClientProtocolForPipelineRecovery |
|   | hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations |
|   | hadoop.hdfs.server.namenode.TestGenericJournalConf |
|   | hadoop.hdfs.qjournal.TestNNWithQJM |
|   | hadoop.hdfs.server.namenode.TestAllowFormat |
|   | hadoop.hdfs.server.namenode.TestReencryptionWithKMS |
|   | hadoop.hdfs.TestLeaseRecoveryStriped |
|   | hadoop.hdfs.server.namenode.TestNameEditsConfigs |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.TestDistributedFileSystem |
|   | hadoop.hdfs.TestReplaceDatanodeOnFailure |
|   | hadoop.hdfs.TestReconstructStripedFile |
|   | hadoop.hdfs.TestLease |
| Timed out junit tests | org.apache.hadoop.hdfs.TestWriteReadStripedFile |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | HDFS-12420 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12886764/HDFS-12420.02.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux d719cb91ece5 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 
18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / f4b6267 |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 

[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists

2017-09-12 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163963#comment-16163963
 ] 

Ajay Kumar commented on HDFS-12420:
---

[~rushabh.shah],[~arpitagarwal],[~vagarychen]  thanks for review. Attaching new 
patch with suggested changes.

> Disable Namenode format when data already exists
> 
>
> Key: HDFS-12420
> URL: https://issues.apache.org/jira/browse/HDFS-12420
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
> Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch
>
>
> Disable NameNode format to avoid accidental formatting of Namenode in 
> production cluster. If someone really wants to delete the complete fsImage, 
> they can first delete the metadata dir and then run {code} hdfs namenode 
> -format{code} manually.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists

2017-09-12 Thread Chen Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163342#comment-16163342
 ] 

Chen Liang commented on HDFS-12420:
---

Thanks for taking care of this [~ajayydv]!

Some of the failed tests seem related. For example seems that 
{{TestSaveNamespace#testTxIdPersistence}} fails because it tries to formatName 
but the previous test has data leftover in test dir, so the format aborted. 
Consequently the following txid assertion in this test also fail. We will need 
to delete test directory content for certain tests.

> Disable Namenode format when data already exists
> 
>
> Key: HDFS-12420
> URL: https://issues.apache.org/jira/browse/HDFS-12420
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
> Attachments: HDFS-12420.01.patch
>
>
> Disable NameNode format to avoid accidental formatting of Namenode in 
> production cluster. If someone really wants to delete the complete fsImage, 
> they can first delete the metadata dir and then run {code} hdfs namenode 
> -format{code} manually.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists

2017-09-12 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163283#comment-16163283
 ] 

Arpit Agarwal commented on HDFS-12420:
--

Thanks for this improvement [~ajayydv]. Couple of comments, in addition to the 
test case as suggested by Rushabh:
# Let's also make the -force option a no-op. We can continue to accept it but 
it should have no effect and we should print a warning saying that the force 
option is being ignored.
# Same with the -nonInteractive option.

> Disable Namenode format when data already exists
> 
>
> Key: HDFS-12420
> URL: https://issues.apache.org/jira/browse/HDFS-12420
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
> Attachments: HDFS-12420.01.patch
>
>
> Disable NameNode format to avoid accidental formatting of Namenode in 
> production cluster. If someone really wants to delete the complete fsImage, 
> they can first delete the metadata dir and then run {code} hdfs namenode 
> -format{code} manually.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists

2017-09-12 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163071#comment-16163071
 ] 

Rushabh S Shah commented on HDFS-12420:
---

[~ajaykumar] Can you please write a test case for the new behavior ?

> Disable Namenode format when data already exists
> 
>
> Key: HDFS-12420
> URL: https://issues.apache.org/jira/browse/HDFS-12420
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
> Attachments: HDFS-12420.01.patch
>
>
> Disable NameNode format to avoid accidental formatting of Namenode in 
> production cluster. If someone really wants to delete the complete fsImage, 
> they can first delete the metadata dir and then run {code} hdfs namenode 
> -format{code} manually.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org