[jira] [Commented] (HDFS-988) saveNamespace can corrupt edits log, apparently due to race conditions

2011-06-11 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048215#comment-13048215
 ] 

Todd Lipcon commented on HDFS-988:
--

+1 on the trunk patch, once you've run the full test suite through jcarder 
(with the "lockclasses" branch that detects rwlock issues). Also looks like it 
needs a rebase

> saveNamespace can corrupt edits log, apparently due to race conditions
> --
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.20-append, 0.21.0, 0.22.0
>Reporter: dhruba borthakur
>Assignee: Eli Collins
>Priority: Blocker
> Fix For: 0.20-append, 0.22.0
>
> Attachments: 988-fixups.txt, HDFS-988_fix_synchs.patch, 
> hdfs-988-2.patch, hdfs-988-3.patch, hdfs-988-4.patch, hdfs-988-5.patch, 
> hdfs-988-6.patch, hdfs-988-b22-1.patch, hdfs-988.txt, saveNamespace.txt, 
> saveNamespace_20-append.patch
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-988) saveNamespace can corrupt edits log, apparently due to race conditions

2011-06-11 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048213#comment-13048213
 ] 

Todd Lipcon commented on HDFS-988:
--

+1 on the 0.22 patch

> saveNamespace can corrupt edits log, apparently due to race conditions
> --
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.20-append, 0.21.0, 0.22.0
>Reporter: dhruba borthakur
>Assignee: Eli Collins
>Priority: Blocker
> Fix For: 0.20-append, 0.22.0
>
> Attachments: 988-fixups.txt, HDFS-988_fix_synchs.patch, 
> hdfs-988-2.patch, hdfs-988-3.patch, hdfs-988-4.patch, hdfs-988-5.patch, 
> hdfs-988-6.patch, hdfs-988-b22-1.patch, hdfs-988.txt, saveNamespace.txt, 
> saveNamespace_20-append.patch
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-988) saveNamespace can corrupt edits log, apparently due to race conditions

2011-06-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13046397#comment-13046397
 ] 

Hadoop QA commented on HDFS-988:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12481892/hdfs-988-6.patch
  against trunk revision 1133476.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 33 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.cli.TestHDFSCLI
  org.apache.hadoop.hdfs.TestHDFSTrash

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/747//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/747//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/747//console

This message is automatically generated.

> saveNamespace can corrupt edits log, apparently due to race conditions
> --
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.20-append, 0.21.0, 0.22.0
>Reporter: dhruba borthakur
>Assignee: Eli Collins
>Priority: Blocker
> Fix For: 0.20-append, 0.22.0
>
> Attachments: 988-fixups.txt, HDFS-988_fix_synchs.patch, 
> hdfs-988-2.patch, hdfs-988-3.patch, hdfs-988-4.patch, hdfs-988-5.patch, 
> hdfs-988-6.patch, hdfs-988-b22-1.patch, hdfs-988.txt, saveNamespace.txt, 
> saveNamespace_20-append.patch
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-988) saveNamespace can corrupt edits log, apparently due to race conditions

2011-06-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13046389#comment-13046389
 ] 

Hadoop QA commented on HDFS-988:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12481892/hdfs-988-6.patch
  against trunk revision 1133476.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 33 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.cli.TestHDFSCLI

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/746//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/746//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/746//console

This message is automatically generated.

> saveNamespace can corrupt edits log, apparently due to race conditions
> --
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.20-append, 0.21.0, 0.22.0
>Reporter: dhruba borthakur
>Assignee: Eli Collins
>Priority: Blocker
> Fix For: 0.20-append, 0.22.0
>
> Attachments: 988-fixups.txt, HDFS-988_fix_synchs.patch, 
> hdfs-988-2.patch, hdfs-988-3.patch, hdfs-988-4.patch, hdfs-988-5.patch, 
> hdfs-988-6.patch, hdfs-988-b22-1.patch, hdfs-988.txt, saveNamespace.txt, 
> saveNamespace_20-append.patch
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-988) saveNamespace can corrupt edits log, apparently due to race conditions

2011-06-09 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13046369#comment-13046369
 ] 

Eli Collins commented on HDFS-988:
--

Btw test-patch is +1 for hdfs-988-b22-1.patch on branch 22 and all the tests 
pass. I think it's good to go, mind reviewing it?  It's scope is limited to 
just the issue in the description (vs the latest patch for trunk).  

> saveNamespace can corrupt edits log, apparently due to race conditions
> --
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.20-append, 0.21.0, 0.22.0
>Reporter: dhruba borthakur
>Assignee: Eli Collins
>Priority: Blocker
> Fix For: 0.20-append, 0.22.0
>
> Attachments: 988-fixups.txt, HDFS-988_fix_synchs.patch, 
> hdfs-988-2.patch, hdfs-988-3.patch, hdfs-988-4.patch, hdfs-988-5.patch, 
> hdfs-988-6.patch, hdfs-988-b22-1.patch, hdfs-988.txt, saveNamespace.txt, 
> saveNamespace_20-append.patch
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-988) saveNamespace can corrupt edits log, apparently due to race conditions

2011-06-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042615#comment-13042615
 ] 

Hadoop QA commented on HDFS-988:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12481197/988-fixups.txt
  against trunk revision 1130381.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/684//console

This message is automatically generated.

> saveNamespace can corrupt edits log, apparently due to race conditions
> --
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.20-append, 0.21.0, 0.22.0
>Reporter: dhruba borthakur
>Assignee: Eli Collins
>Priority: Blocker
> Fix For: 0.20-append, 0.22.0
>
> Attachments: 988-fixups.txt, HDFS-988_fix_synchs.patch, 
> hdfs-988-2.patch, hdfs-988-3.patch, hdfs-988-4.patch, hdfs-988-5.patch, 
> hdfs-988-b22-1.patch, hdfs-988.txt, saveNamespace.txt, 
> saveNamespace_20-append.patch
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-988) saveNamespace can corrupt edits log, apparently due to race conditions

2011-06-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042600#comment-13042600
 ] 

Hadoop QA commented on HDFS-988:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12481173/hdfs-988-5.patch
  against trunk revision 1130339.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 18 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.cli.TestHDFSCLI
  
org.apache.hadoop.hdfs.server.namenode.TestCheckPointForSecurityTokens
  org.apache.hadoop.hdfs.server.namenode.TestCheckpoint
  org.apache.hadoop.hdfs.server.namenode.TestEditLogRace
  org.apache.hadoop.hdfs.server.namenode.TestParallelImageWrite
  org.apache.hadoop.hdfs.server.namenode.TestSaveNamespace
  org.apache.hadoop.hdfs.server.namenode.TestStartup
  org.apache.hadoop.hdfs.TestDFSFinalize
  org.apache.hadoop.hdfs.TestDFSRollback
  org.apache.hadoop.hdfs.TestDFSStartupVersions
  org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
  org.apache.hadoop.hdfs.TestDFSUpgradeFromImage
  org.apache.hadoop.hdfs.TestDFSUpgrade
  org.apache.hadoop.hdfs.TestListFilesInDFS
  org.apache.hadoop.hdfs.TestListFilesInFileContext
  
org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/679//testReport/
Findbugs warnings: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/679//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/679//console

This message is automatically generated.

> saveNamespace can corrupt edits log, apparently due to race conditions
> --
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.20-append, 0.21.0, 0.22.0
>Reporter: dhruba borthakur
>Assignee: Eli Collins
>Priority: Blocker
> Fix For: 0.20-append, 0.22.0
>
> Attachments: HDFS-988_fix_synchs.patch, hdfs-988-2.patch, 
> hdfs-988-3.patch, hdfs-988-4.patch, hdfs-988-5.patch, hdfs-988-b22-1.patch, 
> hdfs-988.txt, saveNamespace.txt, saveNamespace_20-append.patch
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-988) saveNamespace can corrupt edits log, apparently due to race conditions

2011-06-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042591#comment-13042591
 ] 

Hadoop QA commented on HDFS-988:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12481191/hdfs-988-b22-1.patch
  against trunk revision 1130381.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 15 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/683//console

This message is automatically generated.

> saveNamespace can corrupt edits log, apparently due to race conditions
> --
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.20-append, 0.21.0, 0.22.0
>Reporter: dhruba borthakur
>Assignee: Eli Collins
>Priority: Blocker
> Fix For: 0.20-append, 0.22.0
>
> Attachments: HDFS-988_fix_synchs.patch, hdfs-988-2.patch, 
> hdfs-988-3.patch, hdfs-988-4.patch, hdfs-988-5.patch, hdfs-988-b22-1.patch, 
> hdfs-988.txt, saveNamespace.txt, saveNamespace_20-append.patch
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-988) saveNamespace can corrupt edits log, apparently due to race conditions

2011-06-01 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042445#comment-13042445
 ] 

Eli Collins commented on HDFS-988:
--

ELOS#flush calls ELFOS#flushAndSync which does a force on the underlying file 
channel.

> saveNamespace can corrupt edits log, apparently due to race conditions
> --
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.20-append, 0.21.0, 0.22.0
>Reporter: dhruba borthakur
>Assignee: Eli Collins
>Priority: Blocker
> Fix For: 0.20-append, 0.22.0
>
> Attachments: HDFS-988_fix_synchs.patch, hdfs-988-2.patch, 
> hdfs-988-3.patch, hdfs-988-4.patch, hdfs-988.txt, saveNamespace.txt, 
> saveNamespace_20-append.patch
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-988) saveNamespace can corrupt edits log, apparently due to race conditions

2011-06-01 Thread Bharath Mundlapudi (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042432#comment-13042432
 ] 

Bharath Mundlapudi commented on HDFS-988:
-

I am just wondering, if we are calling os sync at all on this code path. All i 
see is flush call which flushes from EditLogOutputStream (java buffers) to 
kernel buffers.  

Shouldn't we be doing the following?

eStream.flush();
eStream.getFileOutputStream().getFD().sync();

This will make sure the edits are actually written to disk. Is there any reason 
for not doing this? 


> saveNamespace can corrupt edits log, apparently due to race conditions
> --
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.20-append, 0.21.0, 0.22.0
>Reporter: dhruba borthakur
>Assignee: Eli Collins
>Priority: Blocker
> Fix For: 0.20-append, 0.22.0
>
> Attachments: HDFS-988_fix_synchs.patch, hdfs-988-2.patch, 
> hdfs-988-3.patch, hdfs-988-4.patch, hdfs-988.txt, saveNamespace.txt, 
> saveNamespace_20-append.patch
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-988) saveNamespace can corrupt edits log, apparently due to race conditions

2011-06-01 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042351#comment-13042351
 ] 

Eli Collins commented on HDFS-988:
--

It looks like most of the unprotected* methods take the rwlock, but don't need 
to because either because their caller takes the lock or they are called from 
loading the edit log (which is why we originally had unprotected versions). Do 
people mind if I fix that up (remove the locking from these methods, make sure 
the unprotected versions are only called when loading the log) in this change 
or do people want that done in a separate change?

> saveNamespace can corrupt edits log, apparently due to race conditions
> --
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.20-append, 0.21.0, 0.22.0
>Reporter: dhruba borthakur
>Assignee: Eli Collins
>Priority: Blocker
> Fix For: 0.20-append, 0.22.0
>
> Attachments: HDFS-988_fix_synchs.patch, hdfs-988-2.patch, 
> hdfs-988-3.patch, hdfs-988-4.patch, hdfs-988.txt, saveNamespace.txt, 
> saveNamespace_20-append.patch
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-988) saveNamespace can corrupt edits log, apparently due to race conditions

2011-05-31 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13041962#comment-13041962
 ] 

dhruba borthakur commented on HDFS-988:
---

we should not logSync after a nextGenerationStamp() because it is invoked for 
every file creation, isn't it? and before the file-create call returns from the 
NN, we do invoke logSync(), so we shud be safe.

The other time we invoke nextGenerationStamp() is for pipeline error recovery, 
and that code path should also be ok, let me think.

> saveNamespace can corrupt edits log, apparently due to race conditions
> --
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.20-append, 0.21.0, 0.22.0
>Reporter: dhruba borthakur
>Assignee: Eli Collins
>Priority: Blocker
> Fix For: 0.20-append, 0.22.0
>
> Attachments: HDFS-988_fix_synchs.patch, hdfs-988-2.patch, 
> hdfs-988-3.patch, hdfs-988-4.patch, hdfs-988.txt, saveNamespace.txt, 
> saveNamespace_20-append.patch
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-988) saveNamespace can corrupt edits log, apparently due to race conditions

2011-05-31 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13041934#comment-13041934
 ] 

Todd Lipcon commented on HDFS-988:
--

Didn't go through the new tests yet, but here are some comments from a first 
pass through FSN:

- checks for if (auditLog.isInfoEnabled()) should probably now be 
(auditLog.isInfoEnabled() && isExternalInvocation()) -- otherwise we're doing a 
needless directory traversal for fsck
- The following methods currently do logSync() while holding the writeLock, 
which is expensive:
-- setPermission 
-- setOwner
-- commitBlockSynchronization (in some exit paths)
-- updatePipeline
- These methods should probably just set a local boolean within the 
synchronized section, then logSync() in the finally clause if it's flagged

- seems strange that some of the xInternal() methods take the write lock 
themselves (eg setReplicationInternal) whereas others assume the caller takes 
the write lock (eg createSymlinkInternal). We should be consistent

- for those methods that don't explicitly take the write lock, we should either 
add an {{assert hasWriteLock()}} or a comment explaining why the lock is not 
necessary (eg internalReleaseLease, reassignLease, 
finalizeINodeFileUnderConstruction)

- why doesn't getListing need the read lock?
- comment for endCheckpoint says "not started" but should say "not ended"
- same with updatePipeline

- I noticed that nextGenerationStamp() doesn't logSync() -- that seems 
dangerous, since after a restart we might hand out a duplicate genstamp.

> saveNamespace can corrupt edits log, apparently due to race conditions
> --
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.20-append, 0.21.0, 0.22.0
>Reporter: dhruba borthakur
>Assignee: Eli Collins
>Priority: Blocker
> Fix For: 0.20-append, 0.22.0
>
> Attachments: HDFS-988_fix_synchs.patch, hdfs-988-2.patch, 
> hdfs-988-3.patch, hdfs-988-4.patch, hdfs-988.txt, saveNamespace.txt, 
> saveNamespace_20-append.patch
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-988) saveNamespace can corrupt edits log

2011-05-18 Thread Matt Foley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035690#comment-13035690
 ] 

Matt Foley commented on HDFS-988:
-

Out of the seven test failures, the only one that might have to do with this 
patch is
* org.apache.hadoop.hdfs.TestInjectionForSimulatedStorage.testInjection 
But I think it's unlikely.

In case it wasn't clear, I'm offering this patch file as a possibly useful 
portion of the solution for this bug, not as a solution in its own right.  Feel 
free to incorporate all or parts of it.  Or not. :-)

> saveNamespace can corrupt edits log
> ---
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.20-append, 0.21.0, 0.22.0
>Reporter: dhruba borthakur
>Assignee: Todd Lipcon
>Priority: Blocker
> Fix For: 0.20-append, 0.22.0
>
> Attachments: HDFS-988_fix_synchs.patch, hdfs-988-2.patch, 
> hdfs-988.txt, saveNamespace.txt, saveNamespace_20-append.patch
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-988) saveNamespace can corrupt edits log

2011-05-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035653#comment-13035653
 ] 

Hadoop QA commented on HDFS-988:


-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12479629/HDFS-988_fix_synchs.patch
  against trunk revision 1124364.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.cli.TestHDFSCLI
  org.apache.hadoop.hdfs.server.datanode.TestBlockRecovery
  org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
  org.apache.hadoop.hdfs.TestFileConcurrentReader
  org.apache.hadoop.hdfs.TestInjectionForSimulatedStorage
  org.apache.hadoop.tools.TestJMXGet

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/559//testReport/
Findbugs warnings: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/559//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/559//console

This message is automatically generated.

> saveNamespace can corrupt edits log
> ---
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.20-append, 0.21.0, 0.22.0
>Reporter: dhruba borthakur
>Assignee: Todd Lipcon
>Priority: Blocker
> Fix For: 0.20-append, 0.22.0
>
> Attachments: HDFS-988_fix_synchs.patch, hdfs-988-2.patch, 
> hdfs-988.txt, saveNamespace.txt, saveNamespace_20-append.patch
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-988) saveNamespace can corrupt edits log

2011-05-18 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035233#comment-13035233
 ] 

dhruba borthakur commented on HDFS-988:
---

yes, they should all be using RWLock, absolutely.

> saveNamespace can corrupt edits log
> ---
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.20-append, 0.21.0, 0.22.0
>Reporter: dhruba borthakur
>Assignee: Todd Lipcon
>Priority: Blocker
> Fix For: 0.20-append, 0.22.0
>
> Attachments: hdfs-988-2.patch, hdfs-988.txt, saveNamespace.txt, 
> saveNamespace_20-append.patch
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-988) saveNamespace can corrupt edits log

2011-05-17 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035201#comment-13035201
 ] 

Todd Lipcon commented on HDFS-988:
--

Looking at trunk I see a lot of other strange synchronization issues. The 
following methods all are synchronized on the FSNamesystem instance:
- getNamespaceInfo 
- setQuota 
- renewLease
- isInSafeMode
- isInStartupSafeMode
- isPopulatingReplQueues
- nextGenerationStamp

I think all of these should probably be using the new rwlock... Dhruba, what do 
you think?

Maybe we need something more like a stress/fuzz test against FSNamesystem 
rather than trying to target the specific cases mentioned above?


> saveNamespace can corrupt edits log
> ---
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.20-append, 0.21.0, 0.22.0
>Reporter: dhruba borthakur
>Assignee: Todd Lipcon
>Priority: Blocker
> Fix For: 0.20-append, 0.22.0
>
> Attachments: hdfs-988-2.patch, hdfs-988.txt, saveNamespace.txt, 
> saveNamespace_20-append.patch
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-988) saveNamespace can corrupt edits log

2011-05-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029803#comment-13029803
 ] 

Hadoop QA commented on HDFS-988:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12478366/hdfs-988-2.patch
  against trunk revision 1100054.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.cli.TestHDFSCLI
  org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
  org.apache.hadoop.hdfs.TestFileConcurrentReader

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/462//testReport/
Findbugs warnings: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/462//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/462//console

This message is automatically generated.

> saveNamespace can corrupt edits log
> ---
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.20-append, 0.21.0, 0.22.0
>Reporter: dhruba borthakur
>Assignee: Todd Lipcon
>Priority: Blocker
> Fix For: 0.20-append, 0.22.0
>
> Attachments: hdfs-988-2.patch, hdfs-988.txt, saveNamespace.txt, 
> saveNamespace_20-append.patch
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-988) saveNamespace can corrupt edits log

2011-05-05 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029783#comment-13029783
 ] 

dhruba borthakur commented on HDFS-988:
---

1. we should not throw an exception is unable to update accesstime because NN 
is in safemode. This will prevent a adminstrator from running read-only 
operations (dfs -cat) on a cluster that is just starting up and is still in 
safemode. It shud just log a warning message and continue

2. Unit test: Create a cluster, open a file for write and and put cluster in 
safemode. verify that getAdditionalBlock, close, commitBlockSync etc fails when 
NN is in safemode 

> saveNamespace can corrupt edits log
> ---
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.20-append, 0.21.0, 0.22.0
>Reporter: dhruba borthakur
>Assignee: Todd Lipcon
>Priority: Blocker
> Fix For: 0.20-append, 0.22.0
>
> Attachments: hdfs-988-2.patch, hdfs-988.txt, saveNamespace.txt, 
> saveNamespace_20-append.patch
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-988) saveNamespace can corrupt edits log

2011-04-09 Thread Nigel Daley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017997#comment-13017997
 ] 

Nigel Daley commented on HDFS-988:
--

Todd, any update on this for 0.22?

> saveNamespace can corrupt edits log
> ---
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.20-append, 0.21.0, 0.22.0
>Reporter: dhruba borthakur
>Assignee: Todd Lipcon
>Priority: Blocker
> Fix For: 0.20-append, 0.22.0
>
> Attachments: hdfs-988.txt, saveNamespace.txt, 
> saveNamespace_20-append.patch
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (HDFS-988) saveNamespace can corrupt edits log

2010-07-01 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12884403#action_12884403
 ] 

dhruba borthakur commented on HDFS-988:
---

It is not a easily repeatable scenario: it is a race condition.

> saveNamespace can corrupt edits log
> ---
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.20-append, 0.21.0, 0.22.0
>Reporter: dhruba borthakur
>Assignee: Todd Lipcon
>Priority: Blocker
> Fix For: 0.20-append
>
> Attachments: hdfs-988.txt, saveNamespace.txt, 
> saveNamespace_20-append.patch
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-988) saveNamespace can corrupt edits log

2010-07-01 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12884380#action_12884380
 ] 

Brahma Reddy Battula commented on HDFS-988:
---

hi dhruba borthakur ,

I am using the 20.1 version..
I put NameNode in safemode and then i executed the save namespace,,But editlogs 
are not corrupted..
can u please give exact scenario to reproduce this.

> saveNamespace can corrupt edits log
> ---
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.20-append, 0.21.0, 0.22.0
>Reporter: dhruba borthakur
>Assignee: Todd Lipcon
>Priority: Blocker
> Fix For: 0.20-append
>
> Attachments: hdfs-988.txt, saveNamespace.txt, 
> saveNamespace_20-append.patch
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-988) saveNamespace can corrupt edits log

2010-06-19 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12880597#action_12880597
 ] 

dhruba borthakur commented on HDFS-988:
---

hi todd, would appreciate it if u can write some sort of a unit test for this 
one, that will help getting this into trunk.

> saveNamespace can corrupt edits log
> ---
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.20-append, 0.21.0, 0.22.0
>Reporter: dhruba borthakur
>Assignee: Todd Lipcon
>Priority: Blocker
> Fix For: 0.20-append
>
> Attachments: hdfs-988.txt, saveNamespace.txt, 
> saveNamespace_20-append.patch
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-988) saveNamespace can corrupt edits log

2010-06-08 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12876798#action_12876798
 ] 

Todd Lipcon commented on HDFS-988:
--

Nicolas: by seqnum you're referring to the generation stamp on the block being 
written, as the client continues to initiate recovery during shutdown? Do we 
also need to pull in HDFS-1145 then? (I have this patch applied on my append 
branch, but not 1145, and not seen errors, just seems to make sense)

> saveNamespace can corrupt edits log
> ---
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.21.0, 0.22.0, 0.20-append
>Reporter: dhruba borthakur
>Assignee: Todd Lipcon
>Priority: Blocker
> Attachments: hdfs-988.txt, saveNamespace.txt, 
> saveNamespace_20-append.patch
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-988) saveNamespace can corrupt edits log

2010-05-11 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12866081#action_12866081
 ] 

dhruba borthakur commented on HDFS-988:
---

Code looks excellent. can we add a unit test that does the following:

* Create a cluster, open a file for write and and put in safemode. verify that 
getAdditionalBlock, close etc fails when NN is in safemode



> saveNamespace can corrupt edits log
> ---
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.21.0, 0.22.0
>Reporter: dhruba borthakur
>Assignee: Todd Lipcon
>Priority: Blocker
> Attachments: hdfs-988.txt, saveNamespace.txt
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-988) saveNamespace can corrupt edits log

2010-05-10 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12866070#action_12866070
 ] 

Todd Lipcon commented on HDFS-988:
--

dhruba, can you take a look at this blocker please?

> saveNamespace can corrupt edits log
> ---
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.21.0, 0.22.0
>Reporter: dhruba borthakur
>Assignee: Todd Lipcon
>Priority: Blocker
> Attachments: hdfs-988.txt, saveNamespace.txt
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-988) saveNamespace can corrupt edits log

2010-04-30 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862838#action_12862838
 ] 

Todd Lipcon commented on HDFS-988:
--

The failed test is unrelated (been failing all patch builds lately)

Can anyone suggest a unit test for this issue? I can't think of any to add 
since functionality wasn't changed, we just cleaned up some potential races.

> saveNamespace can corrupt edits log
> ---
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.21.0, 0.22.0
>Reporter: dhruba borthakur
>Assignee: Todd Lipcon
>Priority: Blocker
> Attachments: hdfs-988.txt, saveNamespace.txt
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-988) saveNamespace can corrupt edits log

2010-04-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12861756#action_12861756
 ] 

Hadoop QA commented on HDFS-988:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12443047/hdfs-988.txt
  against trunk revision 938791.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/331/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/331/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/331/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/331/console

This message is automatically generated.

> saveNamespace can corrupt edits log
> ---
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.21.0, 0.22.0
>Reporter: dhruba borthakur
>Assignee: Todd Lipcon
>Priority: Blocker
> Attachments: hdfs-988.txt, saveNamespace.txt
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-988) saveNamespace can corrupt edits log

2010-04-26 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12860918#action_12860918
 ] 

Todd Lipcon commented on HDFS-988:
--

Hey Dhruba, yep, I am planning on doing this this week so it gets in the 
release. Also, I agree re commitBlockSynchronization, though I think we should 
probably add a couple more tests in this area.

> saveNamespace can corrupt edits log
> ---
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Reporter: dhruba borthakur
>Assignee: Todd Lipcon
> Attachments: saveNamespace.txt
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-988) saveNamespace can corrupt edits log

2010-04-26 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12860870#action_12860870
 ] 

dhruba borthakur commented on HDFS-988:
---

Hi todd, is it possible to make this one into the re-release of 0.21?

I do not think we should allow FSNamesystem.commitBlockSynchronization() in 
safe mode. Do you visualize a case when this could cause a problem? I like it 
safer when I can say that no new transactions can make it to the edits log when 
the namenode is in safemode.

> saveNamespace can corrupt edits log
> ---
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Reporter: dhruba borthakur
>Assignee: Todd Lipcon
> Attachments: saveNamespace.txt
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-988) saveNamespace can corrupt edits log

2010-04-14 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856961#action_12856961
 ] 

Todd Lipcon commented on HDFS-988:
--

Once HDFS-909 is committed I'll rebase this on trunk and take care of 
Konstantin's review comments

> saveNamespace can corrupt edits log
> ---
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Reporter: dhruba borthakur
>Assignee: Todd Lipcon
> Attachments: saveNamespace.txt
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HDFS-988) saveNamespace can corrupt edits log

2010-02-22 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12837060#action_12837060
 ] 

Konstantin Shvachko commented on HDFS-988:
--

# {{enterSafeMode()}} looks good.
# {{FSNamesystem.getAdditionalBlock()}} checking {{isInSafeMode()}} should be 
before calling {{chooseTargets()}}. I would not change {{getAdditionalBlock()}} 
at all.
# I think it is fine to {{FSNamesystem.commitBlockSynchronization()}} in safe 
mode. In any case it probably deserves a separate issue and discussion.
# {{renewLease()}} shouldn't be under FSNamesystem lock? {{leaseManeger}} has 
its own lock.
# Your changes to permission methods incorporate HDFS-133.
# Dhruba, are you going to promote changes to trunk and 0.21?

> saveNamespace can corrupt edits log
> ---
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Reporter: dhruba borthakur
> Attachments: saveNamespace.txt
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-988) saveNamespace can corrupt edits log

2010-02-19 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12836021#action_12836021
 ] 

Todd Lipcon commented on HDFS-988:
--

Hi Dhruba,

I still think we should fix this in the other issues and then backport to 20. 
But I'll do a review of this patch here since you've already uploaded it:

- in setPermission, the audit logging has moved outside the synchronized block. 
Thus dir.getFileInfo may actually return incorrect info (or even fail if it 
races with someone deleting the file)
- same goes for setOwner
- I think it's OK, but can you verify that the top synchronized block in 
getAdditionalBlock can never have side effects? I don't know the lease 
management code well enough - checkLease is guaranteed side-effect free?



> saveNamespace can corrupt edits log
> ---
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Reporter: dhruba borthakur
> Attachments: saveNamespace.txt
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-988) saveNamespace can corrupt edits log

2010-02-18 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12835488#action_12835488
 ] 

Todd Lipcon commented on HDFS-988:
--

HDFS-909's fixes cover two issues. One of the two is in 20, and other doesn't 
appear to be. But we may as well just have one JIRA number. The patch to 20 
will just be a subset of the changes from trunk - no sense in diverging fix 
implementations.

> saveNamespace can corrupt edits log
> ---
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Reporter: dhruba borthakur
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-988) saveNamespace can corrupt edits log

2010-02-18 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12835486#action_12835486
 ] 

dhruba borthakur commented on HDFS-988:
---

I agree that HDFS-956 could be fixed. But HDFS-909 does not apply to 0.20, 
where this one does, isn't it?

> saveNamespace can corrupt edits log
> ---
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Reporter: dhruba borthakur
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-988) saveNamespace can corrupt edits log

2010-02-18 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12835472#action_12835472
 ] 

Todd Lipcon commented on HDFS-988:
--

Does this need to be a separate JIRA from HDFS-909? This code is covered by 
that patch.

Also I think HDFS-956 needs to be fixed for this to truly catch all cases.

> saveNamespace can corrupt edits log
> ---
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Reporter: dhruba borthakur
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-988) saveNamespace can corrupt edits log

2010-02-18 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12835406#action_12835406
 ] 

dhruba borthakur commented on HDFS-988:
---

I think the wait time is theoretically unbounded, but waiting for 5 minutes 
seem like a pretty fool-proof idea to me.

> saveNamespace can corrupt edits log
> ---
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Reporter: dhruba borthakur
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-988) saveNamespace can corrupt edits log

2010-02-18 Thread Brian Bockelman (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12835402#action_12835402
 ] 

Brian Bockelman commented on HDFS-988:
--

How long can logSyncs be pending for?  Is this corruption still possible if the 
sysadmin waits, say, 5 minutes?

> saveNamespace can corrupt edits log
> ---
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Reporter: dhruba borthakur
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-988) saveNamespace can corrupt edits log

2010-02-18 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12835403#action_12835403
 ] 

dhruba borthakur commented on HDFS-988:
---

My proposal is to make the enterSafeMode method wait for all pending 
transactions to get flushed.

{code}
  synchronized void FSNamesystem.enterSafeMode() throws IOException {
if (!isInSafeMode()) {
  safeMode = new SafeModeInfo();
  return;
}
safeMode.setManual();
getEditLog().logSyncAll();  <=== new code here
NameNode.stateChangeLog.info("STATE* Safe mode is ON. "
+ safeMode.getTurnOffTip());
  }

  synchronized void FSEditLog.logSyncAll() throws IOException {
TransactionId id = myTransactionId.get();
id.txid = txid;
logSync();
  } 

{code}






> saveNamespace can corrupt edits log
> ---
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Reporter: dhruba borthakur
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.