date:20110420

[jira] [Updated] (HDFS-1845) symlink comes up as directory after namenode restart

2011-04-20 Thread John George (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John George updated HDFS-1845:
--

Attachment: HDFS-1845-2.patch

Attaching Yahoo! specific patch for the bug. 


$ ant test-core -Dtestcase=TestCheckpoint
..
..

checkfailure:

BUILD SUCCESSFUL
Total time: 37 seconds


 symlink comes up as directory after namenode restart
 

 Key: HDFS-1845
 URL: https://issues.apache.org/jira/browse/HDFS-1845
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: John George
Assignee: John George
 Fix For: 0.22.0, 0.23.0

 Attachments: HDFS-1845-2.patch, HDFS-1845-apache-2.patch, 
 HDFS-1845-apache-3.patch, HDFS-1845-apache.patch, hdfs-1845-branch22-1.patch


 When a symlink is first created, it get added to EditLogs. When namenode is 
 restarted, it reads from this editlog and represents a symlink correctly and 
 saves this information to its image. If the namenode is restarted again, it 
 reads its from this FSImage, but thinks that a symlink is a directory. This 
 is because it uses Block[] blocks to determine if an INode is a directory, 
 a file, or symlink. Since both a directory and a symlink has blocks as null, 
 it thinks that a symlink is a directory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HDFS-1475) Want a -d flag in hadoop dfs -ls : Do not expand directories

2011-04-20 Thread Daryn Sharp (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp reassigned HDFS-1475:
-

Assignee: Daryn Sharp

 Want a -d flag in hadoop dfs -ls : Do not expand directories
 

 Key: HDFS-1475
 URL: https://issues.apache.org/jira/browse/HDFS-1475
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client
Affects Versions: 0.20.1
 Environment: any
Reporter: Greg Connor
Assignee: Daryn Sharp
Priority: Minor

 I would really love it if dfs -ls had a -d flag, like unix ls -d, which would 
 list the directories matching the name or pattern but *not* their contents.
 Current behavior is to expand every matching dir and list its contents, which 
 is awkward if I just want to see the matching dirs themselves (and their 
 permissions).  Worse, if a directory exists but is empty, -ls simply returns 
 no output at all, which is unhelpful.  
 So far we have used some ugly workarounds to this in various scripts, such as
   -ls /path/to |grep dir   # wasteful, and problematic if dir is a 
 substring of the path
   -stat /path/to/dir Exists  # stat has no way to get back the full path, 
 sadly
   -count /path/to/dir  # works but is probably overkill.
 Really there is no reliable replacement for ls -d -- the above hacks will 
 work but only for certain isolated contexts.  (I'm not a java programmer, or 
 else I would probably submit a patch for this, or make my own jar file to do 
 this since I need it a lot.)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1822) Editlog opcodes overlap between 20 security and later releases

2011-04-20 Thread Suresh Srinivas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022177#comment-13022177
 ] 

Suresh Srinivas commented on HDFS-1822:
---

 Doesn't seem like this keeps branch-specific hackery confined to the branch.
It does. We no longer need code for conflicting opcodes in later releases. The 
new check that is being added is for version compatibility.

 Editlog opcodes overlap between 20 security and later releases
 --

 Key: HDFS-1822
 URL: https://issues.apache.org/jira/browse/HDFS-1822
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.21.0, 0.22.0, 0.23.0
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas
Priority: Blocker
 Fix For: 0.22.0, 0.23.0

 Attachments: HDFS-1822.patch


 Same opcode are used for different operations between 0.20.security, 0.22 and 
 0.23. This results in failure to load editlogs on later release, especially 
 during upgrades.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1850) DN should transmit absolute failed volume count rather than increments to the NN

2011-04-20 Thread Eli Collins (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-1850:
--

Hadoop Flags:   (was: [Incompatible change])

Good point. Flag removed.

 DN should transmit absolute failed volume count rather than increments to the 
 NN
 

 Key: HDFS-1850
 URL: https://issues.apache.org/jira/browse/HDFS-1850
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node, name-node
Reporter: Eli Collins
Assignee: Eli Collins
 Fix For: 0.23.0


 The API added in HDFS-811 for the DN to report volume failures to the NN is 
 inc(DN). However the given sequence of events will result in the NN 
 forgetting about reported failed volumes:
 # DN loses a volume and reports it
 # NN restarts
 # DN re-registers to the new NN
 A more robust interface would be to have the DN report the total number of 
 volume failures to the NN each heart beat (the same way other volume state is 
 transmitted).
 This will likely be an incompatible change since it requires changing the 
 Datanode protocol.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1842) Cannot upgrade 0.20.203 to 0.21 with an editslog present

2011-04-20 Thread Suresh Srinivas (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022193#comment-13022193
]

Suresh Srinivas commented on HDFS-1842:
---

You just need to demand that the edits is empty.
This was my first choice as well. My concern was it might throw this error
after partially upgrading to 204 and might require rollback to go back to
previous release, to save namespace. I looked at the code more closely and
rollback is not required.

There are couple of choices on how this can be done:
# If editlog file size == 0 then treat it as there are no edits. I am
reluctant to go this route. With new editlog changes, could editlog size be !=
0, but still it has no file system operation entries?
# While loading editlogs, wait for numEdits to go to 1 and then throw an error.
This means the entire fsimage is loaded, before the error is thrown. If we are
doing this we might as well go with the current patch. The opcode conversion
code then only remains in 2xx release.

Cannot upgrade 0.20.203 to 0.21 with an editslog present

Key: HDFS-1842
URL: https://issues.apache.org/jira/browse/HDFS-1842
Project: Hadoop HDFS
Issue Type: Sub-task
Components: name-node
Affects Versions: 0.20.203.0
Reporter: Allen Wittenauer
Priority: Blocker
Attachments: HDFS-1842.rel203.patch, HDFS-1842.rel204.patch

If a user installs 0.20.203 and then upgrades to 0.21 with an editslog
present, 0.21 will corrupt the file system due to opcode re-usage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1840) Terminate LeaseChecker when all writing files are closed.

2011-04-20 Thread Suresh Srinivas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1309#comment-1309
 ] 

Suresh Srinivas commented on HDFS-1840:
---

+1 for the patch.

 Terminate LeaseChecker when all writing files are closed.
 -

 Key: HDFS-1840
 URL: https://issues.apache.org/jira/browse/HDFS-1840
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h1840_20110418.patch, h1840_20110419.patch, 
 h1840_20110419b.patch


 In {{DFSClient}}, when there are files opened for write, a {{LeaseChecker}} 
 thread is started for updating the leases periodically.  However, it never 
 terminates when when all writing files are closed.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1840) Terminate LeaseChecker when all writing files are closed.

2011-04-20 Thread Tsz Wo (Nicholas), SZE (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-1840:
-

   Resolution: Fixed
Fix Version/s: 0.23.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

I have committed this.

 Terminate LeaseChecker when all writing files are closed.
 -

 Key: HDFS-1840
 URL: https://issues.apache.org/jira/browse/HDFS-1840
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.23.0

 Attachments: h1840_20110418.patch, h1840_20110419.patch, 
 h1840_20110419b.patch


 In {{DFSClient}}, when there are files opened for write, a {{LeaseChecker}} 
 thread is started for updating the leases periodically.  However, it never 
 terminates when when all writing files are closed.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1562) Add rack policy tests

2011-04-20 Thread Matt Foley (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022261#comment-13022261
]

Matt Foley commented on HDFS-1562:
--

Hi Eli, didn't realize you were going to look at TestDatanodeBlockScanner too.
I did extensive mods to it as part of HDFS-1295. Let's take a look and compare.

Add rack policy tests
-

Key: HDFS-1562
URL: https://issues.apache.org/jira/browse/HDFS-1562
Project: Hadoop HDFS
Issue Type: Test
Components: name-node, test
Affects Versions: 0.23.0
Reporter: Eli Collins
Assignee: Eli Collins
Attachments: hdfs-1562-1.patch, hdfs-1562-2.patch, hdfs-1562-3.patch

The existing replication tests (TestBlocksWithNotEnoughRacks,
TestPendingReplication, TestOverReplicatedBlocks, TestReplicationPolicy,
TestUnderReplicatedBlocks, and TestReplication) are missing tests for rack
policy violations. This jira adds the following tests which I created when
generating a new patch for HDFS-15.
* Test that blocks that have a sufficient number of total replicas, but are
not replicated cross rack, get replicated cross rack when a rack becomes
available.
* Test that new blocks for an underreplicated file will get replicated cross
rack.
* Mark a block as corrupt, test that when it is re-replicated that it is
still replicated across racks.
* Reduce the replication factor of a file, making sure that the only block
that is across racks is not removed when deleting replicas.
* Test that when a block is replicated because a replica is lost due to host
failure the the rack policy is preserved.
* Test that when the execss replicas of a block are reduced due to a node
re-joining the cluster the rack policy is not violated.
* Test that rack policy is still respected when blocks are replicated due to
node decommissioning.
* Test that rack policy is still respected when blocks are replicated due to
node decommissioning, even when the blocks are over-replicated.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-148) timeout when writing dfs file causes infinite loop when closing the file

2011-04-20 Thread John Meagher (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022266#comment-13022266
 ] 

John Meagher commented on HDFS-148:
---

It looks like this was fixed for HDFS-278

 timeout when writing dfs file causes infinite loop when closing the file
 

 Key: HDFS-148
 URL: https://issues.apache.org/jira/browse/HDFS-148
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.20.2
Reporter: Nigel Daley
Assignee: Sameer Paranjpye
Priority: Critical

 If, when writing to a dfs file, I get a timeout exception:
 06/11/29 11:16:05 WARN fs.DFSClient: Error while writing.
 java.net.SocketTimeoutException: timed out waiting for rpc response
at org.apache.hadoop.ipc.Client.call(Client.java:469)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:164)
at org.apache.hadoop.dfs.$Proxy0.reportWrittenBlock(Unknown Source)
at 
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.internalClose(DFSClient.java:1220)
at 
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.endBlock(DFSClient.java:1175)
at 
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.flush(DFSClient.java:1121)
at 
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.write(DFSClient.java:1103)
at org.apache.hadoop.examples.NNBench2.createWrite(NNBench2.java:107)
at org.apache.hadoop.examples.NNBench2.main(NNBench2.java:247)
 then the close() operation on the file appears to go into an infinite loop of 
 retrying:
 06/11/29 13:11:19 INFO fs.DFSClient: Could not complete file, retrying...
 06/11/29 13:11:20 INFO fs.DFSClient: Could not complete file, retrying...
 06/11/29 13:11:21 INFO fs.DFSClient: Could not complete file, retrying...
 06/11/29 13:11:23 INFO fs.DFSClient: Could not complete file, retrying...
 06/11/29 13:11:24 INFO fs.DFSClient: Could not complete file, retrying...
 ...

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1788) FsShell ls: Show symlinks properties

2011-04-20 Thread John George (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022362#comment-13022362
 ] 

John George commented on HDFS-1788:
---

To me it looks like this bug has two parts:
1. The ability for FsShell to show a symlink as l with the link target link 
- target just like in Linux. This seems to be a straight forward change in 
FsShell.

2. Inorder for FsShell to be able to do this, it needs to know that it is 
dealing with a symlink. As of now, it looks like FsShell uses FileSystem to 
check if a given path is a symlink or not. FileSystem class does not entirely 
support symlink. So, inorder to fix this bug, ls (FsShell) should either 
 a) start using FileContext (HADOOP-6424) or 
 b) FileSystem should be fixed to be able to deal with symlink. 
Inorder for FileSystem to support symlink, it should either be able to 
implement getFileLinkStatus() or getFileStatus() should itself be able to 
handle symlinks. The fastest/easiest way seems like getting getFileStatus() to 
also return the FileStatus of links. The best solution (but not the fastest) 
though seems to be to let FsShell use FileContext. Would it even make sense to 
let getFileStatus() return the status of symlinks as well (incase where the 
underlying filesystem supports symlinks) so that ls or any other command that 
uses FileSystem (as of today) can also deal with symlinks? 

Comments and suggestions welcome. 

 FsShell ls: Show symlinks properties
 

 Key: HDFS-1788
 URL: https://issues.apache.org/jira/browse/HDFS-1788
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Reporter: Jonathan Eagles
Assignee: John George
Priority: Minor

 ls FsShell command implementation has been consistent with the linux 
 implementations of ls \-l. With the addition of symlinks, I would expect the 
 ability to show file type 'd' for directory, '\-' for file, and 'l' for 
 symlink. In addition, following the linkname entry for symlinks, I would 
 expect the ability to show \- link target. In linux, the default is to 
 the the properties of the link and not of the link target. In linux, '-L' 
 option allows for the dereferencing of symlinks to show link target 
 properties, but it is not the default. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1842) Cannot upgrade 0.20.203 to 0.21 with an editslog present

2011-04-20 Thread Konstantin Shvachko (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022365#comment-13022365
 ] 

Konstantin Shvachko commented on HDFS-1842:
---

Yes rollback is not needed as image and edits loading does not change any files 
or directories.
Both choices work work me.
(1) provides faster failure. Only the size of the empty edits is sizeof(long), 
which is the layoutVersion size.
(2) is also fine and would be my preference. Informed admins will do 
saveNamespace() before upgrading, so the edits will be empty. But if they 
forget the upgrade will fail after 10 minutes, which is o(the time to restart 
the name-node with 203 and then again upgrade).

 Cannot upgrade 0.20.203 to 0.21 with an editslog present
 

 Key: HDFS-1842
 URL: https://issues.apache.org/jira/browse/HDFS-1842
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: 0.20.203.0
Reporter: Allen Wittenauer
Priority: Blocker
 Attachments: HDFS-1842.rel203.patch, HDFS-1842.rel204.patch


 If a user installs 0.20.203 and then upgrades to 0.21 with an editslog 
 present, 0.21 will corrupt the file system due to opcode re-usage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1842) Cannot upgrade 0.20.203 to 0.21 with an editslog present

2011-04-20 Thread Suresh Srinivas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022370#comment-13022370
 ] 

Suresh Srinivas commented on HDFS-1842:
---

OK I will go with (2) then.

 Cannot upgrade 0.20.203 to 0.21 with an editslog present
 

 Key: HDFS-1842
 URL: https://issues.apache.org/jira/browse/HDFS-1842
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: 0.20.203.0
Reporter: Allen Wittenauer
Priority: Blocker
 Attachments: HDFS-1842.rel203.patch, HDFS-1842.rel204.patch


 If a user installs 0.20.203 and then upgrades to 0.21 with an editslog 
 present, 0.21 will corrupt the file system due to opcode re-usage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1052) HDFS scalability with multiple namenodes

2011-04-20 Thread Suresh Srinivas (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-1052:
--

Attachment: (was: HDFS-1052.patch)

 HDFS scalability with multiple namenodes
 

 Key: HDFS-1052
 URL: https://issues.apache.org/jira/browse/HDFS-1052
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: name-node
Affects Versions: 0.22.0
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas
 Attachments: Block pool proposal.pdf, HDFS-1052.patch, Mulitple 
 Namespaces5.pdf, high-level-design.pdf


 HDFS currently uses a single namenode that limits scalability of the cluster. 
 This jira proposes an architecture to scale the nameservice horizontally 
 using multiple namenodes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1052) HDFS scalability with multiple namenodes

2011-04-20 Thread Suresh Srinivas (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-1052:
--

Attachment: HDFS-1052.patch

Latest patch.

 HDFS scalability with multiple namenodes
 

 Key: HDFS-1052
 URL: https://issues.apache.org/jira/browse/HDFS-1052
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: name-node
Affects Versions: 0.22.0
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas
 Attachments: Block pool proposal.pdf, HDFS-1052.patch, Mulitple 
 Namespaces5.pdf, high-level-design.pdf


 HDFS currently uses a single namenode that limits scalability of the cluster. 
 This jira proposes an architecture to scale the nameservice horizontally 
 using multiple namenodes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1843) Discover file not found early for file append

2011-04-20 Thread Bharath Mundlapudi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharath Mundlapudi updated HDFS-1843:
-

Attachment: HDFS-1843-2.patch

Thanks for code review, Jitendra. I have incorporated the changes.

 Discover file not found early for file append 
 --

 Key: HDFS-1843
 URL: https://issues.apache.org/jira/browse/HDFS-1843
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.23.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1843-1.patch, HDFS-1843-2.patch


 For the append call, discover file not found exception early and avoid extra 
 server call. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1788) FsShell ls: Show symlinks properties

2011-04-20 Thread Eli Collins (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022404#comment-13022404
 ] 

Eli Collins commented on HDFS-1788:
---

I think it makes sense to move FsShell over to FileContext (HADOOP-6424). 
That's substantially less work than supporting symlinks in FileSystem and work 
we need to do anyway.

 FsShell ls: Show symlinks properties
 

 Key: HDFS-1788
 URL: https://issues.apache.org/jira/browse/HDFS-1788
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Reporter: Jonathan Eagles
Assignee: John George
Priority: Minor

 ls FsShell command implementation has been consistent with the linux 
 implementations of ls \-l. With the addition of symlinks, I would expect the 
 ability to show file type 'd' for directory, '\-' for file, and 'l' for 
 symlink. In addition, following the linkname entry for symlinks, I would 
 expect the ability to show \- link target. In linux, the default is to 
 the the properties of the link and not of the link target. In linux, '-L' 
 option allows for the dereferencing of symlinks to show link target 
 properties, but it is not the default. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1845) symlink comes up as directory after namenode restart

2011-04-20 Thread Eli Collins (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022406#comment-13022406
 ] 

Eli Collins commented on HDFS-1845:
---

bq. Attaching Yahoo! specific patch for the bug.

Do you mean for branch-0.20-security? 

I think HDFS-1845-2.patch is equivalent to hdfs-1845-branch22-1.patch. 

 symlink comes up as directory after namenode restart
 

 Key: HDFS-1845
 URL: https://issues.apache.org/jira/browse/HDFS-1845
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: John George
Assignee: John George
 Fix For: 0.22.0, 0.23.0

 Attachments: HDFS-1845-2.patch, HDFS-1845-apache-2.patch, 
 HDFS-1845-apache-3.patch, HDFS-1845-apache.patch, hdfs-1845-branch22-1.patch


 When a symlink is first created, it get added to EditLogs. When namenode is 
 restarted, it reads from this editlog and represents a symlink correctly and 
 saves this information to its image. If the namenode is restarted again, it 
 reads its from this FSImage, but thinks that a symlink is a directory. This 
 is because it uses Block[] blocks to determine if an INode is a directory, 
 a file, or symlink. Since both a directory and a symlink has blocks as null, 
 it thinks that a symlink is a directory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1052) HDFS scalability with multiple namenodes

2011-04-20 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022409#comment-13022409
]

Hadoop QA commented on HDFS-1052:
-

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12476941/HDFS-1052.patch
against trunk revision 1095461.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 322 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

-1 javac. The patch appears to cause tar ant target to fail.

-1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these core unit tests:

-1 contrib tests. The patch failed contrib unit tests.

-1 system test framework. The patch failed system test framework compile.

Test results:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/393//testReport/
Console output:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/393//console

This message is automatically generated.

HDFS scalability with multiple namenodes

Key: HDFS-1052
URL: https://issues.apache.org/jira/browse/HDFS-1052
Project: Hadoop HDFS
Issue Type: New Feature
Components: name-node
Affects Versions: 0.22.0
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas
Attachments: Block pool proposal.pdf, HDFS-1052.patch, Mulitple
Namespaces5.pdf, high-level-design.pdf

HDFS currently uses a single namenode that limits scalability of the cluster.
This jira proposes an architecture to scale the nameservice horizontally
using multiple namenodes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1295) Improve namenode restart times by short-circuiting the first block reports from datanodes

2011-04-20 Thread Suresh Srinivas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022414#comment-13022414
 ] 

Suresh Srinivas commented on HDFS-1295:
---

Comments:
# Minor: TestDatanodeBlockScanner - could you LOG.info or LOG.debug instead of 
System.out
# Is it worth retaining the printDatanodeAssignments() and 
printDatanodeBlockReports(), which probably was added as debug code?
# In the test we have TIMEOUT to be 20s. Is it reasonably long enough so that 
tests do not fail?
# In block report time, why is the report creation time not included in metrics?


 Improve namenode restart times by short-circuiting the first block reports 
 from datanodes
 -

 Key: HDFS-1295
 URL: https://issues.apache.org/jira/browse/HDFS-1295
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.22.0
Reporter: dhruba borthakur
Assignee: Matt Foley
 Fix For: 0.23.0

 Attachments: IBR_shortcut_v2a.patch, IBR_shortcut_v3atrunk.patch, 
 IBR_shortcut_v4atrunk.patch, IBR_shortcut_v4atrunk.patch, 
 IBR_shortcut_v4atrunk.patch, IBR_shortcut_v6atrunk.patch, 
 shortCircuitBlockReport_1.txt


 The namenode restart is dominated by the performance of processing block 
 reports. On a 2000 node cluster with 90 million blocks,  block report 
 processing takes 30 to 40 minutes. The namenode diffs the contents of the 
 incoming block report with the contents of the blocks map, and then applies 
 these diffs to the blocksMap, but in reality there is no need to compute the 
 diff because this is the first block report from the datanode.
 This code change improves block report processing time by 300%.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

1 2 >

1 - 100 of 122 matches

Mail list logo