[jira] Commented: (HDFS-1539) prevent data loss when a cluster suffers a power loss

2011-01-04 Thread Hairong Kuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977309#action_12977309
 ] 

Hairong Kuang commented on HDFS-1539:
-

+1. The patch look good.

A minor comment is that I do not think the unit test is of much use because the 
bug occurs when a machine is power off but it's hard to simulate this.

 prevent data loss when a cluster suffers a power loss
 -

 Key: HDFS-1539
 URL: https://issues.apache.org/jira/browse/HDFS-1539
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, hdfs client, name-node
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: syncOnClose1.txt, syncOnClose2.txt


 we have seen an instance where a external outage caused many datanodes to 
 reboot at around the same time.  This resulted in many corrupted blocks. 
 These were recently written blocks; the current implementation of HDFS 
 Datanodes do not sync the data of a block file when the block is closed.
 1. Have a cluster-wide config setting that causes the datanode to sync a 
 block file when a block is finalized.
 2. Introduce a new parameter to the FileSystem.create() to trigger the new 
 behaviour, i.e. cause the datanode to sync a block-file when it is finalized.
 3. Implement the FSDataOutputStream.hsync() to cause all data written to the 
 specified file to be written to stable storage.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Reopened: (HDFS-1312) Datanode storage directories fill unevenly

2011-01-04 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins reopened HDFS-1312:
---


Agree, re-opening, let's have this issue track re-balancing disks within a DN.

 Datanode storage directories fill unevenly
 --

 Key: HDFS-1312
 URL: https://issues.apache.org/jira/browse/HDFS-1312
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Reporter: Travis Crawford

 Filing this issue in response to ``full disk woes`` on hdfs-user.
 Datanodes fill their storage directories unevenly, leading to situations 
 where certain disks are full while others are significantly less used. Users 
 at many different sites have experienced this issue, and HDFS administrators 
 are taking steps like:
 - Manually rebalancing blocks in storage directories
 - Decomissioning nodes  later readding them
 There's a tradeoff between making use of all available spindles, and filling 
 disks at the sameish rate. Possible solutions include:
 - Weighting less-used disks heavier when placing new blocks on the datanode. 
 In write-heavy environments this will still make use of all spindles, 
 equalizing disk use over time.
 - Rebalancing blocks locally. This would help equalize disk use as disks are 
 added/replaced in older cluster nodes.
 Datanodes should actively manage their local disk so operator intervention is 
 not needed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1312) Re-balance disks within a Datanode

2011-01-04 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-1312:
--

Issue Type: New Feature  (was: Bug)
   Summary: Re-balance disks within a Datanode  (was: Datanode storage 
directories fill unevenly)

 Re-balance disks within a Datanode
 --

 Key: HDFS-1312
 URL: https://issues.apache.org/jira/browse/HDFS-1312
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node
Reporter: Travis Crawford

 Filing this issue in response to ``full disk woes`` on hdfs-user.
 Datanodes fill their storage directories unevenly, leading to situations 
 where certain disks are full while others are significantly less used. Users 
 at many different sites have experienced this issue, and HDFS administrators 
 are taking steps like:
 - Manually rebalancing blocks in storage directories
 - Decomissioning nodes  later readding them
 There's a tradeoff between making use of all available spindles, and filling 
 disks at the sameish rate. Possible solutions include:
 - Weighting less-used disks heavier when placing new blocks on the datanode. 
 In write-heavy environments this will still make use of all spindles, 
 equalizing disk use over time.
 - Rebalancing blocks locally. This would help equalize disk use as disks are 
 added/replaced in older cluster nodes.
 Datanodes should actively manage their local disk so operator intervention is 
 not needed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1557) Separate Storage from FSImage

2011-01-04 Thread Ivan Kelly (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly updated HDFS-1557:
-

Status: Patch Available  (was: Open)

 Separate Storage from FSImage
 -

 Key: HDFS-1557
 URL: https://issues.apache.org/jira/browse/HDFS-1557
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: 0.21.0
Reporter: Ivan Kelly
 Fix For: 0.22.0, 0.23.0

 Attachments: HDFS-1557-branch-0.22.diff, HDFS-1557-trunk.diff, 
 HDFS-1557-trunk.diff, HDFS-1557.diff


 FSImage currently derives from Storage and FSEditLog has to call methods 
 directly on FSImage to access the filesystem. This JIRA is to separate the 
 Storage class out into NNStorage so that FSEditLog is less dependent on 
 FSImage. From this point, the other parts of the circular dependency should 
 be easy to fix.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1557) Separate Storage from FSImage

2011-01-04 Thread Ivan Kelly (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly updated HDFS-1557:
-

Status: Open  (was: Patch Available)

cancelling patch and resubmitting to make hudson run on it

 Separate Storage from FSImage
 -

 Key: HDFS-1557
 URL: https://issues.apache.org/jira/browse/HDFS-1557
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: 0.21.0
Reporter: Ivan Kelly
 Fix For: 0.22.0, 0.23.0

 Attachments: HDFS-1557-branch-0.22.diff, HDFS-1557-trunk.diff, 
 HDFS-1557-trunk.diff, HDFS-1557.diff


 FSImage currently derives from Storage and FSEditLog has to call methods 
 directly on FSImage to access the filesystem. This JIRA is to separate the 
 Storage class out into NNStorage so that FSEditLog is less dependent on 
 FSImage. From this point, the other parts of the circular dependency should 
 be easy to fix.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1539) prevent data loss when a cluster suffers a power loss

2011-01-04 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HDFS-1539:
---

   Resolution: Fixed
Fix Version/s: 0.23.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

I just committed this.

 prevent data loss when a cluster suffers a power loss
 -

 Key: HDFS-1539
 URL: https://issues.apache.org/jira/browse/HDFS-1539
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, hdfs client, name-node
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.23.0

 Attachments: syncOnClose1.txt, syncOnClose2.txt


 we have seen an instance where a external outage caused many datanodes to 
 reboot at around the same time.  This resulted in many corrupted blocks. 
 These were recently written blocks; the current implementation of HDFS 
 Datanodes do not sync the data of a block file when the block is closed.
 1. Have a cluster-wide config setting that causes the datanode to sync a 
 block file when a block is finalized.
 2. Introduce a new parameter to the FileSystem.create() to trigger the new 
 behaviour, i.e. cause the datanode to sync a block-file when it is finalized.
 3. Implement the FSDataOutputStream.hsync() to cause all data written to the 
 specified file to be written to stable storage.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1564) Make dfs.datanode.du.reserved configurable per volume

2011-01-04 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977473#action_12977473
 ] 

Konstantin Shvachko commented on HDFS-1564:
---

dfs.datanode.du.reserved is per volume. Here is the description from 
hdfs-default.xml:
{code}
property
  namedfs.datanode.du.reserved/name
  value0/value
  descriptionReserved space in bytes per volume. Always leave this much space 
free for non dfs use.
  /description
/property
{code}
Also if you look in the code the property is used in FSVolume, which 
corresponds to one volume.
Or do I miss what this is about?

 Make dfs.datanode.du.reserved configurable per volume
 -

 Key: HDFS-1564
 URL: https://issues.apache.org/jira/browse/HDFS-1564
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node
Reporter: Aaron T. Myers
Priority: Minor

 In clusters with DNs which have heterogeneous data dir volumes, it would be 
 nice if dfs.datanode.du.reserved could be configured per-volume.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1564) Make dfs.datanode.du.reserved configurable per volume

2011-01-04 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977488#action_12977488
 ] 

Aaron T. Myers commented on HDFS-1564:
--

My understanding is that the requester would like the ability to configure this 
*independently* per volume. e.g. configure 10GB reserved space for volume /fs1 
and 20GB reserved space for volume /fs2.

 Make dfs.datanode.du.reserved configurable per volume
 -

 Key: HDFS-1564
 URL: https://issues.apache.org/jira/browse/HDFS-1564
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node
Reporter: Aaron T. Myers
Priority: Minor

 In clusters with DNs which have heterogeneous data dir volumes, it would be 
 nice if dfs.datanode.du.reserved could be configured per-volume.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1542) Deadlock in Configuration.writeXml when serialized form is larger than one DFS block

2011-01-04 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-1542:
--

Attachment: hdfs-1542.txt

Here's the patch with just the unit test, so we can make sure it doesn't 
regress on the common side.

 Deadlock in Configuration.writeXml when serialized form is larger than one 
 DFS block
 

 Key: HDFS-1542
 URL: https://issues.apache.org/jira/browse/HDFS-1542
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.22.0, 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Attachments: deadlock.txt, hdfs-1542.txt, hdfs-1542.txt, 
 hdfs-1542.txt, hdfs-1542.txt, hdfs1542_cdh3b3.txt, Test.java


 Configuration.writeXml holds a lock on itself and then writes the XML to an 
 output stream, during which DFSOutputStream will try to get a lock on 
 ackQueue/dataQueue. Meanwihle the DataStreamer thread will call functions 
 like conf.getInt() and deadlock against the other thread, since it could be 
 the same conf object.
 This causes a deterministic deadlock whenever the serialized form is larger 
 than block size.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1542) Deadlock in Configuration.writeXml when serialized form is larger than one DFS block

2011-01-04 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-1542:
--

Status: Patch Available  (was: Open)

 Deadlock in Configuration.writeXml when serialized form is larger than one 
 DFS block
 

 Key: HDFS-1542
 URL: https://issues.apache.org/jira/browse/HDFS-1542
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.22.0, 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Attachments: deadlock.txt, hdfs-1542.txt, hdfs-1542.txt, 
 hdfs-1542.txt, hdfs-1542.txt, hdfs1542_cdh3b3.txt, Test.java


 Configuration.writeXml holds a lock on itself and then writes the XML to an 
 output stream, during which DFSOutputStream will try to get a lock on 
 ackQueue/dataQueue. Meanwihle the DataStreamer thread will call functions 
 like conf.getInt() and deadlock against the other thread, since it could be 
 the same conf object.
 This causes a deterministic deadlock whenever the serialized form is larger 
 than block size.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-827) Additional unit tests for FSDataset

2011-01-04 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977517#action_12977517
 ] 

Todd Lipcon commented on HDFS-827:
--

I suppose so - I never remember to look in src/test/unit, though. What's the 
purpose of the distinction there? We have some tests there that take much 
longer than a second (eg TestBlockRecovery), and many tests in src/test/hdfs 
that are near instant.

To be clear, I understand the distinction between unit and functional test, but 
not how it actually makes a difference in our build :)

 Additional unit tests for FSDataset
 ---

 Key: HDFS-827
 URL: https://issues.apache.org/jira/browse/HDFS-827
 Project: Hadoop HDFS
  Issue Type: Test
  Components: data-node, test
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: hdfs-827.txt


 FSDataset doesn't currently have a unit-test that tests it in isolation of 
 the DN or a cluster. A test specifically for this class will be helpful for 
 developing HDFS-788

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HDFS-110) Failed to execute fsck with -move option

2011-01-04 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang resolved HDFS-110.


Resolution: Won't Fix

 Failed to execute fsck with -move option
 

 Key: HDFS-110
 URL: https://issues.apache.org/jira/browse/HDFS-110
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Hairong Kuang
Assignee: Hairong Kuang

 I received the following error when running fsck with -move option. The dfs 
 was started by a user while the fsck was ran by a different user that does 
 not have the write access to the hadoop dfs data directory.
 - moving to /lost+found: /data.txt
 java.io.FileNotFoundException: 
 hadoop-dfs-data-dir/tmp/client-8234960199756230677 (Permission denied)
 at java.io.FileOutputStream.open(Native Method)
 at java.io.FileOutputStream.init(FileOutputStream.java:179)
 at java.io.FileOutputStream.init(FileOutputStream.java:131)
 at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.init(DFSClient.java:546)
 at org.apache.hadoop.dfs.DFSClient.create(DFSClient.java:99)
 at org.apache.hadoop.dfs.DFSck.lostFoundMove(DFSck.java:222)
 at org.apache.hadoop.dfs.DFSck.check(DFSck.java:178)
 at org.apache.hadoop.dfs.DFSck.check(DFSck.java:124)
 at org.apache.hadoop.dfs.DFSck.fsck(DFSck.java:112)
 at org.apache.hadoop.dfs.DFSck.main(DFSck.java:433)
 Failed to move /data.txt to /lost+found: 
 hadoop-dfs-data-dir/tmp/client-8234960199756230677 (Permission denied)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-918) Use single Selector and small thread pool to replace many instances of BlockSender for reads

2011-01-04 Thread Jay Booth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977544#action_12977544
 ] 

Jay Booth commented on HDFS-918:


Hey all, sorry for the slow response, been swamped with the new year and all.

RE: unit tests, at one point it was passing all tests, not sure if the tests 
changed or this changed but I can take a look at it.

RE: 0.23, I can look at forward porting this again, but a lot of changes have 
gone in since then.

@stack, were you testing the only pooling patch or the with full 
multiplexing patch?  

Only pooling would be much simpler to forward port, although I do think that 
the full multiplexing patch is pretty worthwhile.  Aside from the 
small-but-significant performance gain, it was IMO much better factoring to 
have the DN-side logic all encapsulated in a Connection object which has 
sendPacket() repeatedly called, rather than a giant procedural loop that goes 
down and back up through several classes.  The architecture also made keepalive 
pretty straightforward.. just throw that connection back into a listening pool 
when done, and make corresponding changes on client side.  But, I guess that 
logic's been revised now anyways, so it'd be a significant piece of work to 
bring it all back up to date.

 Use single Selector and small thread pool to replace many instances of 
 BlockSender for reads
 

 Key: HDFS-918
 URL: https://issues.apache.org/jira/browse/HDFS-918
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Reporter: Jay Booth
Assignee: Jay Booth
 Fix For: 0.22.0

 Attachments: hbase-hdfs-benchmarks.ods, hdfs-918-20100201.patch, 
 hdfs-918-20100203.patch, hdfs-918-20100211.patch, hdfs-918-20100228.patch, 
 hdfs-918-20100309.patch, hdfs-918-branch20-append.patch, 
 hdfs-918-branch20.2.patch, hdfs-918-pool.patch, hdfs-918-TRUNK.patch, 
 hdfs-multiplex.patch


 Currently, on read requests, the DataXCeiver server allocates a new thread 
 per request, which must allocate its own buffers and leads to 
 higher-than-optimal CPU and memory usage by the sending threads.  If we had a 
 single selector and a small threadpool to multiplex request packets, we could 
 theoretically achieve higher performance while taking up fewer resources and 
 leaving more CPU on datanodes available for mapred, hbase or whatever.  This 
 can be done without changing any wire protocols.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1552) Remove java5 dependencies from build

2011-01-04 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977545#action_12977545
 ] 

Konstantin Shvachko commented on HDFS-1552:
---

The Release 0.21.0 line has been somehow removed in CHANGES.txt.
{code}
-Release 0.21.0 - 2010-08-13
+HDFS-1552. Remove java5 dependencies from build. (cos) 
{code}


 Remove java5 dependencies from build
 

 Key: HDFS-1552
 URL: https://issues.apache.org/jira/browse/HDFS-1552
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build
Affects Versions: 0.21.1
Reporter: Konstantin Boudnik
Assignee: Konstantin Boudnik
 Fix For: 0.21.1

 Attachments: HDFS-1552.patch


 As the first short-term step let's remove JDK5 dependency from build(s)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1536) Improve HDFS WebUI

2011-01-04 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-1536:


Status: Patch Available  (was: Open)

 Improve HDFS WebUI
 --

 Key: HDFS-1536
 URL: https://issues.apache.org/jira/browse/HDFS-1536
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.23.0
Reporter: Hairong Kuang
Assignee: Hairong Kuang
 Fix For: 0.23.0

 Attachments: missingBlocksWebUI.patch, missingBlocksWebUI1.patch


 1. Make the missing blocks count accurate;
 2. Make the under replicated blocks count excluding missing blocks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1541) Not marking datanodes dead When namenode in safemode

2011-01-04 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-1541:


Attachment: deadnodescheck.patch

This patch makes sure that NameNode does not check dead nodes before the 
under-replication queue is populated.

 Not marking datanodes dead When namenode in safemode
 

 Key: HDFS-1541
 URL: https://issues.apache.org/jira/browse/HDFS-1541
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.23.0
Reporter: Hairong Kuang
Assignee: Hairong Kuang
 Fix For: 0.23.0

 Attachments: deadnodescheck.patch


 In a big cluster, when namenode starts up,  it takes a long time for namenode 
 to process block reports from all datanodes. Because heartbeats processing 
 get delayed, some datanodes are erroneously marked as dead, then later on 
 they have to register again, thus wasting time.
 It would speed up starting time if the checking of dead nodes is disabled 
 when namenode in safemode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1541) Not marking datanodes dead When namenode in safemode

2011-01-04 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-1541:


Status: Patch Available  (was: Open)

 Not marking datanodes dead When namenode in safemode
 

 Key: HDFS-1541
 URL: https://issues.apache.org/jira/browse/HDFS-1541
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.23.0
Reporter: Hairong Kuang
Assignee: Hairong Kuang
 Fix For: 0.23.0

 Attachments: deadnodescheck.patch


 In a big cluster, when namenode starts up,  it takes a long time for namenode 
 to process block reports from all datanodes. Because heartbeats processing 
 get delayed, some datanodes are erroneously marked as dead, then later on 
 they have to register again, thus wasting time.
 It would speed up starting time if the checking of dead nodes is disabled 
 when namenode in safemode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1561) BackupNode listens on default host

2011-01-04 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-1561:
--

Status: Open  (was: Patch Available)

 BackupNode listens on default host
 --

 Key: HDFS-1561
 URL: https://issues.apache.org/jira/browse/HDFS-1561
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.21.0
Reporter: Konstantin Shvachko
 Fix For: 0.22.0

 Attachments: BNAddress.patch


 Currently BackupNode uses DNS to find its default host name, and then starts 
 RPC server listening on that address ignoring the address specified in the 
 configuration. Therefore, there is no way to start BackupNode on a particular 
 ip or host address. BackupNode should use the address specified in the 
 configuration instead.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1561) BackupNode listens on default host

2011-01-04 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-1561:
--

Hadoop Flags: [Reviewed]
  Status: Patch Available  (was: Open)

 BackupNode listens on default host
 --

 Key: HDFS-1561
 URL: https://issues.apache.org/jira/browse/HDFS-1561
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.21.0
Reporter: Konstantin Shvachko
 Fix For: 0.22.0

 Attachments: BNAddress.patch


 Currently BackupNode uses DNS to find its default host name, and then starts 
 RPC server listening on that address ignoring the address specified in the 
 configuration. Therefore, there is no way to start BackupNode on a particular 
 ip or host address. BackupNode should use the address specified in the 
 configuration instead.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1541) Not marking datanodes dead When namenode in safemode

2011-01-04 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977554#action_12977554
 ] 

dhruba borthakur commented on HDFS-1541:


+1 code looks good. I think we do not need a unit test for this one.


 Not marking datanodes dead When namenode in safemode
 

 Key: HDFS-1541
 URL: https://issues.apache.org/jira/browse/HDFS-1541
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.23.0
Reporter: Hairong Kuang
Assignee: Hairong Kuang
 Fix For: 0.23.0

 Attachments: deadnodescheck.patch


 In a big cluster, when namenode starts up,  it takes a long time for namenode 
 to process block reports from all datanodes. Because heartbeats processing 
 get delayed, some datanodes are erroneously marked as dead, then later on 
 they have to register again, thus wasting time.
 It would speed up starting time if the checking of dead nodes is disabled 
 when namenode in safemode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1536) Improve HDFS WebUI

2011-01-04 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977557#action_12977557
 ] 

dhruba borthakur commented on HDFS-1536:


+1

 Improve HDFS WebUI
 --

 Key: HDFS-1536
 URL: https://issues.apache.org/jira/browse/HDFS-1536
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.23.0
Reporter: Hairong Kuang
Assignee: Hairong Kuang
 Fix For: 0.23.0

 Attachments: missingBlocksWebUI.patch, missingBlocksWebUI1.patch


 1. Make the missing blocks count accurate;
 2. Make the under replicated blocks count excluding missing blocks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1463) accessTime updates should not occur in safeMode

2011-01-04 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HDFS-1463:
---

Status: Open  (was: Patch Available)

 accessTime updates should not occur in safeMode
 ---

 Key: HDFS-1463
 URL: https://issues.apache.org/jira/browse/HDFS-1463
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: accessTimeSafeMode.txt, accessTimeSafeMode.txt


 FSNamesystem.getBlockLocations sometimes need to update the accessTime of 
 files. If the namenode is in safemode, this call should fail.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1552) Remove java5 dependencies from build

2011-01-04 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977613#action_12977613
 ] 

Konstantin Boudnik commented on HDFS-1552:
--

Oops, I missed that among 9 backports, I guess ;( Will fix the CHANGES.txt and 
commit the fix in a minute. Thanks for catching this.

 Remove java5 dependencies from build
 

 Key: HDFS-1552
 URL: https://issues.apache.org/jira/browse/HDFS-1552
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build
Affects Versions: 0.21.1
Reporter: Konstantin Boudnik
Assignee: Konstantin Boudnik
 Fix For: 0.21.1

 Attachments: HDFS-1552.patch


 As the first short-term step let's remove JDK5 dependency from build(s)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1564) Make dfs.datanode.du.reserved configurable per volume

2011-01-04 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977649#action_12977649
 ] 

Konstantin Shvachko commented on HDFS-1564:
---

The main discussion was in HADOOP-1463 with subsequent improvements in 
HADOOP-2816, HADOOP-2549.
We used to have {{dfs.datanode.du.pct}}, which defined a percent of reserved 
space per volume. This was originally intended for heterogeneous systems, but 
caused controversy or was not understood well, and was removed by HADOOP-4430.
I don't see any other way to address the different volumes issue but to 
reintroduce {{dfs.datanode.du.pct}}. If this is what the requester wants let 
him specify the exact meaning of the parameter and its relation to the existing 
ones.

 Make dfs.datanode.du.reserved configurable per volume
 -

 Key: HDFS-1564
 URL: https://issues.apache.org/jira/browse/HDFS-1564
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node
Reporter: Aaron T. Myers
Priority: Minor

 In clusters with DNs which have heterogeneous data dir volumes, it would be 
 nice if dfs.datanode.du.reserved could be configured per-volume.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.