[jira] Updated: (HDFS-1104) Fsck triggers full GC on NameNode

2010-04-29 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-1104:
-

Hadoop Flags: [Reviewed]

+1 patch looks good.  Thanks, Hairong.

> Fsck triggers full GC on NameNode
> -
>
> Key: HDFS-1104
> URL: https://issues.apache.org/jira/browse/HDFS-1104
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.21.0
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: fsckATime.patch, fsckATime1.patch, fsckATime2.patch
>
>
> A NameNode at one of our clusters fell into full GC while fsck was performed. 
> Digging into the problem shows that it is caused by how NameNode handles the 
> access time of a file.
> Fsck calls open on every file in the checked directory to get the file's 
> block locations. Each open changes the file's access time and then leads to 
> writing a transaction entry to the edit log. The current code optimizes open 
> so that it returns without issuing synchronizing the edit log to the disk. It 
> happened that in our cluster no other jobs were running while fsck was 
> performed. No edit log sync was ever called. So all open transactions were 
> kept in memory. When the edit log buffer got full, it automatically doubled 
> its space by allocating a new buffer.  Full GC happened when no contiguous 
> space were found when allocating a new bigger buffer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-609) Create a file with the append flag does not work in HDFS

2010-04-29 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862544#action_12862544
 ] 

Todd Lipcon commented on HDFS-609:
--

I disagree - I don't think these are addressed in trunk.

#1) the APPEND flag seems to track through to startFileInternal in 
FSNamesystem, which as Hairong mentioned just converts the INode but does not 
properly pass back a LocatedBlock for the last block, or convert it to 
underconstruction status.
#2) There still doesn't seem to be any checks that prevent a user from passing 
blocksize or replication when CreateFlag.APPEND is specified

> Create a file with the append flag does not work in HDFS
> 
>
> Key: HDFS-609
> URL: https://issues.apache.org/jira/browse/HDFS-609
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.21.0
>Reporter: Hairong Kuang
>Priority: Blocker
> Fix For: 0.21.0
>
>
> HADOOP-5438 introduced a create API with flags. There are a couple of issues 
> when the flag is set to be APPEND.
> 1. The APPEND flag does not work in HDFS. Append is not as simple as changing 
> a FileINode to be a FileINodeUnderConstruction. It also need to reopen the 
> last block for applend if last block is not full and handle crc when the last 
> crc chunk is not full.
> 2. The API is not well thought. It has parameters like replication factor and 
> blockSize. Those parameters do not make any sense if APPEND flag is set. But 
> they give an application user a wrong impression that append could change a 
> file's block size and replication factor.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1123) Need HDFS Protocol Specification

2010-04-29 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862539#action_12862539
 ] 

Todd Lipcon commented on HDFS-1123:
---

Absolutely agree, we should document semantics. I guess my suggestion is that 
we do the two tasks separately - in the short term do a quick brushup of what 
we've got now, and in parallel start working on documentation of the semantics, 
etc.

> Need HDFS Protocol Specification
> 
>
> Key: HDFS-1123
> URL: https://issues.apache.org/jira/browse/HDFS-1123
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: bc Wong
>
> It'd be great to document (in a spec, not in the code) the HDFS wire protocol:
> * The layout of the different request and reply messages.
> * The semantics of the various calls.
> * The semantics of the various fields.
> For example, I stumbled upon the goldmine of comments around 
> DataNode.java:1150. It looks correct, but the version number of 9 doesn't 
> inspire confidence that it's up-to-date. (It's also a random place to put 
> such an important comment.)
> Having a formal spec is a big step forward for compatibility. It also 
> highlights design decisions and helps with protocol evolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1123) Need HDFS Protocol Specification

2010-04-29 Thread bc Wong (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862538#action_12862538
 ] 

bc Wong commented on HDFS-1123:
---

Doesn't it still apply after moving to Avro? Avroization makes the layout 
documentation easier. It doesn't describes the semantics of the protocol, which 
is the interesting part.

> Need HDFS Protocol Specification
> 
>
> Key: HDFS-1123
> URL: https://issues.apache.org/jira/browse/HDFS-1123
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: bc Wong
>
> It'd be great to document (in a spec, not in the code) the HDFS wire protocol:
> * The layout of the different request and reply messages.
> * The semantics of the various calls.
> * The semantics of the various fields.
> For example, I stumbled upon the goldmine of comments around 
> DataNode.java:1150. It looks correct, but the version number of 9 doesn't 
> inspire confidence that it's up-to-date. (It's also a random place to put 
> such an important comment.)
> Having a formal spec is a big step forward for compatibility. It also 
> highlights design decisions and helps with protocol evolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1123) Need HDFS Protocol Specification

2010-04-29 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862535#action_12862535
 ] 

Todd Lipcon commented on HDFS-1123:
---

I agree that we should do a better job of documenting the current protocol, but 
we shouldn't spend *too* much time on it, since everyone is in agreement that 
we'd like to move to Avro this year. A quick pass to update the comments is 
probably worth doing, but a formal spec may be overkill for a protocol we plan 
to deprecate imminently.

> Need HDFS Protocol Specification
> 
>
> Key: HDFS-1123
> URL: https://issues.apache.org/jira/browse/HDFS-1123
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: bc Wong
>
> It'd be great to document (in a spec, not in the code) the HDFS wire protocol:
> * The layout of the different request and reply messages.
> * The semantics of the various calls.
> * The semantics of the various fields.
> For example, I stumbled upon the goldmine of comments around 
> DataNode.java:1150. It looks correct, but the version number of 9 doesn't 
> inspire confidence that it's up-to-date. (It's also a random place to put 
> such an important comment.)
> Having a formal spec is a big step forward for compatibility. It also 
> highlights design decisions and helps with protocol evolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1122) client block verification may result in blocks in DataBlockScanner prematurely

2010-04-29 Thread sam rash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam rash updated HDFS-1122:
---

Attachment: hdfs-1122-for-0.20.txt

patch that works on 0.20


> client block verification may result in blocks in DataBlockScanner prematurely
> --
>
> Key: HDFS-1122
> URL: https://issues.apache.org/jira/browse/HDFS-1122
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: sam rash
>Assignee: sam rash
> Attachments: hdfs-1122-for-0.20.txt
>
>
> found that when the DN uses client verification of a block that is open for 
> writing, it will add it to the DataBlockScanner prematurely. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1114) Reducing NameNode memory usage by an alternate hash table

2010-04-29 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862509#action_12862509
 ] 

Konstantin Shvachko commented on HDFS-1114:
---

Do you have an estimate on how much space this will save in NN's memory 
footprint?

> Reducing NameNode memory usage by an alternate hash table
> -
>
> Key: HDFS-1114
> URL: https://issues.apache.org/jira/browse/HDFS-1114
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> NameNode uses a java.util.HashMap to store BlockInfo objects.  When there are 
> many blocks in HDFS, this map uses a lot of memory in the NameNode.  We may 
> optimize the memory usage by a light weight hash table implementation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1110) Namenode heap optimization - reuse objects for commonly used file names

2010-04-29 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-1110:
--

Attachment: hdfs-1110.2.patch

bq. What are the names of these 24 files? Do they fall under the proposed 
default pattern. How big is the noise if we use the default pattern.
Of 24, 22 are part-* files.

bq. we need to optimize only for the top ten (or so) file names, which will 
give us 5% saving in the meta-data memory footprint
I do not think top 10 will save 5% of meta-data memory fooprint. See the posted 
results below.

I have a bug in my previous calculation, that made the savings seem too good to 
be true. With 47 million files optimized to use the dictionary, the saving of 
10 bytes gives 470MB and not 4.7GB :-) Also I did not account for byte[] 
overhead of 24 bytes.

Any way I have a tool NamespaceDedupe with the new patch. You could run on 
fsimage to see the frequency of occurence and savings in heap size. Dhruba you 
can run this on images on your production cluster to see how savings compare 
with what I have posted below.

23 names are used by 3343781 between 10 and 360461 times. Saved space 
114962311
468 names are used by 12944154 between 1 and 10 times. Saved space 
448255164
4335 names are used by 10522601 between 1000 and 1 times. Saved space 
391364352
40031 names are used by 10654372 between 100 and 1000 times. Saved space 
382273386
403974 names are used by10722689 between 10 and 100 times. Saved space 354416484
Total saved space 1691271697


> Namenode heap optimization - reuse objects for commonly used file names
> ---
>
> Key: HDFS-1110
> URL: https://issues.apache.org/jira/browse/HDFS-1110
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Fix For: 0.22.0
>
> Attachments: hdfs-1110.2.patch, hdfs-1110.patch
>
>
> There are a lot of common file names used in HDFS, mainly created by 
> mapreduce, such as file names starting with "part". Reusing byte[] 
> corresponding to these recurring file names will save significant heap space 
> used for storing the file names in millions of INodeFile objects.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-1123) Need HDFS Protocol Specification

2010-04-29 Thread bc Wong (JIRA)
Need HDFS Protocol Specification


 Key: HDFS-1123
 URL: https://issues.apache.org/jira/browse/HDFS-1123
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Reporter: bc Wong


It'd be great to document (in a spec, not in the code) the HDFS wire protocol:
* The layout of the different request and reply messages.
* The semantics of the various calls.
* The semantics of the various fields.

For example, I stumbled upon the goldmine of comments around 
DataNode.java:1150. It looks correct, but the version number of 9 doesn't 
inspire confidence that it's up-to-date. (It's also a random place to put such 
an important comment.)

Having a formal spec is a big step forward for compatibility. It also 
highlights design decisions and helps with protocol evolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1107) Turn on append by default.

2010-04-29 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-1107:
--

Attachment: appendOn.patch

This patch turns appendon by default. But there is still a way to turn it off. 
The next radical step is to remove all checks in the code whether append is 
supported. I'll file another jira for that. It can be done in later in 0.22.

> Turn on append by default.
> --
>
> Key: HDFS-1107
> URL: https://issues.apache.org/jira/browse/HDFS-1107
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Konstantin Shvachko
>Priority: Blocker
> Fix For: 0.21.0
>
> Attachments: appendOn.patch
>
>
> hdfs-default.xml still has the old default value {{dfs.support.append = 
> false}}. It should be changed to {{true}}, or removed from the default 
> configuration and treated as {{true}} if not found.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1107) Turn on append by default.

2010-04-29 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-1107:
--

  Status: Patch Available  (was: Open)
Assignee: Konstantin Shvachko

> Turn on append by default.
> --
>
> Key: HDFS-1107
> URL: https://issues.apache.org/jira/browse/HDFS-1107
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Konstantin Shvachko
>Assignee: Konstantin Shvachko
>Priority: Blocker
> Fix For: 0.21.0
>
> Attachments: appendOn.patch
>
>
> hdfs-default.xml still has the old default value {{dfs.support.append = 
> false}}. It should be changed to {{true}}, or removed from the default 
> configuration and treated as {{true}} if not found.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1118) DFSOutputStream socket leak when cannot connect to DataNode

2010-04-29 Thread Zheng Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Shao updated HDFS-1118:
-

Status: Patch Available  (was: Open)

> DFSOutputStream socket leak when cannot connect to DataNode
> ---
>
> Key: HDFS-1118
> URL: https://issues.apache.org/jira/browse/HDFS-1118
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.20.2, 0.20.1
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Attachments: HDFS-1118.1.patch, HDFS-1118.2.patch
>
>
> The offending code is in {{DFSOutputStream.nextBlockOutputStream}}
> This function retries several times to call {{createBlockOutputStream}}. Each 
> time when it fails, it leaves a {{Socket}} object in {{DFSOutputStream.s}}.
> That object is never closed, but overwritten the next time 
> {{createBlockOutputStream}} is called.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1118) DFSOutputStream socket leak when cannot connect to DataNode

2010-04-29 Thread Zheng Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Shao updated HDFS-1118:
-

Attachment: HDFS-1118.2.patch

Moved the cleanup to finally section.

> DFSOutputStream socket leak when cannot connect to DataNode
> ---
>
> Key: HDFS-1118
> URL: https://issues.apache.org/jira/browse/HDFS-1118
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.20.1, 0.20.2
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Attachments: HDFS-1118.1.patch, HDFS-1118.2.patch
>
>
> The offending code is in {{DFSOutputStream.nextBlockOutputStream}}
> This function retries several times to call {{createBlockOutputStream}}. Each 
> time when it fails, it leaves a {{Socket}} object in {{DFSOutputStream.s}}.
> That object is never closed, but overwritten the next time 
> {{createBlockOutputStream}} is called.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1118) DFSOutputStream socket leak when cannot connect to DataNode

2010-04-29 Thread Zheng Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Shao updated HDFS-1118:
-

Status: Open  (was: Patch Available)

> DFSOutputStream socket leak when cannot connect to DataNode
> ---
>
> Key: HDFS-1118
> URL: https://issues.apache.org/jira/browse/HDFS-1118
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.20.2, 0.20.1
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Attachments: HDFS-1118.1.patch, HDFS-1118.2.patch
>
>
> The offending code is in {{DFSOutputStream.nextBlockOutputStream}}
> This function retries several times to call {{createBlockOutputStream}}. Each 
> time when it fails, it leaves a {{Socket}} object in {{DFSOutputStream.s}}.
> That object is never closed, but overwritten the next time 
> {{createBlockOutputStream}} is called.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1105) Balancer improvement

2010-04-29 Thread Hairong Kuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862470#action_12862470
 ] 

Hairong Kuang commented on HDFS-1105:
-

Thank Dmytro for uploading a new patch. I really like the changes you made! 
Here are more review comments:
# The major contribution of the patch is that it enforces the max time for each 
iteration including the waiting time for moves to complete. I prefer the 
structure of disptchBlockMove to be
{code} {
  long startTime = Util.now();
  start threads to schedule & dispatch block moves; pass startTime to each 
thread as you do in your patch;
  waitForMoveCompletion(startTime); // pass startTime as well; return when 
reaches the max iteration time
}{code}
In this way, you do not need to introduce new heuristic for 
waitForMoveCompletion to quit as you do in your patch.
# I prefer PendingBlockMove#closeSocket() to call sock.close() instead of 
closing only its input stream. I understand that the final section of 
receiveResponse() closes the socket. However it is nice to release all its 
resources in one shot even in PendingBlockMove#closeSocket(). ReceiveRespnse() 
should catch EOFException before catching IOException to avoid printing two log 
messages for one exception. The log message for EOFException should simply say 
EOFException because sometimes it may not caused by 
PendingBlockMove#closeSocket().

Other minor comments:
# should remove unused imports;
# MAX_NUM_CONCURRENT_MOVE should not drop modifier "final";
# should keep all option parsing & balancer initialization in one method "init";
# Replace timeToStr with your new time format and calls timeToStr(timeLeft) in 
Balancer#run();
# It is not user friendly to print exception stack trace on the screen.

> Balancer improvement
> 
>
> Key: HDFS-1105
> URL: https://issues.apache.org/jira/browse/HDFS-1105
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Dmytro Molkov
>Assignee: Dmytro Molkov
> Attachments: HDFS-1105.2.patch, HDFS-1105.3.patch, HDFS-1105.patch
>
>
> We were seeing some weird issues with the balancer in our cluster:
> 1) it can get stuck during an iteration and only restarting it helps
> 2) the iterations are highly inefficient. With 20 minutes iteration it moves 
> 7K blocks a minute for the first 6 minutes and hundreds of blocks in the next 
> 14 minutes
> 3) it can hit namenode and the network pretty hard
> A few improvements we came up with as a result:
> Making balancer more deterministic in terms of running time of iteration, 
> improving the efficiency and making the load configurable:
> Make many of the constants configurable command line parameters: Iteration 
> length, number of blocks to move in parallel to a given node and in cluster 
> overall.
> Terminate transfers that are still in progress after iteration is over.
> Previously iteration time was the time window in which the balancer was 
> scheduling the moves and then it would wait for the moves to finish 
> indefinitely. Each scheduling task can run up to iteration time or even 
> longer. This means if you have too many of them and they are long your actual 
> iterations are longer than 20 minutes. Now each scheduling task has a time of 
> the start of iteration and it should schedule the moves only if it did not 
> run out of time. So the tasks that have started after the iteration is over 
> will not schedule any moves.
> The number of move threads and dispatch threads is configurable so that 
> depending on the load of the cluster you can run it slower.
> I will attach a patch, please let me know what you think and what can be done 
> better.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1107) Turn on append by default.

2010-04-29 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862466#action_12862466
 ] 

Eli Collins commented on HDFS-1107:
---

+1

> Turn on append by default.
> --
>
> Key: HDFS-1107
> URL: https://issues.apache.org/jira/browse/HDFS-1107
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Konstantin Shvachko
>Priority: Blocker
> Fix For: 0.21.0
>
>
> hdfs-default.xml still has the old default value {{dfs.support.append = 
> false}}. It should be changed to {{true}}, or removed from the default 
> configuration and treated as {{true}} if not found.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1107) Turn on append by default.

2010-04-29 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862465#action_12862465
 ] 

Jakob Homan commented on HDFS-1107:
---

+1

> Turn on append by default.
> --
>
> Key: HDFS-1107
> URL: https://issues.apache.org/jira/browse/HDFS-1107
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Konstantin Shvachko
>Priority: Blocker
> Fix For: 0.21.0
>
>
> hdfs-default.xml still has the old default value {{dfs.support.append = 
> false}}. It should be changed to {{true}}, or removed from the default 
> configuration and treated as {{true}} if not found.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1107) Turn on append by default.

2010-04-29 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862458#action_12862458
 ] 

Konstantin Shvachko commented on HDFS-1107:
---

I am going to remove {{dfs.support.append}} from {{hdfs-default.xml}}, and 
change the default value to true for this variable in the code, if there are no 
other suggestions. 
This should not be treated as incompatible change, as I cannot imagine programs 
that would strictly rely on that append is not supported and would fail if it 
suddenly is.

> Turn on append by default.
> --
>
> Key: HDFS-1107
> URL: https://issues.apache.org/jira/browse/HDFS-1107
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Konstantin Shvachko
>Priority: Blocker
> Fix For: 0.21.0
>
>
> hdfs-default.xml still has the old default value {{dfs.support.append = 
> false}}. It should be changed to {{true}}, or removed from the default 
> configuration and treated as {{true}} if not found.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-995) Replace usage of FileStatus#isDir()

2010-04-29 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-995:
-

   Issue Type: Bug  (was: Improvement)
Fix Version/s: 0.21.0
Affects Version/s: 0.20.3
   0.21.0
   (was: 0.22.0)
 Priority: Blocker  (was: Major)

> Replace usage of FileStatus#isDir()
> ---
>
> Key: HDFS-995
> URL: https://issues.apache.org/jira/browse/HDFS-995
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.20.3, 0.21.0
>Reporter: Eli Collins
>Assignee: Eli Collins
>Priority: Blocker
> Fix For: 0.21.0, 0.22.0
>
> Attachments: hdfs-995-1.patch
>
>
> HADOOP-6585 is going to deprecate FileStatus#isDir(). This jira is for 
> replacing all uses of isDir() in HDFS with checks of isDirectory(), isFile(), 
> or isSymlink() as needed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1107) Turn on append by default.

2010-04-29 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-1107:
--

Priority: Blocker  (was: Major)

I think it should be fixed for 0.21.

> Turn on append by default.
> --
>
> Key: HDFS-1107
> URL: https://issues.apache.org/jira/browse/HDFS-1107
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Konstantin Shvachko
>Priority: Blocker
> Fix For: 0.21.0
>
>
> hdfs-default.xml still has the old default value {{dfs.support.append = 
> false}}. It should be changed to {{true}}, or removed from the default 
> configuration and treated as {{true}} if not found.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-829) hdfsJniHelper.c: #include is not portable

2010-04-29 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-829:
-

   Status: Patch Available  (was: Open)
Affects Version/s: 0.21.0
   0.22.0
Fix Version/s: 0.21.0
   0.22.0

+1  Looks good to me for 21.

> hdfsJniHelper.c: #include  is not portable
> ---
>
> Key: HDFS-829
> URL: https://issues.apache.org/jira/browse/HDFS-829
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Allen Wittenauer
> Fix For: 0.21.0, 0.22.0
>
> Attachments: HDFS-632.patch, hdfs-829.patch
>
>
> hdfsJniHelper.c includes  but this appears to be unnecessary, since 
> even under Linux none of the routines that are prototyped are used.  Worse 
> yet, error.h doesn't appear to be a standard header file so this breaks on 
> Mac OS X and Solaris and prevents libhdfs from being built.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-829) hdfsJniHelper.c: #include is not portable

2010-04-29 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-829:
-

Status: Open  (was: Patch Available)

> hdfsJniHelper.c: #include  is not portable
> ---
>
> Key: HDFS-829
> URL: https://issues.apache.org/jira/browse/HDFS-829
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Allen Wittenauer
> Attachments: HDFS-632.patch, hdfs-829.patch
>
>
> hdfsJniHelper.c includes  but this appears to be unnecessary, since 
> even under Linux none of the routines that are prototyped are used.  Worse 
> yet, error.h doesn't appear to be a standard header file so this breaks on 
> Mac OS X and Solaris and prevents libhdfs from being built.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HDFS-808) Implement something like PAR2 support?

2010-04-29 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HDFS-808.
---

Resolution: Duplicate

> Implement something like PAR2 support?
> --
>
> Key: HDFS-808
> URL: https://issues.apache.org/jira/browse/HDFS-808
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Allen Wittenauer
>Priority: Minor
>
> We really need an Idea issue type, because I'm not sure if this is really 
> viable. :)  Just sort of thinking "out loud".
> I was thinking about how file recovery works on services like Usenet to fix 
> data corruption when chunks of files are missing.  I wonder how hard it would 
> be to implement something like PAR2 [ http://en.wikipedia.org/wiki/Parchive ] 
> automatically for large files.  We'd have the advantage of being able to do 
> it in binary of course and could likely hide the details within HDFS itself.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-941) Datanode xceiver protocol should allow reuse of a connection

2010-04-29 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862444#action_12862444
 ] 

Eli Collins commented on HDFS-941:
--

bq. hadoop fs -put of a 1g file from n clients in parallel. I suspect this will 
improve, socket resuse should limit slow start but good to check.

Meant fs -get here since we're caching sockets on reads and not writes. I think 
the DFSInputStream currently creates a new socket for each block it fetches. 

> Datanode xceiver protocol should allow reuse of a connection
> 
>
> Key: HDFS-941
> URL: https://issues.apache.org/jira/browse/HDFS-941
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, hdfs client
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: bc Wong
> Attachments: HDFS-941-1.patch, HDFS-941-2.patch, HDFS-941-3.patch, 
> HDFS-941-3.patch
>
>
> Right now each connection into the datanode xceiver only processes one 
> operation.
> In the case that an operation leaves the stream in a well-defined state (eg a 
> client reads to the end of a block successfully) the same connection could be 
> reused for a second operation. This should improve random read performance 
> significantly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-829) hdfsJniHelper.c: #include is not portable

2010-04-29 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862443#action_12862443
 ] 

Allen Wittenauer commented on HDFS-829:
---

It looks like libhdfs is in the hdfs tree in trunk.  So this can get committed 
now, right?  Can we get this in prior to the 0.21 cut over?

> hdfsJniHelper.c: #include  is not portable
> ---
>
> Key: HDFS-829
> URL: https://issues.apache.org/jira/browse/HDFS-829
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Allen Wittenauer
> Attachments: HDFS-632.patch, hdfs-829.patch
>
>
> hdfsJniHelper.c includes  but this appears to be unnecessary, since 
> even under Linux none of the routines that are prototyped are used.  Worse 
> yet, error.h doesn't appear to be a standard header file so this breaks on 
> Mac OS X and Solaris and prevents libhdfs from being built.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-941) Datanode xceiver protocol should allow reuse of a connection

2010-04-29 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862439#action_12862439
 ] 

Eli Collins commented on HDFS-941:
--

Hey bc,

Nice change!  

Do you have any results from a non-random workload? Please collect:
# before/after TestDFSIO runs so we can see if sequential throughput is affected
# hadoop fs -put of a 1g file from n clients in parallel. I suspect this will 
improve, socket resuse should limit slow start but good to check.

How did you choose DEFAULT_CACHE_SIZE?

In the exception handler in sendReadResult can we be more specific about when 
it's OK not to be able to send the result, and throw an exception in the cases 
when it's no OK rather than swallowing all IOExceptions?

In DataXceiver#opReadBlock you throw an IOException in a try block that catches 
IOException. I think that should LOG.error and close the output stream. You can 
also chain the following if statements that check stat. 

How about asserting sock != null in putCachedSocket? Seems like this should 
never happen if the code is correct and it's easy to ignore logs.

File a jira for ERROR_CHECKSUM?

Please add a comment to the head of ReaderSocketCache explaining why we cache 
BlockReader socket pairs, as opposed to just caching sockets (because we don't 
multiplex BlockReaders over a single socket between hosts).

Nits:
* Nice comment in the BlockReader header, please define "packet" as well. Is 
the RPC specification in DataNode outdated? If so fix it or file a jira instead 
of warning readers it may be outdated. 
* Maybe better name for DN_KEEPALIVE_TIMEOUT since there is no explicit 
keepalive?  TRANSFER_TIMEOUT?
* Would rename workDone to something specific like opsProcessed or make it a 
boolean  
* Add an "a" in "with checksum"
* if needs braces eg BlockReader#read

Thanks,
Eli

> Datanode xceiver protocol should allow reuse of a connection
> 
>
> Key: HDFS-941
> URL: https://issues.apache.org/jira/browse/HDFS-941
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, hdfs client
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: bc Wong
> Attachments: HDFS-941-1.patch, HDFS-941-2.patch, HDFS-941-3.patch, 
> HDFS-941-3.patch
>
>
> Right now each connection into the datanode xceiver only processes one 
> operation.
> In the case that an operation leaves the stream in a well-defined state (eg a 
> client reads to the end of a block successfully) the same connection could be 
> reused for a second operation. This should improve random read performance 
> significantly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HDFS-760) "fs -put" fails if dfs.umask is set to 63

2010-04-29 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan resolved HDFS-760.
--

Resolution: Fixed

This was fixed by HADOOP-6521.

> "fs -put" fails if dfs.umask is set to 63
> -
>
> Key: HDFS-760
> URL: https://issues.apache.org/jira/browse/HDFS-760
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.21.0
>Reporter: Tsz Wo (Nicholas), SZE
> Fix For: 0.21.0, 0.22.0
>
>
> Add the following to hdfs-site.conf
> {noformat}
>   
> dfs.umask
> 63
>   
> {noformat}
> Then run "hadoop fs -put"
> {noformat}
> -bash-3.1$ ./bin/hadoop fs -put README.txt r.txt
> 09/11/09 23:09:07 WARN conf.Configuration: mapred.task.id is deprecated. 
> Instead, use mapreduce.task.attempt.id
> put: 63
> Usage: java FsShell [-put  ... ]
> -bash-3.1$
> {noformat}
> Observed the above behavior in 0.21.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1110) Namenode heap optimization - reuse objects for commonly used file names

2010-04-29 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862431#action_12862431
 ] 

Konstantin Shvachko commented on HDFS-1110:
---

bq. File names used > 10 times  24

What are the names of these 24 files? Do they fall under the proposed default 
pattern. How big is the noise if we use the default pattern.

On the one hand I see the point of providing a generic approach for people to 
specify their own patterns.
But I also agree with Dhruba that we need to optimize only for the top ten (or 
so) file names, which will give us 5% saving in the meta-data memory footprint. 
The rest should be ignored, it would be a wast of resources to optimize for the 
rest. Your approach 2 would be a move in this direction.

So may be it would be useful to have a tool Jacob mentions (OIV-based), so that 
admins could run it offline on the image and get top N frequently used names, 
with an estimate how much space this saves. Then they will be able to formulate 
the reg exp. Otherwise, it is going to be a painful guessing game.

> Namenode heap optimization - reuse objects for commonly used file names
> ---
>
> Key: HDFS-1110
> URL: https://issues.apache.org/jira/browse/HDFS-1110
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Fix For: 0.22.0
>
> Attachments: hdfs-1110.patch
>
>
> There are a lot of common file names used in HDFS, mainly created by 
> mapreduce, such as file names starting with "part". Reusing byte[] 
> corresponding to these recurring file names will save significant heap space 
> used for storing the file names in millions of INodeFile objects.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-760) "fs -put" fails if dfs.umask is set to 63

2010-04-29 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HDFS-760:
---

Priority: Major  (was: Blocker)

Downgrading from blocker for 0.21. Looks like this is a corner case which has a 
workaround (use octal).

> "fs -put" fails if dfs.umask is set to 63
> -
>
> Key: HDFS-760
> URL: https://issues.apache.org/jira/browse/HDFS-760
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.21.0
>Reporter: Tsz Wo (Nicholas), SZE
> Fix For: 0.21.0, 0.22.0
>
>
> Add the following to hdfs-site.conf
> {noformat}
>   
> dfs.umask
> 63
>   
> {noformat}
> Then run "hadoop fs -put"
> {noformat}
> -bash-3.1$ ./bin/hadoop fs -put README.txt r.txt
> 09/11/09 23:09:07 WARN conf.Configuration: mapred.task.id is deprecated. 
> Instead, use mapreduce.task.attempt.id
> put: 63
> Usage: java FsShell [-put  ... ]
> -bash-3.1$
> {noformat}
> Observed the above behavior in 0.21.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-708) A stress-test tool for HDFS.

2010-04-29 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862412#action_12862412
 ] 

Konstantin Shvachko commented on HDFS-708:
--

With respect to 14. I found the following solution.
{code}
public DataGenerator(FileSystem fs, Path fn) throws IOException {
  if(!(fs instanceof DistributedFileSystem)) {
this.fileId = -1L;
return;
  }
  DFSDataInputStream in = null;
  try {
in = (DFSDataInputStream) ((DistributedFileSystem)fs).open(fn);
this.fileId = in.getCurrentBlock().getBlockId();
  } finally {
if(in != null) in.close();
  }
}
{code}
Right after creating a file for write you can get the id of the first block of 
the file and store it in {{DataGenerator.fileId}} - a new field.. This id is 
not changing while renames, and can be reliably used as a file-specific mix-in 
for hash in data generation and verification. The data value of a file at a 
specific offset is then calculated as {{hash(fileId, offset)}};

> A stress-test tool for HDFS.
> 
>
> Key: HDFS-708
> URL: https://issues.apache.org/jira/browse/HDFS-708
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: test, tools
>Affects Versions: 0.22.0
>Reporter: Konstantin Shvachko
>Assignee: Joshua Harlow
> Fix For: 0.22.0
>
> Attachments: slive.patch, SLiveTest.pdf
>
>
> It would be good to have a tool for automatic stress testing HDFS, which 
> would provide IO-intensive load on HDFS cluster.
> The idea is to start the tool, let it run overnight, and then be able to 
> analyze possible failures.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-776) Fix exception handling in Balancer

2010-04-29 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HDFS-776:
---

Priority: Critical  (was: Blocker)

> Fix exception handling in Balancer
> --
>
> Key: HDFS-776
> URL: https://issues.apache.org/jira/browse/HDFS-776
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer
>Reporter: Owen O'Malley
>Priority: Critical
> Fix For: 0.21.0
>
>
> The Balancer's AccessKeyUpdater handles exceptions badly. In particular:
> 1. Catching Exception too low. The wrapper around setKeys should only catch 
> IOException.
> 2. InterruptedException is ignored. It should be caught at the top level and 
> exit run.
> 3. Throwable is not caught. It should be caught at the top level and kill the 
> Balancer server process.
> {code}
>   class AccessKeyUpdater implements Runnable {
> public void run() {
>   while (shouldRun) {
> try {
>   accessTokenHandler.setKeys(namenode.getAccessKeys());
> } catch (Exception e) {
>   LOG.error(StringUtils.stringifyException(e));
> }
> try {
>   Thread.sleep(keyUpdaterInterval);
> } catch (InterruptedException ie) {
> }
>   }
> }
>   }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-708) A stress-test tool for HDFS.

2010-04-29 Thread Joshua Harlow (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua Harlow updated HDFS-708:
---

Attachment: (was: slive.patch)

> A stress-test tool for HDFS.
> 
>
> Key: HDFS-708
> URL: https://issues.apache.org/jira/browse/HDFS-708
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: test, tools
>Affects Versions: 0.22.0
>Reporter: Konstantin Shvachko
>Assignee: Joshua Harlow
> Fix For: 0.22.0
>
> Attachments: slive.patch, SLiveTest.pdf
>
>
> It would be good to have a tool for automatic stress testing HDFS, which 
> would provide IO-intensive load on HDFS cluster.
> The idea is to start the tool, let it run overnight, and then be able to 
> analyze possible failures.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-708) A stress-test tool for HDFS.

2010-04-29 Thread Joshua Harlow (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua Harlow updated HDFS-708:
---

Attachment: slive.patch

Updated  for code comments.

> A stress-test tool for HDFS.
> 
>
> Key: HDFS-708
> URL: https://issues.apache.org/jira/browse/HDFS-708
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: test, tools
>Affects Versions: 0.22.0
>Reporter: Konstantin Shvachko
>Assignee: Joshua Harlow
> Fix For: 0.22.0
>
> Attachments: slive.patch, slive.patch, SLiveTest.pdf
>
>
> It would be good to have a tool for automatic stress testing HDFS, which 
> would provide IO-intensive load on HDFS cluster.
> The idea is to start the tool, let it run overnight, and then be able to 
> analyze possible failures.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-708) A stress-test tool for HDFS.

2010-04-29 Thread Joshua Harlow (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862396#action_12862396
 ] 

Joshua Harlow commented on HDFS-708:


1. Done
2. Done
3. These methods have meanings for null (mainly for default checks for 
existence for merging) and a random seed meaning null means no seed which is 
possible. Duration for milliseconds can return an int though. Just that null 
has a meaning if the default value for a config option is set to be a null 
object. Which it is in a couple of cases.
4 & 5. Done (we are no measuring only the time around readByte and write())
6. Done
7. Done
8. Done
9 & 10. Done
11. Done and most classes made package private
15. Will add some tests.

> A stress-test tool for HDFS.
> 
>
> Key: HDFS-708
> URL: https://issues.apache.org/jira/browse/HDFS-708
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: test, tools
>Affects Versions: 0.22.0
>Reporter: Konstantin Shvachko
>Assignee: Joshua Harlow
> Fix For: 0.22.0
>
> Attachments: slive.patch, SLiveTest.pdf
>
>
> It would be good to have a tool for automatic stress testing HDFS, which 
> would provide IO-intensive load on HDFS cluster.
> The idea is to start the tool, let it run overnight, and then be able to 
> analyze possible failures.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1079) HDFS implementation should throw exceptions defined in AbstractFileSystem

2010-04-29 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862395#action_12862395
 ] 

Suresh Srinivas commented on HDFS-1079:
---

Eli, given we need a test for checking right exceptions are thrown from 
FileContext for various file systems, I have created Hadoop-6736 to capture 
that effort. The test is fairly involved and is better off done in a separate 
jira.

Yes, we should throw HadoopIllegalArgumentException where currently 
IllegalArgumentExceptions are thrown.

> HDFS implementation should throw exceptions defined in AbstractFileSystem
> -
>
> Key: HDFS-1079
> URL: https://issues.apache.org/jira/browse/HDFS-1079
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: name-node
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Fix For: 0.22.0
>
> Attachments: HDFS-1079.1.patch, HDFS-1079.patch, HDFS-1079.patch
>
>
> HDFS implementation Hdfs.java should throw exceptions as defined in 
> AbstractFileSystem. To facilitate this, ClientProtocol should be changed to 
> throw specific exceptions, as defined in AbstractFileSystem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-801) Add SureLogic annotations' jar into Ivy and Eclipse configs

2010-04-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862394#action_12862394
 ] 

Hadoop QA commented on HDFS-801:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12428002/hdfs_3.1.0.patch
  against trunk revision 939091.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 2 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/334/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/334/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/334/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/334/console

This message is automatically generated.

> Add SureLogic annotations' jar into Ivy and Eclipse configs
> ---
>
> Key: HDFS-801
> URL: https://issues.apache.org/jira/browse/HDFS-801
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: build, tools
>Affects Versions: 0.22.0
>Reporter: Konstantin Boudnik
>Assignee: Edwin Chan
> Attachments: hdfs_3.1.0.patch, hdfs_3.1.0.patch
>
>
> In order to use SureLogic analysis tools and allow their concurrency analysis 
> annotations in HDFS code the annotations library has to be automatically 
> pulled from a Maven repo. Also, it has to be added to Eclipse .classpath 
> template.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1104) Fsck triggers full GC on NameNode

2010-04-29 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-1104:


Status: Patch Available  (was: Open)

> Fsck triggers full GC on NameNode
> -
>
> Key: HDFS-1104
> URL: https://issues.apache.org/jira/browse/HDFS-1104
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.21.0
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: fsckATime.patch, fsckATime1.patch, fsckATime2.patch
>
>
> A NameNode at one of our clusters fell into full GC while fsck was performed. 
> Digging into the problem shows that it is caused by how NameNode handles the 
> access time of a file.
> Fsck calls open on every file in the checked directory to get the file's 
> block locations. Each open changes the file's access time and then leads to 
> writing a transaction entry to the edit log. The current code optimizes open 
> so that it returns without issuing synchronizing the edit log to the disk. It 
> happened that in our cluster no other jobs were running while fsck was 
> performed. No edit log sync was ever called. So all open transactions were 
> kept in memory. When the edit log buffer got full, it automatically doubled 
> its space by allocating a new buffer.  Full GC happened when no contiguous 
> space were found when allocating a new bigger buffer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1104) Fsck triggers full GC on NameNode

2010-04-29 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-1104:


Status: Open  (was: Patch Available)

> Fsck triggers full GC on NameNode
> -
>
> Key: HDFS-1104
> URL: https://issues.apache.org/jira/browse/HDFS-1104
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.21.0
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: fsckATime.patch, fsckATime1.patch, fsckATime2.patch
>
>
> A NameNode at one of our clusters fell into full GC while fsck was performed. 
> Digging into the problem shows that it is caused by how NameNode handles the 
> access time of a file.
> Fsck calls open on every file in the checked directory to get the file's 
> block locations. Each open changes the file's access time and then leads to 
> writing a transaction entry to the edit log. The current code optimizes open 
> so that it returns without issuing synchronizing the edit log to the disk. It 
> happened that in our cluster no other jobs were running while fsck was 
> performed. No edit log sync was ever called. So all open transactions were 
> kept in memory. When the edit log buffer got full, it automatically doubled 
> its space by allocating a new buffer.  Full GC happened when no contiguous 
> space were found when allocating a new bigger buffer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1089) Remove uses of FileContext#isFile, isDirectory and exists

2010-04-29 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-1089:
--

Status: Patch Available  (was: Open)

Kick hudson. HADOOP-6678 should have made it into the common jar by now.

> Remove uses of FileContext#isFile, isDirectory and exists
> -
>
> Key: HDFS-1089
> URL: https://issues.apache.org/jira/browse/HDFS-1089
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Eli Collins
>Assignee: Eli Collins
> Fix For: 0.21.0
>
> Attachments: hdfs-1089-1.patch
>
>
> Here's an HDFS jira for the second part of HADOOP-6678: removing uses of 
> FileContext#isFile, isDirectory and exists.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1089) Remove uses of FileContext#isFile, isDirectory and exists

2010-04-29 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-1089:
--

Status: Open  (was: Patch Available)

> Remove uses of FileContext#isFile, isDirectory and exists
> -
>
> Key: HDFS-1089
> URL: https://issues.apache.org/jira/browse/HDFS-1089
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Eli Collins
>Assignee: Eli Collins
> Fix For: 0.21.0
>
> Attachments: hdfs-1089-1.patch
>
>
> Here's an HDFS jira for the second part of HADOOP-6678: removing uses of 
> FileContext#isFile, isDirectory and exists.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1007) HFTP needs to be updated to use delegation tokens

2010-04-29 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HDFS-1007:
--

Attachment: 1007-bugfix.patch

Bugfix for handling null tokens (for Y20S)

> HFTP needs to be updated to use delegation tokens
> -
>
> Key: HDFS-1007
> URL: https://issues.apache.org/jira/browse/HDFS-1007
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.22.0
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: 0.22.0
>
> Attachments: 1007-bugfix.patch, distcp-hftp-2.1.1.patch, 
> distcp-hftp.1.patch, distcp-hftp.2.1.patch, distcp-hftp.2.patch, 
> distcp-hftp.patch, HDFS-1007-BP20-fix-1.patch, HDFS-1007-BP20-fix-2.patch, 
> HDFS-1007-BP20-fix-3.patch, HDFS-1007-BP20.patch
>
>
> HFTPFileSystem should be updated to use the delegation tokens so that it can 
> talk to the secure namenodes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK

2010-04-29 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-1001:
--

Status: Patch Available  (was: Open)

> DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
> -
>
> Key: HDFS-1001
> URL: https://issues.apache.org/jira/browse/HDFS-1001
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: bc Wong
>Assignee: bc Wong
> Attachments: HDFS-1001-2.patch, HDFS-1001-3.patch, HDFS-1001-3.patch, 
> HDFS-1001-rebased.patch, HDFS-1001.patch, HDFS-1001.patch.1
>
>
> Running the TestPread with additional debug statements reveals that the 
> BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. 
> Currently it doesn't matter since DataXceiver closes the connection after 
> each op, and CHECKSUM_OK is the last thing on the wire. But if we want to 
> cache connections, they need to agree on the exchange of CHECKSUM_OK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK

2010-04-29 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-1001:
--

Status: Open  (was: Patch Available)

> DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
> -
>
> Key: HDFS-1001
> URL: https://issues.apache.org/jira/browse/HDFS-1001
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: bc Wong
>Assignee: bc Wong
> Attachments: HDFS-1001-2.patch, HDFS-1001-3.patch, HDFS-1001-3.patch, 
> HDFS-1001-rebased.patch, HDFS-1001.patch, HDFS-1001.patch.1
>
>
> Running the TestPread with additional debug statements reveals that the 
> BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. 
> Currently it doesn't matter since DataXceiver closes the connection after 
> each op, and CHECKSUM_OK is the last thing on the wire. But if we want to 
> cache connections, they need to agree on the exchange of CHECKSUM_OK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK

2010-04-29 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862370#action_12862370
 ] 

Eli Collins commented on HDFS-1001:
---

+1 Nice change. 

Nits:
* in DataNode  "DataNode always expects" should read "always checks" since the 
response is optional. 
* Would rename readCasually to something like readBytesCheckEOS
* In DataXceiver "from client" should read "from the client" 



> DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
> -
>
> Key: HDFS-1001
> URL: https://issues.apache.org/jira/browse/HDFS-1001
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: bc Wong
>Assignee: bc Wong
> Attachments: HDFS-1001-2.patch, HDFS-1001-3.patch, HDFS-1001-3.patch, 
> HDFS-1001-rebased.patch, HDFS-1001.patch, HDFS-1001.patch.1
>
>
> Running the TestPread with additional debug statements reveals that the 
> BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. 
> Currently it doesn't matter since DataXceiver closes the connection after 
> each op, and CHECKSUM_OK is the last thing on the wire. But if we want to 
> cache connections, they need to agree on the exchange of CHECKSUM_OK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2010-04-29 Thread sam rash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam rash updated HDFS-1057:
---

Attachment: conurrent-reader-patch-1.txt

based on hadoop root dir


> Concurrent readers hit ChecksumExceptions if following a writer to very end 
> of file
> ---
>
> Key: HDFS-1057
> URL: https://issues.apache.org/jira/browse/HDFS-1057
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: data-node
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Todd Lipcon
>Assignee: sam rash
>Priority: Blocker
> Attachments: conurrent-reader-patch-1.txt
>
>
> In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
> calling flush(). Therefore, if there is a concurrent reader, it's possible to 
> race here - the reader will see the new length while those bytes are still in 
> the buffers of BlockReceiver. Thus the client will potentially see checksum 
> errors or EOFs. Additionally, the last checksum chunk of the file is made 
> accessible to readers even though it is not stable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2010-04-29 Thread sam rash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam rash updated HDFS-1057:
---

Attachment: (was: conurrent-reader-patch-1.txt)

> Concurrent readers hit ChecksumExceptions if following a writer to very end 
> of file
> ---
>
> Key: HDFS-1057
> URL: https://issues.apache.org/jira/browse/HDFS-1057
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: data-node
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Todd Lipcon
>Assignee: sam rash
>Priority: Blocker
> Attachments: conurrent-reader-patch-1.txt
>
>
> In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
> calling flush(). Therefore, if there is a concurrent reader, it's possible to 
> race here - the reader will see the new length while those bytes are still in 
> the buffers of BlockReceiver. Thus the client will potentially see checksum 
> errors or EOFs. Additionally, the last checksum chunk of the file is made 
> accessible to readers even though it is not stable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1104) Fsck triggers full GC on NameNode

2010-04-29 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-1104:


Attachment: fsckATime2.patch

This patch addressed Nicholas's review comments:
> The unit test may not work since there is a FSNamesystem.accessTimePrecision.
Changed the default precision in the test.
>  NameNode.getBlockLocationsNoATime(..) does not check permission.
Woops, it was in the first patch but was accidentally removed from the 2nd 
patch.



> Fsck triggers full GC on NameNode
> -
>
> Key: HDFS-1104
> URL: https://issues.apache.org/jira/browse/HDFS-1104
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.21.0
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: fsckATime.patch, fsckATime1.patch, fsckATime2.patch
>
>
> A NameNode at one of our clusters fell into full GC while fsck was performed. 
> Digging into the problem shows that it is caused by how NameNode handles the 
> access time of a file.
> Fsck calls open on every file in the checked directory to get the file's 
> block locations. Each open changes the file's access time and then leads to 
> writing a transaction entry to the edit log. The current code optimizes open 
> so that it returns without issuing synchronizing the edit log to the disk. It 
> happened that in our cluster no other jobs were running while fsck was 
> performed. No edit log sync was ever called. So all open transactions were 
> kept in memory. When the edit log buffer got full, it automatically doubled 
> its space by allocating a new buffer.  Full GC happened when no contiguous 
> space were found when allocating a new bigger buffer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2010-04-29 Thread sam rash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862359#action_12862359
 ] 

sam rash commented on HDFS-1057:


oops, thought i had passed the right options to git diff.  will update in a bit

in the meantime,

patch -p3 < patch.txt

will work


> Concurrent readers hit ChecksumExceptions if following a writer to very end 
> of file
> ---
>
> Key: HDFS-1057
> URL: https://issues.apache.org/jira/browse/HDFS-1057
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: data-node
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Todd Lipcon
>Assignee: sam rash
>Priority: Blocker
> Attachments: conurrent-reader-patch-1.txt
>
>
> In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
> calling flush(). Therefore, if there is a concurrent reader, it's possible to 
> race here - the reader will see the new length while those bytes are still in 
> the buffers of BlockReceiver. Thus the client will potentially see checksum 
> errors or EOFs. Additionally, the last checksum chunk of the file is made 
> accessible to readers even though it is not stable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2010-04-29 Thread Jean-Daniel Cryans (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862353#action_12862353
 ] 

Jean-Daniel Cryans commented on HDFS-1057:
--

Sam, can you base the patch on hadoop's root folder? It's kinda hard to apply 
as is.

> Concurrent readers hit ChecksumExceptions if following a writer to very end 
> of file
> ---
>
> Key: HDFS-1057
> URL: https://issues.apache.org/jira/browse/HDFS-1057
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: data-node
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Todd Lipcon
>Assignee: sam rash
>Priority: Blocker
> Attachments: conurrent-reader-patch-1.txt
>
>
> In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
> calling flush(). Therefore, if there is a concurrent reader, it's possible to 
> race here - the reader will see the new length while those bytes are still in 
> the buffers of BlockReceiver. Thus the client will potentially see checksum 
> errors or EOFs. Additionally, the last checksum chunk of the file is made 
> accessible to readers even though it is not stable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2010-04-29 Thread sam rash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam rash updated HDFS-1057:
---

Attachment: conurrent-reader-patch-1.txt

0.20 test + patch

> Concurrent readers hit ChecksumExceptions if following a writer to very end 
> of file
> ---
>
> Key: HDFS-1057
> URL: https://issues.apache.org/jira/browse/HDFS-1057
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: data-node
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Todd Lipcon
>Assignee: sam rash
>Priority: Blocker
> Attachments: conurrent-reader-patch-1.txt
>
>
> In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
> calling flush(). Therefore, if there is a concurrent reader, it's possible to 
> race here - the reader will see the new length while those bytes are still in 
> the buffers of BlockReceiver. Thus the client will potentially see checksum 
> errors or EOFs. Additionally, the last checksum chunk of the file is made 
> accessible to readers even though it is not stable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1122) client block verification may result in blocks in DataBlockScanner prematurely

2010-04-29 Thread sam rash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862348#action_12862348
 ] 

sam rash commented on HDFS-1122:


This results in these log messages:

2010-04-21 13:06:30,951 WARN 
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Adding an already 
existing block blk_6423942125821562308_117574
2010-04-21 12:59:47,054 WARN 
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Adding an already 
existing block blk_-1890060265487773738_117566
2010-04-21 12:56:26,831 WARN 
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Adding an already 
existing block blk_-8254097362836825914_117561
2010-04-21 12:53:03,386 WARN 
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Adding an already 
existing block blk_8946894423251690136_117557
2010-04-21 12:49:43,148 WARN 
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Adding an already 
existing block blk_-5467425469535997066_117553
2010-04-21 12:46:22,613 WARN 
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Adding an already 
existing block blk_-3020378094937646676_117549

and its possible the block scanner could mark the blocks corrupt since they are 
being written to.  I have a test + 0.20 patch I will upload shortly (crux of 
patch is that client verifications can only update the DataBlockScanner, not 
add new blocks).

> client block verification may result in blocks in DataBlockScanner prematurely
> --
>
> Key: HDFS-1122
> URL: https://issues.apache.org/jira/browse/HDFS-1122
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: sam rash
>Assignee: sam rash
>
> found that when the DN uses client verification of a block that is open for 
> writing, it will add it to the DataBlockScanner prematurely. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-1122) client block verification may result in blocks in DataBlockScanner prematurely

2010-04-29 Thread sam rash (JIRA)
client block verification may result in blocks in DataBlockScanner prematurely
--

 Key: HDFS-1122
 URL: https://issues.apache.org/jira/browse/HDFS-1122
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: sam rash
Assignee: sam rash


found that when the DN uses client verification of a block that is open for 
writing, it will add it to the DataBlockScanner prematurely. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-941) Datanode xceiver protocol should allow reuse of a connection

2010-04-29 Thread bc Wong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bc Wong updated HDFS-941:
-

Attachment: HDFS-941-3.patch

Ousp. The previous patch was in the reverse direction.

> Datanode xceiver protocol should allow reuse of a connection
> 
>
> Key: HDFS-941
> URL: https://issues.apache.org/jira/browse/HDFS-941
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, hdfs client
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: bc Wong
> Attachments: HDFS-941-1.patch, HDFS-941-2.patch, HDFS-941-3.patch, 
> HDFS-941-3.patch
>
>
> Right now each connection into the datanode xceiver only processes one 
> operation.
> In the case that an operation leaves the stream in a well-defined state (eg a 
> client reads to the end of a block successfully) the same connection could be 
> reused for a second operation. This should improve random read performance 
> significantly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-165) NPE in datanode.handshake()

2010-04-29 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862338#action_12862338
 ] 

Jakob Homan commented on HDFS-165:
--

bq. I will merge [this patch] into the lifecycle patch rather than split out 
(as I have done here) 
Steve, I read this to mean this patch is no longer necessary and can be closed 
as WontFix?  Does this sound good to you?


> NPE in datanode.handshake()
> ---
>
> Key: HDFS-165
> URL: https://issues.apache.org/jira/browse/HDFS-165
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.20.1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 0.22.0
>
> Attachments: HDFS-165.patch
>
>
> It appears possible to raise an NPE in DataNode.handshake() if the startup 
> protocol gets interrupted or fails in some manner

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK

2010-04-29 Thread bc Wong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bc Wong updated HDFS-1001:
--

Attachment: HDFS-1001-3.patch

Ousp. Previous patch was in reverse direction.

> DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
> -
>
> Key: HDFS-1001
> URL: https://issues.apache.org/jira/browse/HDFS-1001
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: bc Wong
>Assignee: bc Wong
> Attachments: HDFS-1001-2.patch, HDFS-1001-3.patch, HDFS-1001-3.patch, 
> HDFS-1001-rebased.patch, HDFS-1001.patch, HDFS-1001.patch.1
>
>
> Running the TestPread with additional debug statements reveals that the 
> BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. 
> Currently it doesn't matter since DataXceiver closes the connection after 
> each op, and CHECKSUM_OK is the last thing on the wire. But if we want to 
> cache connections, they need to agree on the exchange of CHECKSUM_OK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1079) HDFS implementation should throw exceptions defined in AbstractFileSystem

2010-04-29 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862332#action_12862332
 ] 

Eli Collins commented on HDFS-1079:
---

* What tests covers all the new throws of InvalidPathException?   
* Should the rest of h.hdfs.* be converted to HadoopIllegalArgumentException as 
well?

> HDFS implementation should throw exceptions defined in AbstractFileSystem
> -
>
> Key: HDFS-1079
> URL: https://issues.apache.org/jira/browse/HDFS-1079
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: name-node
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Fix For: 0.22.0
>
> Attachments: HDFS-1079.1.patch, HDFS-1079.patch, HDFS-1079.patch
>
>
> HDFS implementation Hdfs.java should throw exceptions as defined in 
> AbstractFileSystem. To facilitate this, ClientProtocol should be changed to 
> throw specific exceptions, as defined in AbstractFileSystem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1079) HDFS implementation should throw exceptions defined in AbstractFileSystem

2010-04-29 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-1079:
--

Attachment: HDFS-1079.1.patch

New patch addresses the comments except the following:
# FSDirectory - out of sync with trunk
#* Declare more concrete exceptions beyond IOExceptions
#** renameTo - not changing deprecated methods
#** unprotectedRenameTo not changing deprecated methods
# FSNamesystem
#* Remove IO exception declaration - not thrown
#** concat - checkPathAccess called from this throws IOException
#** setTimes - method throws IOException


> HDFS implementation should throw exceptions defined in AbstractFileSystem
> -
>
> Key: HDFS-1079
> URL: https://issues.apache.org/jira/browse/HDFS-1079
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: name-node
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Fix For: 0.22.0
>
> Attachments: HDFS-1079.1.patch, HDFS-1079.patch, HDFS-1079.patch
>
>
> HDFS implementation Hdfs.java should throw exceptions as defined in 
> AbstractFileSystem. To facilitate this, ClientProtocol should be changed to 
> throw specific exceptions, as defined in AbstractFileSystem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-708) A stress-test tool for HDFS.

2010-04-29 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-708:
-

Status: Open  (was: Patch Available)

Canceling patch pending review updates.

> A stress-test tool for HDFS.
> 
>
> Key: HDFS-708
> URL: https://issues.apache.org/jira/browse/HDFS-708
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: test, tools
>Affects Versions: 0.22.0
>Reporter: Konstantin Shvachko
>Assignee: Joshua Harlow
> Fix For: 0.22.0
>
> Attachments: slive.patch, SLiveTest.pdf
>
>
> It would be good to have a tool for automatic stress testing HDFS, which 
> would provide IO-intensive load on HDFS cluster.
> The idea is to start the tool, let it run overnight, and then be able to 
> analyze possible failures.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-708) A stress-test tool for HDFS.

2010-04-29 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862287#action_12862287
 ] 

Konstantin Shvachko commented on HDFS-708:
--

Sorry, (4) was a bit vague. I meant that the start and end times should be 
taken right around the actual HDFS action. E.g. for write it would be
{code}
get_start_time;
outputStream.write();
get_elapsed_time;
{code}

15. Forgot to mention that SLive should have a test. It can be simple. It can 
call slive on local MR and local FS with some reasonable parameters, which 
trigger most of the code paths. An alternative is to start Mini clusters and 
run slive on them. The important thing is it should not take long time to run.

> A stress-test tool for HDFS.
> 
>
> Key: HDFS-708
> URL: https://issues.apache.org/jira/browse/HDFS-708
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: test, tools
>Affects Versions: 0.22.0
>Reporter: Konstantin Shvachko
>Assignee: Joshua Harlow
> Fix For: 0.22.0
>
> Attachments: slive.patch, SLiveTest.pdf
>
>
> It would be good to have a tool for automatic stress testing HDFS, which 
> would provide IO-intensive load on HDFS cluster.
> The idea is to start the tool, let it run overnight, and then be able to 
> analyze possible failures.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-801) Add SureLogic annotations' jar into Ivy and Eclipse configs

2010-04-29 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-801:
-

Status: Patch Available  (was: Open)

Re-submitting to Hudson since it's been a while since the last run, before 
commit.

> Add SureLogic annotations' jar into Ivy and Eclipse configs
> ---
>
> Key: HDFS-801
> URL: https://issues.apache.org/jira/browse/HDFS-801
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: build, tools
>Affects Versions: 0.22.0
>Reporter: Konstantin Boudnik
>Assignee: Edwin Chan
> Attachments: hdfs_3.1.0.patch, hdfs_3.1.0.patch
>
>
> In order to use SureLogic analysis tools and allow their concurrency analysis 
> annotations in HDFS code the annotations library has to be automatically 
> pulled from a Maven repo. Also, it has to be added to Eclipse .classpath 
> template.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-801) Add SureLogic annotations' jar into Ivy and Eclipse configs

2010-04-29 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-801:
-

Status: Open  (was: Patch Available)

> Add SureLogic annotations' jar into Ivy and Eclipse configs
> ---
>
> Key: HDFS-801
> URL: https://issues.apache.org/jira/browse/HDFS-801
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: build, tools
>Affects Versions: 0.22.0
>Reporter: Konstantin Boudnik
>Assignee: Edwin Chan
> Attachments: hdfs_3.1.0.patch, hdfs_3.1.0.patch
>
>
> In order to use SureLogic analysis tools and allow their concurrency analysis 
> annotations in HDFS code the annotations library has to be automatically 
> pulled from a Maven repo. Also, it has to be added to Eclipse .classpath 
> template.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1079) HDFS implementation should throw exceptions defined in AbstractFileSystem

2010-04-29 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862237#action_12862237
 ] 

Sanjay Radia commented on HDFS-1079:


* dfsclient:
The methods should declare only IOException - the actual exceptions are 
declared in Client-protocol; this will make it easier to keep the exception 
declarations uptodate.


* FSDirectory - out of sync with trunk
** Declare more concrete exceptions beyond IOExceptions
*** renameTo -
*** unprotectedRenameTo
*** getPreferredBlockSize 
*** addSymlink
* FSNamesystem
** Remove IO exception declaration - not thrown
*** unprotectedConcat 
*** getBlocklocations* 
*** createLocatedBlock 
*** concat
*** setTimes
** Throw more detailed exception
*** startFile* 
*** appendFile i
*** delete* 


* File Jira to cleanup  IOException  - a better exception can be thrown
** removeBlock, removeLastBlock
** getCurrentUser
** plus a few others.

> HDFS implementation should throw exceptions defined in AbstractFileSystem
> -
>
> Key: HDFS-1079
> URL: https://issues.apache.org/jira/browse/HDFS-1079
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: name-node
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Fix For: 0.22.0
>
> Attachments: HDFS-1079.patch, HDFS-1079.patch
>
>
> HDFS implementation Hdfs.java should throw exceptions as defined in 
> AbstractFileSystem. To facilitate this, ClientProtocol should be changed to 
> throw specific exceptions, as defined in AbstractFileSystem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1113) Allow users with write access to a directory to change ownership of its subdirectories/files

2010-04-29 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862197#action_12862197
 ] 

Owen O'Malley commented on HDFS-1113:
-

Making the file owner in to a user settable string is a huge cost to security. 
Making the users scan the audit log from the beginning of time to find the 
creator of a file isn't a great answer.

Isn't the motivation really that you want to control access to the file? It 
seems like ACL's really answer your request (and many additional ones).

> Allow users with write access to a directory to change ownership of its 
> subdirectories/files
> 
>
> Key: HDFS-1113
> URL: https://issues.apache.org/jira/browse/HDFS-1113
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: name-node
> Environment: All
>Reporter: Milind Bhandarkar
>Assignee: Sanjay Radia
>
> owner and group of a file/directory, and namespace/diskspace quota for a 
> directory are mutable attributes. If I have writable access to a directory, 
> say /team/MyTeam, and if there are subdirectories underneath, such as 
> /team/MyTeam/TeamMember1, /team/MyTeam/TeamMember2, then I should be able to 
> chown, chgrp, setQuota, clrQuota on TeamMemeber{1|2} subdirectories. 
> Currently in HDFS (and in Posix), it requires me to be a superuser to perform 
> these operations.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1051) Umbrella Jira for Scaling the HDFS Name Service

2010-04-29 Thread Jeff Hammerbacher (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862148#action_12862148
 ] 

Jeff Hammerbacher commented on HDFS-1051:
-

Another piece of research germane to this JIRA: "Haceph: Scalable Metadata 
Management for Hadoop using Ceph
" from UCSC 
(http://www.soe.ucsc.edu/~carlosm/Papers/eestolan-nsdi10-abstract.pdf)

> Umbrella Jira for Scaling the HDFS Name Service
> ---
>
> Key: HDFS-1051
> URL: https://issues.apache.org/jira/browse/HDFS-1051
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 0.22.0
>Reporter: Sanjay Radia
>Assignee: Sanjay Radia
>
> The HDFS Name service currently uses a single Namenode which limits its 
> scalability. This is a master jira to track sub-jiras to address this problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-1121) Allow HDFS client to measure distribution of blocks across devices for a specific DataNode

2010-04-29 Thread Jeff Hammerbacher (JIRA)
Allow HDFS client to measure distribution of blocks across devices for a 
specific DataNode
--

 Key: HDFS-1121
 URL: https://issues.apache.org/jira/browse/HDFS-1121
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client
Reporter: Jeff Hammerbacher


As discussed on the mailing list, it would be useful if the DfsClient could 
measure the distribution of blocks across devices for an individual DataNode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-1120) Make DataNode's block-to-device placement policy pluggable

2010-04-29 Thread Jeff Hammerbacher (JIRA)
Make DataNode's block-to-device placement policy pluggable
--

 Key: HDFS-1120
 URL: https://issues.apache.org/jira/browse/HDFS-1120
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Reporter: Jeff Hammerbacher


As discussed on the mailing list, as the number of disk drives per server 
increases, it would be useful to allow the DataNode's policy for new block 
placement to grow in sophistication from the current round-robin strategy.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1119) Refactor BlocksMap with GettableSet

2010-04-29 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-1119:
-

Attachment: h1119_20100429.patch

h1119_20100429.patch:
- Added GettableSet interface.
- Added GettableSetByHashMap, a GettableSet implementation using 
java.util.HashMap.
- Used GettableSet in BlocksMap.
- Also removed unused getLoadFactor() methods.

> Refactor BlocksMap with GettableSet
> ---
>
> Key: HDFS-1119
> URL: https://issues.apache.org/jira/browse/HDFS-1119
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Tsz Wo (Nicholas), SZE
> Attachments: h1119_20100429.patch
>
>
> The data structure required in BlocksMap is a GettableSet.  See also [this 
> comment|https://issues.apache.org/jira/browse/HDFS-1114?focusedCommentId=12862118&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12862118].

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-1119) Refactor BlocksMap with GettableSet

2010-04-29 Thread Tsz Wo (Nicholas), SZE (JIRA)
Refactor BlocksMap with GettableSet
---

 Key: HDFS-1119
 URL: https://issues.apache.org/jira/browse/HDFS-1119
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Reporter: Tsz Wo (Nicholas), SZE


The data structure required in BlocksMap is a GettableSet.  See also [this 
comment|https://issues.apache.org/jira/browse/HDFS-1114?focusedCommentId=12862118&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12862118].

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1114) Reducing NameNode memory usage by an alternate hash table

2010-04-29 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862118#action_12862118
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1114:
--

The data structure we need in BlocksMap is a GettableSet.
{code}
/**
 * A set which supports the get operation.
 * @param  The type of the elements.
 */
public interface GettableSet extends Iterable {
  /**
   * @return the size of this set.
   */
  int size();

  /**
   * @return true if the given element equals to a stored element.
   * Otherwise, return false.
   */
  boolean contains(Object element);

  /**
   * @return the stored element if there is any.  Otherwise, return null.
   */
  E get(Object element);

  /**
   * Add the given element to this set.
   * @return the previous stored element if there is any.
   * Otherwise, return null.
   */
  E add(E element);

  /**
   * Remove the element from the set.
   * @return the stored element if there is any.  Otherwise, return null.
   */
  E remove(Object element);
}
{code}

> Reducing NameNode memory usage by an alternate hash table
> -
>
> Key: HDFS-1114
> URL: https://issues.apache.org/jira/browse/HDFS-1114
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> NameNode uses a java.util.HashMap to store BlockInfo objects.  When there are 
> many blocks in HDFS, this map uses a lot of memory in the NameNode.  We may 
> optimize the memory usage by a light weight hash table implementation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.