Joanna Harpell wrote:
A few questions about NDFS:
- Can a single NDFS file have multiple concurrent writers?
No. This is not supported.
- Can the block size be changed in a running NDFS file system?
Yes, that is the intent.
- Are there any plans to localize Map tasks to block-resident nodes?
Yes. I think MapReduce & NDFS are now getting stable enough that we can
begin work on performance enhancements like this. Other optimizations
I'd like to get to soon: one copy of reduce output should be written to
the local node if possible; reduce tasks should start as soon as the
first map task is complete, copying and sorting map output in parallel
with the remainder of the map tasks; and the job tracker should assign
new tasks to nodes that are working on the fewest tasks. Each of these
should make a significant performance improvement.
- Is there any reason that big files (>>10TB?) wouldnt work?
Not that I can think of.
- Is there any reason that big blocks (1GB?) wouldn't work?
Not that I can think of.
The total number of blocks must not get too great, since the name ->
blockId* mapping is kept in RAM on the namenode.
Doug
-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers