Joanna Harpell wrote:
 A few questions about NDFS:

 - Can a single NDFS file have multiple concurrent writers?

No.  This is not supported.

- Can the block size be changed in a running NDFS file system?

Yes, that is the intent.

- Are there any plans to localize Map tasks to block-resident nodes?

Yes. I think MapReduce & NDFS are now getting stable enough that we can begin work on performance enhancements like this. Other optimizations I'd like to get to soon: one copy of reduce output should be written to the local node if possible; reduce tasks should start as soon as the first map task is complete, copying and sorting map output in parallel with the remainder of the map tasks; and the job tracker should assign new tasks to nodes that are working on the fewest tasks. Each of these should make a significant performance improvement.

- Is there any reason that big files (>>10TB?) wouldnt work?

Not that I can think of.

- Is there any reason that big blocks (1GB?) wouldn't work?

Not that I can think of.

The total number of blocks must not get too great, since the name -> blockId* mapping is kept in RAM on the namenode.

Doug


-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to