On Wed, Jul 15, 2009 at 19:31, Tomislav Poljak<[email protected]> wrote: > Hi, > I'm trying to merge (using nutch-1.0 mergesegs) about 1.2MM pages on one > machine contained in 10 segments, using: > > bin/nutch mergesegs crawl/merge_seg -dir crawl/segments > > ,but there is not enough space on 500G disk to complete this merge task > (getting java.io.IOException: No space left on device in hadoop.log) > > Shouldn't 500G be enough disk space for this merge? Is this a bug? If > this is not a bug, how much disk space is required for this merge? >
A lot :) Try deleting your hadoop temporary folders. If that doesn't help you may try merging segment parts one by one. For example, move your content/ directories and try merging again. If successful you can then merge contents later and move the resulting content/ into your merge_seg dir. > Tomislav > > -- Doğacan Güney
