On Wed, Jul 15, 2009 at 19:31, Tomislav Poljak<[email protected]> wrote:
> Hi,
> I'm trying to merge (using nutch-1.0 mergesegs) about 1.2MM pages on one
> machine contained in 10 segments, using:
>
> bin/nutch mergesegs crawl/merge_seg -dir crawl/segments
>
> ,but there is not enough space on 500G disk to complete this merge task
> (getting java.io.IOException: No space left on device in hadoop.log)
>
> Shouldn't 500G be enough disk space for this merge? Is this a bug? If
> this is not a bug, how much disk space is required for this merge?
>

A lot :)

Try deleting your hadoop temporary folders. If that doesn't help you
may try merging
segment parts one by one. For example, move your content/ directories
and try merging
again. If successful you can then merge contents later and move the
resulting content/ into
your merge_seg dir.

> Tomislav
>
>



-- 
Doğacan Güney

Reply via email to