> > Hello List,
> > I noticed that if I run a mergesegs on a single segment, the resulting
> > segment is bigger than the original. Is it a feature or a bug?
> > This behaviour occurred mainly in the directory parse-data. All other
dirs
> > are more or less equal (+-3MB).
> >
> > Org
> > Parse-data.: 550 MB
> >
> > After mergesegs
> > Parse-data: 1.5 GB
> >

> This looks wrong. Could you try it on a smaller segment, dump the
> "before" and "after" to a text file and see what's different?
> 
Hmm it looks like that it has something to do with DMOZ parsed fetchlists.
Unfortunately I can not recover the complete process. On smaller segments
the merge went well. I will let you know if I find out some more. 

Greets 
Matthias


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to