Emmanuel JOKE wrote:
> Hi Guys,
> 
> I've read an article which explain that we are now able to use the native
> lib of hadoop in order to compress our data crawled.
> 
> I'm just wondering how can we compress a crawldb and all others stuff that
> are already saved on the disk.
> could you please help me ?

You can use the *Merger tools to re-write the data. E.g. CrawlDbMerger 
for crawldb, giving just a single db as the input argument.


-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to