Hi

I wounder how I can extract the info that the language-identifier plugin
produces. If I was allowed to wish I would like the info to come when I dump
the data from the segments with the following command

bin/nutch readseg -dump crawl/segments/... output -nofetch -nogenerate
-noparse -noparsedata -parsetext

something like
URL:: http://...
Language:: en

Is it possible to get nutch to output it like, if not, is it possible to get
the info in some other way? As it is now I cant seem to find the info
anywhere. I've done the invert links and index step, but I have no clue on
where my language info is stored so I can extract it.
-- 
View this message in context: 
http://www.nabble.com/Getting-the-language-identifier-info-tp23813763p23813763.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to