On Wed, 2007-06-20 at 14:33 +0530, karan thakral wrote: > hi > > i need to write plugins for extracting the info from meta tags ...in HTML > documents > the HTML documents are having meta tags as Title Publisher and Creator and > date > > are thr already in buit in plugins available with the nutch distribution and > will i have to write the plugin by myself.
Have a look at $NUTCH_HOME/src/plugin/parse-html/src/java/org/apache/nutch/parse/html/HtmlParser.java There is a line HTMLMetaProcessor.getMetaTags(metaTags, root, base); HTH salu2 -- Thorsten Scherler thorsten.at.apache.org Open Source Java consulting, training and solutions ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
