[ 
http://issues.apache.org/jira/browse/NUTCH-62?page=comments#action_12312825 ] 

Jack Tang commented on NUTCH-62:
--------------------------------

The attachment contains MetaDataParser and config file. It looks up html META 
tag, and stored the name-value pairs into metaData map, then you can index the 
info. in index-more plugin.

> Add html META tag information into metaData in index-more plugin
> ----------------------------------------------------------------
>
>          Key: NUTCH-62
>          URL: http://issues.apache.org/jira/browse/NUTCH-62
>      Project: Nutch
>         Type: Improvement
>   Components: indexer
>     Reporter: Jack Tang
>     Priority: Trivial

>
> Now(version dev-0.7), only some metaData  in http response such as type, 
> date, content-length are available int the index-more plugin. And we cannot 
> index/sotre the meta data in html header (<META> exactly)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira



-------------------------------------------------------
This SF.Net email is sponsored by: NEC IT Guy Games.  How far can you shotput
a projector? How fast can you ride your desk chair down the office luge track?
If you want to score the big prize, get to know the little guy.  
Play to win an NEC 61" plasma display: http://www.necitguy.com/?r=20
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to