On 3/2/07, Andrzej Bialecki <[EMAIL PROTECTED]> wrote:
> Dennis Kubes wrote:
> > Believe it or not I don't think that meta tags are currently stored.
> > I looked through the html parsing code and didn't see anywhere that it
> > could be storing it except in html filters.  I see that meta tags are
> > parsed and passed to the html filters but I didn't see any default
> > filter that was storing them.
> >
> > If there isn't a reason why we shouldn't be storing meta tags, if we
> > aren't currently storing them (I could be missing where this is
> > happening :) ), and this is something that people want then I can
> > create an html filter that will store the meta-tags in the Parse
> > MetaData.

Yes!! Please that would be nice.  Maybe we can do metatag-parse, metatag-index
metatag-query?? no?? This way those who want this can turn it on as a
plugin?? no??

> The reason is simple - space. Storing additional data consumes space,
> and if someone just occasionally needs this info from one or two pages
> it's less costly to re-parse the page again.

Oh I see. Now I understand. But I wonder what is the MetaData parser
doing really? is it being used anywhere in the crawl-index life cycle at all?
Just wondering...



> --
> Best regards,
> Andrzej Bialecki     <><
>  ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>
>
>

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to