Hi Chad The link was a configuration example.
more explained example: http://www.misite.com/videos/.*=videos (rule A) if the url fetched match which rule A, then index a Field named = 'category' with value = 'videos'. Later you can search over this field category to filter yours searches. I will send this plugin in another new thread mail. I post the plugin here, in the list. I don't know another way to share it with you. Regards Ernesto. [EMAIL PROTECTED] escribió: > couldn't get the link to work but yes if you could share that would be > great. > > Chad Savage > > > > > Ernesto De Santis wrote: >> I did a url-category-indexer. >> >> It works with a .properties file that map urls writed as regexp and >> categories. >> example: >> >> http://www.misite.com/videos/.*=videos >> >> If it seems useful, I can share it. >> >> Maybe, it could be better config it in a .xml file. >> >> Regards, >> Ernesto. >> >> Stefan Neufeind escribió: >>> Alvaro Cabrerizo wrote: >>> >>>> Have you included a node to describe your new searcher filter into >>>> plugin.xml? >>>> >>>> 2006/10/11, xu nutch <[EMAIL PROTECTED]>: >>>> >>>>> I have a question about myplugin for indexfilter and queryfilter. >>>>> Can u Help me ! >>>>> ------------------------------------- >>>>> MoreIndexingFilter.java in add >>>>> doc.add(new Field("category", "test", false, true, false)); >>>>> ------------------------------------- >>>>> >>>>> -------------------------------------- >>>>> >>>>> >>>>> package org.apache.nutch.searcher.more; >>>>> >>>>> import org.apache.nutch.searcher.RawFieldQueryFilter; >>>>> >>>>> /** Handles "category:" query clauses, causing them to search the >>>>> field indexed by >>>>> * BasicIndexingFilter. */ >>>>> public class CategoryQueryFilter extends RawFieldQueryFilter { >>>>> public CategoryQueryFilter() { >>>>> super("category"); >>>>> } >>>>> } >>>>> ----------------------------------------------- >>>>> ----------------------------------------------- >>>>> >>>>> <property> >>>>> <name>plugin.includes</name> >>>>> <value>nutch-extensionpoints|protocol-http|urlfilter-regex|parse-(text|html)|index-(basic|more)|query-(basic|site|url|more)</value> >>>>> >>>>> >>>>> >>>>> <description>Regular expression naming plugin directory names to >>>>> include. Any plugin not matching this expression is excluded. >>>>> In any case you need at least include the nutch-extensionpoints >>>>> plugin. By >>>>> default Nutch includes crawling just HTML and plain text via HTTP, >>>>> and basic indexing and search plugins. >>>>> </description> >>>>> </property> >>>>> >>>>> <property> >>>>> <name>plugin.includes</name> >>>>> <value>nutch-extensionpoints|protocol-http|urlfilter-regex|parse-(text|html)|index-(basic|more)|query-(basic|site|url|more)</value> >>>>> >>>>> >>>>> >>>>> <description>Regular expression naming plugin directory names to >>>>> include. Any plugin not matching this expression is excluded. >>>>> In any case you need at least include the nutch-extensionpoints >>>>> plugin. By >>>>> default Nutch includes crawling just HTML and plain text via HTTP, >>>>> and basic indexing and search plugins. >>>>> </description> >>>>> </property> >>>>> ----------------------------------------------- >>>>> >>>>> I use luke to query "category:test" is ok! >>>>> but I use tomcat webstie to query "category:test" , >>>>> no return result. >>>>> >>> >>> In case you get the search working: >>> How do you plan to categorize URLs/sites? I'm looking for a solution >>> there, since I didn't yet manage to implement something >>> URL-prefix-filter based to map categories to URLs or so. >>> >>> >>> Regards, >>> Stefan >>> >>> >>> >> >> __________________________________________________ >> Preguntá. Respondé. Descubrí. >> Todo lo que querías saber, y lo que ni imaginabas, >> está en Yahoo! Respuestas (Beta). >> ¡Probalo ya! http://www.yahoo.com.ar/respuestas >> >> > __________________________________________________ Preguntá. Respondé. Descubrí. Todo lo que querías saber, y lo que ni imaginabas, está en Yahoo! Respuestas (Beta). ¡Probalo ya! http://www.yahoo.com.ar/respuestas ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
