Hi Jeff,

Off the top of my head, the best way to do that might be to inject the
filters as a CSV blob, and write (or modify) an indexing filter to split up
the blob and index them as separate values to a "multiValued" field in Solr.

On Thu, Apr 2, 2015 at 3:07 PM, Jeff Cocking <[email protected]> wrote:

> Jonathan et al
>
> Thank you for the reply.  I used this approach and it is working with one
> minor issue. It is the "one to many" requirement for each group. The intent
> is to use a filter query within solr on the group data element.  I have
> tried the following:
>
> group="filter1,filter2"
> group="filter1","filter2"
> group=filter1,filter2
> group=filter1 filter2
> group=filter1   group=filter2
>
> Each of these choices create a single variable assigned to group. Do you
> have any suggestions on how to format the seed.txt file to support the "one
> to many" option? i.e. that each filter value can be used as a filter query
> element within solr?
>
>
> For those who find this thread searching for a similar solution, here is
> how to implement urlmeta:
>
> 1. Turn on the plugin by adding urlmeta in the plugin.includes property
> within nutch-site.xml. urlmeta is a standalone item within plugin value:
>  ....|index-(basic|anchor|metadata)|urlmeta|indexer-solr|....
> 2. Add the urlmeta.tags property to the nutch-site.xml file. Add the
> keywords you want to use as values.
> <property>
>   <name>urlmeta.tags</name>
>   <value>group1,group2</value>
> </property>
> 3. In your seed.txt file add the tag values for the urls as needed. make
> sure they are tab delimited.
>    http://www.domain1.com   /tgroup1=foo   /tgroup2=bar
>    http://www.domain2.com   /tgroup1=faa   /tgroup2=bur
>
>
>
> On Thu, Apr 2, 2015 at 9:36 AM, Jonathan Cooper-Ellis <
> [email protected]> wrote:
>
> > Hey Jeff,
> >
> > Check out the urlmeta plugin. You can inject metadata in with your seed
> > list and propagate it to outlinks.
> >
> > On Thu, Apr 2, 2015 at 10:09 AM, Jeff Cocking <[email protected]>
> > wrote:
> >
> > > Environment:  Nutch 1.9, Solr 5.0
> > >
> > > I am trying to define a group (category) of websites. Each website will
> > > have assigned group (1 to many). The assignment is known before the
> > > creation of seed.txt file.  All pages within the website should inherit
> > the
> > > assigned group(s). The assigned group(s) need to be passed to Solr for
> > > faceted search.
> > >
> > > For example:
> > > www.site1.com group1, group2 group3
> > > All pages within www.site1.com inherit group1, group2, group3
> > >
> > > www.site2.com group2, group4, group5
> > > All pages within www.site2.com inherit group2, group4, group5
> > >
> > > Thoughts on ways to accomplish this?
> > >
> > > Thank you in advance.
> > >
> > > jeff
> > >
> >
> >
> >
> > --
> > Jonathan Cooper-Ellis
> > Field Enablement Engineer
> > <http://www.cloudera.com>
> >
>



-- 
Jonathan Cooper-Ellis
Field Enablement Engineer
<http://www.cloudera.com>

Reply via email to