Yes, If a value is changed in the seed.txt file, will the new values be used when the page is re-crawled/fetched?
Sorry for being vague. jeff On Fri, Apr 3, 2015 at 12:19 PM, Jonathan Cooper-Ellis < [email protected]> wrote: > If a value is changed in seed.txt? > > On Fri, Apr 3, 2015 at 12:44 PM, Jeff Cocking <[email protected]> > wrote: > > > I figured i might have to inject a csv blob and manually explode as a > > custom filter. > > > > As a second question, If a value is changed, will the new value propagate > > with normal fetching cycle? > > > > thank you. > > > > jeff > > > > On Thu, Apr 2, 2015 at 5:06 PM, Jonathan Cooper-Ellis < > > [email protected]> wrote: > > > > > Hi Jeff, > > > > > > Off the top of my head, the best way to do that might be to inject the > > > filters as a CSV blob, and write (or modify) an indexing filter to > split > > up > > > the blob and index them as separate values to a "multiValued" field in > > > Solr. > > > > > > On Thu, Apr 2, 2015 at 3:07 PM, Jeff Cocking <[email protected]> > > > wrote: > > > > > > > Jonathan et al > > > > > > > > Thank you for the reply. I used this approach and it is working with > > one > > > > minor issue. It is the "one to many" requirement for each group. The > > > intent > > > > is to use a filter query within solr on the group data element. I > have > > > > tried the following: > > > > > > > > group="filter1,filter2" > > > > group="filter1","filter2" > > > > group=filter1,filter2 > > > > group=filter1 filter2 > > > > group=filter1 group=filter2 > > > > > > > > Each of these choices create a single variable assigned to group. Do > > you > > > > have any suggestions on how to format the seed.txt file to support > the > > > "one > > > > to many" option? i.e. that each filter value can be used as a filter > > > query > > > > element within solr? > > > > > > > > > > > > For those who find this thread searching for a similar solution, here > > is > > > > how to implement urlmeta: > > > > > > > > 1. Turn on the plugin by adding urlmeta in the plugin.includes > property > > > > within nutch-site.xml. urlmeta is a standalone item within plugin > > value: > > > > ....|index-(basic|anchor|metadata)|urlmeta|indexer-solr|.... > > > > 2. Add the urlmeta.tags property to the nutch-site.xml file. Add the > > > > keywords you want to use as values. > > > > <property> > > > > <name>urlmeta.tags</name> > > > > <value>group1,group2</value> > > > > </property> > > > > 3. In your seed.txt file add the tag values for the urls as needed. > > make > > > > sure they are tab delimited. > > > > http://www.domain1.com /tgroup1=foo /tgroup2=bar > > > > http://www.domain2.com /tgroup1=faa /tgroup2=bur > > > > > > > > > > > > > > > > On Thu, Apr 2, 2015 at 9:36 AM, Jonathan Cooper-Ellis < > > > > [email protected]> wrote: > > > > > > > > > Hey Jeff, > > > > > > > > > > Check out the urlmeta plugin. You can inject metadata in with your > > seed > > > > > list and propagate it to outlinks. > > > > > > > > > > On Thu, Apr 2, 2015 at 10:09 AM, Jeff Cocking < > > [email protected]> > > > > > wrote: > > > > > > > > > > > Environment: Nutch 1.9, Solr 5.0 > > > > > > > > > > > > I am trying to define a group (category) of websites. Each > website > > > will > > > > > > have assigned group (1 to many). The assignment is known before > the > > > > > > creation of seed.txt file. All pages within the website should > > > inherit > > > > > the > > > > > > assigned group(s). The assigned group(s) need to be passed to > Solr > > > for > > > > > > faceted search. > > > > > > > > > > > > For example: > > > > > > www.site1.com group1, group2 group3 > > > > > > All pages within www.site1.com inherit group1, group2, group3 > > > > > > > > > > > > www.site2.com group2, group4, group5 > > > > > > All pages within www.site2.com inherit group2, group4, group5 > > > > > > > > > > > > Thoughts on ways to accomplish this? > > > > > > > > > > > > Thank you in advance. > > > > > > > > > > > > jeff > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Jonathan Cooper-Ellis > > > > > Field Enablement Engineer > > > > > <http://www.cloudera.com> > > > > > > > > > > > > > > > > > > > > > -- > > > Jonathan Cooper-Ellis > > > Field Enablement Engineer > > > <http://www.cloudera.com> > > > > > > > > > -- > Jonathan Cooper-Ellis > Field Enablement Engineer > <http://www.cloudera.com> >

