Re: Listing Priority
Hi, Check out the new RegexpBoostProcessor https://lucene.apache.org/solr/4_2_0/solr-core/org/apache/solr/update/processor/RegexpBoostProcessor.html which does exactly this based on a config file -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com 24. apr. 2013 kl. 00:22 skrev Furkan KAMACI furkankam...@gmail.com: Let's assume that I have written an update processor and extracted the domain and checked it with my predefined list. What should I do at indexing process and select? 2013/4/15 Alexandre Rafalovitch arafa...@gmail.com You may find the work and code contributions by Jan Høydahl quite relevant. See the presentation from 2 years ago: http://www.slideshare.net/lucenerevolution/jan-hoydahl-improving-solrs-update-chain-eurocon2011 One of the things he/they contributed is URLClassify Update Processor, it might be quite relevant. https://lucene.apache.org/solr/4_1_0/solr-core/org/apache/solr/update/processor/URLClassifyProcessor.html Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Sun, Apr 14, 2013 at 4:59 PM, Furkan KAMACI furkankam...@gmail.com wrote: I have crawled some internet pages and indexed them at Solr. When I list my results via Solr I want that: if a page has a URL(my schema includes a field for URL) that ends with .edu, .edu.az or .co.uk I will give more priority to them. How can I do it in a more efficient way at Solr?
Re: Listing Priority
Let's assume that I have written an update processor and extracted the domain and checked it with my predefined list. What should I do at indexing process and select? 2013/4/15 Alexandre Rafalovitch arafa...@gmail.com You may find the work and code contributions by Jan Høydahl quite relevant. See the presentation from 2 years ago: http://www.slideshare.net/lucenerevolution/jan-hoydahl-improving-solrs-update-chain-eurocon2011 One of the things he/they contributed is URLClassify Update Processor, it might be quite relevant. https://lucene.apache.org/solr/4_1_0/solr-core/org/apache/solr/update/processor/URLClassifyProcessor.html Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Sun, Apr 14, 2013 at 4:59 PM, Furkan KAMACI furkankam...@gmail.com wrote: I have crawled some internet pages and indexed them at Solr. When I list my results via Solr I want that: if a page has a URL(my schema includes a field for URL) that ends with .edu, .edu.az or .co.uk I will give more priority to them. How can I do it in a more efficient way at Solr?
Listing Priority
I have crawled some internet pages and indexed them at Solr. When I list my results via Solr I want that: if a page has a URL(my schema includes a field for URL) that ends with .edu, .edu.az or .co.uk I will give more priority to them. How can I do it in a more efficient way at Solr?
RE: Listing Priority
You can use boost queries to boost documents that match some query e.g. suffix:co.uk but you'll need to have URL suffixes indexed. Nutch knows about URL suffixes but does not index them. You would need to add a custom indexing filter or patch an existing filter to add a suffix field. URLUtil has methods that return the URL suffix for a given URL. http://wiki.apache.org/solr/FunctionQuery#query -Original message- From:Furkan KAMACI furkankam...@gmail.com Sent: Sun 14-Apr-2013 22:59 To: solr-user@lucene.apache.org Subject: Listing Priority I have crawled some internet pages and indexed them at Solr. When I list my results via Solr I want that: if a page has a URL(my schema includes a field for URL) that ends with .edu, .edu.az or .co.uk I will give more priority to them. How can I do it in a more efficient way at Solr?
Re: Listing Priority
You may find the work and code contributions by Jan Høydahl quite relevant. See the presentation from 2 years ago: http://www.slideshare.net/lucenerevolution/jan-hoydahl-improving-solrs-update-chain-eurocon2011 One of the things he/they contributed is URLClassify Update Processor, it might be quite relevant. https://lucene.apache.org/solr/4_1_0/solr-core/org/apache/solr/update/processor/URLClassifyProcessor.html Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Sun, Apr 14, 2013 at 4:59 PM, Furkan KAMACI furkankam...@gmail.com wrote: I have crawled some internet pages and indexed them at Solr. When I list my results via Solr I want that: if a page has a URL(my schema includes a field for URL) that ends with .edu, .edu.az or .co.uk I will give more priority to them. How can I do it in a more efficient way at Solr?