Re: Listing Priority

2013-04-24 Thread Jan Høydahl
Hi,

Check out the new RegexpBoostProcessor 
https://lucene.apache.org/solr/4_2_0/solr-core/org/apache/solr/update/processor/RegexpBoostProcessor.html
 which does exactly this based on a config file

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

24. apr. 2013 kl. 00:22 skrev Furkan KAMACI furkankam...@gmail.com:

 Let's assume that I have written an update processor and extracted the
 domain and checked it with my predefined list. What should I do at indexing
 process and select?
 
 
 2013/4/15 Alexandre Rafalovitch arafa...@gmail.com
 
 You may find the work and code contributions by Jan Høydahl quite
 relevant. See the presentation from 2 years ago:
 
 http://www.slideshare.net/lucenerevolution/jan-hoydahl-improving-solrs-update-chain-eurocon2011
 
 One of the things he/they contributed is URLClassify Update Processor,
 it might be quite relevant.
 
 https://lucene.apache.org/solr/4_1_0/solr-core/org/apache/solr/update/processor/URLClassifyProcessor.html
 
 Regards,
   Alex.
 Personal blog: http://blog.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all
 at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
 book)
 
 
 On Sun, Apr 14, 2013 at 4:59 PM, Furkan KAMACI furkankam...@gmail.com
 wrote:
 I have crawled some internet pages and indexed them at Solr.
 
 When I list my results via Solr I want that: if a page has a URL(my
 schema
 includes a field for URL) that ends with .edu, .edu.az or .co.uk I will
 give more priority to them.
 
 How can I do it in a more efficient way at Solr?
 



Re: Listing Priority

2013-04-23 Thread Furkan KAMACI
Let's assume that I have written an update processor and extracted the
domain and checked it with my predefined list. What should I do at indexing
process and select?


2013/4/15 Alexandre Rafalovitch arafa...@gmail.com

 You may find the work and code contributions by Jan Høydahl quite
 relevant. See the presentation from 2 years ago:

 http://www.slideshare.net/lucenerevolution/jan-hoydahl-improving-solrs-update-chain-eurocon2011

 One of the things he/they contributed is URLClassify Update Processor,
 it might be quite relevant.

 https://lucene.apache.org/solr/4_1_0/solr-core/org/apache/solr/update/processor/URLClassifyProcessor.html

 Regards,
Alex.
 Personal blog: http://blog.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all
 at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
 book)


 On Sun, Apr 14, 2013 at 4:59 PM, Furkan KAMACI furkankam...@gmail.com
 wrote:
  I have crawled some internet pages and indexed them at Solr.
 
  When I list my results via Solr I want that: if a page has a URL(my
 schema
  includes a field for URL) that ends with .edu, .edu.az or .co.uk I will
  give more priority to them.
 
  How can I do it in a more efficient way at Solr?



Listing Priority

2013-04-14 Thread Furkan KAMACI
I have crawled some internet pages and indexed them at Solr.

When I list my results via Solr I want that: if a page has a URL(my schema
includes a field for URL) that ends with .edu, .edu.az or .co.uk I will
give more priority to them.

How can I do it in a more efficient way at Solr?


RE: Listing Priority

2013-04-14 Thread Markus Jelsma
You can use boost queries to boost documents that match some query e.g. 
suffix:co.uk but you'll need to have URL suffixes indexed. Nutch knows about 
URL suffixes but does not index them. You would need to add a custom indexing 
filter or patch an existing filter to add a suffix field. URLUtil has methods 
that return the URL suffix for a given URL.

http://wiki.apache.org/solr/FunctionQuery#query

 
 
-Original message-
 From:Furkan KAMACI furkankam...@gmail.com
 Sent: Sun 14-Apr-2013 22:59
 To: solr-user@lucene.apache.org
 Subject: Listing Priority
 
 I have crawled some internet pages and indexed them at Solr.
 
 When I list my results via Solr I want that: if a page has a URL(my schema
 includes a field for URL) that ends with .edu, .edu.az or .co.uk I will
 give more priority to them.
 
 How can I do it in a more efficient way at Solr?
 


Re: Listing Priority

2013-04-14 Thread Alexandre Rafalovitch
You may find the work and code contributions by Jan Høydahl quite
relevant. See the presentation from 2 years ago:
http://www.slideshare.net/lucenerevolution/jan-hoydahl-improving-solrs-update-chain-eurocon2011

One of the things he/they contributed is URLClassify Update Processor,
it might be quite relevant.
https://lucene.apache.org/solr/4_1_0/solr-core/org/apache/solr/update/processor/URLClassifyProcessor.html

Regards,
   Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Sun, Apr 14, 2013 at 4:59 PM, Furkan KAMACI furkankam...@gmail.com wrote:
 I have crawled some internet pages and indexed them at Solr.

 When I list my results via Solr I want that: if a page has a URL(my schema
 includes a field for URL) that ends with .edu, .edu.az or .co.uk I will
 give more priority to them.

 How can I do it in a more efficient way at Solr?