RE: page rank

2007-06-22 Thread David Xiao
I have a few more questions base on your kindly replies to my first question.

1. My solr instance already indexed hundreds of thousands of documents, so how 
can I update these documents to add new field "numberField"

2. In runtime, my application might want to update value of "numberField" very 
frequency. How to achieve that via solr? Is that performance critical if many 
documents need to be updated?

3. Even I have check below wiki page for FunctionQuery, it is still not clear 
to me to understand this quoted words:
"
> In terms of score which RequestHandler are you planning to use?
> If using dismax you can define a boost function:
> recip(rord(numberField),1,1000,1000)
"
With it, how to let solr take into consideration of this numberField (kind of 
popularity factor)? 
Would it be possible to give me an example please?


Best Regards,
David




-Original Message-
From: Nick Jenkin [mailto:[EMAIL PROTECTED] 
Sent: Thursday, June 21, 2007 6:30 AM
To: solr-user@lucene.apache.org
Subject: Re: page rank

Also if you are using the standard request handler you can use the "val" hack:

foo:"bar" _val_:"recip(rord(numberField),1,1000,1000)"

You can find more info about this here:
http://wiki.apache.org/solr/FunctionQuery

-Nick

On 6/21/07, Daniel Alheiros <[EMAIL PROTECTED]> wrote:
> Hi David.
>
> Yes you can.
>
> Just define a field as a slong type field:
>
> 
>
> It can be used to sort (&sort=numberField desc) or to boost your score (it
> will depend on the RequestHandler you are going to use).
>
> In terms of score which RequestHandler are you planning to use?
> If using dismax you can define a boost function:
> recip(rord(numberField),1,1000,1000)
>
> I hope it helps.
>
> Regards,
> Daniel Alheiros
>
> On 20/6/07 16:47, "David Xiao" <[EMAIL PROTECTED]> wrote:
>
> > Hello folks,
> >
> >
> >
> > I am using solr to index web contents. I want to know is that possible to 
> > tell
> > solr about rank information of contents?
> >
> > For example, I give each content an integer number.
> >
> >
> >
> > And I hope solr take this number into consideration when it generates search
> > result. (larger number, more priority)
> >
> >
> >
> > Best Regards,
> >
> > David
> >
>
>
> http://www.bbc.co.uk/
> This e-mail (and any attachments) is confidential and may contain personal 
> views which are not the views of the BBC unless specifically stated.
> If you have received it in error, please delete it from your system.
> Do not use, copy or disclose the information in any way nor act in reliance 
> on it and notify the sender immediately.
> Please note that the BBC monitors e-mails sent or received.
> Further communication will signify your consent to this.
>
>



page rank

2007-06-20 Thread David Xiao
Hello folks,

 

I am using solr to index web contents. I want to know is that possible to tell 
solr about rank information of contents?

For example, I give each content an integer number.

 

And I hope solr take this number into consideration when it generates search 
result. (larger number, more priority)

 

Best Regards,

David



distributed search

2007-06-03 Thread David Xiao
Hello all,

 

Is there distributed support in Solr search engine? For example install solr 
instance on different server and have them load balanced.

Anyway, any suggestion/experience about Solr distributed search topic is 
appreciated.

 

Regards,

David



Crawler for solr

2007-05-11 Thread David Xiao
Hello,

 

I am using crawler to index and search some intranet webpages which need 
authorization. I wrote my own crawler for this kind of needs. But with the 
requirement is evolving, I need another crawler for external webpages (on 
internet)  too, so I am looking for a generic crawler that can integrate with 
Solr.

 

The crawler should be easy to configure and able to customize Xml output 
according to schema.xml

Does anyone had good idea?

 

Regards,

David



RE: Solr concurrent commit not updated

2007-05-11 Thread David Xiao
I have keep the id field be unique.
Actually I found the problem is due to following Python code:

P = subprocess.Popen(arguments, )
It seems that when the program ends, the sub-process started by that call is 
not finish yet. And I guess that's why staticis shows "commit but not adddoc"

Anyone have similar issue?





-Original Message-
From: James liu [mailto:[EMAIL PROTECTED] 
Sent: Friday, May 11, 2007 11:32 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr concurrent commit not updated

u should know id is unique number.

2007/5/11, David Xiao <[EMAIL PROTECTED]>:
>
> Hello all,
>
>
>
> I have tested by use post.sh in example directory to add xml documents
> into solr. It works when I add one by one.
>
> But when I have a lot of .xml file to be posted (say about 500-1000 files)
> and I wrote a shell script to call post.sh one by one. I found those xml
> files are not searchable after post.
>
>
>
> But from solr admin page / statistics I found that it records commited
> numbers. But numDocs is not updated.
>
> So why, when I use post.sh to post one xml it will be fine, but if I use
> post.sh for 500 times, each time one xml will be different behavior?
>
>
>
> Regards,
>
> David
>
>


-- 
regards
jl



Solr concurrent commit not updated

2007-05-10 Thread David Xiao
Hello all,

 

I have tested by use post.sh in example directory to add xml documents into 
solr. It works when I add one by one.

But when I have a lot of .xml file to be posted (say about 500-1000 files) and 
I wrote a shell script to call post.sh one by one. I found those xml files are 
not searchable after post.

 

But from solr admin page / statistics I found that it records commited numbers. 
But numDocs is not updated.

So why, when I use post.sh to post one xml it will be fine, but if I use 
post.sh for 500 times, each time one xml will be different behavior?

 

Regards,

David