Re: Tagging

2007-02-13 Thread Yonik Seeley
On 2/13/07, Erik Hatcher <[EMAIL PROTECTED]> wrote: On Feb 13, 2007, at 9:01 PM, Yonik Seeley wrote: >> And yeah, Peter is a solr4lib kinda guy, doing some way cool stuff >> with Lucene and Solr already: > search/? >> search=raw&pageNumber=1&index=peelbib&field

Re: Tagging

2007-02-13 Thread Mike Klaas
On 2/13/07, Erik Hatcher <[EMAIL PROTECTED]> wrote: Sorry if I'm sending things mangled somehow - and if anyone has suggestions on correcting I'm all ears. Unfortunately, no. There is some precedent for putting angle brackets around URLs in e- mails: this mechanism was documented in Tim Ber

Re: Help with tuning solr

2007-02-13 Thread Mike Klaas
On 2/13/07, Ian Meyer <[EMAIL PROTECTED]> wrote: On 2/13/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: > Yes, sorting by fields does take up memory (the fieldcache). > 256M is pretty small for a 5M doc index. > If you have any more memory slots, spring for some more memory (a > little over $100 for

Re: Help with tuning solr

2007-02-13 Thread Ian Meyer
On 2/13/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: Yes, sorting by fields does take up memory (the fieldcache). 256M is pretty small for a 5M doc index. If you have any more memory slots, spring for some more memory (a little over $100 for 1GB). Yeah, I'll see if I can give solr a bit more.

Re: Tagging

2007-02-13 Thread Erik Hatcher
On Feb 13, 2007, at 9:23 PM, Yonik Seeley wrote: On 2/13/07, Erik Hatcher <[EMAIL PROTECTED]> wrote: On Feb 13, 2007, at 9:01 PM, Yonik Seeley wrote: >> And yeah, Peter is a solr4lib kinda guy, doing some way cool stuff >> with Lucene and Solr already: > sea

Re: Help with tuning solr

2007-02-13 Thread Yonik Seeley
Yes, sorting by fields does take up memory (the fieldcache). 256M is pretty small for a 5M doc index. If you have any more memory slots, spring for some more memory (a little over $100 for 1GB). Lucene also likes to have free memory left over available for OS cache - otherwise searches start to

Re: Tagging

2007-02-13 Thread Yonik Seeley
On 2/13/07, Erik Hatcher <[EMAIL PROTECTED]> wrote: On Feb 13, 2007, at 9:01 PM, Yonik Seeley wrote: >> And yeah, Peter is a solr4lib kinda guy, doing some way cool stuff >> with Lucene and Solr already: > search/? >> search=raw&pageNumber=1&index=peelbib&field

Re: Tagging

2007-02-13 Thread Erik Hatcher
On Feb 13, 2007, at 9:01 PM, Yonik Seeley wrote: And yeah, Peter is a solr4lib kinda guy, doing some way cool stuff with Lucene and Solr already: FYI, your mailer is alway

Re: Tagging

2007-02-13 Thread Yonik Seeley
On 2/13/07, Erik Hatcher <[EMAIL PROTECTED]> wrote: There is also the possibility of keeping tags with the original documents and having them individually updated without having to resend the original full text as well: But it does require havin

Re: Tagging

2007-02-13 Thread Erik Hatcher
There is also the possibility of keeping tags with the original documents and having them individually updated without having to resend the original full text as well: And yeah, Peter is a solr4lib kinda guy, doing some way cool stuff with

Help with tuning solr

2007-02-13 Thread Ian Meyer
All, I'm having some performance issues with solr. I will give some background on our setup and implementation of solr. I'm completely open to reworking everything if the way we are currently doing things are not optimal. I'll try to be as verbose as I can in explaining all of this, but feel free

Re: non-relative scoring

2007-02-13 Thread Walter Underwood
You can declare the top result to be 100% and scale from there. "Percent relevant" is not a concept that really holds together. What does it mean to be 100% relevant? I'm not even sure what "twice as relevant" means. A tf.idf engine, like Lucene, might not have a maximum score. What if a document

non-relative scoring

2007-02-13 Thread solr
Is it possible to generate a non-relative score for each result, from solr? I would like to be able to generate a web page that shows the first 3 results' scores as 87%, 73%, and 72%. If the range of solr document match scores were between 0 and 1, it would be easy. But I never know what my MaxS

Re: Tagging

2007-02-13 Thread Yonik Seeley
On 2/13/07, Binkley, Peter <[EMAIL PROTECTED]> wrote: I still wonder if there's a good way of storing the tags outside the Lucene index and using them via facets whose bitsets are manipulated directly rather than being populated from the index. In my project, reindexing a documents whenever a use

Re: question about synonyms

2007-02-13 Thread Yonik Seeley
On 2/13/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: : To be clear, no clean way to do *expansion* as opposed to reduction at : query time, when the alternatives are of different lengths. Reduction at query time doesn't work either ... when query parser sees the string: my best buy ..

Re: convert custom facets to Solr facets...

2007-02-13 Thread Yonik Seeley
On 2/12/07, Erik Hatcher <[EMAIL PROTECTED]> wrote: On Feb 12, 2007, at 9:10 PM, Gmail Account wrote: > This would be great! I can't help with the solution but I am very > interested in using it if one of you guys can figure it out. > > I can't wait to see if this works out. And just for the re

Re: question about synonyms

2007-02-13 Thread Chris Hostetter
: To be clear, no clean way to do *expansion* as opposed to reduction at : query time, when the alternatives are of different lengths. Reduction at query time doesn't work either ... when query parser sees the string: my best buy ...it analyzes each white space sepearted string seperately

Re: Incremental replication...

2007-02-13 Thread Bill Au
FYI, additional information on replication is available in the Solr TWiki: http://wiki.apache.org/solr/CollectionDistribution Bill On 2/13/07, Bertrand Delacretaz <[EMAIL PROTECTED]> wrote: On 2/13/07, escher2k <[EMAIL PROTECTED]> wrote: > ...Atleast from looking at the snapshooter script, i

Re: Question re snapinstaller

2007-02-13 Thread Yonik Seeley
On 2/13/07, Bill Au <[EMAIL PROTECTED]> wrote: Solr snapshots are created using hard links. The file is not deleted as long as there is 1 or more link to it. Or a process that holds it open. It would work even if there were no links in the filesystem because the IndexReader would still be hol

Re: question about synonyms

2007-02-13 Thread Yonik Seeley
On 2/13/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: : I am using the synonyms only at query time. : Below is the field analysis. FYI: I think what yonik ment was the section of your schema.xml that defines the fieldtype. : It seems like the culpit is the space in the phrase "best buy" in :

Re: Question re snapinstaller

2007-02-13 Thread Bill Au
Solr snapshots are created using hard links. The file is not deleted as long as there is 1 or more link to it. Bill On 2/13/07, Mike Klaas <[EMAIL PROTECTED]> wrote: On 2/13/07, Ken Krugler <[EMAIL PROTECTED]> wrote: > >A Lucene's IndexReader opens all index files it needs when it is instant

Re: question about synonyms

2007-02-13 Thread Chris Hostetter
: I am using the synonyms only at query time. : Below is the field analysis. FYI: I think what yonik ment was the section of your schema.xml that defines the fieldtype. : It seems like the culpit is the space in the phrase "best buy" in : synonyms.txt. because of some limitations in the way Ana

Re: question about highlighting

2007-02-13 Thread Chris Hostetter
: This is part of the response: what's the rest of the response? highlighting info comes in a seperate block, after the section. (for the record, "hl=on" should work fine too) -Hoss

Re: Gentoo: problem with xml-apis.jar/Apache Tomcat Native Library

2007-02-13 Thread Chris Hostetter
Solr isn't really doing anything particularly special when this RuntimeException occurs, the line is purely... static final XPathFactory xpathFactory = XPathFactory.newInstance(); According to the 1.5 javadocs for this method... Get a new XPathFactory instance using the default object mod

Re: question about synonyms

2007-02-13 Thread nick19701
Yonik Seeley wrote: > > Are you using the synonyms at index time, query time, or both? > Did you reindex if you made changes to an "index" analyzer? > It would help if you post the fieldtype for the field you are searching. > I am using the synonyms only at query time. Below is the field analy

Re: Incremental replication...

2007-02-13 Thread Bertrand Delacretaz
On 2/13/07, escher2k <[EMAIL PROTECTED]> wrote: ...Atleast from looking at the snapshooter script, it doesn't seem to be doing anything specific... The snapshooter script only makes an "instant snapshot" of the index directory using cp -lr. This does not involve any copying of index data. The

RE: Incremental replication...

2007-02-13 Thread escher2k
Graham Stead-2 wrote: > > We have used replication for a few weeks now and it generally works well. > > I believe you'll find that commit operations cause only new segments to be > transferred, whereas optimize operations cause the entire index to be > transferred. Therefore, the amount of data

RE: Incremental replication...

2007-02-13 Thread Graham Stead
We have used replication for a few weeks now and it generally works well. I believe you'll find that commit operations cause only new segments to be transferred, whereas optimize operations cause the entire index to be transferred. Therefore, the amount of data transferred really depends on how fr

Incremental replication...

2007-02-13 Thread escher2k
I was wondering if the scripts provided in Solr do incremental replication. Looking at the script for snapshooter, it seems like the whole index directory is copied over. Is that correct ? If so, isn't performance a problem over the long run ? Thanks for the clarification in advance (I hope I am w

Re: question about synonyms

2007-02-13 Thread Yonik Seeley
On 2/13/07, nick19701 <[EMAIL PROTECTED]> wrote: Hi, I put this line in my synonyms.txt bestbuy,bb,best buy I expect that when bb is searched, all results including "bestbuy", "bb" or "best buy" will be returned. But in my test I only got back the results which include "bestbuy" or "best buy".

Re: Question re snapinstaller

2007-02-13 Thread Mike Klaas
On 2/13/07, Ken Krugler <[EMAIL PROTECTED]> wrote: >A Lucene's IndexReader opens all index files it needs when it is instantiated. >Changes to a Lucene index via IndexWriter never change an existing >file... new files are always created. >Put the two together and it allows an IndexWriter (or any

Re: Question re snapinstaller

2007-02-13 Thread Ken Krugler
On 2/13/07, Ken Krugler <[EMAIL PROTECTED]> wrote: Hi all, In looking at the snapinstaller script, it seems to do the following: 1. Copy a new index directory from the master to the slave's Solr data directory, giving it a name "index.tmp". 2. Delete the current index directory ("index"). 3.

question about synonyms

2007-02-13 Thread nick19701
Hi, I put this line in my synonyms.txt bestbuy,bb,best buy I expect that when bb is searched, all results including "bestbuy", "bb" or "best buy" will be returned. But in my test I only got back the results which include "bestbuy" or "best buy". The results which include "bb" are not returned.

Re: question about highlighting

2007-02-13 Thread nick19701
Hi, Andre, I tried hl=true. But it still doesn't work. Here is my request: select?indent=on&version=2.2&q=pageContent%3Adell&start=0&rows=10&fl=pageContent&qt=standard&wt=standard&explainOther=&hl=true&hl.fl=pageContent This is part of the response: standard 10 0 pageContent on pageContent t

RE: Tagging

2007-02-13 Thread Binkley, Peter
I still wonder if there's a good way of storing the tags outside the Lucene index and using them via facets whose bitsets are manipulated directly rather than being populated from the index. In my project, reindexing a documents whenever a user adds a tag is very very bad, since we're indexing pote

Re: question about highlighting

2007-02-13 Thread Andre Halama
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 nick19701 schrieb: Hi Nick, > select?indent=on&version=2.2&q=dell&start=0&rows=10&fl=*%2Cscore&qt=standard&wt=standard&explainOther=&hl=on&hl.fl=pageContent try hl=true... Hth, Andre - -- Andre Halama hbz, Gruppe Portale, Projekt vascoda (TB1), So

question about highlighting

2007-02-13 Thread nick19701
I can't locate any concrete examples of using highlighting. After checking out the following wiki, http://wiki.apache.org/solr/HighlightingParameters I sent my solr server the following request: select?indent=on&version=2.2&q=dell&start=0&rows=10&fl=*%2Cscore&qt=standard&wt=standard&explainOthe

Re: Question re snapinstaller

2007-02-13 Thread Yonik Seeley
On 2/13/07, Ken Krugler <[EMAIL PROTECTED]> wrote: Hi all, In looking at the snapinstaller script, it seems to do the following: 1. Copy a new index directory from the master to the slave's Solr data directory, giving it a name "index.tmp". 2. Delete the current index directory ("index"). 3.

Question re snapinstaller

2007-02-13 Thread Ken Krugler
Hi all, In looking at the snapinstaller script, it seems to do the following: 1. Copy a new index directory from the master to the slave's Solr data directory, giving it a name "index.tmp". 2. Delete the current index directory ("index"). 3. Rename the temp index directory to be "index". Th