unABLE to match

2007-03-27 Thread Shridhar Venkatraman
Solr Solr Admin (GENIE) ShridharVAIO:8084 cwd=C:\Program Files\netbeans-5.5\enterprise3\apache-tomcat-5.5.17\bin SolrHome=c:\Documents and Settings\Shridhar\Desktop\Public\Sana\KN\Genie\GenieConf/ Field Analysis *Field name* *Field value (Index)* verbo

unABLE to match

2007-03-27 Thread Shridhar Venkatraman
Solr Solr Admin (GENIE) ShridharVAIO:8084 cwd=C:\Program Files\netbeans-5.5\enterprise3\apache-tomcat-5.5.17\bin SolrHome=c:\Documents and Settings\Shridhar\Desktop\Public\Sana\KN\Genie\GenieConf/ Field Analysis *Field name* *Field value (Index)* verbo

Reposting unABLE to match

2007-03-27 Thread Shridhar Venkatraman
Solr Solr Admin (GENIE) ShridharVAIO:8084 cwd=C:\Program Files\netbeans-5.5\enterprise3\apache-tomcat-5.5.17\bin SolrHome=c:\Documents and Settings\Shridhar\Desktop\Public\Sana\KN\Genie\GenieConf/ Field Analysis *Field name* *Field value (Index)* verbo

Re: Reposting unABLE to match

2007-03-27 Thread Bertrand Delacretaz
On 3/27/07, Shridhar Venkatraman <[EMAIL PROTECTED]> wrote: ...Reposting unABLE to match No need to repost if your message made it to the list. If it hasn't been answered yet, it either means that no one knows the answer or that no one has had the time to answer yet. We're all volunteers here.

Re: Reposting unABLE to match

2007-03-27 Thread Maarten . De . Vilder
what exactly is the problem ? seems like you end up with the same term text in both query and index analyzer ... you should have found a match... Shridhar Venkatraman <[EMAIL PROTECTED]> 27/03/2007 14:08 Please respond to solr-user@lucene.apache.org To solr-user@lucene.apache.org cc Subj

Re: Reposting unABLE to match

2007-03-27 Thread Shridhar Venkatraman
Sorry for the repeated postings, i was reposting only because my email text which explained the problem disappeared. This is what was at the head of the email and did not get posted previously;     Hi,     Sorry for this multiple postings...     My email text did not get posted along with the

How to make the search default use AND instead of OR?

2007-03-27 Thread Thierry Collogne
Hello, I have a small question.When I do a search and enter 2 words, seperated with a space (for example small business), the query is done like small OR business. So I get results containing small, business or small and business. In our case I would like only the results that contain small AND

Re: How to make the search default use AND instead of OR?

2007-03-27 Thread thomas arni
You can configure that in the "schema.xml" file: Thierry Collogne wrote: Hello, I have a small question.When I do a search and enter 2 words, seperated with a space (for example small business), the query is done like small OR business. So I get results containing small, business or smal

Re: How to make the search default use AND instead of OR?

2007-03-27 Thread Thierry Collogne
Thanks. Apparently I overlooked it. On 27/03/07, thomas arni <[EMAIL PROTECTED]> wrote: You can configure that in the "schema.xml" file: Thierry Collogne wrote: > Hello, > > I have a small question.When I do a search and enter 2 words, > seperated with > a space (for example small business

SolrSearchGenerator for Cocoon (2.1)

2007-03-27 Thread mirko
Hi, I looked at the SolrSearchGenerator (this is the part which is of interest to me), but I could not get it work for Cocoon 2.1 yet. It seems that the there is no getParameters method for the org.apache.cocoon.environment interface: http://cocoon.apache.org/2.1/apidocs/org/apache/cocoon/environ

Re: Reposting unABLE to match

2007-03-27 Thread Maarten . De . Vilder
the only thing i can think of is the fact that in the index analysis the term-type is "word" and in the query analysis the term-type is "alphanumeric" you should be getting a match if that doesnt matter ... you get exactly the same term texts ... Shridhar Venkatraman <[EMAIL PROTECTED]> 27

Re: Reposting unABLE to match

2007-03-27 Thread Shridhar Venkatraman
The phrase "unABLE TO CONNECT" does not match in my system. However, any         combination of case is ok as long as the first letter 'U" is in uppercase.     Bad-> uNABLE, unABLE, unaBLE     Gud-> Unable, UNable, UNAble... Any ideas ?

Re: How to make the search default use AND instead of OR?

2007-03-27 Thread Walter Underwood
I don't recommend defaulting to AND. This will increase the number of failed searches (no hits) for your users. If one word is misspelled in a multi-word AND query, you'll get no results. Since About 10% of queries are misspelled and about half of queries are multi-word, that will immediately incre

Re: How to make the search default use AND instead of OR?

2007-03-27 Thread Mike Klaas
On 3/27/07, Walter Underwood <[EMAIL PROTECTED]> wrote: I don't recommend defaulting to AND. This will increase the number of failed searches (no hits) for your users. If one word is misspelled in a multi-word AND query, you'll get no results. Since About 10% of queries are misspelled and about h

Re: How to make the search default use AND instead of OR?

2007-03-27 Thread Walter Underwood
On 3/27/07 10:57 AM, "Mike Klaas" <[EMAIL PROTECTED]> wrote: > I agree with your point above, but I fear "AND: bad! OR: good!" > becoming dogma--often AND+spellcheck is the better option. AND-with-spell-suggestion is better, but the spelling suggestion needs to be really, really good. That is rea

Filter query doesn't always work...

2007-03-27 Thread escher2k
I have a strange problem, and I don't seem to see any issue with the data. I am filtering on a field called reviews_positive_6_mos. The field is declared as an integer. If I specify - (a) fq=reviews_positive_6mos%3A[*+TO+*] => 36033 records are retrieved. (b) fq=reviews_positive_6mos%3A[*+TO+100]

Re: Filter query doesn't always work...

2007-03-27 Thread mirko
Hi, you might want to use the sint (sortable integer) fieldtype instead. If you use the integer fieldtype I guess the range queries are treated as string prefixes (like in [Ab TO Ch]). You can find some documentation about it in the example schema.xml: http://svn.apache.org/viewvc/lucene/solr/t

Re: SolrSearchGenerator for Cocoon (2.1)

2007-03-27 Thread Thorsten Scherler
On Tue, 2007-03-27 at 10:53 -0400, [EMAIL PROTECTED] wrote: > Hi, > > I looked at the SolrSearchGenerator (this is the part which is of interest to > me), but I could not get it work for Cocoon 2.1 yet. > > It seems that the there is no getParameters method for the > org.apache.cocoon.environment

Re: Reposting unABLE to match

2007-03-27 Thread Yonik Seeley
On 3/27/07, Shridhar Venkatraman <[EMAIL PROTECTED]> wrote: The phrase "unABLE TO CONNECT" does not match in my system. However, any combination of case is ok as long as the first letter 'U" is in uppercase. Bad-> uNABLE, unABLE, unaBLE Gud-> Unable, UNable, UNAble... Any i

Re: maximum index size

2007-03-27 Thread Mike Klaas
On 3/27/07, Kevin Osborn <[EMAIL PROTECTED]> wrote: I know there are a bunch of variables here (RAM, number of fields, hits, etc.), but I am trying to get a sense of how big of an index in terms of number of documents Solr can reasonable handle. I have heard indexes of 3-4 million documents ru

Re: maximum index size

2007-03-27 Thread Kevin Osborn
- Original Message From: Mike Klaas <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Tuesday, March 27, 2007 3:20:40 PM Subject: Re: maximum index size If you are going to store a document for each customer then some field must indicate to which customer the document instance b

Re: storing results

2007-03-27 Thread Mike Klaas
On 3/27/07, Joan Codina <[EMAIL PROTECTED]> wrote: I'm using solr, to build a search engine, and it works great!! Thanks for the job,guys! Glad it is working for you. but... I need to build a searcher that must allow to perform a "search process" for a collection of documents. And this se

Re: maximum index size

2007-03-27 Thread Mike Klaas
On 3/27/07, Kevin Osborn <[EMAIL PROTECTED]> wrote: > If you are going to store a document for each customer then some field > must indicate to which customer the document instance belongs. In > that case, why not index a single copy of each document, with a field > containing a list of custome

Re: storing results

2007-03-27 Thread Joan Codina
I would like to store the results of a query someway,. Then after the user analyzes some of the documents (and he/she can take some days to do it), the user can try to make a query refinement over the previous result, getting a subset of it. To do so I need to store the results as a filter, w

Re: maximum index size

2007-03-27 Thread Venkatesh Seetharam
I've 50 million documents each about 10K in size and I've 4 index partitions each consisting of 12.5 million documents. Each index partition is about 80GB. A search typically takes about 3-5 seconds. Single word searches are faster than multi-word searches. I'm still working on finding the ideal i

RE: maximum index size

2007-03-27 Thread Andre Basse
>I've 50 million documents each about 10K in size and I've 4 index partitions each consisting of 12.5 million documents. Each index partition is about 80GB. A search typically takes about 3-5 seconds. Single word searches are faster than multi-word searches. I'm still working on finding the ideal i

Re: storing results

2007-03-27 Thread Erik Hatcher
Joan, What you're after is something custom at a layer above Solr, not something that really fits as something built into Solr. For example, I've implemented "saved searches" in both Collex www.nines.org/collex> and Flare (it's a proof-of-concept in Flare, saved only in Rails session scope)

Re: Filter query doesn't always work...

2007-03-27 Thread escher2k
Thanks a lot Mirklo. That seems to work. mirko-9 wrote: > > Hi, > > you might want to use the sint (sortable integer) fieldtype instead. If > you use > the integer fieldtype I guess the range queries are treated as string > prefixes > (like in [Ab TO Ch]). > > You can find some documentatio

Re: How to make the search default use AND instead of OR?

2007-03-27 Thread Chris Hostetter
: A more nuanced answer would be to finesse the MM parameter so shorter : multi-word queries behave as AND, and longer queries allow more : flexibility (this could probably be achieved by using a high : percentage setting, but I'd have to double check how the rounding is : done). you are correct

Re: How to make the search default use AND instead of OR?

2007-03-27 Thread Chris Hostetter
: I disabled MM. It was giving me too many no-hits on real : user queries. I think it makes the engine somewhat mysterious : to users. if you have some examples you can share i'd love to hear about them ... it seemed like a good idea when i wrote it, and it seems to work well in the instances whe

Re: storing results

2007-03-27 Thread Chris Hostetter
: To do so I need to store the results as a filter, with a given name, so : the user can use it later on. But I need to store this in disk, as I can : not trust on the cache or the web session. : The user should then indicate that the query that is doing now has a : filter (a previous query) and

Re: Reposting unABLE to match

2007-03-27 Thread Chris Hostetter
:     Sorry for this multiple postings... :     My email text did not get posted along with the attachment, don't know why ? :     Here it is again. in general: don't use attachments, paste text directly into hte body of your email, that may have had soemthing to do with your problem. -Hoss

Document boost not as expected...

2007-03-27 Thread escher2k
I am implementing a document boost at indexing time for the documents. I read some posting that seemed to indicate that omitNorm=false is needed to retain the document boosting for retrieval. After I did that, it looks like I am not able to get back the boost I originally put in. Instead, I get 1.

Re: Document boost not as expected...

2007-03-27 Thread Mike Klaas
On 3/27/07, escher2k <[EMAIL PROTECTED]> wrote: I am implementing a document boost at indexing time for the documents. I read some posting that seemed to indicate that omitNorm=false is needed to retain the document boosting for retrieval. After I did that, it looks like I am not able to get bac

Re: Document boost not as expected...

2007-03-27 Thread Chris Hostetter
Ditto everything Mike said, but i'm also curious what Similarity changes you made ... without knowing what that code looks like, all bets are off in terms of anyone being able to help you understand the scores you are seeing. : I am not quite sure how the score changed from 1.33 to 1.25. I am not

Re: How to make the search default use AND instead of OR?

2007-03-27 Thread Mike Klaas
On 3/27/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: if you want strict "AND" style matching with dismax, just use "100%" for mm. MinShouldMatch is one of the few things about DisMax that is extremely well documented... http://lucene.apache.org/solr/api/org/apache/solr/util/doc-files/min-sho

Re: maximum index size

2007-03-27 Thread Venkatesh Seetharam
Hi Andre, Comments are inline. What hardware are you running? 4 Dual-proc 64 GB blades for each searcher and a broker that merges results on 64 bit SUSE linux running JDK 1.6 with 8GB Heap. Do you use collection distribution? Nope. I use hadoop to index the documents. Thanks, Venkatesh On