Re: spellchecking multiple fields?

2008-07-15 Thread Shalin Shekhar Mangar
One way would be to create a copyField containing both the fields and use it as the dictionary's source. If you do want to keep separate dictionaries for both the fields then I guess we can introduce per-dictionary overridable parameters like the per-field overridden facet parameters. That would b

spellchecking multiple fields?

2008-07-15 Thread Ryan McKinley
I have a use case where I want to spellcheck the input query across multiple fields: Did you mean: location = washington vs Did you mean: person = washington The current parameter / response structure for the spellcheck component does not support this kind of thing. Any thoughts on how/i

Re: Slow deleteById request

2008-07-15 Thread Renaud Delbru
Hi, I think the reason was indeed maxPendingDeletes which was configured to 1000. After having updated to a solr nightly build with Lucene 2.4, the issue seems to have disappeared. Thanks for your advices. -- Renaud Delbru Mike Klaas wrote: On 1-Jul-08, at 10:44 PM, Chris Hostetter wrote:

Re: solr synonyms behaviour

2008-07-15 Thread swarag
Yonik Seeley wrote: > > On Tue, Jul 15, 2008 at 2:27 PM, swarag <[EMAIL PROTECTED]> > wrote: >> To my understanding, this means I am using synonyms at index time and NOT >> query time. And yet, I am still having these problems with synonyms. > > Can you give a specific example? Use debugQuery=

Re: 2 IDs in schema.xml

2008-07-15 Thread Shalin Shekhar Mangar
Multiple uniqueKeys are not supported. You must use only one field as the uniqueKey. On Tue, Jul 15, 2008 at 11:52 PM, dudes dudes <[EMAIL PROTECTED]> wrote: > > Hi > > With some strange reason hotmail doesn't send any XML tags through. I have > attached a file with all the necessary xml tags the

Re: FileBasedSpellChecker behavior?

2008-07-15 Thread Shalin Shekhar Mangar
Also see https://issues.apache.org/jira/browse/SOLR-622 On Wed, Jul 16, 2008 at 2:25 AM, Yonik Seeley <[EMAIL PROTECTED]> wrote: > On Tue, Jul 15, 2008 at 4:19 PM, Grant Ingersoll <[EMAIL PROTECTED]> > wrote: > > agreed, but there is a problem in Solr, AIUI, with regards to when the > > readers a

Re: FileBasedSpellChecker behavior?

2008-07-15 Thread Yonik Seeley
On Tue, Jul 15, 2008 at 4:19 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > agreed, but there is a problem in Solr, AIUI, with regards to when the > readers are available and when inform() gets called. The workaround is to > have a warming query, I believe. Right... see https://issues.apache.or

Re: FileBasedSpellChecker behavior?

2008-07-15 Thread Grant Ingersoll
On Jul 15, 2008, at 3:49 PM, Ryan McKinley wrote: Hi- I'm messing with spellchecking and running into behavior that seems peculiar. We have an index with many words including: "swim" and "slim" If I search for "slim", it returns "swim" as an option -- likewise, if I search for "slim" it

FileBasedSpellChecker behavior?

2008-07-15 Thread Ryan McKinley
Hi- I'm messing with spellchecking and running into behavior that seems peculiar. We have an index with many words including: "swim" and "slim" If I search for "slim", it returns "swim" as an option -- likewise, if I search for "slim" it returns "swim" why does it check words that are in

RE: Wiki for 1.3

2008-07-15 Thread sundar shankar
THANKS!!! > Date: Tue, 15 Jul 2008 11:38:06 -0700> From: [EMAIL PROTECTED]> To: > solr-user@lucene.apache.org> Subject: RE: Wiki for 1.3> > > : Thanks. Do we > expect the same some time soon. I agree that the user > : community have shed > light in with a lot of examples. Just wanna know if > :

Re: solr synonyms behaviour

2008-07-15 Thread Yonik Seeley
On Tue, Jul 15, 2008 at 2:27 PM, swarag <[EMAIL PROTECTED]> wrote: > To my understanding, this means I am using synonyms at index time and NOT > query time. And yet, I am still having these problems with synonyms. Can you give a specific example? Use debugQuery=true to see what the resulting quer

Re: Solr stops responding

2008-07-15 Thread Fuad Efendi
Sorry for bunch of short self-replies, just trying to analyse... CPU may get overloaded by constantly running GC trying to defragment&optimize memory, in a loop (constant queue of requests); response time will be few minutes (in best cases) and contain 500... so that sometimes we can't see

RE: Wiki for 1.3

2008-07-15 Thread Chris Hostetter
: Thanks. Do we expect the same some time soon. I agree that the user : community have shed light in with a lot of examples. Just wanna know if : there was more that could be done. I am looking at the java docs of the : same too and that helps to some extent. But have felt the wiki was very :

Re: solr synonyms behaviour

2008-07-15 Thread swarag
matt connolly wrote: > > You won't have the multiple word problem if you use synonyms at index time > instead of query time. > > > swarag wrote: >> >> Here is a basic example of some synonyms in my synonyms.txt: >> club=>club,bar,night cabaret >> bar=>bar,club >> >> As you can see, a search

2 IDs in schema.xml

2008-07-15 Thread dudes dudes
Hi With some strange reason hotmail doesn't send any XML tags through. I have attached a file with all the necessary xml tags there , thanks :) I have a rare situation and I'm not too sure how to resolve it. I have defined 2 fields.. one is call userID and the other one is called companyID in

Re: Solr stops responding

2008-07-15 Thread Fuad Efendi
Just as a sample, SolrCore contains blocks like } catch (Throwable e) { SolrException.logOnce(log,null,e); } And SolrServlet: } catch (Throwable e) { SolrException.log(log,e); sendErr(500, SolrException.toStr(e), request, response); } What will happen with OutOfMemoryError? I

Re: Duplicate content

2008-07-15 Thread Ryan McKinley
On Jul 15, 2008, at 10:31 AM, Fuad Efendi wrote: Thanks Ryan, Is really unique if we allow duplicates? I had similar problem... if you allowDups, then uniqueKey may not be unique... however, it is still used as the key for many items. Quoting Ryan McKinley <[EMAIL PROTECTED]>: O

Re: solr synonyms behaviour

2008-07-15 Thread matt connolly
You won't have the multiple word problem if you use synonyms at index time instead of query time. swarag wrote: > > Here is a basic example of some synonyms in my synonyms.txt: > club=>club,bar,night cabaret > bar=>bar,club > > As you can see, a search for 'bar' will return any documents with

Re: Filter by Type increases search results.

2008-07-15 Thread Yonik Seeley
On Tue, Jul 15, 2008 at 11:10 AM, Norberto Meijome <[EMAIL PROTECTED]> wrote: > On Tue, 15 Jul 2008 18:07:43 +0530 > "Preetam Rao" <[EMAIL PROTECTED]> wrote: > >> When I say filter, I meant q=fish&fq=type:idea > > btw, this *seems* to only work for me with standard search handler. dismax > and fq:

Re: solr synonyms behaviour

2008-07-15 Thread swarag
matt connolly wrote: > > > swarag wrote: >> >> Knowing the Lucene struggles with multi-word query-time synonyms, my >> question is, does this also affect index-time synonyms? What other >> alternatives do we have if we require there to be multiple word synonyms? >> > > No the multiple word p

Re: Solr stops responding

2008-07-15 Thread Fuad Efendi
I suspect that SolrException is used to catch ALL exceptions in order to show "500 OutOfMemory" in HTML/XML/JSON etc., so that JVM simply hangs... weird HTTP understanding... Quoting Fuad Efendi <[EMAIL PROTECTED]>: Following lines are strange, looks like SOLR deals with OOM and rethrows

Re: WordDelimiterFilter splits at non-ASCII chars

2008-07-15 Thread Yonik Seeley
On Tue, Jul 15, 2008 at 10:29 AM, Stefan Oestreicher <[EMAIL PROTECTED]> wrote: > as I understand the WordDelimiterFilter should split on case changes, word > delimiters and changes from character to digit, but it should not > differentiate between ASCII and multibyte chars. It does however. The wo

Re: WordDelimiterFilter splits at non-ASCII chars

2008-07-15 Thread Shalin Shekhar Mangar
Hi Stefan, I wrote a test case for the problem you described but it is working fine. I used the following definition: What configuration are you using? If it is different, please share it so that I can test with it. On Tue, Jul 15, 2008 at 7:59 PM, Stefan Oestreicher < [EMAIL PROTECTED]> wrote

Re: Solr stops responding

2008-07-15 Thread Fuad Efendi
Following lines are strange, looks like SOLR deals with OOM and rethrows own exception (so that in some cases JVM simply hangs instead of exit): Apr 4, 2008 1:20:53 PM org.apache.solr.common.SolrException log SEVERE: java.lang.OutOfMemoryError: Java heap space This is full Thread Dump a

Best way to return ExternalFileField in the results

2008-07-15 Thread climbingrose
Hi all, I've been trying to return a field of type ExternalFileField in the search result. Upon examining XMLWriter class, it seems like Solr can't do this out of the box. Therefore, I've tried to hack Solr to enable this behaviour. The goal is to call to ExternalFileField.getValueSource(SchemaFie

Re: Solr stops responding

2008-07-15 Thread Noble Paul നോബിള്‍ नोब्ळ्
Can we collect more information. It would be nice to know what the threads are doing when it hangs. If you are using *nix issue kill -3 it would print out the stacktrace of all the threads in the VM . That may tell us what is the state of each thread which could help us suggest something On Tue

Re: Solr stops responding

2008-07-15 Thread Doug Steigerwald
We haven't seen an OutOfMemoryError. The load on the server doesn't go up either (hovers around 1-2). We're on Java 1.6.0_03-b05. 4x3.8GHz Xeons, 8GB RAM. Doug On Jul 15, 2008, at 11:29 AM, Fuad Efendi wrote: I constantly have the same problem; sometimes I have OutOfMemoryError in logs

Re: Duplicate content

2008-07-15 Thread Fuad Efendi
Thanks Ryan, Is really unique if we allow duplicates? I had similar problem... Quoting Ryan McKinley <[EMAIL PROTECTED]>: On Jul 15, 2008, at 2:45 AM, Sunil wrote: Hi All, I want to change the duplicate content behavior in solr. What I want to do is: 1) I don't want duplicate content. 2

Re: Solr stops responding

2008-07-15 Thread Fuad Efendi
I constantly have the same problem; sometimes I have OutOfMemoryError in logs, sometimes not. Not-predictable. I minimized all caches, it still happens even with 8192M. CPU usage is 375%-400% (two double-core Opterons), SUN Java 5. Moved to BEA JRockit 5 yesterday, looks 30 times faster (25%

RE: Wiki for 1.3

2008-07-15 Thread sundar shankar
Thanks. Do we expect the same some time soon. I agree that the user community have shed light in with a lot of examples. Just wanna know if there was more that could be done. I am looking at the java docs of the same too and that helps to some extent. But have felt the wiki was very very useful

Re: Solr stops responding

2008-07-15 Thread Jarek Zgoda
Doug Steigerwald pisze: > We're running Solr with basically the example Solr setup with Jetty > (6.1.3). We package our Solr install by using 'ant example' and > replacing configs/etc. Whenever Solr stops responding, there are no > messages in the logs, nothing. Requests just time out. > > We

Re: solr:sorting on what type is faster

2008-07-15 Thread Shalin Shekhar Mangar
If a sort is not specified then documents are returned in decreasing order of their score. You can get more details on the scoring at http://lucene.apache.org/java/docs/scoring.html On Tue, Jul 15, 2008 at 6:03 PM, sumantht <[EMAIL PROTECTED]> wrote: > > hi, > in databases, sorting based on text

Solr stops responding

2008-07-15 Thread Doug Steigerwald
Since we pushed Solr out to production a few weeks ago, we've seen a few issues with Solr not responding to requests (searches or admin pages). There doesn't seem to be any reason for it from what we can tell. We haven't seen it in QA or development. We're running Solr with basically the

Re: Filter by Type increases search results.

2008-07-15 Thread Norberto Meijome
On Tue, 15 Jul 2008 18:07:43 +0530 "Preetam Rao" <[EMAIL PROTECTED]> wrote: > When I say filter, I meant q=fish&fq=type:idea btw, this *seems* to only work for me with standard search handler. dismax and fq: dont' seem to get along nicely... but maybe, it is just late and i'm not testing it pro

Re: Duplicate content

2008-07-15 Thread Ryan McKinley
On Jul 15, 2008, at 2:45 AM, Sunil wrote: Hi All, I want to change the duplicate content behavior in solr. What I want to do is: 1) I don't want duplicate content. 2) I don't want to overwrite old content with new one. Means, if I add duplicate content in solr and the content already exis

WordDelimiterFilter splits at non-ASCII chars

2008-07-15 Thread Stefan Oestreicher
Hi, as I understand the WordDelimiterFilter should split on case changes, word delimiters and changes from character to digit, but it should not differentiate between ASCII and multibyte chars. It does however. The word "hälse" (german plural of "neck") gets split into "h", "ä" and "lse", which un

Re: which type of fields are to be compressed

2008-07-15 Thread Erick Erickson
Compression is only relevant for the original text, not the indexed part. So in terms of searching, it's irrelevant. Where it is relevant is when you *fetch* the document (e.g. doe = hits.doc(32)), the de-compression work is done (for stored documents). Depending upon your app, this may or may not

RE: Duplicate content

2008-07-15 Thread Sunil
Thanks guys. -Original Message- From: Norberto Meijome [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 15, 2008 2:35 PM To: solr-user@lucene.apache.org Subject: Re: Duplicate content On Tue, 15 Jul 2008 10:48:14 +0200 Jarek Zgoda <[EMAIL PROTECTED]> wrote: > >> 2) I don't want to overwri

Re: Filter by Type increases search results.

2008-07-15 Thread matt connolly
Of course - it's so obvious now. Thanks! -- View this message in context: http://www.nabble.com/Filter-by-Type-increases-search-results.-tp18462188p18464457.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Filter by Type increases search results.

2008-07-15 Thread Preetam Rao
Hi Matt, When I say filter, I meant q=fish&fq=type:idea What you are trying is a boolean OR of defaultsearchfield.:fish OR type:idea. Its not a filter, its an OR. Obviously you will get a union of results... -- Preetam On Tue, Jul 15, 2008 at 5:37 PM, matt connolly <[EMAIL PROTECTED]>

solr:sorting on what type is faster

2008-07-15 Thread sumantht
hi, in databases, sorting based on text fields is faster and preferable, if i am not wrong. similarly, which type of fields are to be chosen to sort in 'solr'? how the ties are broken? sorry for mistakes, if any .. thank you -- View this message in context: http://www.nabble.com/solr%3Asorting

which type of fields are to be compressed

2008-07-15 Thread sumantht
hi is it preferable to compress each and every field, if not why.? how exactly it helps? -- View this message in context: http://www.nabble.com/which-type-of-fields-are-to-be-compressed-tp18464056p18464056.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Filter by Type increases search results.

2008-07-15 Thread matt connolly
Yes, the same, except for the filter. For example: http://localhost:8983/solr/select?q=fish returns: etc (followed by 2 docs) http://localhost:8983/solr/select?q=fish+type:idea returns: . (followed by 9 docs) -Matt Preetam Rao wrote: > > Hi Matt, > > Other than applying o

Re: Filter by Type increases search results.

2008-07-15 Thread Preetam Rao
Hi Matt, Other than applying one more fq, is everything else remains same between the two queries, like q and all other parameters ? My understanding is that, fq is an intersection on the set of results returned from q. So it should always be a subset of results returned from q. So if one uses ju

Re: Dismax request handler and sub phrase matches... suggestion for another handler..

2008-07-15 Thread Preetam Rao
I agree. If we do decide to implement another kind of request handler, it should be through StandardRequesthandler def type attribute, which selects the registered QParser which generates appropriate queries for lucene. Preetam On Tue, Jul 15, 2008 at 3:59 PM, Erik Hatcher <[EMAIL PR

Filter by Type increases search results.

2008-07-15 Thread matt connolly
I'm using Solr with a Drupal site, and one of the fields in the schema is "type". In my example development site, searching for the word "fish" returns 2 documents, one type='story', and the other type='idea'. If I filter by type:idea then I get 9 results, the correct first result, followed by 8

Re: Dismax request handler and sub phrase matches... suggestion for another handler..

2008-07-15 Thread Erik Hatcher
On Jul 15, 2008, at 4:45 AM, Preetam Rao wrote: What are your thoughts on having one more request handler like dismax, but which uses a sub-phrase query instead of dismax query ? It'd be better to just implement a QParser(Plugin) such that the StandardRequestHandler can use it (&defType=di

RE: Solr searching issue..

2008-07-15 Thread dudes dudes
thanks ! I think I fixed the issue and it's doing good :) > From: [EMAIL PROTECTED] > To: solr-user@lucene.apache.org > Subject: RE: Solr searching issue.. > Date: Mon, 14 Jul 2008 20:12:00 + > > Copy field dest="text". I am not sure if u can copy int

Re: solr synonyms behaviour

2008-07-15 Thread matt connolly
swarag wrote: > > Knowing the Lucene struggles with multi-word query-time synonyms, my > question is, does this also affect index-time synonyms? What other > alternatives do we have if we require there to be multiple word synonyms? > No the multiple word problem doesn't happen with index synon

Re: solr synonyms behaviour

2008-07-15 Thread Guillaume Smet
Chris, On Sat, Jan 26, 2008 at 2:30 AM, Chris Hostetter <[EMAIL PROTECTED]> wrote: > : I have the synonym filter only at query time coz i can't re-index data (or > : portion of data) everytime i add a synonym and a couple of other reasons. > > Use cases like yours will *never* work as a query time

Re: Duplicate content

2008-07-15 Thread Norberto Meijome
On Tue, 15 Jul 2008 10:48:14 +0200 Jarek Zgoda <[EMAIL PROTECTED]> wrote: > >> 2) I don't want to overwrite old content with new one. > >> > >> Means, if I add duplicate content in solr and the content already > >> exists, the old content should not be overwritten. > > > > before inserting a n

Re: Duplicate content

2008-07-15 Thread Jarek Zgoda
Norberto Meijome pisze: >> 2) I don't want to overwrite old content with new one. >> >> Means, if I add duplicate content in solr and the content already >> exists, the old content should not be overwritten. > > before inserting a new document, query the index - if you get a result back, > then

Dismax request handler and sub phrase matches... suggestion for another handler..

2008-07-15 Thread Preetam Rao
Hi, Apologies if you are receiving it second time...having tough time with mail server.. I take a user entered query as it is and run it with dismax query handler. The documents fields have been filled from structured data, where different fields have different attributes like number of beds, num

Re: Duplicate content

2008-07-15 Thread Norberto Meijome
On Tue, 15 Jul 2008 13:15:41 +0530 "Sunil" <[EMAIL PROTECTED]> wrote: > 1) I don't want duplicate content. SOLR uses the field you define as the unique field to determine whether a document should be replaced or added. The rest of the fields are in your hands. You could devise a setup whereby the

Re: Duplicate content

2008-07-15 Thread Noble Paul നോബിള്‍ नोब्ळ्
You must do a check before adding documents On Tue, Jul 15, 2008 at 1:15 PM, Sunil <[EMAIL PROTECTED]> wrote: > Hi All, > > I want to change the duplicate content behavior in solr. What I want to > do is: > > 1) I don't want duplicate content. > 2) I don't want to overwrite old content with new on

Duplicate content

2008-07-15 Thread Sunil
Hi All, I want to change the duplicate content behavior in solr. What I want to do is: 1) I don't want duplicate content. 2) I don't want to overwrite old content with new one. Means, if I add duplicate content in solr and the content already exists, the old content should not be overwritten.