Well, I think this is slightly too categorical - a range query on a substring can be thought of as a simple range query. So, for example the following query:
"lucene 1*" becomes behind the scenes: "lucene (10|11|12|13|14|1abcd)" the issue there is that it is a string range, but it is a range query - it just has to be indexed in a clever way So, Marcin, you still have quite a few options besides the strict boolean query model 1. have a special tokenizer chain which creates one token out of these groups (eg. "some text prefix_1") and search for "some text prefix_*" [and do some post-filtering if necessary] 2. another version, using regex /some text (1|2|3...)/ - you got the idea 3. construct the lucene multi-term range query automatically, in your qparser - to produce a phrase query "lucene (10|11|12|13|14)" 4. use payloads to index your integer at the position of "some text" and then retrieve only "some text" where the payload is in range x-y - an example is here, look at getPayloadQuery() https://github.com/romanchyla/montysolr/blob/master/contrib/adsabs/src/test/org/adsabs/lucene/BenchmarkAuthorSearch.java- but this is more complex situation and if you google, you will find a better description 5. use a qparser that is able to handle nested search and analysis at the same time - eg. your query is: field:"some text" NEAR1 field:[0 TO 10] - i know about a parser that can handle this and i invite others to check it out (yeah, JIRA tickets need reviewers ;-)) https://issues.apache.org/jira/browse/LUCENE-5014 there might be others i forgot, but it is certainly doable; but as Jack points out, you may want to stop for a moment to reflect whether it is necessary HTH, roman On Tue, Jul 16, 2013 at 8:35 AM, Jack Krupansky <j...@basetechnology.com>wrote: > Sorry, but you are basically misusing Solr (and multivalued fields), > trying to take a "shortcut" to avoid a proper data model. > > To properly use Solr, you need to put each of these multivalued field > values in a separate Solr document, with a "text" field and a "value" > field. Then, you can query: > > text:"some text" AND value:[min-value TO max-value] > > Exactly how you should restructure your data model is dependent on all of > your other requirements. > > You may be able to simply flatten your data. > > You may be able to use a simple join operation. > > Or, maybe you need to do a multi-step query operation if you data is > sufficiently complex. > > If you want to keep your multivalued field in its current form for display > purposes or keyword search, or exact match search, fine, but your stated > goal is inconsistent with the Semantics of Solr and Lucene. > > To be crystal clear, there is no such thing as "a range query on a > substring" in Solr or Lucene. > > -- Jack Krupansky > > -----Original Message----- From: Marcin Rzewucki > Sent: Tuesday, July 16, 2013 5:13 AM > To: solr-user@lucene.apache.org > Subject: Re: Range query on a substring. > > > By multivalued I meant an array of values. For example: > <arr name="myfield"> > <str>text1 (X)</str> > <str>text2 (Y)</str> > </arr> > > I'd like to avoid spliting it as you propose. I have 2.3mn collection with > pretty large records (few hundreds fields and more per record). Duplicating > them would impact performance. > > Regards. > > > > On 16 July 2013 10:26, Oleg Burlaca <oburl...@gmail.com> wrote: > > Ah, you mean something like this: >> record: >> Id=10, text = "this is a text N1 (X), another text N2 (Y), text N3 (Z)" >> Id=11, text = "this is a text N1 (W), another text N2 (Q), third text >> (M)" >> >> and you need to search for: "text N1" and X < B ? >> How big is the core? the first thing that comes to my mind, again, at >> indexing level, >> split the text into pieces and index it in solr like this: >> >> record_id | text | value >> 10 | text N1 | X >> 10 | text N2 | Y >> 10 | text N3 | Z >> >> does it help? >> >> >> >> On Tue, Jul 16, 2013 at 10:51 AM, Marcin Rzewucki <mrzewu...@gmail.com >> >wrote: >> >> > Hi Oleg, >> > It's a multivalued field and it won't be easier to query when I split >> this >> > field into text and numbers. I may get wrong results. >> > >> > Regards. >> > >> > >> > On 16 July 2013 09:35, Oleg Burlaca <oburl...@gmail.com> wrote: >> > >> > > IMHO the number(s) should be extracted and stored in separate columns >> in >> > > SOLR at indexing time. >> > > >> > > -- >> > > Oleg >> > > >> > > >> > > On Tue, Jul 16, 2013 at 10:12 AM, Marcin Rzewucki < >> mrzewu...@gmail.com >> > > >wrote: >> > > >> > > > Hi, >> > > > >> > > > I have a problem (wonder if it is possible to solve it at all) with >> the >> > > > following query. There are documents with a field which contains a >> text >> > > and >> > > > a number in brackets, eg. >> > > > >> > > > myfield: this is a text (number) >> > > > >> > > > There might be some other documents with the same text but different >> > > number >> > > > in brackets. >> > > > I'd like to find documents with the given text say "this is a text" >> and >> > > > "number" between A and B. Is it possible in Solr ? Any ideas ? >> > > > >> > > > Kind regards. >> > > > >> > > >> > >> >> >