Hi, This is the typical TextField with ... <fieldType name="text123" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType>
SRK On Thursday, November 24, 2016 1:38 AM, Reth RM <reth.ik...@gmail.com> wrote: what is the fieldType of those records? On Tue, Nov 22, 2016 at 4:18 AM, Sandeep Khanzode <sandeep_khanz...@yahoo.com.invalid> wrote: Hi Erick, I gave this a try. These are my results. There is a record with "John D. Smith", and another named "John Doe". 1.] {!complexphrase inOrder=true}name:"John D.*" ... does not fetch any results. 2.] {!complexphrase inOrder=true}name:"John D*" ... fetches both results. Second observation: There is a record with "John D Smith" 1.] {!complexphrase inOrder=true}name:"John*" ... does not fetch any results. 2.] {!complexphrase inOrder=true}name:"John D*" ... fetches that record. 3.] {!complexphrase inOrder=true}name:"John D S*" ... fetches that record. SRK On Sunday, November 13, 2016 7:43 AM, Erick Erickson <erickerick...@gmail.com> wrote: Right, for that kind of use case you want complexPhraseQueryParser, see: https://cwiki.apache.org/ confluence/display/solr/Other+ Parsers#OtherParsers- ComplexPhraseQueryParser Best, Erick On Sat, Nov 12, 2016 at 9:39 AM, Sandeep Khanzode <sandeep_khanz...@yahoo.com> wrote: > Thanks, Erick. > > I am actually not trying to use the String field (prefer a TextField here). > But, in my comparisons with TextField, it seems that something like phrase > matching with whitespace and wildcard (like, 'my do*' or say, 'my dog*', or > say, 'my dog has*') can only be accomplished with a string type field, > especially because, with a WhitespaceTokenizer in TextField, the space will > be lost, and all tokens will be individually considered. Am I missing > something? > > SRK > > > On Friday, November 11, 2016 10:05 PM, Erick Erickson > <erickerick...@gmail.com> wrote: > > > You have to query text and string fields differently, that's just the > way it works. The problem is getting the query string through the > parser as a _single_ token or as multiple tokens. > > Let's say you have a string field with the "a b" example. You have a > single token > a b that starts at offset 0. > > But with a text field, you have two tokens, > a at position 0 > b at position 1 > > But when the query parser sees "a b" (without quotes) it splits it > into two tokens, and only the text field has both tokens so the string > field won't match. > > OTOH, when the query parser sees "a\ b" it passes this through as a > single token, which only matches the string field as there's no > _single_ token "a b" in the text field. > > But a more interesting question is why you want to search this way. > String fields are intended for keywords, machine-generated IDs and the > like. They're pretty useless for searching anything except > 1> exact tokens > 2> prefixes > > While if you have "my dog has fleas" in a string field, you _can_ > search "*dog*" and get a hit but the performance is poor when you get > a large corpus. Performance for "my*" will be pretty good though. > > In all this sounds like an XY problem, what's the use-case you're > trying to solve? > > Best, > Erick > > > > On Thu, Nov 10, 2016 at 10:11 PM, Sandeep Khanzode > <sandeep_khanz...@yahoo.com. invalid> wrote: >> Hi Erick, Reth, >> >> The 'a\ b*' as well as the q.op=AND approach worked (successfully) only >> for StrField for me. >> >> Any attempt at creating a 'a\ b*' for a TextField does not match any >> documents. The parsedQuery in debug mode does show 'field:a b*'. I am sure >> there are documents that should match. >> Another (maybe unrelated) observation is if I have 'field:a\ b', then the >> parsedQuery is field:a field:b. Which does not match as expected (matches >> individually). >> >> Can you please provide an example that I can use in Solr Query dashboard? >> That will be helpful. >> >> I have also seen that wildcard queries work irrespective of field type >> i.e. StrField as well as TextField. That makes sense because with a >> WhitespaceTokenizer only creates word boundaries when we do not use a >> EdgeNGramFilter. If I am not wrong, that is. SRK >> >> On Friday, November 11, 2016 5:00 AM, Erick Erickson >> <erickerick...@gmail.com> wrote: >> >> >> You can escape the space with a backslash as 'a\ b*' >> >> Best, >> Erick >> >> On Thu, Nov 10, 2016 at 2:37 PM, Reth RM <reth.ik...@gmail.com> wrote: >>> I don't think you can do wildcard on StrField. For text field, if your >>> query is "category:(test m*)" the parsed query will be "category:test >>> OR >>> category:m*" >>> You can add q.op=AND to make an AND between those terms. >>> >>> For phrase type wild card query support, as per docs, it >>> is ComplexPhraseQueryParser that supports it. (I haven't tested it >>> myself) >>> >>> >>> https://cwiki.apache.org/ confluence/display/solr/Other+ >>> Parsers#OtherParsers- ComplexPhraseQueryParser >>> >>> On Thu, Nov 10, 2016 at 11:40 AM, Sandeep Khanzode < >>> sandeep_khanz...@yahoo.com. invalid> wrote: >>> >>>> Hi, >>>> How does a search like abc* work in StrField. Since the entire thing is >>>> stored as a single token, is it a type of a trie structure that allows >>>> such >>>> wildcard matching? >>>> How can searches with space like 'a b*' be executed for text fields >>>> (tokenized on whitespace)? If we specify this type of query, it is >>>> broken >>>> down into two queries with field:a and field:b*. I would like them to be >>>> contiguous, sort of, like a phrase search with wild card. >>>> SRK >> >> >> > >