It looks like the debug result you are showing me is the results for
Rod's not Rod’s, but in answer to your question
This is why I think "Rod’s finds fields Rod's and
Rod’s that are now in the index as rod's"
The analysis page shows Rod’s gets stored in the index as:
rod's rods rod s
Field Value (Index)
Rod’s
Analyse Fieldname / FieldType: _text_ Schema Browser
<https://centos1:8985/solr/#/rat_11/schema?field=_text_>
*
Verbose Output
WT
text
raw_bytes
start
end
positionLength
type
termFrequency
position
Rod’s
[52 6f 64 e2 80 99 73]
0
5
1
word
1
1
SF
text
raw_bytes
start
end
positionLength
type
termFrequency
position
Rod’s
[52 6f 64 e2 80 99 73]
0
5
1
word
1
1
WDGF
text
raw_bytes
start
end
positionLength
type
termFrequency
position
keyword
Rod’s
[52 6f 64 e2 80 99 73]
0
5
2
word
1
1
false
Rods
[52 6f 64 73]
0
5
2
word
1
1
false
Rod
[52 6f 64]
0
3
1
word
1
1
false
s
[73]
4
5
1
word
1
2
false
FGF
text
raw_bytes
start
end
positionLength
type
termFrequency
position
keyword
Rod’s
[52 6f 64 e2 80 99 73]
0
5
2
word
1
1
false
Rods
[52 6f 64 73]
0
5
2
word
1
1
false
Rod
[52 6f 64]
0
3
1
word
1
1
false
s
[73]
4
5
1
word
1
2
false
PRF
text
raw_bytes
start
end
positionLength
type
termFrequency
position
keyword
Rod’s
[52 6f 64 e2 80 99 73]
0
5
2
word
1
1
false
Rods
[52 6f 64 73]
0
5
2
word
1
1
false
Rod
[52 6f 64]
0
3
1
word
1
1
false
s
[73]
4
5
1
word
1
2
false
PRF
text
raw_bytes
start
end
positionLength
type
termFrequency
position
keyword
Rod's
[52 6f 64 27 73]
0
5
2
word
1
1
false
Rods
[52 6f 64 73]
0
5
2
word
1
1
false
Rod
[52 6f 64]
0
3
1
word
1
1
false
s
[73]
4
5
1
word
1
2
false
PRF
text
raw_bytes
start
end
positionLength
type
termFrequency
position
keyword
Rod's
[52 6f 64 27 73]
0
5
2
word
1
1
false
Rods
[52 6f 64 73]
0
5
2
word
1
1
false
Rod
[52 6f 64]
0
3
1
word
1
1
false
s
[73]
4
5
1
word
1
2
false
PRF
text
raw_bytes
start
end
positionLength
type
termFrequency
position
keyword
Rod's
[52 6f 64 27 73]
0
5
2
word
1
1
false
Rods
[52 6f 64 73]
0
5
2
word
1
1
false
Rod
[52 6f 64]
0
3
1
word
1
1
false
s
[73]
4
5
1
word
1
2
false
LCF
tex
t
raw_bytes
start
end
positionLength
type
termFrequency
position
keyword
rod's
[72 6f 64 27 73]
0
5
2
word
1
1
false
rods
[72 6f 64 73]
0
5
2
word
1
1
false
rod
[72 6f 64]
0
3
1
word
1
1
false
s
[73]
4
5
1
word
1
2
false
This is what we were trying to achieve with the <filter
class="solr.PatternReplaceFilterFactory" pattern="’" replacement="'"/>
The problem is when using wildcard *Rod’s* we get no hits
||
|"responseHeader":{ "status":0, "QTime":2, "params":{ "q":"*Rod’s*",
"debugQuery":"on", "_":"1582315262594"}},
"response":{"numFound":0,"start":0,"docs":[] }, "debug":{
"rawquerystring":"*Rod’s*", "querystring":"*Rod’s*",
"parsedquery":"_text_:*rod’s*", "parsedquery_toString":"_text_:*rod’s*",
"explain":{}, "QParser":"LuceneQParser", ... |
On 2/21/2020 11:52 AM, Erick Erickson wrote:
Why do you say “…that are now in the index as rod’s”? You have
WordDelimiterGraphFilterFactory, which breaks things up. When I put your field
definition in the schema and use the analysis page, turns “rod’s” into the
following 4 tokens:
rod’s
rods
rod
s
And querying on field:”*Rod’s*” works just fine. I’m using 8.x, and when I add
“&debug=query” to the URL, I see:
{
"responseHeader": {
"status": 0, "QTime": 10, "params": {
"q": "eoe:\"*Rod's*\"", "debug": "query"
}
}, "response": {
"numFound": 1, "start": 0, "docs": [
{
"id": "1", "eoe": "Rod's", "_version_": 1659176849231577088
}
]
}, "debug": {
"rawquerystring": "eoe:\"*Rod's*\"", "querystring": "eoe:\"*Rod's*\"", "parsedquery": "SynonymQuery(Synonym(eoe:*rod's*
eoe:rod))", "parsedquery_toString": "Synonym(eoe:*rod's* eoe:rod)", "QParser": "LuceneQParser"
}
}
What do you see?
Best,
Erick
On Feb 21, 2020, at 12:57 PM, Mike Phillips <m.phill...@prosperodigital.com>
wrote:
Rod’s finds fields Rod's and Rod’s that are now in the index as rod's
but *Rod’s* finds nothing because the index now only contains rod's