Sorry that my brain has turned to mush... the issue you are hitting is due
to a known, undocumented limit in the whitespace tokenizer:

https://issues.apache.org/jira/browse/LUCENE-5785
"White space tokenizer has undocumented limit of 256 characters per token"

If you look at the parsed query you will see that two query terms were
generated. This is because the whitespace tokenizer will simply split long
tokens every 256 characters. So, your filter will never see a long term.

There is a note on the Jira (evidently by me!) that you can use the pattern
tokenizer as a workaround. But... if your term is a string anyway, you
could just use the keyword tokenizer.


-- Jack Krupansky

On Fri, May 15, 2015 at 4:06 PM, Charles Sanders <csand...@redhat.com>
wrote:

> Shawn,
> Thanks a bunch for working with me on this.
>
> I have deleted all records from my index. Stopped solr. Made the schema
> changes as requested. Started solr. Then insert the one test record. Then
> search. Still see the same results. No portal_package is not the unique
> key, its uri. Which is a string field.
>
> <field name="portal_package" type="text_std" indexed="true" stored="true"
> multiValued="true"/>
>
> <fieldType name="text_std" class="solr.TextField"
> positionIncrementGap="100">
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> <filter class="solr.LengthFilterFactory" min="1" max="300" />
> </fieldType>
>
> {
> "documentKind": "test",
> "uri": "test300",
> "id": "test300",
> "portal_package":"12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890"
> }
>
>
> {
> "responseHeader": {
> "status": 0,
> "QTime": 47,
> "params": {
> "spellcheck": "true",
> "enableElevation": "false",
> "df": "allText",
> "echoParams": "all",
> "spellcheck.maxCollations": "5",
> "spellcheck.dictionary": "andreasAutoComplete",
> "spellcheck.count": "5",
> "spellcheck.collate": "true",
> "spellcheck.onlyMorePopular": "true",
> "rows": "10",
> "indent": "true",
> "q":
> "portal_package:12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890",
> "_": "1431719989047",
> "debug": "query",
> "wt": "json"
> }
> },
> "response": {
> "numFound": 1,
> "start": 0,
> "docs": [
> {
> "documentKind": "test",
> "uri": "test300",
> "id": "test300",
> "portal_package": [
>
> "12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890"
> ],
> "_version_": 1501267024421060600,
> "timestamp": "2015-05-15T19:56:43.247Z",
> "language": "en"
> }
> ]
> },
> "debug": {
> "rawquerystring":
> "portal_package:12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890",
> "querystring":
> "portal_package:12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890",
> "parsedquery":
> "portal_package:1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456
> portal_package:7890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890",
> "parsedquery_toString":
> "portal_package:1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456
> portal_package:7890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890",
> "QParser": "LuceneQParser"
> }
> }
>
>
>
>
>
> ----- Original Message -----
>
> From: "Shawn Heisey" <apa...@elyograg.org>
> To: solr-user@lucene.apache.org
> Sent: Friday, May 15, 2015 3:29:19 PM
> Subject: Re: Problem with solr.LengthFilterFactory
>
> On 5/15/2015 1:23 PM, Shawn Heisey wrote:
> > Then I looked back at your fieldType definition and noticed that you
> > are only defining an index analyzer. Remove the 'type="index"' part of
> > the analyzer config so it happens at both index and query time,
> > reindex, then try again.
>
> The reindex may be very important here. I would actually completely
> delete your data directory and restart Solr before reindexing, to be
> sure you don't have old recordsfrom any previous reindexes.
>
> http://wiki.apache.org/solr/HowToReindex
>
> I think this next part is unlikely, but I'm going to ask it anyway: Is
> the portal_package field your schema uniqueKey? If it is, that might be
> an additional source of problems. Using a solr.Textfield for a
> uniqueKey field causes Solr to behave in unexpected ways.
>
> Thanks,
> Shawn
>
>
>

Reply via email to