An, SOW may be a red-herring then.

The fact that file_type is defined as Keyword is the key.
KeywordTokenizerFactory is expressly about _not_ breaking up the input
in any way.

You're right, you could reformulate your query. I think, though, that
you'd be better off changing the tokenizer to something like
WhitespaceTokenizerFactory (re-index of course).

The way you're using the field, it seems like this more "naturally"
expresses how you use the field. If you reformulate your query you'll
have to remember why forever, whereas if you change the field type
your future self (and whoever has to maintain it next) won't have to
remember this tricky bit.

Up to you of course,
Erick

On Wed, Dec 27, 2017 at 5:46 PM, Nawab Zada Asad Iqbal <khi...@gmail.com> wrote:
> Thanks Erick for pushing me into the right direction.
>
> so sow=false, but i think that it is the default behavior so I didn't
> expect this to cause any strange outcome.  However the reason Folder_id is
> being treated differently than the others is the schema definition.
> Folder_id is a long. While file_type is defined as Keyword (keywrod
> tokenizer doesn't seem to split on space).
> So I guess my solution is to:
> 1) either sow=true at query time or
> 2) write the query differently for file_type OR
> 3) modify the definition of file_type field in the schema.
>
> I guess (2) is a safe option.
>
> Thanks
> Nawab
>
> On Wed, Dec 27, 2017 at 5:17 PM, Erick Erickson <erickerick...@gmail.com>
> wrote:
>
>> OK, that's definitely weird. A separate fq clause like
>> fq={!q.op=OR}file_type:(jpg jpeg)
>>
>> should _not_ parse in to:
>> file_type:jpg jpeg
>>
>> Hmmm, any possibility that Split On Whitespace is somehow being set
>> (SOW) to false? Why in the world it would only show up like this is a
>> mystery, just askin'.
>>
>> It's probably worth building it up a bit and playing around with
>> reordering the fq clauses just to see if it's some weird interaction
>> there. That's not a cure, but data to add to an (eventual I'd expect)
>> JIRA.
>>
>> For instance, if you more the folder_id part after the file_type, is
>> it different? If you remove that bit all together, does the problem
>> still persist? What about the user_id part of the middle clause? Does
>> removing that make a difference?
>>
>> If you do raise a JIRA, you need to include:
>> 1> the raw query. Please don't edit at all (security policies allowing).
>> 2> the debug=query output (full)
>> 3> your request handler from solrconfig.xml
>> 4> your field definitions and associated types.
>>
>> Because it worked for me just fine in the simple case, so some
>> non-obvious combination of things is causing this. Since neither of us
>> know _what_, include everything ;)..
>>
>> Best,
>> Erick
>>
>> On Wed, Dec 27, 2017 at 4:45 PM, Nawab Zada Asad Iqbal <khi...@gmail.com>
>> wrote:
>> > Thanks Erik. Yes some similar queries are also working for me.
>> >
>> > "file_type:(jpg%20OR%20jpeg)" and "{!q.op=OR}file_type:(jpg OR jpeg)" are
>> > translated into  the following which is correct.
>> >
>> >    - "file_type:jpg file_type:jpeg"
>> >
>> > While "{!q.op=OR}file_type:(jpg jpeg)" is translated into file_type:jpg
>> > jpeg
>> >
>> >
>> > Here is the complete list of my filter queries. You can see that the
>> second
>> > query is translated very differently from the third. Though i am not sure
>> > if the second query is also correctly parsed or not.
>> >
>> >
>> >
>> >    - filter_queries: [
>> >       - "id:file_258470818866",
>> >       - "{!q.op=OR}folder_id:(23329074268 12033480380 36928119693
>> >       25894325891 25982100517 25895234569 25894295930 39367823449
>> 40634891514
>> >       41056556633 42045264481 41307354636 14370419636 14370432839
>> 24723808252
>> >       24723839431) user_id:(642129292)",
>> >       - "{!q.op=OR}file_type:(jpg jpeg)"
>> >       ],
>> >    - parsed_filter_queries: [
>> >       - "id:file_258470818866",
>> >       - "(IndexOrDocValuesQuery(folder_id:[23329074268 TO 23329074268])
>> >       IndexOrDocValuesQuery(folder_id:[12033480380 TO 12033480380])
>> >       IndexOrDocValuesQuery(folder_id:[36928119693 TO 36928119693])
>> >       IndexOrDocValuesQuery(folder_id:[25894325891 TO 25894325891])
>> >       IndexOrDocValuesQuery(folder_id:[25982100517 TO 25982100517])
>> >       IndexOrDocValuesQuery(folder_id:[25895234569 TO 25895234569])
>> >       IndexOrDocValuesQuery(folder_id:[25894295930 TO 25894295930])
>> >       IndexOrDocValuesQuery(folder_id:[39367823449 TO 39367823449])
>> >       IndexOrDocValuesQuery(folder_id:[40634891514 TO 40634891514])
>> >       IndexOrDocValuesQuery(folder_id:[41056556633 TO 41056556633])
>> >       IndexOrDocValuesQuery(folder_id:[42045264481 TO 42045264481])
>> >       IndexOrDocValuesQuery(folder_id:[41307354636 TO 41307354636])
>> >       IndexOrDocValuesQuery(folder_id:[14370419636 TO 14370419636])
>> >       IndexOrDocValuesQuery(folder_id:[14370432839 TO 14370432839])
>> >       IndexOrDocValuesQuery(folder_id:[24723808252 TO 24723808252])
>> >       IndexOrDocValuesQuery(folder_id:[24723839431 TO 24723839431]))
>> >       IndexOrDocValuesQuery(user_id:[642129292 TO 642129292])",
>> >       - "file_type:jpg jpeg"
>> >       ]
>> >
>> >
>> >
>> >
>> >
>> > On Wed, Dec 27, 2017 at 4:27 PM, Erick Erickson <erickerick...@gmail.com
>> >
>> > wrote:
>> >
>> >> 1> similar queries work for me just fine with the techproducts exapmle
>> >> 2> that's not what I wanted, you just reiterated the _input_.
>> >> I asked for the results when adding &debug=query to the string so you
>> >> can see the parsed query.
>> >> You should see something similar to:
>> >>
>> >> "parsed_filter_queries":["file_type:jpg file_type:jpeg"]}
>> >>
>> >> Best,
>> >> Erick
>> >>
>> >> On Wed, Dec 27, 2017 at 3:59 PM, Nawab Zada Asad Iqbal <
>> khi...@gmail.com>
>> >> wrote:
>> >> > 1. input: fq={!q.op=OR}file_type:(jpg%20jpeg)  (fails, no results)
>> >> >
>> >> >    - fq: [
>> >> >       - "id:file_258470818866",
>> >> >       - "{!q.op=OR}file_type:(jpg jpeg)"
>> >> >       ],
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > 2. input: fq={!q.op=OR}file_type:(jpg%20OR%20jpeg) (This works)
>> >> >
>> >> >
>> >> >    - fq: [
>> >> >       - "id:file_258470818866",
>> >> >       - "{!q.op=OR}file_type:(jpg OR jpeg)"
>> >> >       ],
>> >> >
>> >> >
>> >> > 3. input: &fq=file_type:(jpg%20OR%20jpeg) (This also works)
>> >> >
>> >> >
>> >> >    - fq: [
>> >> >       - "id:file_258470818866",
>> >> >       - "file_type:(jpg OR jpeg)"
>> >> >       ],
>> >> >
>> >> >
>> >> >
>> >> > PS: I am using 7.0.0 (including almost all the updates from 7.0.1).
>> >> >
>> >> > Regards
>> >> > Nawab
>> >> > On Wed, Dec 27, 2017 at 3:54 PM, Erick Erickson <
>> erickerick...@gmail.com
>> >> >
>> >> > wrote:
>> >> >
>> >> >> What does adding &debug=query show in the two cases?
>> >> >>
>> >> >> Best,
>> >> >> Erick
>> >> >>
>> >> >> On Wed, Dec 27, 2017 at 3:40 PM, Nawab Zada Asad Iqbal <
>> >> khi...@gmail.com>
>> >> >> wrote:
>> >> >> > Hi,
>> >> >> >
>> >> >> > Are the following two queries equal:
>> >> >> >
>> >> >> > In my understanding, I can specify the arguments the operator once
>> in
>> >> the
>> >> >> > {} local parameter syntax (example 1) or I can interleave OR
>> between
>> >> >> > different clauses  (example 2). But I am getting my result in the
>> >> second
>> >> >> > case only. What am I doing wrong?
>> >> >> >
>> >> >> > This was working fine in Solr 4 but not in Solr 7.
>> >> >> >
>> >> >> >
>> >> >> > 1:
>> >> >> > .../solr/filesearch/select?fq=id:258470818866&fq={!q.op=OR}
>> >> >> file_type:(jpg%20jpeg)
>> >> >> > --> Returns nothing.
>> >> >> >
>> >> >> >
>> >> >> > 2:
>> >> >> > .../solr/filesearch/select?fq=id:258470818866&fq={!q.op=OR}
>> >> >> file_type:(jpg%20OR%20jpeg)
>> >> >> > --> This returns the required document.
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > Thanks
>> >> >> > Nawab
>> >> >>
>> >>
>>

Reply via email to