fq syntax for requiring all multiValued field values to be within a list?

2014-09-27 Thread White, Bill
Hello,

I've attempted to figure this out from reading the documentation but
without much luck.  I looked for a comprehensive query syntax specification
(e.g., with BNF and a list of operator semantics) but I'm unable to find
such a document (does such a thing exist? or is the syntax too much of a
moving target?)

I'm using 4.6.1, if that makes a difference, though upgrading is an option
if it necessary to make this work.

I've got a multiValued field color, which describes the colors of item in
the database.  Items can have zero or more colors.  What I want is to be
able to filter out all hits that contain colors not within a constraining
list, i.e., something like

NOT (color NOT IN (red,yellow,green)).

So the following would be passed by the filter:
(no value for 'color')
color: red
color: red, color: green

whereas these would be excluded:
color: red, color: blue
color: magenta


Nothing I've come up with so far, e.g. -(-color: red -color: green),
seems to work.

I've also looked into using a function query but it seems to lack operators
for dealing with string multivalued fields.

Ideas?

Thanks,
Bill


Re: fq syntax for requiring all multiValued field values to be within a list?

2014-09-27 Thread White, Bill
Not just that.  I'm looking for things which match either red or yellow or
green, but do NOT match ANY other color.  I can probably drop the
requirement related to having no color.


On Sat, Sep 27, 2014 at 3:28 PM, Yonik Seeley yo...@heliosearch.com wrote:

 On Sat, Sep 27, 2014 at 2:52 PM, White, Bill bwh...@ptfs.com wrote:
  Hello,
 
  I've attempted to figure this out from reading the documentation but
  without much luck.  I looked for a comprehensive query syntax
 specification
  (e.g., with BNF and a list of operator semantics) but I'm unable to find
  such a document (does such a thing exist? or is the syntax too much of a
  moving target?)
 
  I'm using 4.6.1, if that makes a difference, though upgrading is an
 option
  if it necessary to make this work.
 
  I've got a multiValued field color, which describes the colors of item
 in
  the database.  Items can have zero or more colors.  What I want is to be
  able to filter out all hits that contain colors not within a constraining
  list, i.e., something like
 
  NOT (color NOT IN (red,yellow,green)).
 
  So the following would be passed by the filter:
  (no value for 'color')
  color: red
  color: red, color: green
 
  whereas these would be excluded:
  color: red, color: blue
  color: magenta

 You're looking for things that either match red, yellow, or green, or
 have no color:

 color:(red yellow green) OR (*:* -color:*)

 -Yonik
 http://heliosearch.org - native code faceting, facet functions,
 sub-facets, off-heap data



Re: fq syntax for requiring all multiValued field values to be within a list?

2014-09-27 Thread White, Bill
Sorry, color is multivalued, so a given record might be both blue and red.
I don't want those to show up in the results.

On Sat, Sep 27, 2014 at 3:36 PM, White, Bill bwh...@ptfs.com wrote:

 Not just that.  I'm looking for things which match either red or yellow or
 green, but do NOT match ANY other color.  I can probably drop the
 requirement related to having no color.


 On Sat, Sep 27, 2014 at 3:28 PM, Yonik Seeley yo...@heliosearch.com
 wrote:

 On Sat, Sep 27, 2014 at 2:52 PM, White, Bill bwh...@ptfs.com wrote:
  Hello,
 
  I've attempted to figure this out from reading the documentation but
  without much luck.  I looked for a comprehensive query syntax
 specification
  (e.g., with BNF and a list of operator semantics) but I'm unable to find
  such a document (does such a thing exist? or is the syntax too much of a
  moving target?)
 
  I'm using 4.6.1, if that makes a difference, though upgrading is an
 option
  if it necessary to make this work.
 
  I've got a multiValued field color, which describes the colors of
 item in
  the database.  Items can have zero or more colors.  What I want is to be
  able to filter out all hits that contain colors not within a
 constraining
  list, i.e., something like
 
  NOT (color NOT IN (red,yellow,green)).
 
  So the following would be passed by the filter:
  (no value for 'color')
  color: red
  color: red, color: green
 
  whereas these would be excluded:
  color: red, color: blue
  color: magenta

 You're looking for things that either match red, yellow, or green, or
 have no color:

 color:(red yellow green) OR (*:* -color:*)

 -Yonik
 http://heliosearch.org - native code faceting, facet functions,
 sub-facets, off-heap data





Re: fq syntax for requiring all multiValued field values to be within a list?

2014-09-27 Thread White, Bill
Hmm, that won't work since color is free-form.

Is there a way to invoke (via fq) a user-defined function (hopefully
defined as part of the fq syntax, but alternatively, written in Java) and
have it applied to the resultset?

On Sat, Sep 27, 2014 at 3:41 PM, Yonik Seeley yo...@heliosearch.com wrote:

 On Sat, Sep 27, 2014 at 3:36 PM, White, Bill bwh...@ptfs.com wrote:
  Sorry, color is multivalued, so a given record might be both blue and
 red.
  I don't want those to show up in the results.

 I think the only way currently (out of the box) is to enumerate the
 other possible colors to exclude them.

 color:(red yellow green)  -color:(blue cyan xxx)

 -Yonik
 http://heliosearch.org - native code faceting, facet functions,
 sub-facets, off-heap data



  On Sat, Sep 27, 2014 at 3:36 PM, White, Bill bwh...@ptfs.com wrote:
 
  Not just that.  I'm looking for things which match either red or yellow
 or
  green, but do NOT match ANY other color.  I can probably drop the
  requirement related to having no color.
 
  On Sat, Sep 27, 2014 at 3:28 PM, Yonik Seeley yo...@heliosearch.com
  wrote:
 
  On Sat, Sep 27, 2014 at 2:52 PM, White, Bill bwh...@ptfs.com wrote:
   Hello,
  
   I've attempted to figure this out from reading the documentation but
   without much luck.  I looked for a comprehensive query syntax
  specification
   (e.g., with BNF and a list of operator semantics) but I'm unable to
 find
   such a document (does such a thing exist? or is the syntax too much
 of a
   moving target?)
  
   I'm using 4.6.1, if that makes a difference, though upgrading is an
  option
   if it necessary to make this work.
  
   I've got a multiValued field color, which describes the colors of
  item in
   the database.  Items can have zero or more colors.  What I want is
 to be
   able to filter out all hits that contain colors not within a
  constraining
   list, i.e., something like
  
   NOT (color NOT IN (red,yellow,green)).
  
   So the following would be passed by the filter:
   (no value for 'color')
   color: red
   color: red, color: green
  
   whereas these would be excluded:
   color: red, color: blue
   color: magenta
 
  You're looking for things that either match red, yellow, or green, or
  have no color:
 
  color:(red yellow green) OR (*:* -color:*)
 
  -Yonik
  http://heliosearch.org - native code faceting, facet functions,
  sub-facets, off-heap data
 
 
 



Re: fq syntax for requiring all multiValued field values to be within a list?

2014-09-27 Thread White, Bill
OK, let me try phrasing it better.

How do I exclude from search, any result which contains any value for
multivalued field 'color' which is not within a given constraint set
(e.g., red, green, yellow, burnt sienna), given that I do not what
any of the other possible values of 'color' are?

In pseudocode:

for all x in result.color
if x not in (red,green,yellow, burnt sienna)
filter out result

I don't see how range queries would work since I have no control over the
possible values of 'color', e.g., there could be a valid color lemon
yellow between green and red, and I don't want a result which has
(color: red, color: lemon yellow)

On Sat, Sep 27, 2014 at 4:02 PM, Mikhail Khludnev 
mkhlud...@griddynamics.com wrote:

 On Sat, Sep 27, 2014 at 11:36 PM, White, Bill bwh...@ptfs.com wrote:

  but do NOT match ANY other color.


 Bill, I miss the whole picture, it's worth to rephrase the problem in one
 sentence.
 But regarding the quote above, you can try to use exclusive ranges

 https://lucene.apache.org/core/4_6_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Range_Searches
 fq=-color:({* TO green} {green TO red} {red TO *})
 just don't forget to build ranges alphabetically

 --
 Sincerely yours
 Mikhail Khludnev
 Principal Engineer,
 Grid Dynamics

 http://www.griddynamics.com
 mkhlud...@griddynamics.com



Re: fq syntax for requiring all multiValued field values to be within a list?

2014-09-27 Thread White, Bill
Thanks!

On Sat, Sep 27, 2014 at 4:18 PM, Yonik Seeley yo...@heliosearch.com wrote:

 On Sat, Sep 27, 2014 at 3:46 PM, White, Bill bwh...@ptfs.com wrote:
  Hmm, that won't work since color is free-form.
 
  Is there a way to invoke (via fq) a user-defined function (hopefully
  defined as part of the fq syntax, but alternatively, written in Java) and
  have it applied to the resultset?

 https://wiki.apache.org/solr/SolrPlugins#QParserPlugin

 -Yonik
 http://heliosearch.org - native code faceting, facet functions,
 sub-facets, off-heap data



Re: fq syntax for requiring all multiValued field values to be within a list?

2014-09-27 Thread White, Bill
Hmm.  If I understand correctly this builds a set out of open intervals
(exclusive ranges), that's a great idea!

It doesn't seem to work for me, though;  fq=-color:({* TO red} {red TO *})
is giving me results with color=burnt sienna

The field is defined as field name=color type=string indexed=true
stored=true multiValued=true /

On Sat, Sep 27, 2014 at 4:43 PM, Mikhail Khludnev 
mkhlud...@griddynamics.com wrote:

 indeed!
 the exclusive range {green TO red} matches to the lemon yellow
 hence, the negation suppresses it from appearing
 fq=-color:{green TO red}
 then you need to suppress eg black and white also
 fq=-color:({* TO green} {green TO red} {red TO *})

 I have no control over the
  possible values of 'color',

 You don't need to control possible values, you just suppressing any values
 beside of the given green and red.
 Mind that either green or red passes that negation of exclusive ranges
 disjunction.


 On Sun, Sep 28, 2014 at 12:15 AM, White, Bill bwh...@ptfs.com wrote:

  OK, let me try phrasing it better.
 
  How do I exclude from search, any result which contains any value for
  multivalued field 'color' which is not within a given constraint set
  (e.g., red, green, yellow, burnt sienna), given that I do not
 what
  any of the other possible values of 'color' are?
 
  In pseudocode:
 
  for all x in result.color
  if x not in (red,green,yellow, burnt sienna)
  filter out result
 
  I don't see how range queries would work since I have no control over the
  possible values of 'color', e.g., there could be a valid color lemon
  yellow between green and red, and I don't want a result which has
  (color: red, color: lemon yellow)
 
  On Sat, Sep 27, 2014 at 4:02 PM, Mikhail Khludnev 
  mkhlud...@griddynamics.com wrote:
 
   On Sat, Sep 27, 2014 at 11:36 PM, White, Bill bwh...@ptfs.com wrote:
  
but do NOT match ANY other color.
  
  
   Bill, I miss the whole picture, it's worth to rephrase the problem in
 one
   sentence.
   But regarding the quote above, you can try to use exclusive ranges
  
  
 
 https://lucene.apache.org/core/4_6_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Range_Searches
   fq=-color:({* TO green} {green TO red} {red TO *})
   just don't forget to build ranges alphabetically
  
   --
   Sincerely yours
   Mikhail Khludnev
   Principal Engineer,
   Grid Dynamics
  
   http://www.griddynamics.com
   mkhlud...@griddynamics.com
  
 



 --
 Sincerely yours
 Mikhail Khludnev
 Principal Engineer,
 Grid Dynamics

 http://www.griddynamics.com
 mkhlud...@griddynamics.com



Re: fq syntax for requiring all multiValued field values to be within a list?

2014-09-27 Thread White, Bill
It worked for me once I changed to

-color:({* TO red} OR {red TO *})

I'm not sure why the OR is needed, maybe it's my version? (4.6.1)

On Sat, Sep 27, 2014 at 5:22 PM, Mikhail Khludnev 
mkhlud...@griddynamics.com wrote:

 hm. try to convert it to query q=-color:({* TO red} {red TO *}) and check
 the explanation from debugQuery=true

 I tried to play with my index

   q: *:*,
   facet.field: swatchColors_string_mv,
   fq: -swatchColors_string_mv:({* TO RED} {RED TO *}),

 I got the following facets:

   facet_fields: {
   swatchColors_string_mv: [
 RED,
 122,
 BLACK,
 0,
 BLUE,
 0,
 BROWN,
 0,
 GREEN,
 0,

 so, it works for me at least...



 On Sun, Sep 28, 2014 at 12:54 AM, White, Bill bwh...@ptfs.com wrote:

  Hmm.  If I understand correctly this builds a set out of open intervals
  (exclusive ranges), that's a great idea!
 
  It doesn't seem to work for me, though;  fq=-color:({* TO red} {red TO
 *})
  is giving me results with color=burnt sienna
 
  The field is defined as field name=color type=string indexed=true
  stored=true multiValued=true /
 
  On Sat, Sep 27, 2014 at 4:43 PM, Mikhail Khludnev 
  mkhlud...@griddynamics.com wrote:
 
   indeed!
   the exclusive range {green TO red} matches to the lemon yellow
   hence, the negation suppresses it from appearing
   fq=-color:{green TO red}
   then you need to suppress eg black and white also
   fq=-color:({* TO green} {green TO red} {red TO *})
  
   I have no control over the
possible values of 'color',
  
   You don't need to control possible values, you just suppressing any
  values
   beside of the given green and red.
   Mind that either green or red passes that negation of exclusive ranges
   disjunction.
  
  
   On Sun, Sep 28, 2014 at 12:15 AM, White, Bill bwh...@ptfs.com wrote:
  
OK, let me try phrasing it better.
   
How do I exclude from search, any result which contains any value for
multivalued field 'color' which is not within a given constraint
 set
(e.g., red, green, yellow, burnt sienna), given that I do not
   what
any of the other possible values of 'color' are?
   
In pseudocode:
   
for all x in result.color
if x not in (red,green,yellow, burnt sienna)
filter out result
   
I don't see how range queries would work since I have no control over
  the
possible values of 'color', e.g., there could be a valid color lemon
yellow between green and red, and I don't want a result which
 has
(color: red, color: lemon yellow)
   
On Sat, Sep 27, 2014 at 4:02 PM, Mikhail Khludnev 
mkhlud...@griddynamics.com wrote:
   
 On Sat, Sep 27, 2014 at 11:36 PM, White, Bill bwh...@ptfs.com
  wrote:

  but do NOT match ANY other color.


 Bill, I miss the whole picture, it's worth to rephrase the problem
 in
   one
 sentence.
 But regarding the quote above, you can try to use exclusive ranges


   
  
 
 https://lucene.apache.org/core/4_6_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Range_Searches
 fq=-color:({* TO green} {green TO red} {red TO *})
 just don't forget to build ranges alphabetically

 --
 Sincerely yours
 Mikhail Khludnev
 Principal Engineer,
 Grid Dynamics

 http://www.griddynamics.com
 mkhlud...@griddynamics.com

   
  
  
  
   --
   Sincerely yours
   Mikhail Khludnev
   Principal Engineer,
   Grid Dynamics
  
   http://www.griddynamics.com
   mkhlud...@griddynamics.com
  
 



 --
 Sincerely yours
 Mikhail Khludnev
 Principal Engineer,
 Grid Dynamics

 http://www.griddynamics.com
 mkhlud...@griddynamics.com



Re: fq syntax for requiring all multiValued field values to be within a list?

2014-09-27 Thread White, Bill
Yes, that was it, thank you!

On Sat, Sep 27, 2014 at 5:28 PM, Mikhail Khludnev 
mkhlud...@griddynamics.com wrote:

 http://wiki.apache.org/solr/SchemaXml#Default_query_parser_operator ?
 once again, debugQuery=true perfectly explains what's going on with q.

 On Sun, Sep 28, 2014 at 1:24 AM, White, Bill bwh...@ptfs.com wrote:

  It worked for me once I changed to
 
  -color:({* TO red} OR {red TO *})
 
  I'm not sure why the OR is needed, maybe it's my version? (4.6.1)
 
  On Sat, Sep 27, 2014 at 5:22 PM, Mikhail Khludnev 
  mkhlud...@griddynamics.com wrote:
 
   hm. try to convert it to query q=-color:({* TO red} {red TO *}) and
 check
   the explanation from debugQuery=true
  
   I tried to play with my index
  
 q: *:*,
 facet.field: swatchColors_string_mv,
 fq: -swatchColors_string_mv:({* TO RED} {RED TO *}),
  
   I got the following facets:
  
 facet_fields: {
 swatchColors_string_mv: [
   RED,
   122,
   BLACK,
   0,
   BLUE,
   0,
   BROWN,
   0,
   GREEN,
   0,
  
   so, it works for me at least...
  
  
  
   On Sun, Sep 28, 2014 at 12:54 AM, White, Bill bwh...@ptfs.com wrote:
  
Hmm.  If I understand correctly this builds a set out of open
 intervals
(exclusive ranges), that's a great idea!
   
It doesn't seem to work for me, though;  fq=-color:({* TO red} {red
 TO
   *})
is giving me results with color=burnt sienna
   
The field is defined as field name=color type=string
  indexed=true
stored=true multiValued=true /
   
On Sat, Sep 27, 2014 at 4:43 PM, Mikhail Khludnev 
mkhlud...@griddynamics.com wrote:
   
 indeed!
 the exclusive range {green TO red} matches to the lemon yellow
 hence, the negation suppresses it from appearing
 fq=-color:{green TO red}
 then you need to suppress eg black and white also
 fq=-color:({* TO green} {green TO red} {red TO *})

 I have no control over the
  possible values of 'color',

 You don't need to control possible values, you just suppressing any
values
 beside of the given green and red.
 Mind that either green or red passes that negation of exclusive
  ranges
 disjunction.


 On Sun, Sep 28, 2014 at 12:15 AM, White, Bill bwh...@ptfs.com
  wrote:

  OK, let me try phrasing it better.
 
  How do I exclude from search, any result which contains any value
  for
  multivalued field 'color' which is not within a given constraint
   set
  (e.g., red, green, yellow, burnt sienna), given that I do
  not
 what
  any of the other possible values of 'color' are?
 
  In pseudocode:
 
  for all x in result.color
  if x not in (red,green,yellow, burnt sienna)
  filter out result
 
  I don't see how range queries would work since I have no control
  over
the
  possible values of 'color', e.g., there could be a valid color
  lemon
  yellow between green and red, and I don't want a result
 which
   has
  (color: red, color: lemon yellow)
 
  On Sat, Sep 27, 2014 at 4:02 PM, Mikhail Khludnev 
  mkhlud...@griddynamics.com wrote:
 
   On Sat, Sep 27, 2014 at 11:36 PM, White, Bill bwh...@ptfs.com
 
wrote:
  
but do NOT match ANY other color.
  
  
   Bill, I miss the whole picture, it's worth to rephrase the
  problem
   in
 one
   sentence.
   But regarding the quote above, you can try to use exclusive
  ranges
  
  
 

   
  
 
 https://lucene.apache.org/core/4_6_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Range_Searches
   fq=-color:({* TO green} {green TO red} {red TO *})
   just don't forget to build ranges alphabetically
  
   --
   Sincerely yours
   Mikhail Khludnev
   Principal Engineer,
   Grid Dynamics
  
   http://www.griddynamics.com
   mkhlud...@griddynamics.com
  
 



 --
 Sincerely yours
 Mikhail Khludnev
 Principal Engineer,
 Grid Dynamics

 http://www.griddynamics.com
 mkhlud...@griddynamics.com

   
  
  
  
   --
   Sincerely yours
   Mikhail Khludnev
   Principal Engineer,
   Grid Dynamics
  
   http://www.griddynamics.com
   mkhlud...@griddynamics.com
  
 



 --
 Sincerely yours
 Mikhail Khludnev
 Principal Engineer,
 Grid Dynamics

 http://www.griddynamics.com
 mkhlud...@griddynamics.com