RE: prefix query help

2016-12-08 Thread Kris Musshorn
I think this will work. Ill try it tomorrow and let you know.
Thanks for the help Eric and Shawn
Kris

-Original Message-
From: Erik Hatcher [mailto:erik.hatc...@gmail.com] 
Sent: Thursday, December 8, 2016 2:43 PM
To: solr-user@lucene.apache.org
Subject: Re: prefix query help

It’s hard to tell how _exact_ to be here, but if you’re indexing those strings 
and your queries are literally always -MM, then do the truncation of the 
actual data into that format or via analysis techniques to index only the 
-MM piece of the incoming string.  

But given what you’ve got so far, using what the prefix examples I provided 
below, your two queries would be this:

   q={!prefix f=metatag.date v=‘2016-06'}

and

   q=({!prefix f=metatag.date v=‘2016-06’} OR {!prefix f=metatag.date 
v=‘2014-04’} )

Does that work for you?

It really should work to do this q=metadata.date:(2016-06* OR 2014-04*) as 
you’ve got it, but you said that sort of thing wasn’t working (debug out would 
help suss that issue out).

If you did index those strings cleaner as -MM to accommodate the types of 
query you’ve shown then you could do q=metadata.date:(2016-06 OR 2014-04), or 
q={!terms f=metadata.date}2016-06,2014-04

Erik




> On Dec 8, 2016, at 11:34 AM, KRIS MUSSHORN <mussho...@comcast.net> wrote:
> 
> yes I did attach rather than paste sorry. 
>   
> Ok heres an actual, truncated, example of the metatag.date field contents in 
> solr. 
> NONE-NN-NN is the default setting. 
>   
> doc 1 
> " metatag.date ": [ 
>   "2016-06-15T14:51:04Z" ,
>   "2016-06-15T14:51:04Z" 
> ] 
>   
> doc 2 
> " metatag.date ": [ 
>   "2016-06-15" 
> ] 
> doc 3 
> " metatag.date ": [ 
>   "NONE-NN-NN" 
> ] 
> doc 4 
> " metatag.date ": [ 
>   "-mm-dd" 
> ] 
>   
> doc 5 
> " metatag.date ": [ 
>   "2016-07-06" 
> ] 
> 
> doc 6 
> " metatag.date ": [ 
>   "2014-04-15T14:51:06Z" , 
>   "2014-04-15T14:51:06Z" 
> ] 
>   
> q=2016-06 should return doc 2 and 1 
> q=2016-06 OR 2014-04 should return docs 1, 2 and 6 
>   
> yes I know its wonky but its what I have to deal with until he content is 
> cleaned up. 
> I cant use date type.. that would make my life to easy. 
>   
> TIA again 
> Kris 
> 
> - Original Message -
> 
> From: "Erik Hatcher" <erik.hatc...@gmail.com> 
> To: solr-user@lucene.apache.org 
> Sent: Thursday, December 8, 2016 12:36:26 PM 
> Subject: Re: prefix query help 
> 
> Kris - 
> 
> To chain multiple prefix queries together: 
> 
> q=({!prefix f=field1 v=‘prefix1'} {!prefix f=field2 v=‘prefix2’}) 
> 
> The leading paren is needed to ensure it’s being parsed with the lucene 
> qparser (be sure not to have defType set, or a variant would be needed) and 
> that allows multiple {!…} expressions to be parsed.  The outside-the-curlys 
> value for the prefix shouldn’t be attempted with multiples, so the `v` is the 
> way to go, either inline or $referenced. 
> 
> If you do have defType set, say to edismax, then do something like this 
> instead: 
> q={!lucene v=prefixed_queries} 
> _queries={!prefix f=field1 v=‘prefix1'} {!prefix f=field2 
> v=‘prefix2’} 
>// I don’t think parens are needed with _queries, but maybe.  
>  
> 
> =query (or =true) is your friend - see how things are parsed.  I 
> presume in your example that didn’t work that the dash didn’t work as you 
> expected?   or… not sure.  What’s the parsed_query output in debug on that 
> one? 
> 
> Erik 
> 
> p.s. did you really just send a Word doc to the list that could have been 
> inlined in text?  :)   
> 
> 
> 
>> On Dec 8, 2016, at 7:18 AM, KRIS MUSSHORN <mussho...@comcast.net> wrote: 
>> 
>> Im indexing data from Nutch into SOLR 5.4.1. 
>> I've got a date metatag that I have to store as text type because the data 
>> stinks. 
>> It's stored in SOLR as field metatag.date. 
>> At the source the dates are formatted (when they are entered correctly ) as 
>> -MM-DD 
>>   
>> q=metatag.date:2016-01* does not produce the correct results and returns 
>> undesireable matches2016-05-01 etc as example. 
>> q={!prefix f=metatag.date}2016-01 gives me exactly what I want for one 
>> month/year. 
>>   
>> My question is how do I chain n prefix queries together? 
>> i.e. 
>> I want all docs where metatag.date prefix is 2016-01 or 2016-07 or 2016-10 
>>   
>> TIA, 
>> Kris 
>>   
> 
> 



Re: prefix query help

2016-12-08 Thread Erik Hatcher
It’s hard to tell how _exact_ to be here, but if you’re indexing those strings 
and your queries are literally always -MM, then do the truncation of the 
actual data into that format or via analysis techniques to index only the 
-MM piece of the incoming string.  

But given what you’ve got so far, using what the prefix examples I provided 
below, your two queries would be this:

   q={!prefix f=metatag.date v=‘2016-06'}

and

   q=({!prefix f=metatag.date v=‘2016-06’} OR {!prefix f=metatag.date 
v=‘2014-04’} )

Does that work for you?

It really should work to do this q=metadata.date:(2016-06* OR 2014-04*) as 
you’ve got it, but you said that sort of thing wasn’t working (debug out would 
help suss that issue out).

If you did index those strings cleaner as -MM to accommodate the types of 
query you’ve shown then you could do q=metadata.date:(2016-06 OR 2014-04), or 
q={!terms f=metadata.date}2016-06,2014-04

Erik




> On Dec 8, 2016, at 11:34 AM, KRIS MUSSHORN <mussho...@comcast.net> wrote:
> 
> yes I did attach rather than paste sorry. 
>   
> Ok heres an actual, truncated, example of the metatag.date field contents in 
> solr. 
> NONE-NN-NN is the default setting. 
>   
> doc 1 
> " metatag.date ": [ 
>   "2016-06-15T14:51:04Z" ,
>   "2016-06-15T14:51:04Z" 
> ] 
>   
> doc 2 
> " metatag.date ": [ 
>   "2016-06-15" 
> ] 
> doc 3 
> " metatag.date ": [ 
>   "NONE-NN-NN" 
> ] 
> doc 4 
> " metatag.date ": [ 
>   "-mm-dd" 
> ] 
>   
> doc 5 
> " metatag.date ": [ 
>   "2016-07-06" 
> ] 
> 
> doc 6 
> " metatag.date ": [ 
>   "2014-04-15T14:51:06Z" , 
>   "2014-04-15T14:51:06Z" 
> ] 
>   
> q=2016-06 should return doc 2 and 1 
> q=2016-06 OR 2014-04 should return docs 1, 2 and 6 
>   
> yes I know its wonky but its what I have to deal with until he content is 
> cleaned up. 
> I cant use date type.. that would make my life to easy. 
>   
> TIA again 
> Kris 
> 
> - Original Message -
> 
> From: "Erik Hatcher" <erik.hatc...@gmail.com> 
> To: solr-user@lucene.apache.org 
> Sent: Thursday, December 8, 2016 12:36:26 PM 
> Subject: Re: prefix query help 
> 
> Kris - 
> 
> To chain multiple prefix queries together: 
> 
> q=({!prefix f=field1 v=‘prefix1'} {!prefix f=field2 v=‘prefix2’}) 
> 
> The leading paren is needed to ensure it’s being parsed with the lucene 
> qparser (be sure not to have defType set, or a variant would be needed) and 
> that allows multiple {!…} expressions to be parsed.  The outside-the-curlys 
> value for the prefix shouldn’t be attempted with multiples, so the `v` is the 
> way to go, either inline or $referenced. 
> 
> If you do have defType set, say to edismax, then do something like this 
> instead: 
> q={!lucene v=prefixed_queries} 
> _queries={!prefix f=field1 v=‘prefix1'} {!prefix f=field2 
> v=‘prefix2’} 
>// I don’t think parens are needed with _queries, but maybe.  
>  
> 
> =query (or =true) is your friend - see how things are parsed.  I 
> presume in your example that didn’t work that the dash didn’t work as you 
> expected?   or… not sure.  What’s the parsed_query output in debug on that 
> one? 
> 
> Erik 
> 
> p.s. did you really just send a Word doc to the list that could have been 
> inlined in text?  :)   
> 
> 
> 
>> On Dec 8, 2016, at 7:18 AM, KRIS MUSSHORN <mussho...@comcast.net> wrote: 
>> 
>> Im indexing data from Nutch into SOLR 5.4.1. 
>> I've got a date metatag that I have to store as text type because the data 
>> stinks. 
>> It's stored in SOLR as field metatag.date. 
>> At the source the dates are formatted (when they are entered correctly ) as 
>> -MM-DD 
>>   
>> q=metatag.date:2016-01* does not produce the correct results and returns 
>> undesireable matches2016-05-01 etc as example. 
>> q={!prefix f=metatag.date}2016-01 gives me exactly what I want for one 
>> month/year. 
>>   
>> My question is how do I chain n prefix queries together? 
>> i.e. 
>> I want all docs where metatag.date prefix is 2016-01 or 2016-07 or 2016-10 
>>   
>> TIA, 
>> Kris 
>>   
> 
> 



Re: prefix query help

2016-12-08 Thread KRIS MUSSHORN
yes I did attach rather than paste sorry. 
  
Ok heres an actual, truncated, example of the metatag.date field contents in 
solr. 
NONE-NN-NN is the default setting. 
  
doc 1 
" metatag.date ": [ 
  "2016-06-15T14:51:04Z" , 
  "2016-06-15T14:51:04Z" 
    ] 
  
doc 2 
" metatag.date ": [ 
  "2016-06-15" 
    ] 
doc 3 
" metatag.date ": [ 
  "NONE-NN-NN" 
    ] 
doc 4 
" metatag.date ": [ 
  "-mm-dd" 
    ] 
  
doc 5 
" metatag.date ": [ 
  "2016-07-06" 
    ] 

doc 6 
" metatag.date ": [ 
  "2014-04-15T14:51:06Z" , 
  "2014-04-15T14:51:06Z" 
    ] 
  
q=2016-06 should return doc 2 and 1 
q=2016-06 OR 2014-04 should return docs 1, 2 and 6 
  
yes I know its wonky but its what I have to deal with until he content is 
cleaned up. 
I cant use date type.. that would make my life to easy. 
  
TIA again 
Kris 

- Original Message -----

From: "Erik Hatcher" <erik.hatc...@gmail.com> 
To: solr-user@lucene.apache.org 
Sent: Thursday, December 8, 2016 12:36:26 PM 
Subject: Re: prefix query help 

Kris - 

To chain multiple prefix queries together: 

    q=({!prefix f=field1 v=‘prefix1'} {!prefix f=field2 v=‘prefix2’}) 

The leading paren is needed to ensure it’s being parsed with the lucene qparser 
(be sure not to have defType set, or a variant would be needed) and that allows 
multiple {!…} expressions to be parsed.  The outside-the-curlys value for the 
prefix shouldn’t be attempted with multiples, so the `v` is the way to go, 
either inline or $referenced. 

If you do have defType set, say to edismax, then do something like this 
instead: 
    q={!lucene v=prefixed_queries} 
    _queries={!prefix f=field1 v=‘prefix1'} {!prefix f=field2 
v=‘prefix2’} 
       // I don’t think parens are needed with _queries, but maybe.   

=query (or =true) is your friend - see how things are parsed.  I 
presume in your example that didn’t work that the dash didn’t work as you 
expected?   or… not sure.  What’s the parsed_query output in debug on that one? 

Erik 

p.s. did you really just send a Word doc to the list that could have been 
inlined in text?  :)   



> On Dec 8, 2016, at 7:18 AM, KRIS MUSSHORN <mussho...@comcast.net> wrote: 
> 
> Im indexing data from Nutch into SOLR 5.4.1. 
> I've got a date metatag that I have to store as text type because the data 
> stinks. 
> It's stored in SOLR as field metatag.date. 
> At the source the dates are formatted (when they are entered correctly ) as 
> -MM-DD 
>   
> q=metatag.date:2016-01* does not produce the correct results and returns 
> undesireable matches2016-05-01 etc as example. 
> q={!prefix f=metatag.date}2016-01 gives me exactly what I want for one 
> month/year. 
>   
> My question is how do I chain n prefix queries together? 
> i.e. 
> I want all docs where metatag.date prefix is 2016-01 or 2016-07 or 2016-10 
>   
> TIA, 
> Kris 
>   




Re: prefix query help

2016-12-08 Thread Shawn Heisey
On 12/8/2016 10:02 AM, KRIS MUSSHORN wrote:
>
> Here is how I have the field defined... see attachment.

You're using a tokenized field type.

For the kinds of queries you asked about here, you want to use StrField,
not TextField -- this type cannot have an analysis chain and indexes to
one token that is completely unchanged from the input.  Note that if you
also want to do other kinds of queries (like 2016), then StrField would
break those.  You do have to reindex after making this change.

Since your input data is consistently -MM-DD, you should consider
using the solr.DateRangeField class instead of a string or text type. 
This allows queries like "2016" or "2016-07" to work as you would expect
them to.

https://cwiki.apache.org/confluence/display/solr/Working+with+Dates#WorkingwithDates-DateRangeFormatting

Thanks,
Shawn



Re: prefix query help

2016-12-08 Thread Erik Hatcher
Kris -

To chain multiple prefix queries together:

q=({!prefix f=field1 v=‘prefix1'} {!prefix f=field2 v=‘prefix2’})

The leading paren is needed to ensure it’s being parsed with the lucene qparser 
(be sure not to have defType set, or a variant would be needed) and that allows 
multiple {!…} expressions to be parsed.  The outside-the-curlys value for the 
prefix shouldn’t be attempted with multiples, so the `v` is the way to go, 
either inline or $referenced.

If you do have defType set, say to edismax, then do something like this instead:
q={!lucene v=prefixed_queries}
_queries={!prefix f=field1 v=‘prefix1'} {!prefix f=field2 
v=‘prefix2’} 
   // I don’t think parens are needed with _queries, but maybe.  

=query (or =true) is your friend - see how things are parsed.  I 
presume in your example that didn’t work that the dash didn’t work as you 
expected?   or… not sure.  What’s the parsed_query output in debug on that one?

Erik

p.s. did you really just send a Word doc to the list that could have been 
inlined in text?  :)  



> On Dec 8, 2016, at 7:18 AM, KRIS MUSSHORN  wrote:
> 
> Im indexing data from Nutch into SOLR 5.4.1. 
> I've got a date metatag that I have to store as text type because the data 
> stinks. 
> It's stored in SOLR as field metatag.date. 
> At the source the dates are formatted (when they are entered correctly ) as 
> -MM-DD 
>   
> q=metatag.date:2016-01* does not produce the correct results and returns 
> undesireable matches2016-05-01 etc as example. 
> q={!prefix f=metatag.date}2016-01 gives me exactly what I want for one 
> month/year. 
>   
> My question is how do I chain n prefix queries together? 
> i.e. 
> I want all docs where metatag.date prefix is 2016-01 or 2016-07 or 2016-10 
>   
> TIA, 
> Kris 
>   



Re: prefix query help

2016-12-08 Thread KRIS MUSSHORN

Here is how I have the field defined... see attachment. 
  
  
- Original Message -

From: "Erick Erickson" <erickerick...@gmail.com> 
To: "solr-user" <solr-user@lucene.apache.org> 
Sent: Thursday, December 8, 2016 10:44:08 AM 
Subject: Re: prefix query help 

You'd probably be better off indexing it as a "string" type given your 
expectations. Depending on the analysis chain (do take a look at 
admin/analysis for the field in question) the tokenization can be tricky 
to get right. 

Best, 
Erick 

On Thu, Dec 8, 2016 at 7:18 AM, KRIS MUSSHORN <mussho...@comcast.net> wrote: 
> Im indexing data from Nutch into SOLR 5.4.1. 
> I've got a date metatag that I have to store as text type because the data 
> stinks. 
> It's stored in SOLR as field metatag.date. 
> At the source the dates are formatted (when they are entered correctly ) as 
> -MM-DD 
> 
> q=metatag.date:2016-01* does not produce the correct results and returns 
> undesireable matches2016-05-01 etc as example. 
> q={!prefix f=metatag.date}2016-01 gives me exactly what I want for one 
> month/year. 
> 
> My question is how do I chain n prefix queries together? 
> i.e. 
> I want all docs where metatag.date prefix is 2016-01 or 2016-07 or 2016-10 
> 
> TIA, 
> Kris 
> 



field name.docx
Description: MS-Word 2007 document


Re: prefix query help

2016-12-08 Thread Erick Erickson
You'd probably be better off indexing it as a "string" type given your
expectations. Depending on the analysis chain (do take a look at
admin/analysis for the field in question) the tokenization can be tricky
to get right.

Best,
Erick

On Thu, Dec 8, 2016 at 7:18 AM, KRIS MUSSHORN  wrote:
> Im indexing data from Nutch into SOLR 5.4.1.
> I've got a date metatag that I have to store as text type because the data 
> stinks.
> It's stored in SOLR as field metatag.date.
> At the source the dates are formatted (when they are entered correctly ) as 
> -MM-DD
>
> q=metatag.date:2016-01* does not produce the correct results and returns 
> undesireable matches2016-05-01 etc as example.
> q={!prefix f=metatag.date}2016-01 gives me exactly what I want for one 
> month/year.
>
> My question is how do I chain n prefix queries together?
> i.e.
> I want all docs where metatag.date prefix is 2016-01 or 2016-07 or 2016-10
>
> TIA,
> Kris
>