RE: prefix query help
I think this will work. Ill try it tomorrow and let you know. Thanks for the help Eric and Shawn Kris -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: Thursday, December 8, 2016 2:43 PM To: solr-user@lucene.apache.org Subject: Re: prefix query help It’s hard to tell how _exact_ to be here, but if you’re indexing those strings and your queries are literally always -MM, then do the truncation of the actual data into that format or via analysis techniques to index only the -MM piece of the incoming string. But given what you’ve got so far, using what the prefix examples I provided below, your two queries would be this: q={!prefix f=metatag.date v=‘2016-06'} and q=({!prefix f=metatag.date v=‘2016-06’} OR {!prefix f=metatag.date v=‘2014-04’} ) Does that work for you? It really should work to do this q=metadata.date:(2016-06* OR 2014-04*) as you’ve got it, but you said that sort of thing wasn’t working (debug out would help suss that issue out). If you did index those strings cleaner as -MM to accommodate the types of query you’ve shown then you could do q=metadata.date:(2016-06 OR 2014-04), or q={!terms f=metadata.date}2016-06,2014-04 Erik > On Dec 8, 2016, at 11:34 AM, KRIS MUSSHORN <mussho...@comcast.net> wrote: > > yes I did attach rather than paste sorry. > > Ok heres an actual, truncated, example of the metatag.date field contents in > solr. > NONE-NN-NN is the default setting. > > doc 1 > " metatag.date ": [ > "2016-06-15T14:51:04Z" , > "2016-06-15T14:51:04Z" > ] > > doc 2 > " metatag.date ": [ > "2016-06-15" > ] > doc 3 > " metatag.date ": [ > "NONE-NN-NN" > ] > doc 4 > " metatag.date ": [ > "-mm-dd" > ] > > doc 5 > " metatag.date ": [ > "2016-07-06" > ] > > doc 6 > " metatag.date ": [ > "2014-04-15T14:51:06Z" , > "2014-04-15T14:51:06Z" > ] > > q=2016-06 should return doc 2 and 1 > q=2016-06 OR 2014-04 should return docs 1, 2 and 6 > > yes I know its wonky but its what I have to deal with until he content is > cleaned up. > I cant use date type.. that would make my life to easy. > > TIA again > Kris > > - Original Message - > > From: "Erik Hatcher" <erik.hatc...@gmail.com> > To: solr-user@lucene.apache.org > Sent: Thursday, December 8, 2016 12:36:26 PM > Subject: Re: prefix query help > > Kris - > > To chain multiple prefix queries together: > > q=({!prefix f=field1 v=‘prefix1'} {!prefix f=field2 v=‘prefix2’}) > > The leading paren is needed to ensure it’s being parsed with the lucene > qparser (be sure not to have defType set, or a variant would be needed) and > that allows multiple {!…} expressions to be parsed. The outside-the-curlys > value for the prefix shouldn’t be attempted with multiples, so the `v` is the > way to go, either inline or $referenced. > > If you do have defType set, say to edismax, then do something like this > instead: > q={!lucene v=prefixed_queries} > _queries={!prefix f=field1 v=‘prefix1'} {!prefix f=field2 > v=‘prefix2’} >// I don’t think parens are needed with _queries, but maybe. > > > =query (or =true) is your friend - see how things are parsed. I > presume in your example that didn’t work that the dash didn’t work as you > expected? or… not sure. What’s the parsed_query output in debug on that > one? > > Erik > > p.s. did you really just send a Word doc to the list that could have been > inlined in text? :) > > > >> On Dec 8, 2016, at 7:18 AM, KRIS MUSSHORN <mussho...@comcast.net> wrote: >> >> Im indexing data from Nutch into SOLR 5.4.1. >> I've got a date metatag that I have to store as text type because the data >> stinks. >> It's stored in SOLR as field metatag.date. >> At the source the dates are formatted (when they are entered correctly ) as >> -MM-DD >> >> q=metatag.date:2016-01* does not produce the correct results and returns >> undesireable matches2016-05-01 etc as example. >> q={!prefix f=metatag.date}2016-01 gives me exactly what I want for one >> month/year. >> >> My question is how do I chain n prefix queries together? >> i.e. >> I want all docs where metatag.date prefix is 2016-01 or 2016-07 or 2016-10 >> >> TIA, >> Kris >> > >
Re: prefix query help
It’s hard to tell how _exact_ to be here, but if you’re indexing those strings and your queries are literally always -MM, then do the truncation of the actual data into that format or via analysis techniques to index only the -MM piece of the incoming string. But given what you’ve got so far, using what the prefix examples I provided below, your two queries would be this: q={!prefix f=metatag.date v=‘2016-06'} and q=({!prefix f=metatag.date v=‘2016-06’} OR {!prefix f=metatag.date v=‘2014-04’} ) Does that work for you? It really should work to do this q=metadata.date:(2016-06* OR 2014-04*) as you’ve got it, but you said that sort of thing wasn’t working (debug out would help suss that issue out). If you did index those strings cleaner as -MM to accommodate the types of query you’ve shown then you could do q=metadata.date:(2016-06 OR 2014-04), or q={!terms f=metadata.date}2016-06,2014-04 Erik > On Dec 8, 2016, at 11:34 AM, KRIS MUSSHORN <mussho...@comcast.net> wrote: > > yes I did attach rather than paste sorry. > > Ok heres an actual, truncated, example of the metatag.date field contents in > solr. > NONE-NN-NN is the default setting. > > doc 1 > " metatag.date ": [ > "2016-06-15T14:51:04Z" , > "2016-06-15T14:51:04Z" > ] > > doc 2 > " metatag.date ": [ > "2016-06-15" > ] > doc 3 > " metatag.date ": [ > "NONE-NN-NN" > ] > doc 4 > " metatag.date ": [ > "-mm-dd" > ] > > doc 5 > " metatag.date ": [ > "2016-07-06" > ] > > doc 6 > " metatag.date ": [ > "2014-04-15T14:51:06Z" , > "2014-04-15T14:51:06Z" > ] > > q=2016-06 should return doc 2 and 1 > q=2016-06 OR 2014-04 should return docs 1, 2 and 6 > > yes I know its wonky but its what I have to deal with until he content is > cleaned up. > I cant use date type.. that would make my life to easy. > > TIA again > Kris > > - Original Message - > > From: "Erik Hatcher" <erik.hatc...@gmail.com> > To: solr-user@lucene.apache.org > Sent: Thursday, December 8, 2016 12:36:26 PM > Subject: Re: prefix query help > > Kris - > > To chain multiple prefix queries together: > > q=({!prefix f=field1 v=‘prefix1'} {!prefix f=field2 v=‘prefix2’}) > > The leading paren is needed to ensure it’s being parsed with the lucene > qparser (be sure not to have defType set, or a variant would be needed) and > that allows multiple {!…} expressions to be parsed. The outside-the-curlys > value for the prefix shouldn’t be attempted with multiples, so the `v` is the > way to go, either inline or $referenced. > > If you do have defType set, say to edismax, then do something like this > instead: > q={!lucene v=prefixed_queries} > _queries={!prefix f=field1 v=‘prefix1'} {!prefix f=field2 > v=‘prefix2’} >// I don’t think parens are needed with _queries, but maybe. > > > =query (or =true) is your friend - see how things are parsed. I > presume in your example that didn’t work that the dash didn’t work as you > expected? or… not sure. What’s the parsed_query output in debug on that > one? > > Erik > > p.s. did you really just send a Word doc to the list that could have been > inlined in text? :) > > > >> On Dec 8, 2016, at 7:18 AM, KRIS MUSSHORN <mussho...@comcast.net> wrote: >> >> Im indexing data from Nutch into SOLR 5.4.1. >> I've got a date metatag that I have to store as text type because the data >> stinks. >> It's stored in SOLR as field metatag.date. >> At the source the dates are formatted (when they are entered correctly ) as >> -MM-DD >> >> q=metatag.date:2016-01* does not produce the correct results and returns >> undesireable matches2016-05-01 etc as example. >> q={!prefix f=metatag.date}2016-01 gives me exactly what I want for one >> month/year. >> >> My question is how do I chain n prefix queries together? >> i.e. >> I want all docs where metatag.date prefix is 2016-01 or 2016-07 or 2016-10 >> >> TIA, >> Kris >> > >
Re: prefix query help
yes I did attach rather than paste sorry. Ok heres an actual, truncated, example of the metatag.date field contents in solr. NONE-NN-NN is the default setting. doc 1 " metatag.date ": [ "2016-06-15T14:51:04Z" , "2016-06-15T14:51:04Z" ] doc 2 " metatag.date ": [ "2016-06-15" ] doc 3 " metatag.date ": [ "NONE-NN-NN" ] doc 4 " metatag.date ": [ "-mm-dd" ] doc 5 " metatag.date ": [ "2016-07-06" ] doc 6 " metatag.date ": [ "2014-04-15T14:51:06Z" , "2014-04-15T14:51:06Z" ] q=2016-06 should return doc 2 and 1 q=2016-06 OR 2014-04 should return docs 1, 2 and 6 yes I know its wonky but its what I have to deal with until he content is cleaned up. I cant use date type.. that would make my life to easy. TIA again Kris - Original Message ----- From: "Erik Hatcher" <erik.hatc...@gmail.com> To: solr-user@lucene.apache.org Sent: Thursday, December 8, 2016 12:36:26 PM Subject: Re: prefix query help Kris - To chain multiple prefix queries together: q=({!prefix f=field1 v=‘prefix1'} {!prefix f=field2 v=‘prefix2’}) The leading paren is needed to ensure it’s being parsed with the lucene qparser (be sure not to have defType set, or a variant would be needed) and that allows multiple {!…} expressions to be parsed. The outside-the-curlys value for the prefix shouldn’t be attempted with multiples, so the `v` is the way to go, either inline or $referenced. If you do have defType set, say to edismax, then do something like this instead: q={!lucene v=prefixed_queries} _queries={!prefix f=field1 v=‘prefix1'} {!prefix f=field2 v=‘prefix2’} // I don’t think parens are needed with _queries, but maybe. =query (or =true) is your friend - see how things are parsed. I presume in your example that didn’t work that the dash didn’t work as you expected? or… not sure. What’s the parsed_query output in debug on that one? Erik p.s. did you really just send a Word doc to the list that could have been inlined in text? :) > On Dec 8, 2016, at 7:18 AM, KRIS MUSSHORN <mussho...@comcast.net> wrote: > > Im indexing data from Nutch into SOLR 5.4.1. > I've got a date metatag that I have to store as text type because the data > stinks. > It's stored in SOLR as field metatag.date. > At the source the dates are formatted (when they are entered correctly ) as > -MM-DD > > q=metatag.date:2016-01* does not produce the correct results and returns > undesireable matches2016-05-01 etc as example. > q={!prefix f=metatag.date}2016-01 gives me exactly what I want for one > month/year. > > My question is how do I chain n prefix queries together? > i.e. > I want all docs where metatag.date prefix is 2016-01 or 2016-07 or 2016-10 > > TIA, > Kris >
Re: prefix query help
On 12/8/2016 10:02 AM, KRIS MUSSHORN wrote: > > Here is how I have the field defined... see attachment. You're using a tokenized field type. For the kinds of queries you asked about here, you want to use StrField, not TextField -- this type cannot have an analysis chain and indexes to one token that is completely unchanged from the input. Note that if you also want to do other kinds of queries (like 2016), then StrField would break those. You do have to reindex after making this change. Since your input data is consistently -MM-DD, you should consider using the solr.DateRangeField class instead of a string or text type. This allows queries like "2016" or "2016-07" to work as you would expect them to. https://cwiki.apache.org/confluence/display/solr/Working+with+Dates#WorkingwithDates-DateRangeFormatting Thanks, Shawn
Re: prefix query help
Kris - To chain multiple prefix queries together: q=({!prefix f=field1 v=‘prefix1'} {!prefix f=field2 v=‘prefix2’}) The leading paren is needed to ensure it’s being parsed with the lucene qparser (be sure not to have defType set, or a variant would be needed) and that allows multiple {!…} expressions to be parsed. The outside-the-curlys value for the prefix shouldn’t be attempted with multiples, so the `v` is the way to go, either inline or $referenced. If you do have defType set, say to edismax, then do something like this instead: q={!lucene v=prefixed_queries} _queries={!prefix f=field1 v=‘prefix1'} {!prefix f=field2 v=‘prefix2’} // I don’t think parens are needed with _queries, but maybe. =query (or =true) is your friend - see how things are parsed. I presume in your example that didn’t work that the dash didn’t work as you expected? or… not sure. What’s the parsed_query output in debug on that one? Erik p.s. did you really just send a Word doc to the list that could have been inlined in text? :) > On Dec 8, 2016, at 7:18 AM, KRIS MUSSHORNwrote: > > Im indexing data from Nutch into SOLR 5.4.1. > I've got a date metatag that I have to store as text type because the data > stinks. > It's stored in SOLR as field metatag.date. > At the source the dates are formatted (when they are entered correctly ) as > -MM-DD > > q=metatag.date:2016-01* does not produce the correct results and returns > undesireable matches2016-05-01 etc as example. > q={!prefix f=metatag.date}2016-01 gives me exactly what I want for one > month/year. > > My question is how do I chain n prefix queries together? > i.e. > I want all docs where metatag.date prefix is 2016-01 or 2016-07 or 2016-10 > > TIA, > Kris >
Re: prefix query help
Here is how I have the field defined... see attachment. - Original Message - From: "Erick Erickson" <erickerick...@gmail.com> To: "solr-user" <solr-user@lucene.apache.org> Sent: Thursday, December 8, 2016 10:44:08 AM Subject: Re: prefix query help You'd probably be better off indexing it as a "string" type given your expectations. Depending on the analysis chain (do take a look at admin/analysis for the field in question) the tokenization can be tricky to get right. Best, Erick On Thu, Dec 8, 2016 at 7:18 AM, KRIS MUSSHORN <mussho...@comcast.net> wrote: > Im indexing data from Nutch into SOLR 5.4.1. > I've got a date metatag that I have to store as text type because the data > stinks. > It's stored in SOLR as field metatag.date. > At the source the dates are formatted (when they are entered correctly ) as > -MM-DD > > q=metatag.date:2016-01* does not produce the correct results and returns > undesireable matches2016-05-01 etc as example. > q={!prefix f=metatag.date}2016-01 gives me exactly what I want for one > month/year. > > My question is how do I chain n prefix queries together? > i.e. > I want all docs where metatag.date prefix is 2016-01 or 2016-07 or 2016-10 > > TIA, > Kris > field name.docx Description: MS-Word 2007 document
Re: prefix query help
You'd probably be better off indexing it as a "string" type given your expectations. Depending on the analysis chain (do take a look at admin/analysis for the field in question) the tokenization can be tricky to get right. Best, Erick On Thu, Dec 8, 2016 at 7:18 AM, KRIS MUSSHORNwrote: > Im indexing data from Nutch into SOLR 5.4.1. > I've got a date metatag that I have to store as text type because the data > stinks. > It's stored in SOLR as field metatag.date. > At the source the dates are formatted (when they are entered correctly ) as > -MM-DD > > q=metatag.date:2016-01* does not produce the correct results and returns > undesireable matches2016-05-01 etc as example. > q={!prefix f=metatag.date}2016-01 gives me exactly what I want for one > month/year. > > My question is how do I chain n prefix queries together? > i.e. > I want all docs where metatag.date prefix is 2016-01 or 2016-07 or 2016-10 > > TIA, > Kris >