Mike - This is also what I have found. Search:parse has to actually return this "empty" query for the <empty> option to have any effect:
<cts:and-query qtextempty="1" xmlns:cts="http://marklogic.com/cts"/> When it is passed punctuation text and "punctuation-insensitive" in options it returns: <cts:word-query qtextref="cts:text" xmlns:cts="http://marklogic.com/cts"> <cts:text>,</cts:text> <cts:option>punctuation-insensitive</cts:option> </cts:word-query> The same problem occurs with "whitespace-insensitive" in options and search:parse(" ",$options): <cts:word-query qtextref="cts:text" xmlns:cts="http://marklogic.com/cts"> <cts:text> </cts:text> <cts:option>whitespace-insensitive</cts:option> </cts:word-query> Both these queries are unaffected by <empty apply="all-results"/> and return no results. I don't think this is desirable for any application. Ideally I think Search API would provide an option to behave like your parser or for search:parse to return empty queries for these scenarios. Stripping out punctuation from the input query is a decent workaround, but we have to be careful not strip out characters that could be part of a constraint, phrase, custom grammar, etc., so the regex gets uglier. -Will -----Original Message----- From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of Michael Blakeley Sent: Wednesday, February 01, 2012 10:56 AM To: General MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] element-query with punctuation insensitive and punctuation marks as cts:text In cases like this it's worth looking at the query output. The search:parse function produces this: <cts:and-query strength="20" qtextjoin="" qtextgroup="( )" xmlns:cts="http://marklogic.com/cts"> <cts:word-query qtextpre=""" qtextref="cts:text" qtextpost="""> <cts:text>metal</cts:text> <cts:option>case-insensitive</cts:option> <cts:option>unstemmed</cts:option> <cts:option>punctuation-insensitive</cts:option> </cts:word-query> <cts:and-query strength="20" qtextjoin="" qtextgroup="( )"> <cts:word-query qtextref="cts:text"> <cts:text>,</cts:text> <cts:option>case-insensitive</cts:option> <cts:option>unstemmed</cts:option> <cts:option>punctuation-insensitive</cts:option> </cts:word-query> <cts:word-query qtextpre=""" qtextref="cts:text" qtextpost="""> <cts:text>locker</cts:text> <cts:option>case-insensitive</cts:option> <cts:option>unstemmed</cts:option> <cts:option>punctuation-insensitive</cts:option> </cts:word-query> </cts:and-query> </cts:and-query> See the cts:text entry for ','? After some testing with 5.0-2, my guess is that since ',' is the only character in that punctuation-insensitive word-query, that word-query term ends up not matching anything. I think it should match *everything*, which would also cause problems if search:parse created that query. But whether the existing behavior is a bug or not, the workaround should be simple: rewrite the input query so that it does not contain any punctuation. This might be suitable: replace($query, '[^\w\s]', ' ') Or you might look into using https://github.com/mblakele/xqysp with search:resolve(). XQYSP ignores unexpected punctuation unless it is part of a quoted term. -- Mike On 1 Feb 2012, at 09:21 , Will Thompson wrote: > Abhishek - I recently had a very similar issue with empty searches and > punctuation, and the solution appeared to be adding <empty > apply="all-results" /> to search options. However, after further testing, I > am also getting empty results. For example, > > let $options := > <options xmlns="http://marklogic.com/appservices/search"> > <term> > <empty apply="all-results" /> > <term-option>punctuation-insensitive</term-option> > </term> > <searchable-expression>//doc</searchable-expression> > </options> > let $empty := > <cts:word-query qtextref="cts:text" xmlns:cts="http://marklogic.com/cts"> > <cts:text>;</cts:text> > <cts:option>punctuation-insensitive</cts:option> > </cts:word-query> > return > search:resolve($empty,$options) > > This returns no results, and the value of @apply does not seem to have any > effect. I think this is probably a bug. > > -Will > > > From: general-boun...@developer.marklogic.com > [mailto:general-boun...@developer.marklogic.com] On Behalf OfAbhishek53 S > Sent: Wednesday, February 01, 2012 2:55 AM > To: General MarkLogic Developer Discussion > Subject: Re: [MarkLogic Dev General] element-query with punctuation > insensitive and punctuation marks as cts:text > > > Hi Geert, > > Here is the sample query I used > > import module namespace search = "http://marklogic.com/appservices/search" > at > "/MarkLogic/appservices/search/search.xqy"; > let $parsed-query := search:parse('"metal" , "locker"', > <options > xmlns="http://marklogic.com/appservices/search"> > > <search-option>unfiltered</search-option> > <term> > <empty apply="all-results" /> > <term-option>case-insensitive</term-option> > <term-option>unstemmed</term-option> > > <term-option>punctuation-insensitive</term-option> > </term> > > > </options>) > > let $query := cts:element-query(xs:QName("data"),cts:query($parsed-query)) > return > > xdmp:estimate(cts:search(fn:doc(), > $query)) > > > > Thanks > Abhishek Srivastav > Tata Consultancy Services > Cell:- +91-9883389968 > Mailto: abhishek5...@tcs.com > Website: http://www.tcs.com > ____________________________________________ > Experience certainty. IT Services > Business Solutions > Outsourcing > ____________________________________________ > > > From: > Abhishek53 S <abhishek5...@tcs.com> > To: > General MarkLogic Developer Discussion <general@developer.marklogic.com> > Date: > 02/01/2012 04:17 PM > Subject: > Re: [MarkLogic Dev General] element-query with punctuation insensitive and > punctuation marks as cts:text > Sent by: > general-boun...@developer.marklogic.com > > > > > > Hi Geert, > > Thanks for your response. Currently I am not inclined towards removing the > word-query with punctuation marks (Until it will be the last option to do) > from the main query. I am using search:parse function to parse the search > term. > > I tried with your 3rd option but still unable to get the expected result > [count without punctuation (,) = count with punctuation (,) as > punctuation-insensitive]. If I can recall it correctly this term option is > used to send result or not when the term is empty terms how this would help > me in this case... > > Thanks for you help! > > Abhishek Srivastav > Tata Consultancy Services > Cell:- +91-9883389968 > Mailto: abhishek5...@tcs.com > Website: http://www.tcs.com > ____________________________________________ > Experience certainty. IT Services > Business Solutions > Outsourcing > ____________________________________________ > > From: > Geert Josten <geert.jos...@dayon.nl> > To: > General MarkLogic Developer Discussion <general@developer.marklogic.com> > Date: > 02/01/2012 03:26 PM > Subject: > Re: [MarkLogic Dev General] element-query with punctuation insensitive and > punctuation marks as cts:text > Sent by: > general-boun...@developer.marklogic.com > > > > > > Hi Abishek, > > What is happening here is that you pass ',' as search term to a word-query > with 'punctuation-insensitive' option. That option causes the comma character > effectively to be stripped out of the search term, leaving an empty search > term. Doing a cts:word-query with an empty search term results nothing. > > I think you have few options: > 1. Don't tokenize the search string yourself (at least, if that is what > you are doing), and pass in 'metal,' or ', metal' as search term with > punctuation insensitive. That is effectively the same as searching for > 'metal'. > 2. Strip punctuation yourself before parsing it to <cts:query> element > structure (or post-process the query element structure to filter out > punctuation-only queries) > 3. Add <empty apply="all-results" /> to your search options (I'm > guessing you are using search:parse, so to the options you pass in there) > > Kind regards, > Geert > > Van: general-boun...@developer.marklogic.com > [mailto:general-boun...@developer.marklogic.com] NamensAbhishek53 S > Verzonden: woensdag 1 februari 2012 10:30 > Aan: General MarkLogic Developer Discussion > Onderwerp: [MarkLogic Dev General] element-query with punctuation insensitive > and punctuation marks as cts:text > > > Hi Folks, > > I am not sure if I am wrong somewhere while explaining this issue of > punctuation-insensitive search with punctuation marks as cts:text > (element-query). While executing the below query I am not getting any count > back because punctuation mark is not ignored during search (even if > punctuation-insensitive). The expected behavior of our application is always > punctuation-insensitive . If I remove word query with punctuation marks, It > will start returning count based on remaining search criteria. On the other > hand word query with punctuation-sensitive option is behaving similar to it > is ignored from the search criteria. > > > Please let me know how to make this element-query punctuation insensitive > even if punctuation marks are present into cts:text node of word-query . > xdmp:estimate(cts:search(fn:doc(), > cts:query( > <cts:element-query> > <cts:element xmlns="">data</cts:element> > <cts:and-query> > <cts:word-query> > <cts:text xml:lang="en">,</cts:text> > <cts:option>case-insensitive</cts:option> > > <cts:option>punctuation-insensitive</cts:option> > <cts:option>unstemmed</cts:option> > </cts:word-query> > <cts:word-query> > <cts:text xml:lang="en">metal</cts:text> > <cts:option>case-insensitive</cts:option> > > <cts:option>punctuation-insensitive</cts:option> > <cts:option>unstemmed</cts:option> > </cts:word-query> > </cts:and-query> > </cts:element-query> > ))) > > Thanks & Regards > Abhishek Srivastav > Tata Consultancy Services > Cell:- +91-9883389968 > Mailto: abhishek5...@tcs.com > Website: http://www.tcs.com > ____________________________________________ > Experience certainty. IT Services > Business Solutions > Outsourcing > ____________________________________________ > =====-----=====-----===== > Notice: The information contained in this e-mail > message and/or attachments to it may contain > confidential or privileged information. If you are > not the intended recipient, any dissemination, use, > review, distribution, printing or copying of the > information contained in this e-mail message > and/or attachments to it are strictly prohibited. If > you have received this communication in error, > please notify us by reply e-mail or telephone and > immediately and permanently delete the message > and any attachments. Thank you_______________________________________________ > General mailing list > General@developer.marklogic.com > http://developer.marklogic.com/mailman/listinfo/general_______________________________________________ > General mailing list > General@developer.marklogic.com > http://developer.marklogic.com/mailman/listinfo/general > > _______________________________________________ > General mailing list > General@developer.marklogic.com > http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general