Thanks Danny, but I'm not sure I follow. Maybe that was not the best
explanation. Rather than use dashes like hyphens, I just want a search for
something like "Venue ― Motion to Transfer" to ignore the dash when
parsed. It appears to be treating it like a word instead and is not ignored:
cts:and-query(
(cts:word-query("Venue", ("case-insensitive", "punctuation-insensitive",
"lang=en"), 1),
cts:word-query("―", ("case-insensitive", "punctuation-insensitive",
"lang=en"), 1),
cts:word-query("Motion", ("case-insensitive", "punctuation-insensitive",
"lang=en"), 1),
cts:word-query("to", ("case-insensitive", "punctuation-insensitive",
"lang=en"), 1),
cts:word-query("Transfer", ("case-insensitive", "punctuation-insensitive",
"lang=en"), 1)),
())
-Will
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Danny Sokolsky
Sent: Thursday, January 26, 2012 5:35 PM
To: General MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] en/em dashes punctuation?
Hi Will,
One thing you can do is change your search grammar to use a joiner other than
the negative sign.
Here is the default grammar:
http://docs.marklogic.com/5.0doc/docapp.xqy#display.xqy?fname=http://pubs/5.0doc/xml/search-dev-guide/search-api.xml%2344520
-Danny
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Will Thompson
Sent: Thursday, January 26, 2012 4:34 PM
To: General MarkLogic Developer Discussion
Subject: [MarkLogic Dev General] en/em dashes punctuation?
Our search autocomplete pulls from doc titles, some of which contain en or em
dashes. However, if the dash is "floating"- i.e.: "Venue - Motion to Transfer"
- search:parse parses it into the query, even though
<term-option>punctuation-insensitive</term-option> is included in the <term>
section of the search options node. I thought it may just be getting ignored
when it's evaluated but it's definitely limiting the query.
I can confirm they are punctuation: cts:tokenize("hyphen-en-em-bar―")[.
instance of cts:punctuation] => "- - - ―"
But is there an exception here (the same way hyphens are always parsed to
negate)? Do I just need to remove these from the query string before calling
search:parse? If there is a cleaner way, that would be great.
Best,
Will
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general