Mike Sokolov wrote:
Agreed; however it's not clear that trailing whitespace needs to be
preserved in order to be able to search for DITA tokens, as in the
original example. I guess it might depend on just what the tokens
consist of but a word- or phrase-search might be able to make use of the
implicit tokenization done by the indexer without the need for the
trailing whitespace.
EG: cts:attribute-word-search(..."topic/topic") ought to match
"topic/topic" and not match "mytopic/topic-foo", I think.
It's not just a question of what will work from a MarkLogic query but
what consumers of the elements brought out of MarkLogic will get. For
example, the XSLT pattern for processing DITA content is:
<xsl:template match="*[contains(@class, ' topic/topic ')]">
If I get stuff out of MarkLogic and hand it to an XSLT transform (e.g.,
the DITA Open Toolkit) then the above match would fail for generic
topics (because the literal value of class= would be "- topic/topic" not
"- topic/topic ").
Likewise, editors and other tools that expect the trailing space in
order to bind behavior to elements would fail.
So even in the best case it would be necessary to moderate any element
extraction through a filter that either removes the class= attributes
entirely (falling back on the schema- or DTD-defined defaults, assuming
the DTD or schema association is restored or maintained in the result)
or that adds the missing trailing space to the literal class= values in
the instance.
Cheers,
Eliot
--
Eliot Kimber
Senior Solutions Architect
"Bringing Strategy, Content, and Technology Together"
Main: 610.631.6770
www.reallysi.com
www.rsuitecms.com
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general