Mike Sokolov wrote:
Agreed; however it's not clear that trailing whitespace needs to be preserved in order to be able to search for DITA tokens, as in the original example. I guess it might depend on just what the tokens consist of but a word- or phrase-search might be able to make use of the implicit tokenization done by the indexer without the need for the trailing whitespace.

EG: cts:attribute-word-search(..."topic/topic") ought to match "topic/topic" and not match "mytopic/topic-foo", I think.


It's not just a question of what will work from a MarkLogic query but what consumers of the elements brought out of MarkLogic will get. For example, the XSLT pattern for processing DITA content is:

<xsl:template match="*[contains(@class, ' topic/topic ')]">

If I get stuff out of MarkLogic and hand it to an XSLT transform (e.g., the DITA Open Toolkit) then the above match would fail for generic topics (because the literal value of class= would be "- topic/topic" not "- topic/topic ").

Likewise, editors and other tools that expect the trailing space in order to bind behavior to elements would fail.

So even in the best case it would be necessary to moderate any element extraction through a filter that either removes the class= attributes entirely (falling back on the schema- or DTD-defined defaults, assuming the DTD or schema association is restored or maintained in the result) or that adds the missing trailing space to the literal class= values in the instance.

Cheers,

Eliot

--
Eliot Kimber
Senior Solutions Architect
"Bringing Strategy, Content, and Technology Together"
Main: 610.631.6770
www.reallysi.com
www.rsuitecms.com
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to