Benjamin Habegger created OAK-12195:
---------------------------------------
Summary: Add W3C DOM wrapper for JCR to enable rich XPath content
extraction
Key: OAK-12195
URL: https://issues.apache.org/jira/browse/OAK-12195
Project: Jackrabbit Oak
Issue Type: New Feature
Reporter: Benjamin Habegger
JCR shares many commonalities with the W3C's wide spread Document Object Model:
a tree of nodes with attributes/properties, namespaces, etc. Robust
implementation of the DOM exist in many programming languages and in particular
in Java via Apache Xalan project. DOM has been built along side the XML/HTML
and widely used in browsers but not strongly tide to those serialization
formats.
Wrapping a JCR tree as a DOM tree would allow to leverage existing widely
spread and well tested code at the mere cost of implementing a wrapper. In
particular this would allow to use a full-fledge and standard XPath expressions
to be able to fetch JCR content.
A particularly interesting application to such extraction would be to allow
using XPath to define the content to be present in fulltext search fields.
Currently, the way fulltext field is filled is either too broad or not broad
enough. For example, aggregates pull in all the content withing a JCR and does
not allow filtering parts of the interesting content for search for content not
interesting.
This ticket proposes to introduce a DOM wrapper for a JCR tree, in particular
allowing to retrieve content from a JCR using full-fledged XPath.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)