> I am working with a database of text documents. I have extracted some 
> structured metadata from these documents and added these values as document 
> properties

I'd advise not using separate properties fragments for this. Joining properties 
and document fragments is not free.

The separation between document and properties fragments is nice and clean, but 
usually overkill. What you can do is co-locate the source text and its 
structured metadata in the same document:

<my:envelope xmlns:my="…">
  <my:metadata>
    <deliverydate/>
  </my:metadata>
  <my:source>
   <original>text</original>
  </my:source>
</my:envelope>

As you discover other interesting aspects of the raw input source, you can 
"promote" the structured, canonical forms into their own metadata elements 
without having to modify the source data. You can query either the metadata or 
the raw source or both. As before, you'd put a range index on the deliverydate 
element. However, if you have queries that cross metadata and source data, such 
as, "Find me all documents that mention the phrases X, Y, but not Z and were 
delivered last month", MarkLogic doesn't have to join separate fragments to 
resolve these.

"Envelope" is a common data modeling pattern in MarkLogic that takes advantage 
of the flexible document data model. Envelopes allow you to manage the various 
aspects of the same logical thing (e.g. customer, article, trade) in the same 
physical document. Because queries and caches generally work at the document 
level, this is also usually the most efficient way as well.

Obviously, there is lots more to data modeling in MarkLogic. I'd encourage 
those interested to check out the MarkLogic University courses on this topic 
<https://mlu.marklogic.com/ondemand/index.xqy?q=Series%3A%22Data%20Modeling%22>.

Justin

--
Justin Makeig
Director, Product Management
MarkLogic
[email protected]

> On Nov 14, 2016, at 11:37 AM, Fox, David <[email protected]> wrote:
> 
> Thank you everyone for your responses. Geert was spot in with my misuse of 
> date, when I needed dateTime.
>  
> Also, great suggestions for how to combine searches across documents and 
> document properties. I like the suggestion to use cts: 
> properties-fragment-query together with search:resolve.
>  
> Thanks again,
> David
> _______________________________________________
> General mailing list
> [email protected]
> Manage your subscription at: 
> http://developer.marklogic.com/mailman/listinfo/general



_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to