Hi Hans,

I think MLCP will be a better starting point than XQSync for MarkLogic 6 and 
later. You can use MLCP transforms to embed checksum code. The transform 
function returns a map:map for each fragment you want to have written to disk, 
but if you simply return an empty sequence, the document will not be written. 
Docs on transforms can be found here:

https://docs.marklogic.com/guide/mlcp/import#id_82518

Consider using xdmp:sha512 for better accuracy, and saving the checksum inside 
the content to save some calculation time, though sha calc is pretty fast..

Let us know if you need further help..

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Hans Hübner 
<hans.hueb...@lambdawerk.com<mailto:hans.hueb...@lambdawerk.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Tuesday, June 28, 2016 at 6:45 AM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: [MarkLogic Dev General] Bulk updates (xqsync vs. mlcp)

Hi,

we're planning to use MarkLogic to do regular bulk updates on a larger set of 
documents (~1 million).  Many of the documents will be unchanged from their 
previous version, and we'd like to avoid reinserting them as we want to be able 
to use the point-in-time query feature to track document changes over time.  
I've read an old thread in this forum that suggested calculating a checksum 
over each input document and then only writing it to the database if the 
previous version's checksum differs.  In that same thread, it was also 
suggested that xqsync could be used.

Now xqsync apparently was replaced by mlcp, and I can find an indication in the 
mlcp documentation that it avoids writing unchanged documents.

Can anyone suggest the best way to approach this?

Thanks!
Hans

_______________________________________________
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to