Hello, I stumble upon what seems to be a straightforward task of making a bulk modification of XML documents in MarkLogic (such as adding a new element to every document in the collection). I've looked at CPF first, but it seems like it only supports event-based processing (triggers) and does not have any facility for batch processing.
So I just write a simple xQuery as follows: for $x in collection()/A return ( xdmp:node-delete( $x/test ), xdmp:node-insert-child( $x, <test>test</test> ) ) but it runs out of memory on a large collection - "Expanded tree cache full". So it looks like the above query is trying fetch all documents into memory first, then iterate over them. Whereas what I want is to perform the work on smaller chunks of data that fit into memory and, ideally, do several such chunks in parallel (think "map" without "reduce"). Is there another approach? I am reading about CoRB that seems to be just the thing I need, but I wonder if I am missing another potential solution here. Also, while CoRB description mentions that it can run updates on disk (not in memory), it does not mention parallelization - which eventually will be quite important for my use case. Thanks, [Forward Slash] [Elevate] Alexei Betin Principal Architect; Big Data P: (817) 928-1643 | Elevate.com<http://www.elevate.com> 4150 International Plaza, Suite 300 Fort Worth, TX 76109 Privileged and Confidential. This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain privileged and/or confidential information. If you have received this e-mail in error, please notify me immediately by a return e-mail and delete this e-mail. You are hereby notified that any dissemination, distribution or copying of this e-mail and/or any attachments thereto, is strictly prohibited.
_______________________________________________ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general