Attached is the script Ive used in the past. YMMV Note that this was intended to put a directory tree from the filesystem to ML. It would have to be modified to pull from one ML instance to another .
The MD5 property and length is not builtin property of ML. This script assures it is written to every document and if it doesn't exist will write it the first time. That may or may not be optimal for your case. Note that the MD5 and length is meaningless on an in-database document because the serialized form is not stored. But it is meaningful assuming that the document existed prior in a serialized form. Thus to compare 2 documents you have to serialize them and then compare their MD5. But once stored with the documents as a property they can be queried without serializing or even fetching either document. This script handles updates, inserts, and deletes to make the target match the source. This is called like xmlsh put_sync direcory uri it assumes you have xmlsh and the marklogic extension module installed, and the MLCONNECT variable set. <disclaimer> personal code with no assumed warrantee and not affiliated with my employer use as an example only blah blah blah ----------------------------------------------------------------------------- David Lee Lead Engineer MarkLogic Corporation [email protected] Phone: +1 650-287-2531 Cell: +1 812-630-7622 www.marklogic.com This e-mail and any accompanying attachments are confidential. The information is intended solely for the use of the individual to whom it is addressed. Any review, disclosure, copying, distribution, or use of this e-mail communication by others is strictly prohibited. If you are not the intended recipient, please notify us immediately by returning this message to the sender and delete all copies. Thank you for your cooperation. -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Murray, Gregory Sent: Friday, February 10, 2012 11:05 AM To: MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] Syncing only documents that have changed David, Would you be willing to share your xmlsh script? If you can't, at least that points me in the right direction -- I hadn't thought to use xmlsh. I also didn't know there was an MD5 property. Thanks! Greg On Feb 10, 2012, at 10:20 AM, David Lee wrote: > If you start outside ML you can connect to 2 different servers via XCC easily. > I've implemented a "sync" script in xmlsh which does datetime and checksum > comparisons to sync between filesystem and an ML server. It could easily be > adopted to sync between 2 servers. > > If you are inside ML then one idea would be to expose an HTTP service on the > other server to do what you ask. > > IMHO timestamps are not quite good enough unless your system clocks are > synced. The xmlsh sync script uses a MD5 property stored with the document. > > ----------------------------------------------------------------------------- > David Lee > Lead Engineer > MarkLogic Corporation > [email protected] > Phone: +1 650-287-2531 > Cell: +1 812-630-7622 > www.marklogic.com > > This e-mail and any accompanying attachments are confidential. The > information is intended solely for the use of the individual to whom it is > addressed. Any review, disclosure, copying, distribution, or use of this > e-mail communication by others is strictly prohibited. If you are not the > intended recipient, please notify us immediately by returning this message to > the sender and delete all copies. Thank you for your cooperation. > > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of Murray, Gregory > Sent: Friday, February 10, 2012 9:56 AM > To: General MarkLogic Developer Discussion > Subject: [MarkLogic Dev General] Syncing only documents that have changed > > I need to copy documents from one server (development) to another > (production) but copy only documents that have changed, that is, each > document on development that has a more recent last-modified property than > the corresponding document on production. > > Does xqsync have an option for this? I'm not seeing one. > > If not, can Information Studio do this? > > If not, is it possible to run an XQuery query that connects to an XDBC server > on a different machine? If so, I could easily take the last-modified property > of the document in the database against which I run the query (development) > and compare it against the same property of the corresponding document on the > production machine. In the past I've used the <database> option of > xdmp:eval() to grab documents from a different database on the same machine, > but in this case I need to connect to a different machine altogether. > > Many thanks, > Greg > > Gregory Murray > Digital Library Application Developer > Princeton Theological Seminary > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
put_sync.xsh
Description: put_sync.xsh
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
