David, Thanks very much for the info and the script. Definitely points me in a useful direction.
Thanks, Greg On Feb 10, 2012, at 12:52 PM, David Lee wrote: > Attached is the script Ive used in the past. > YMMV > > Note that this was intended to put a directory tree from the filesystem to > ML. It would have to be modified to pull from one ML instance to another . > > The MD5 property and length is not builtin property of ML. This script > assures it is written to every document and if it doesn't exist will write it > the first time. That may or may not be optimal for your case. Note that > the MD5 and length is meaningless on an in-database document because the > serialized form is not stored. But it is meaningful assuming that the > document existed prior in a serialized form. > Thus to compare 2 documents you have to serialize them and then compare their > MD5. > But once stored with the documents as a property they can be queried without > serializing or even fetching either document. This script handles updates, > inserts, and deletes to make the target match the source. > > > This is called like > > xmlsh put_sync direcory uri > > it assumes you have xmlsh and the marklogic extension module installed, and > the MLCONNECT variable set. > > <disclaimer> > personal code with no assumed warrantee and not affiliated with my employer > use as an example only blah blah blah > > ----------------------------------------------------------------------------- > David Lee > Lead Engineer > MarkLogic Corporation > [email protected] > Phone: +1 650-287-2531 > Cell: +1 812-630-7622 > www.marklogic.com > > This e-mail and any accompanying attachments are confidential. The > information is intended solely for the use of the individual to whom it is > addressed. Any review, disclosure, copying, distribution, or use of this > e-mail communication by others is strictly prohibited. If you are not the > intended recipient, please notify us immediately by returning this message to > the sender and delete all copies. Thank you for your cooperation. > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of Murray, Gregory > Sent: Friday, February 10, 2012 11:05 AM > To: MarkLogic Developer Discussion > Subject: Re: [MarkLogic Dev General] Syncing only documents that have changed > > David, > > Would you be willing to share your xmlsh script? If you can't, at least that > points me in the right direction -- I hadn't thought to use xmlsh. I also > didn't know there was an MD5 property. > > Thanks! > Greg > > > On Feb 10, 2012, at 10:20 AM, David Lee wrote: > >> If you start outside ML you can connect to 2 different servers via XCC >> easily. >> I've implemented a "sync" script in xmlsh which does datetime and checksum >> comparisons to sync between filesystem and an ML server. It could easily be >> adopted to sync between 2 servers. >> >> If you are inside ML then one idea would be to expose an HTTP service on the >> other server to do what you ask. >> >> IMHO timestamps are not quite good enough unless your system clocks are >> synced. The xmlsh sync script uses a MD5 property stored with the document. >> >> ----------------------------------------------------------------------------- >> David Lee >> Lead Engineer >> MarkLogic Corporation >> [email protected] >> Phone: +1 650-287-2531 >> Cell: +1 812-630-7622 >> www.marklogic.com >> >> This e-mail and any accompanying attachments are confidential. The >> information is intended solely for the use of the individual to whom it is >> addressed. Any review, disclosure, copying, distribution, or use of this >> e-mail communication by others is strictly prohibited. If you are not the >> intended recipient, please notify us immediately by returning this message >> to the sender and delete all copies. Thank you for your cooperation. >> >> >> -----Original Message----- >> From: [email protected] >> [mailto:[email protected]] On Behalf Of Murray, Gregory >> Sent: Friday, February 10, 2012 9:56 AM >> To: General MarkLogic Developer Discussion >> Subject: [MarkLogic Dev General] Syncing only documents that have changed >> >> I need to copy documents from one server (development) to another >> (production) but copy only documents that have changed, that is, each >> document on development that has a more recent last-modified property than >> the corresponding document on production. >> >> Does xqsync have an option for this? I'm not seeing one. >> >> If not, can Information Studio do this? >> >> If not, is it possible to run an XQuery query that connects to an XDBC >> server on a different machine? If so, I could easily take the last-modified >> property of the document in the database against which I run the query >> (development) and compare it against the same property of the corresponding >> document on the production machine. In the past I've used the <database> >> option of xdmp:eval() to grab documents from a different database on the >> same machine, but in this case I need to connect to a different machine >> altogether. >> >> Many thanks, >> Greg >> >> Gregory Murray >> Digital Library Application Developer >> Princeton Theological Seminary >> >> _______________________________________________ >> General mailing list >> [email protected] >> http://developer.marklogic.com/mailman/listinfo/general >> _______________________________________________ >> General mailing list >> [email protected] >> http://developer.marklogic.com/mailman/listinfo/general > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general > <put_sync.xsh>_______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
