Yes, I was using xdmp:md5.
In case of multiple node scenario (like below)

let $newu := $new/node()[fn:empty(./*)]
return xdmp:md5($newu)

dc513ea4fbdaa7a14786ffdebc4ef64e
800618943025315f869e4e1f09471012
c81e728d9d4c2f636f067f89cc14862c
c236b1023a2f143fc9857752555cd93b
e4da3b7fbbce2345d7772b0674a318d5
4121d9675000fe5e09893c0482eb7f9b
86cc1a19037cc185546f489eb7075bcb
fafecd0ee5e82491d0b3dcc2b429473b
0d61f8370cad1d412f80b84d143e1257
8b5531a15cd75ba9bc8ed411eeaab897
875049874e2ccabc60f24adb0386d77e

it returns multiple 11 "32 bytes" (sorry for my wrong statement below)
and is difficult to compare.

Regards,
Utsav Joshi
-----Original Message-----
From: general-boun...@developer.marklogic.com
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Lee, David
Sent: Wednesday, June 23, 2010 10:05 AM
To: General Mark Logic Developer Discussion
Subject: Re: [MarkLogic Dev General] xml comparsion

That is suspicious.
Did you use xdmp:md5 ? 
The resultant string should always be 32 bytes exactly.



-----Original Message-----
From: general-boun...@developer.marklogic.com
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Joshi,
Utsav (LNG-CON)
Sent: Wednesday, June 23, 2010 9:44 AM
To: General Mark Logic Developer Discussion
Subject: Re: [MarkLogic Dev General] xml comparsion


I tried using md5 but sometime md5 string (700+ characters) is too long
for 'eq' or '=' to compare.
So I found this is not reliable behavior.

Regards,
Utsav Joshi
-----Original Message-----
From: general-boun...@developer.marklogic.com
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Lee, David
Sent: Tuesday, June 22, 2010 8:23 PM
To: General Mark Logic Developer Discussion
Subject: Re: [MarkLogic Dev General] xml comparsion

Thats great !!!!!
( Nothing like avoiding exponential behavior)

To avoid deep-equals you could add an attribute with an md5 of the
serialized form of the node.
This then becomes a single value check instead of deep-equal.
 

-----Original Message-----
From: general-boun...@developer.marklogic.com
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Joshi,
Utsav (LNG-CON)
Sent: Tuesday, June 22, 2010 5:24 PM
To: General Mark Logic Developer Discussion
Subject: Re: [MarkLogic Dev General] xml comparsion


Thanks a lot for your inputs
My performance improved from 48 minute to 6.84 Seconds.

I still need to think about deep-equal alternative.

Regards,
Utsav Joshi
-----Original Message-----
From: general-boun...@developer.marklogic.com
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Geert
Josten
Sent: Tuesday, June 22, 2010 2:51 AM
To: General Mark Logic Developer Discussion
Subject: Re: [MarkLogic Dev General] xml comparsion

Hi Joshi,

Some observations from first glance. Don't loop over both old and new,
but only new, and only grab the appropriate mid element from old using a
match on element a. That will eliminate the exponential order. You can
use cts functions to guarantee you are using indexes to get the
appropriate mid from old.

It might also help to declare mid as fragment root, but that does likely
create a lot of fragments in your database, and can have side-effects on
existing search code. But saves loading old and new as one big fragment
into memory, and needing it to be parsed in memory to reach the mid's..

Deep-equal is also rather expensive, if you can swap it with something
simpler, that might speed things up as well..

Kind regards,
Geert

>


drs. G.P.H. (Geert) Josten
Consultant

Daidalos BV
Hoekeindsehof 1-4
2665 JZ Bleiswijk

T +31 (0)10 850 1200
F +31 (0)10 850 1199

mailto:geert.jos...@daidalos.nl
http://www.daidalos.nl/

KvK 27164984


De informatie - verzonden in of met dit e-mailbericht - is afkomstig van
Daidalos BV en is uitsluitend bestemd voor de geadresseerde. Indien u
dit bericht onbedoeld hebt ontvangen, verzoeken wij u het te
verwijderen. Aan dit bericht kunnen geen rechten worden ontleend.

> From: general-boun...@developer.marklogic.com
> [mailto:general-boun...@developer.marklogic.com] On Behalf Of
> Joshi, Utsav (LNG-CON)
> Sent: maandag 21 juni 2010 22:17
> To: general@developer.marklogic.com
> Subject: [MarkLogic Dev General] xml comparsion
>
>
>
> I am comparing different version of 2 xml (old.xml and
> new.xml files for reference) to check if there is any changes in xml.
>
> Base/mid/a is my key to compare between old/xml and new.xml.
>
>
>
> It is taking more than a minute for 3000 "mid" elements and
> once it goes beyond 100,000 "mid" element it is taking forever.
>
> I want to reduce the execution time to sub-second response,
> can you please advise.
>
>
>
> old xml
>
> <base>
>  <mid>
>   <a>1</a>
>   <b>b</b>
>   <c>c</c>
>   <d>d</d>
>  </mid>
>  <mid>
>   <a>2</a>
>   <b>b</b>
>   <c>c</c>
>   <d>d</d>
>  </mid>
>  <mid>
>   <a>3</a>
>   <b>b</b>
>   <c>c</c>
>   <d>d</d>
>  </mid>
>
>
>
> new xml
>
> <base>
> <top>xxx</top>
>  <mid>
>   <a>1</a>
>   <b>b</b>
>   <c>c</c>
>   <d>d</d>
>  </mid>
>  <mid>
>   <a>2</a>
>   <b>b</b>
>   <c>c</c>
>   <d>d</d>
>  </mid>
>  <mid>
>   <a>3</a>
>   <b>b333</b>
>   <c>c</c>
>   <d>d</d>
>  </mid>
>
>
>
> xquery
>
> xdmp:query-trace(true()),
>
> for $old in doc("/documents/old.xml")/base/mid
>
> for $new in doc("/documents/new.xml")/base/mid
>
> return if ($new/a = $old/a
>
> and not(deep-equal($new, $old))) then
> <updated><old>{$old}</old><new>{$new}</new></updated>
>
> else (), xdmp:query-meters()
>
>
>
>
>
> I have created element range index on local name 'a' and
> below is the excerpt from log file
>
>
>
> 2010-06-17 14:51:27.282 Info: Docs: line 3:
> xdmp:eval("xdmp:query-trace(true()),&#13;&#10;for $old in
> doc(&quot;/docume...", (), <options
> xmlns="xdmp:eval"><isolation>different-transaction</isolation>
> </options>)
>
> 2010-06-17 14:51:27.282 Info: Docs: line 3: Analyzing path
> for $new: fn:doc("/documents/new.xml")/base/mid
>
> 2010-06-17 14:51:27.282 Info: Docs: line 3: Step 1 is
> searchable: fn:doc("/documents/new.xml")
>
> 2010-06-17 14:51:27.282 Info: Docs: line 3: Step 2 is searchable: base
>
> 2010-06-17 14:51:27.282 Info: Docs: line 3: Step 3 is searchable: mid
>
> 2010-06-17 14:51:27.282 Info: Docs: line 3: Path is fully searchable.
>
> 2010-06-17 14:51:27.282 Info: Docs: line 3: Gathering constraints.
>
> 2010-06-17 14:51:27.282 Info: Docs: line 3: Step 1
> contributed 1 constraint: fn:doc("/documents/new.xml")
>
> 2010-06-17 14:51:27.282 Info: Docs: line 3: Step 2 test
> contributed 1 constraint: base
>
> 2010-06-17 14:51:27.282 Info: Docs: line 3: Step 3 test
> contributed 1 constraint: mid
>
> 2010-06-17 14:51:27.282 Info: Docs: line 3: Executing search.
>
> 2010-06-17 14:51:27.298 Info: Docs: line 3: Selected 1
> fragment to filter
>
>
>
>
>
> cq output
>
>
>
> <updated><old><mid>
>
>                <a>3000</a>
>
>                <b>b</b>
>
>                <c>c</c>
>
>                <d>d</d>
>
>                <details>
>
>                        <a1>a1</a1>
>
>                        <b1>b1</b1>
>
>                        <c1>c1</c1>
>
>                        <d1>d1</d1>
>
>                </details>
>
>         </mid></old><new><mid>
>
>                <a>3000</a>
>
>                <b>b3000</b>
>
>                <c>c</c>
>
>                <d>d</d>
>
>                <details>
>
>                        <a1>a1</a1>
>
>                        <b1>b1</b1>
>
>                        <c1>c1</c1>
>
>                        <d1>d1</d1>
>
>                </details>
>
>         </mid></new></updated>
>
> <qm:query-meters
> xsi:schemaLocation="http://marklogic.com/xdmp/query-meters
> query-meters.xsd"
> xmlns:qm="http://marklogic.com/xdmp/query-meters";
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";>
>
>   <qm:elapsed-time>PT1M4.059S</qm:elapsed-time>
>
>   <qm:requests>0</qm:requests>
>
>   <qm:list-cache-hits>26978</qm:list-cache-hits>
>
>   <qm:list-cache-misses>4</qm:list-cache-misses>
>
>   <qm:in-memory-list-hits>0</qm:in-memory-list-hits>
>
>   <qm:expanded-tree-cache-hits>2996</qm:expanded-tree-cache-hits>
>
>   <qm:expanded-tree-cache-misses>2</qm:expanded-tree-cache-misses>
>
>   <qm:compressed-tree-cache-hits>0</qm:compressed-tree-cache-hits>
>
>   <qm:compressed-tree-cache-misses>2</qm:compressed-tree-cache-misses>
>
>
> <qm:in-memory-compressed-tree-hits>0</qm:in-memory-compressed-
> tree-hits>
>
>   <qm:value-cache-hits>0</qm:value-cache-hits>
>
>   <qm:value-cache-misses>8985006</qm:value-cache-misses>
>
>   <qm:regexp-cache-hits>0</qm:regexp-cache-hits>
>
>   <qm:regexp-cache-misses>0</qm:regexp-cache-misses>
>
>   <qm:link-cache-hits>0</qm:link-cache-hits>
>
>   <qm:link-cache-misses>0</qm:link-cache-misses>
>
>   <qm:filter-hits>0</qm:filter-hits>
>
>   <qm:filter-misses>0</qm:filter-misses>
>
>   <qm:fragments-added>0</qm:fragments-added>
>
>   <qm:fragments-deleted>0</qm:fragments-deleted>
>
>   <qm:fs-program-cache-hits>0</qm:fs-program-cache-hits>
>
>   <qm:fs-program-cache-misses>0</qm:fs-program-cache-misses>
>
>   <qm:db-program-cache-hits>0</qm:db-program-cache-hits>
>
>   <qm:db-program-cache-misses>0</qm:db-program-cache-misses>
>
>
> <qm:fs-main-module-sequence-cache-hits>0</qm:fs-main-module-se
> quence-cache-hits>
>
>
> <qm:fs-main-module-sequence-cache-misses>0</qm:fs-main-module-
> sequence-cache-misses>
>
>
> <qm:db-main-module-sequence-cache-hits>0</qm:db-main-module-se
> quence-cache-hits>
>
>
> <qm:db-main-module-sequence-cache-misses>0</qm:db-main-module-
> sequence-cache-misses>
>
>   <qm:fs-library-module-cache-hits>0</qm:fs-library-module-cache-hits>
>
>
> <qm:fs-library-module-cache-misses>0</qm:fs-library-module-cac
> he-misses>
>
>   <qm:db-library-module-cache-hits>0</qm:db-library-module-cache-hits>
>
>
> <qm:db-library-module-cache-misses>0</qm:db-library-module-cac
> he-misses>
>
>   <qm:fragments>
>
>     <qm:fragment>
>
>       <qm:root xmlns="">base</qm:root>
>
>       <qm:expanded-tree-cache-hits>2996</qm:expanded-tree-cache-hits>
>
>       <qm:expanded-tree-cache-misses>2</qm:expanded-tree-cache-misses>
>
>     </qm:fragment>
>
>   </qm:fragments>
>
>   <qm:documents>
>
>     <qm:document>
>
>       <qm:uri>/documents/new.xml</qm:uri>
>
>       <qm:expanded-tree-cache-hits>2996</qm:expanded-tree-cache-hits>
>
>       <qm:expanded-tree-cache-misses>1</qm:expanded-tree-cache-misses>
>
>     </qm:document>
>
>     <qm:document>
>
>       <qm:uri>/documents/old.xml</qm:uri>
>
>       <qm:expanded-tree-cache-hits>0</qm:expanded-tree-cache-hits>
>
>       <qm:expanded-tree-cache-misses>1</qm:expanded-tree-cache-misses>
>
>     </qm:document>
>
>   </qm:documents>
>
> </qm:query-meters>
>
>
>
>
_______________________________________________
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to