One tip:

Any time you can express $node//child as $node/exact/path/to/child you'll get 
better performance, because it saves MarkLogic from having to scan the full 
tree looking for the child.

Then there's little things to try, like if you're going to repeatedly compare a 
node to another node's value, you can get the data($val) value and compare 
using that instead, so the atomization of the node happens just once.  Internal 
optimizations things like this change between server versions so I tend to 
experiment.

And why get /text() if you want /string().

The following line of code is called presumably a large number of times, so the 
above ideas could help.

> $xml_doc//firmname[.=$theOrigFirmname]/../translation/text()


Maybe:

$xml_doc/exact/path/translation[firmname = $theOrigFirmnameData]/string()

Also, have you tried using the profiler?

-jh-

> On May 24, 2016, at 2:53 AM, Kari Cowan <kco...@alm.com> wrote:
> 
> The file is used in a different application that I don’t have control over, 
> so I am just adjusting the data that’s in the file – to fix the firmname 
> (correcting some typo’s and inconsistencies they had and continue to have – 
> can’t really prevent that because the service pulls the data from various 
> public court records and every law clerk seems to have their own way of 
> entering the data).
>  
> When my script is doing: for $firms in $pacer_doc//(counsel|party) …
> Is there a better way than load the doc nodes in a for loop – maybe some 
> other function I am not aware of or another flowr ?
>  
>  
>  
> From: general-boun...@developer.marklogic.com 
> <mailto:general-boun...@developer.marklogic.com> 
> [mailto:general-boun...@developer.marklogic.com 
> <mailto:general-boun...@developer.marklogic.com>] On Behalf Of Geert Josten
> Sent: Monday, May 23, 2016 11:44 AM
> To: MarkLogic Developer Discussion <general@developer.marklogic.com 
> <mailto:general@developer.marklogic.com>>
> Subject: Re: [MarkLogic Dev General] How to handle very large xml file to 
> prevent com.marklogic.xcc.exceptions.XQueryException: Time limit exceeded
>  
> Hi Kari,
>  
> 13 Mb isn’t really big actually, but big enough to perform less optimal, and 
> cause timeouts. You could just increase the timeout, but it is probably a 
> better idea to revise your strategy, and consider breaking your large file 
> into record-like files (each containing just one firm for instance). You can 
> then make much more use of the search capabilities of MarkLogic.
>  
> Cheers,
> Geert
>  
> From: <general-boun...@developer.marklogic.com 
> <mailto:general-boun...@developer.marklogic.com>> on behalf of Kari Cowan 
> <kco...@alm.com <mailto:kco...@alm.com>>
> Reply-To: MarkLogic Developer Discussion <general@developer.marklogic.com 
> <mailto:general@developer.marklogic.com>>
> Date: Monday, May 23, 2016 at 8:40 PM
> To: "general@developer.marklogic.com 
> <mailto:general@developer.marklogic.com>" <general@developer.marklogic.com 
> <mailto:general@developer.marklogic.com>>
> Subject: [MarkLogic Dev General] How to handle very large xml file to prevent 
> com.marklogic.xcc.exceptions.XQueryException: Time limit exceeded
>  
> There must be a better way to do this.  My script works fine when it’s 
> loading a document that is not very large, but occassionally one of the docs 
> is massive (13Mb on one of my error issues), and when that happens, in my 
> application I get an error like:
> com.marklogic.xcc.exceptions.XQueryException: Time limit exceeded
>  
> The script is basically getting a uri, reading it back and comparing the 
> ‘firmname’ nodes (there can be many in the same document), and if it differs 
> in the shortlist.xml, we change it to what that file says it should be.
>  
> The problem with my large file – there’s over 72,000 lawfirms it’s trying to 
> compare
>  
> This is my script – anyone have a suggestion of a better way to accomplish 
> what I am attempting?
>  
>  
>  
> xquery version "1.0-ml";
> declare namespace html = "http://www.w3.org/1999/xhtml 
> <http://www.w3.org/1999/xhtml>";
>  
> declare variable $uri as xs:string external;
> let $uri := try { ($uri) } catch ($e) { "" }
> (: let $uri:="/olympus/pacer-xml/9739715_3:15-cv-01221" :)
>  
> let $xml_doc:=fn:doc("/olympus/data-utils/standard_firmnames_shortlist.xml")
>  
> for $this_uri in "$uri"
> let $doc := fn:doc($uri)
> let $pacer_doc:=$doc
>  
> for $firms in $pacer_doc//(counsel|party)
>   let $theOrigFirmname:= $firms/originalFirmname         
>   let $theFirmname:= $firms/firmname
>   let $translation:= 
> $xml_doc//firmname[.=$theOrigFirmname]/../translation/text()
>  
>  
> for $firm in $pacer_doc
> return if( fn:exists($translation) and fn:exists($theFirmname) and 
> ($translation ne $theFirmname ) ) then
> (
>   fn:concat("CHANGING FIRMNAME: ",$theFirmname, " TO STANDARD FIRMNAME 
> TRANSLATION: ",$translation, " IN URI: " ,$uri),
>   xdmp:log(fn:concat("Olympotomus Changed Firmname: ",$theFirmname, " in URI: 
> " ,$uri)),
>   xdmp:node-replace($theFirmname,<firmname>{$translation}</firmname>) 
>  )
> else (
>   fn:concat("...Evaluated and did not change Firmname: ",$theFirmname, " in 
> URI: " ,$uri),
>   xdmp:log(fn:concat("Olympotomus Evaluated and did not change a Firmname: 
> ",$theFirmname, " in URI: " ,$uri))
>   )
> ALM, an information and intelligence company, provides customers with 
> critical news, data, analysis, marketing solutions and events to successfully 
> manage the business of business. 
>  
> Customers use ALM solutions to discover new ideas and approaches for solving 
> business challenges, connect to the right professionals and peers to move 
> business forward, and compete to win through access to data, analytics and 
> insight. ALM serves a community of over six million business professionals 
> seeking to discover, connect and compete in highly complex industries. Learn 
> more at www.alm.com <x-msg://137/www.alm.com>. 
> 
> ALM, an information and intelligence company, provides customers with 
> critical news, data, analysis, marketing solutions and events to successfully 
> manage the business of business. 
>  
> Customers use ALM solutions to discover new ideas and approaches for solving 
> business challenges, connect to the right professionals and peers to move 
> business forward, and compete to win through access to data, analytics and 
> insight. ALM serves a community of over six million business professionals 
> seeking to discover, connect and compete in highly complex industries. Learn 
> more at www.alm.com <x-msg://137/www.alm.com>. 
> 
> _______________________________________________
> General mailing list
> General@developer.marklogic.com <mailto:General@developer.marklogic.com>
> Manage your subscription at: 
> http://developer.marklogic.com/mailman/listinfo/general 
> <http://developer.marklogic.com/mailman/listinfo/general>
_______________________________________________
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to