I can confirm that the frequent memory problems we encountered with a 4.1 version of MarkLogic on a 32-bit Windows 2003 Server R2 machine have completely disappeared once we moved to 64-bit 2008 Server R2 platform, even though this particular version of Windows isn't officially supported (yet).
cheers, Jakob. On Wed, Dec 9, 2009 at 19:59, Lee, David <[email protected]> wrote: > Thanks Mike. I understand other people (and other configurations) are having > success with large directories. I'm simply reporting that *my system* is not > successful. > I did re-set the memory paramaters and it doesnt help much. > I suspect that your statement is the main one, that a 64bit machine and OS is > needed to accomidate this type of usage. > > > > > > > -----Original Message----- > From: Michael Blakeley [mailto:[email protected]] > Sent: Wednesday, December 09, 2009 1:42 PM > To: Lee, David > Cc: General Mark Logic Developer Discussion > Subject: Re: [MarkLogic Dev General] Cannot delete directory with 1mil docs - > XDMP-MEMORY > > David, > > Directories with millions of documents aren't necessarily a problem: I > create them frequently. Last week I build a 20M document database, and > the largest directory contained 9.2M documents. > > I see the 32-bit kernel as more of a problem. A 32-bit kernel is limited > to a 32-bit address space, and the server process only gets 3-GB of that > address space, no matter how much RAM or swap you have. So why not > install a 64-bit linux? Your CPU is probably 64-bit capable, unless it > pre-dates AMD Opteron or Intel's EM64T technology. > > Also, Jason reminded me that you've done some past tuning of your > database in-memory limits, to accommodate those giant fragmented > documents. Now that you're loading smaller documents, you should reset > those to the default values. There's a button for this, toward the > bottom of the database config screen: it's labeled "get default values". > Returning to the default values might help you avoid the XDMP-MEMORY error. > > Getting back to the query in my last message, it is probably slow > because it has to read-lock all the documents in the directory, even > when the query is only deleting 1000 of them. You can get around this > with some xdmp:eval() trickery (caution - sharp tools!). This version > uses an outer read-only query to gather the uris, and an inner update to > delete them. So instead of needing millions of read locks and 1000 write > locks, it only needs 1000 read locks and 1000 write locks. > > This is essentially a way to relax the query's ACID guarantees. Normally > we guarantee that the documents that are present at the start of a > transaction, and aren't affected by the transaction, will still be > available at the end of the transaction. Hence the need to read-lock all > of them. But by telling the update to run in a different-transaction, we > can relax this requirement and allow the xdmp:directory() portion to run > in lockless (timestamped) mode. The assert on line 1 ensures that the > xdmp:directory() part really does run in timestamped mode. > > let $assert := > if (xdmp:request-timestamp()) then () > else error((), 'NOTIMESTAMP', text { 'outer query is not read-only' }) > let $path := '/' > let $map := map:map() > let $list-uris := > for $i in xdmp:directory($path, 'infinity')[1 to 1000] > return map:put($map, xdmp:node-uri($i), true()) > let $do := xdmp:eval(' > declare variable $URIS as map:map external; > xdmp:document-delete(map:keys($URIS)) > ', > (xs:QName('URIS'), $map), > <options xmlns="xdmp:eval"> > <isolation>different-transaction</isolation> > <prevent-deadlocks>true</prevent-deadlocks> > </options> > ) > return count(map:keys($map)) > , xdmp:elapsed-time() > > You could keep running that until it returns 0, and you could tinker > with the '1 to 1000' range if you like. > > -- Mike > > On 2009-12-09 09:46, Lee, David wrote: >> Thanks for the suggestion >> I am running 4.1-3, I have plenty of swap space. >> >> I tried the bulk deletes but they were taking about 1 minute per 1000 >> documents to delete ... >> I gave up after a few hours. >> >> I've created a new DB and am starting the process of reloading now, about >> 2/3 through then I'll delete the old forest. >> >> I've come to the conclusion, that atleast on my system which is admittedly >> not that powerful (32bit linux, 4GB ram, 2.8ghz, ) that ML doesnt handle >> directories with> 1mil entries very well. >> I try to add more then that and run into all sorts of memory problems. >> I try to *delete* that directory and cant. >> >> It also doesnt handle individual files with> 1mil fragments that well but >> atleast it handles them. >> For my experimental case, I'm trying now a hybrid approach which is to bulk >> up 1000 "rows" per file and keeping the # of files in a directory in the >> 1000's not million's ... >> >> >> >> -----Original Message----- >> From: Michael Blakeley [mailto:[email protected]] >> Sent: Wednesday, December 09, 2009 12:33 PM >> To: General Mark Logic Developer Discussion >> Cc: Lee, David >> Subject: Re: [MarkLogic Dev General] Cannot delete directory with 1mil docs >> - XDMP-MEMORY >> >> The XDMP-MEMORY message does mean that the host couldn't allocation the >> needed memory. In this case that was probably because the transaction >> was too large to fit in memory. If you aren't already using 4.1-3, I'd >> upgrade - just in case this is a known problem that has already been fixed. >> >> If 4.1-3 doesn't help, then I suppose you could increase the swap >> space... but I don't think you'd like the performance. You might be able >> to reduce the sizes of the group-level caches, but that might lead to >> *CACHEFULL errors. >> >> So as Geert suggested, clearing the forest is probably the fastest >> solution. Or if you don't mind spending more time on it, you could >> delete in blocks of 1000 documents. >> >> for $i in xdmp:directory($path, 'infinity')[1 to 1000] >> return xdmp:document-delete(xdmp:node-uri($i)) >> >> You could automate this using xdmp:spawn(). You could also use >> cts:uris() with a cts:directory-query(), if you have the uri lexicon >> available. >> >> -- Mike >> >> On 2009-12-09 05:59, Lee, David wrote: >>> My joys of success were premature. >>> I ran into memory problems trying to load the full set of documents, it >>> died after about 1mil. >>> So I tried to delete the directory and now I’m getting >>> >>> Exception running: :query >>> com.marklogic.xcc.exceptions.XQueryException: XDMP-MEMORY: >>> xdmp:directory-delete >>> ("/RxNorm/rxnsat/") -- Memory exhausted >>> in /eval, on line 1 >>> >>> Arg !!!! >>> >>> I’ve tried to change various memory settings to no avail. Any clue how to >>> delete this directory ? >>> or should I start to delete the files piecemeal. >>> >>> Suggestions welcome. >>> >>> -David >>> >>> >>> ---------------------------------------- >>> David A. Lee >>> Senior Principal Software Engineer >>> Epocrates, Inc. >>> [email protected]<mailto:[email protected]> >>> 812-482-5224 >>> >>> >>> >> >> > > > > _______________________________________________ > General mailing list > [email protected] > http://xqzone.com/mailman/listinfo/general > > _______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general
