Yes, we had similar experiences under linux. It took us longer than I'm willing to admit to realize that 5GB or so of an 8GB server wasn't really being utilized at all. Then we switched to a 64 bit RedHat and - as they say - Viola (sic): memory allocation problems solved.
-Mike > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of > Jakob Fix > Sent: Wednesday, December 09, 2009 2:31 PM > To: General Mark Logic Developer Discussion > Subject: Re: [MarkLogic Dev General] Cannot delete directory > with 1mil docs - XDMP-MEMORY > > I can confirm that the frequent memory problems we encountered with a > 4.1 version of MarkLogic on a 32-bit Windows 2003 Server R2 > machine have completely disappeared once we moved to 64-bit > 2008 Server R2 platform, even though this particular version > of Windows isn't officially supported (yet). > > cheers, > Jakob. > > > > On Wed, Dec 9, 2009 at 19:59, Lee, David <[email protected]> wrote: > > Thanks Mike. I understand other people (and other > configurations) are having success with large directories. > I'm simply reporting that *my system* is not successful. > > I did re-set the memory paramaters and it doesnt help much. > > I suspect that your statement is the main one, that a 64bit > machine and OS is needed to accomidate this type of usage. > > > > > > > > > > > > > > -----Original Message----- > > From: Michael Blakeley [mailto:[email protected]] > > Sent: Wednesday, December 09, 2009 1:42 PM > > To: Lee, David > > Cc: General Mark Logic Developer Discussion > > Subject: Re: [MarkLogic Dev General] Cannot delete > directory with 1mil > > docs - XDMP-MEMORY > > > > David, > > > > Directories with millions of documents aren't necessarily a > problem: I > > create them frequently. Last week I build a 20M document > database, and > > the largest directory contained 9.2M documents. > > > > I see the 32-bit kernel as more of a problem. A 32-bit kernel is > > limited to a 32-bit address space, and the server process only gets > > 3-GB of that address space, no matter how much RAM or swap > you have. > > So why not install a 64-bit linux? Your CPU is probably 64-bit > > capable, unless it pre-dates AMD Opteron or Intel's EM64T > technology. > > > > Also, Jason reminded me that you've done some past tuning of your > > database in-memory limits, to accommodate those giant fragmented > > documents. Now that you're loading smaller documents, you > should reset > > those to the default values. There's a button for this, toward the > > bottom of the database config screen: it's labeled "get > default values". > > Returning to the default values might help you avoid the > XDMP-MEMORY error. > > > > Getting back to the query in my last message, it is probably slow > > because it has to read-lock all the documents in the > directory, even > > when the query is only deleting 1000 of them. You can get > around this > > with some xdmp:eval() trickery (caution - sharp tools!). > This version > > uses an outer read-only query to gather the uris, and an > inner update > > to delete them. So instead of needing millions of read > locks and 1000 > > write locks, it only needs 1000 read locks and 1000 write locks. > > > > This is essentially a way to relax the query's ACID guarantees. > > Normally we guarantee that the documents that are present > at the start > > of a transaction, and aren't affected by the transaction, > will still > > be available at the end of the transaction. Hence the need to > > read-lock all of them. But by telling the update to run in a > > different-transaction, we can relax this requirement and allow the > > xdmp:directory() portion to run in lockless (timestamped) mode. The > > assert on line 1 ensures that the > > xdmp:directory() part really does run in timestamped mode. > > > > let $assert := > > if (xdmp:request-timestamp()) then () > > else error((), 'NOTIMESTAMP', text { 'outer query is not > read-only' > > }) let $path := '/' > > let $map := map:map() > > let $list-uris := > > for $i in xdmp:directory($path, 'infinity')[1 to 1000] > > return map:put($map, xdmp:node-uri($i), true()) let $do := > > xdmp:eval(' > > declare variable $URIS as map:map external; > > xdmp:document-delete(map:keys($URIS)) > > ', > > (xs:QName('URIS'), $map), > > <options xmlns="xdmp:eval"> > > <isolation>different-transaction</isolation> > > <prevent-deadlocks>true</prevent-deadlocks> > > </options> > > ) > > return count(map:keys($map)) > > , xdmp:elapsed-time() > > > > You could keep running that until it returns 0, and you > could tinker > > with the '1 to 1000' range if you like. > > > > -- Mike > > > > On 2009-12-09 09:46, Lee, David wrote: > >> Thanks for the suggestion > >> I am running 4.1-3, I have plenty of swap space. > >> > >> I tried the bulk deletes but they were taking about 1 > minute per 1000 documents to delete ... > >> I gave up after a few hours. > >> > >> I've created a new DB and am starting the process of > reloading now, about 2/3 through then I'll delete the old forest. > >> > >> I've come to the conclusion, that atleast on my system > which is admittedly not that powerful (32bit linux, 4GB ram, > 2.8ghz, ) that ML doesnt handle directories with> 1mil > entries very well. > >> I try to add more then that and run into all sorts of > memory problems. > >> I try to *delete* that directory and cant. > >> > >> It also doesnt handle individual files with> 1mil > fragments that well but atleast it handles them. > >> For my experimental case, I'm trying now a hybrid approach > which is to bulk up 1000 "rows" per file and keeping the # of > files in a directory in the 1000's not million's ... > >> > >> > >> > >> -----Original Message----- > >> From: Michael Blakeley [mailto:[email protected]] > >> Sent: Wednesday, December 09, 2009 12:33 PM > >> To: General Mark Logic Developer Discussion > >> Cc: Lee, David > >> Subject: Re: [MarkLogic Dev General] Cannot delete directory with > >> 1mil docs - XDMP-MEMORY > >> > >> The XDMP-MEMORY message does mean that the host couldn't > allocation > >> the needed memory. In this case that was probably because the > >> transaction was too large to fit in memory. If you aren't already > >> using 4.1-3, I'd upgrade - just in case this is a known > problem that has already been fixed. > >> > >> If 4.1-3 doesn't help, then I suppose you could increase the swap > >> space... but I don't think you'd like the performance. You > might be > >> able to reduce the sizes of the group-level caches, but that might > >> lead to *CACHEFULL errors. > >> > >> So as Geert suggested, clearing the forest is probably the fastest > >> solution. Or if you don't mind spending more time on it, you could > >> delete in blocks of 1000 documents. > >> > >> for $i in xdmp:directory($path, 'infinity')[1 to 1000] > >> return xdmp:document-delete(xdmp:node-uri($i)) > >> > >> You could automate this using xdmp:spawn(). You could also use > >> cts:uris() with a cts:directory-query(), if you have the > uri lexicon > >> available. > >> > >> -- Mike > >> > >> On 2009-12-09 05:59, Lee, David wrote: > >>> My joys of success were premature. > >>> I ran into memory problems trying to load the full set of > documents, it died after about 1mil. > >>> So I tried to delete the directory and now Im getting > >>> > >>> Exception running: :query > >>> com.marklogic.xcc.exceptions.XQueryException: XDMP-MEMORY: > >>> xdmp:directory-delete > >>> ("/RxNorm/rxnsat/") -- Memory exhausted in /eval, on line 1 > >>> > >>> Arg !!!! > >>> > >>> Ive tried to change various memory settings to no avail. > Any clue how to delete this directory ? > >>> or should I start to delete the files piecemeal. > >>> > >>> Suggestions welcome. > >>> > >>> -David > >>> > >>> > >>> ---------------------------------------- > >>> David A. Lee > >>> Senior Principal Software Engineer > >>> Epocrates, Inc. > >>> [email protected]<mailto:[email protected]> > >>> 812-482-5224 > >>> > >>> > >>> > >> > >> > > > > > > > > _______________________________________________ > > General mailing list > > [email protected] > > http://xqzone.com/mailman/listinfo/general > > > > > _______________________________________________ > General mailing list > [email protected] > http://xqzone.com/mailman/listinfo/general > _______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general
