I can confirm that the frequent memory problems we encountered with a
4.1 version of MarkLogic on a 32-bit Windows 2003 Server R2 machine
have completely disappeared once we moved to 64-bit 2008 Server R2
platform, even though this particular version of Windows isn't
officially supported (yet).

cheers,
Jakob.



On Wed, Dec 9, 2009 at 19:59, Lee, David <[email protected]> wrote:
> Thanks Mike.  I understand other people (and other configurations) are having 
> success with large directories.  I'm simply reporting that *my system* is not 
> successful.
> I did re-set the memory paramaters and it doesnt help much.
> I suspect that your statement is the main one, that a 64bit machine and OS is 
> needed to accomidate this type of usage.
>
>
>
>
>
>
> -----Original Message-----
> From: Michael Blakeley [mailto:[email protected]]
> Sent: Wednesday, December 09, 2009 1:42 PM
> To: Lee, David
> Cc: General Mark Logic Developer Discussion
> Subject: Re: [MarkLogic Dev General] Cannot delete directory with 1mil docs - 
> XDMP-MEMORY
>
> David,
>
> Directories with millions of documents aren't necessarily a problem: I
> create them frequently. Last week I build a 20M document database, and
> the largest directory contained 9.2M documents.
>
> I see the 32-bit kernel as more of a problem. A 32-bit kernel is limited
> to a 32-bit address space, and the server process only gets 3-GB of that
> address space, no matter how much RAM or swap you have. So why not
> install a 64-bit linux? Your CPU is probably 64-bit capable, unless it
> pre-dates AMD Opteron or Intel's EM64T technology.
>
> Also, Jason reminded me that you've done some past tuning of your
> database in-memory limits, to accommodate those giant fragmented
> documents. Now that you're loading smaller documents, you should reset
> those to the default values. There's a button for this, toward the
> bottom of the database config screen: it's labeled "get default values".
> Returning to the default values might help you avoid the XDMP-MEMORY error.
>
> Getting back to the query in my last message, it is probably slow
> because it has to read-lock all the documents in the directory, even
> when the query is only deleting 1000 of them. You can get around this
> with some xdmp:eval() trickery (caution - sharp tools!). This version
> uses an outer read-only query to gather the uris, and an inner update to
> delete them. So instead of needing millions of read locks and 1000 write
> locks, it only needs 1000 read locks and 1000 write locks.
>
> This is essentially a way to relax the query's ACID guarantees. Normally
> we guarantee that the documents that are present at the start of a
> transaction, and aren't affected by the transaction, will still be
> available at the end of the transaction. Hence the need to read-lock all
> of them. But by telling the update to run in a different-transaction, we
> can relax this requirement and allow the xdmp:directory() portion to run
> in lockless (timestamped) mode. The assert on line 1 ensures that the
> xdmp:directory() part really does run in timestamped mode.
>
> let $assert :=
>   if (xdmp:request-timestamp()) then ()
>   else error((), 'NOTIMESTAMP', text { 'outer query is not read-only' })
> let $path := '/'
> let $map := map:map()
> let $list-uris :=
>   for $i in xdmp:directory($path, 'infinity')[1 to 1000]
>   return map:put($map, xdmp:node-uri($i), true())
> let $do := xdmp:eval('
>   declare variable $URIS as map:map external;
>   xdmp:document-delete(map:keys($URIS))
> ',
>   (xs:QName('URIS'), $map),
>   <options xmlns="xdmp:eval">
>     <isolation>different-transaction</isolation>
>     <prevent-deadlocks>true</prevent-deadlocks>
>   </options>
> )
> return count(map:keys($map))
> , xdmp:elapsed-time()
>
> You could keep running that until it returns 0, and you could tinker
> with the '1 to 1000' range if you like.
>
> -- Mike
>
> On 2009-12-09 09:46, Lee, David wrote:
>> Thanks for the suggestion
>> I am running 4.1-3, I have plenty of swap space.
>>
>> I tried the bulk deletes but they were taking about 1 minute per 1000 
>> documents to delete ...
>> I gave up after a few hours.
>>
>> I've created a new DB and am starting the process of reloading now, about 
>> 2/3 through then I'll delete the old forest.
>>
>> I've come to the conclusion, that atleast on my system which is admittedly 
>> not that powerful (32bit linux, 4GB ram,  2.8ghz, ) that ML doesnt handle 
>> directories with>  1mil entries very well.
>> I try to add more then that and run into all sorts of memory problems.
>> I try to *delete* that directory and cant.
>>
>> It also doesnt handle individual files with>  1mil fragments that well but 
>> atleast it handles them.
>> For my experimental case, I'm trying now a hybrid approach which is to bulk 
>> up 1000 "rows" per file and keeping the # of files in a directory in the 
>> 1000's not million's ...
>>
>>
>>
>> -----Original Message-----
>> From: Michael Blakeley [mailto:[email protected]]
>> Sent: Wednesday, December 09, 2009 12:33 PM
>> To: General Mark Logic Developer Discussion
>> Cc: Lee, David
>> Subject: Re: [MarkLogic Dev General] Cannot delete directory with 1mil docs 
>> - XDMP-MEMORY
>>
>> The XDMP-MEMORY message does mean that the host couldn't allocation the
>> needed memory. In this case that was probably because the transaction
>> was too large to fit in memory. If you aren't already using 4.1-3, I'd
>> upgrade - just in case this is a known problem that has already been fixed.
>>
>> If 4.1-3 doesn't help, then I suppose you could increase the swap
>> space... but I don't think you'd like the performance. You might be able
>> to reduce the sizes of the group-level caches, but that might lead to
>> *CACHEFULL errors.
>>
>> So as Geert suggested, clearing the forest is probably the fastest
>> solution. Or if you don't mind spending more time on it, you could
>> delete in blocks of 1000 documents.
>>
>>     for $i in xdmp:directory($path, 'infinity')[1 to 1000]
>>     return xdmp:document-delete(xdmp:node-uri($i))
>>
>> You could automate this using xdmp:spawn(). You could also use
>> cts:uris() with a cts:directory-query(), if you have the uri lexicon
>> available.
>>
>> -- Mike
>>
>> On 2009-12-09 05:59, Lee, David wrote:
>>> My joys of success were premature.
>>> I ran into memory problems trying to load the full set of documents, it 
>>> died after about 1mil.
>>> So I tried to delete the directory and now I’m getting
>>>
>>> Exception running: :query
>>> com.marklogic.xcc.exceptions.XQueryException: XDMP-MEMORY: 
>>> xdmp:directory-delete
>>> ("/RxNorm/rxnsat/") -- Memory exhausted
>>> in /eval, on line 1
>>>
>>> Arg !!!!
>>>
>>> I’ve tried to change various memory settings to no avail.  Any clue how to 
>>> delete this directory ?
>>> or should I start to delete the files piecemeal.
>>>
>>> Suggestions welcome.
>>>
>>> -David
>>>
>>>
>>> ----------------------------------------
>>> David A. Lee
>>> Senior Principal Software Engineer
>>> Epocrates, Inc.
>>> [email protected]<mailto:[email protected]>
>>> 812-482-5224
>>>
>>>
>>>
>>
>>
>
>
>
> _______________________________________________
> General mailing list
> [email protected]
> http://xqzone.com/mailman/listinfo/general
>
>
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to