Re: [MarkLogic Dev General] Database Status Troubles

Damon Feldman Wed, 27 Jun 2012 06:05:55 -0700

Mike and Alex,

MarkLogic can do both at the same time, but an underlying system issue may be 
triggered by the intensified activity. I agree with a previous message 
suggesting the OS is overloaded, swapping or otherwise in serious trouble. The 
system should re-index without affecting queries very much, particularly if the 
reindex throttle is turned down.


Can you check the OS and/or VM status for CPU, swap and I/O activity?

Yours,
Damon

-----Original Message-----
From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Mike Sokolov
Sent: Wednesday, June 27, 2012 7:52 AM
To: MarkLogic Developer Discussion
Cc: Danny Sokolsky
Subject: Re: [MarkLogic Dev General] Database Status Troubles

I would agree that you are bumping up against hardware limits.

I get this regularly on a largish database (> 10M docs, ~ 680 GB on one 
server).  Whenever we re-index a significant amount of content the 
server becomes unusable for days - all it can handle is the reindexing 
and merging.  Even if we throttle reindexing down to 1 there will be 
periods of downtime (and of course the whole process takes longer).  The 
admin console responds fine except for the status screens, which 
(mostly) time out, as Alex reports.  If you keep trying, you can usually 
eventually get a response, although this can take 1/2 hour of retrying 
and waiting, which is painful.

The strategy we are pursuing is to maintain two servers; we reindex them 
in sequence, switching production from one to the other and back again.  
Another thing we are trying is increasing RAM (because we can, and the 
system doesn't *seem* to be CPU bound), but I suspect the real issue is 
disk I/O.

I'd like to add more disk, or faster ones (or ideally more servers, but 
that is challenging for a variety of reasons).  I'm also not sure how to 
prove that more I/O bandwidth would help dramatically.  I've looked at 
vmstat: we're not swapping, but I'm also not sure how to read the stats 
clearly enough to understand in detail what's going on with the disks.

Also wondering if you have recommendations as to how to configure 
disks.  Do you use RAID?  What kind?  A centralized SAN or other storage 
device?  I plan to ask support for help, too, but any general 
suggestions from folks on the list here would be appreciated.

-Mike

On 6/26/2012 6:16 PM, Danny Sokolsky wrote:
> Another thing you might try is turning the reindexer throttle down.  Go to 
> the database config page and try setting reindexer throttle to 1 (I think the 
> default is 5).  Then give your system a while to catch up, and see if you can 
> then get to the forest status page.
>
>
> -Danny
>
> -----Original Message-----
> From: general-boun...@developer.marklogic.com 
> [mailto:general-boun...@developer.marklogic.com] On Behalf Of Michael Blakeley
> Sent: Tuesday, June 26, 2012 2:37 PM
> To: MarkLogic Developer Discussion
> Subject: Re: [MarkLogic Dev General] Database Status Troubles
>
> On 26 Jun 2012, at 14:12 , Alex Milowski wrote:
>
>> On Tue, Jun 26, 2012 at 2:01 PM, Michael Blakeley <m...@blakeley.com> wrote:
>>> On 26 Jun 2012, at 13:52 , Alex Milowski wrote:
>>>
>>>> The forest status page doesn't return.  The browser just hangs out
>>>> until something times out.
>>> That is decidedly abnormal. I have seen a few systems where database-status 
>>> was slow and would time out during reindexing, but forest-status should 
>>> not. Admittedly I don't do much with geospatial, so this might be a 
>>> geospatial-specific bug.
>> Here's what I don't get.  I dropped a large index.  Shouldn't that be
>> an easy operation?
>>
>> Or is it the index I added that is causing the problem?  It should
>> only hit a very small number of documents/elements.  Is it still
>> required it to touch everything?
> MarkLogic never updates in place (except for timestamps). So it doesn't drop 
> existing indexes: it only reindexes. When the indexes for an old fragment are 
> out of date with respect to the current database configuration, the fragment 
> is deleted and its content reinserted as a new fragment. That is happening to 
> all the fragments affected by your latest configuration changes.
>
>>> But my hunch is that the OS is overloaded, so I would start by checking the 
>>> OS status: top and iostat, on linux or Solaris. The OS may be out of RAM 
>>> and swapping, or possibly reindexing puts too much demand on the available 
>>> disk I/O capacity. Either way, I suspect that your database has outgrown 
>>> the hardware.
>> Yes, that is certainly true.  I don't have enough memory.  Funny bit
>> is that the application works reasonable well for my research
>> purposes.  For production, it would need far more memory to support a
>> load.
> Reindexing is essentially a large series of batched updates. Updates are much 
> more expensive than reads, and unlike reads they cannot be cached.
>
> You can do some monitoring of the reindexer status with 'tail -F' on the 
> error log, while setting the file-log-level to "Debug". However if the OS is 
> paging, there won't be much progress to monitor.
>
> If you like, you could handle a situation like this by adding the new index 
> and allow reindexing, then turn off reindexing and delete the obsolete index. 
> The obsolete index won't disappear from the old stands, but new ones won't 
> have it. I'm not sure if that's useful in your situation, but it is an 
> option. It breaks the problem into a small one that you solve, and a larger 
> one that you ignore.
>
> Or you might reduce the sizes of your in-memory limits and host-level caches 
> to a bare minimum, to reduce demands on the OS. Reindexing performance would 
> suffer, but you might be able to get your work done. This isn't easy to get 
> right, though. Usually the best answer is to upgrade the hardware.
>
> -- Mike
> _______________________________________________
> General mailing list
> General@developer.marklogic.com
> http://community.marklogic.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> General@developer.marklogic.com
> http://community.marklogic.com/mailman/listinfo/general


_______________________________________________
General mailing list
General@developer.marklogic.com
http://community.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
General@developer.marklogic.com
http://community.marklogic.com/mailman/listinfo/general

Re: [MarkLogic Dev General] Database Status Troubles

Reply via email to