You can collect good data manually with something like 'iostat -mxz 15' (remove 
the 'z' or the 'm' if not supported, but the 'x' is important).

Or you can configure your sysstat cron job to collect disk statistics 
automatically, for sar. In /etc/cron.d/sysstat you want something like this:

> */10 * * * * root /usr/lib64/sa/sa1 -S DISK 60 10 &
>  
> 53 23 * * * root /usr/lib64/sa/sa2 –A

The "-S DISK" is important, and isn't there by default.

Also make sure sysstat is installed, of course. It includes iostat too.

-- Mike

On 27 Jun 2012, at 07:11 , Mike Sokolov wrote:

> We will need to trigger a reindex in the next couple of days, so I will 
> capture some stats then and post back - thanks.
> 
> -Mike
> 
> On 06/27/2012 09:05 AM, Damon Feldman wrote:
>> Mike and Alex,
>> 
>> MarkLogic can do both at the same time, but an underlying system issue may 
>> be triggered by the intensified activity. I agree with a previous message 
>> suggesting the OS is overloaded, swapping or otherwise in serious trouble. 
>> The system should re-index without affecting queries very much, particularly 
>> if the reindex throttle is turned down.
>> 
>> Can you check the OS and/or VM status for CPU, swap and I/O activity?
>> 
>> Yours,
>> Damon
>> 
>> -----Original Message-----
>> From: general-boun...@developer.marklogic.com 
>> [mailto:general-boun...@developer.marklogic.com] On Behalf Of Mike Sokolov
>> Sent: Wednesday, June 27, 2012 7:52 AM
>> To: MarkLogic Developer Discussion
>> Cc: Danny Sokolsky
>> Subject: Re: [MarkLogic Dev General] Database Status Troubles
>> 
>> I would agree that you are bumping up against hardware limits.
>> 
>> I get this regularly on a largish database (>  10M docs, ~ 680 GB on one
>> server).  Whenever we re-index a significant amount of content the
>> server becomes unusable for days - all it can handle is the reindexing
>> and merging.  Even if we throttle reindexing down to 1 there will be
>> periods of downtime (and of course the whole process takes longer).  The
>> admin console responds fine except for the status screens, which
>> (mostly) time out, as Alex reports.  If you keep trying, you can usually
>> eventually get a response, although this can take 1/2 hour of retrying
>> and waiting, which is painful.
>> 
>> The strategy we are pursuing is to maintain two servers; we reindex them
>> in sequence, switching production from one to the other and back again.
>> Another thing we are trying is increasing RAM (because we can, and the
>> system doesn't *seem* to be CPU bound), but I suspect the real issue is
>> disk I/O.
>> 
>> I'd like to add more disk, or faster ones (or ideally more servers, but
>> that is challenging for a variety of reasons).  I'm also not sure how to
>> prove that more I/O bandwidth would help dramatically.  I've looked at
>> vmstat: we're not swapping, but I'm also not sure how to read the stats
>> clearly enough to understand in detail what's going on with the disks.
>> 
>> Also wondering if you have recommendations as to how to configure
>> disks.  Do you use RAID?  What kind?  A centralized SAN or other storage
>> device?  I plan to ask support for help, too, but any general
>> suggestions from folks on the list here would be appreciated.
>> 
>> -Mike
>> 
>> On 6/26/2012 6:16 PM, Danny Sokolsky wrote:
>> 
>>> Another thing you might try is turning the reindexer throttle down.  Go to 
>>> the database config page and try setting reindexer throttle to 1 (I think 
>>> the default is 5).  Then give your system a while to catch up, and see if 
>>> you can then get to the forest status page.
>>> 
>>> 
>>> -Danny
>>> 
>>> -----Original Message-----
>>> From: general-boun...@developer.marklogic.com 
>>> [mailto:general-boun...@developer.marklogic.com] On Behalf Of Michael 
>>> Blakeley
>>> Sent: Tuesday, June 26, 2012 2:37 PM
>>> To: MarkLogic Developer Discussion
>>> Subject: Re: [MarkLogic Dev General] Database Status Troubles
>>> 
>>> On 26 Jun 2012, at 14:12 , Alex Milowski wrote:
>>> 
>>> 
>>>> On Tue, Jun 26, 2012 at 2:01 PM, Michael Blakeley<m...@blakeley.com>  
>>>> wrote:
>>>> 
>>>>> On 26 Jun 2012, at 13:52 , Alex Milowski wrote:
>>>>> 
>>>>> 
>>>>>> The forest status page doesn't return.  The browser just hangs out
>>>>>> until something times out.
>>>>>> 
>>>>> That is decidedly abnormal. I have seen a few systems where 
>>>>> database-status was slow and would time out during reindexing, but 
>>>>> forest-status should not. Admittedly I don't do much with geospatial, so 
>>>>> this might be a geospatial-specific bug.
>>>>> 
>>>> Here's what I don't get.  I dropped a large index.  Shouldn't that be
>>>> an easy operation?
>>>> 
>>>> Or is it the index I added that is causing the problem?  It should
>>>> only hit a very small number of documents/elements.  Is it still
>>>> required it to touch everything?
>>>> 
>>> MarkLogic never updates in place (except for timestamps). So it doesn't 
>>> drop existing indexes: it only reindexes. When the indexes for an old 
>>> fragment are out of date with respect to the current database 
>>> configuration, the fragment is deleted and its content reinserted as a new 
>>> fragment. That is happening to all the fragments affected by your latest 
>>> configuration changes.
>>> 
>>> 
>>>>> But my hunch is that the OS is overloaded, so I would start by checking 
>>>>> the OS status: top and iostat, on linux or Solaris. The OS may be out of 
>>>>> RAM and swapping, or possibly reindexing puts too much demand on the 
>>>>> available disk I/O capacity. Either way, I suspect that your database has 
>>>>> outgrown the hardware.
>>>>> 
>>>> Yes, that is certainly true.  I don't have enough memory.  Funny bit
>>>> is that the application works reasonable well for my research
>>>> purposes.  For production, it would need far more memory to support a
>>>> load.
>>>> 
>>> Reindexing is essentially a large series of batched updates. Updates are 
>>> much more expensive than reads, and unlike reads they cannot be cached.
>>> 
>>> You can do some monitoring of the reindexer status with 'tail -F' on the 
>>> error log, while setting the file-log-level to "Debug". However if the OS 
>>> is paging, there won't be much progress to monitor.
>>> 
>>> If you like, you could handle a situation like this by adding the new index 
>>> and allow reindexing, then turn off reindexing and delete the obsolete 
>>> index. The obsolete index won't disappear from the old stands, but new ones 
>>> won't have it. I'm not sure if that's useful in your situation, but it is 
>>> an option. It breaks the problem into a small one that you solve, and a 
>>> larger one that you ignore.
>>> 
>>> Or you might reduce the sizes of your in-memory limits and host-level 
>>> caches to a bare minimum, to reduce demands on the OS. Reindexing 
>>> performance would suffer, but you might be able to get your work done. This 
>>> isn't easy to get right, though. Usually the best answer is to upgrade the 
>>> hardware.
>>> 
>>> -- Mike
>>> _______________________________________________
>>> General mailing list
>>> General@developer.marklogic.com
>>> http://community.marklogic.com/mailman/listinfo/general
>>> _______________________________________________
>>> General mailing list
>>> General@developer.marklogic.com
>>> http://community.marklogic.com/mailman/listinfo/general
>>> 
>> 
>> _______________________________________________
>> General mailing list
>> General@developer.marklogic.com
>> http://community.marklogic.com/mailman/listinfo/general
>> _______________________________________________
>> General mailing list
>> General@developer.marklogic.com
>> http://community.marklogic.com/mailman/listinfo/general
>> 
> _______________________________________________
> General mailing list
> General@developer.marklogic.com
> http://community.marklogic.com/mailman/listinfo/general
> 

_______________________________________________
General mailing list
General@developer.marklogic.com
http://community.marklogic.com/mailman/listinfo/general

Reply via email to