Re: loooong saveChanges prep-stage

2024-02-01 Thread Ramsey Gurley via Webobjects-dev
I'm not sure where the OSC locking and unlocking happens off the top of my 
head, but if it's waiting on a lock, it could still be the database. You can 
set up database monitoring on cpu/ram/io just to rule that out as a culprit. 
Otherwise, if it is your app, you might find the problem running your app 
through a profiler.

Also, you can check to see if your app is not returning stats on javamonitor. 
That usually indicates a deadlock. If you're running multiple instances, you 
could be failing over to a new instance after a timeout due to the instance 
being deadlocked. If you see your instances deadlocking, you can configure jmx 
and then use something like jstat to detect where your deadlock is happening. 
It was a bit of a pain setting that up on my prod apps, but it was definitely 
worth it. You can see all kinds of useful info on your running prod apps and 
find problems you didn't even know you had.

Also, set up monitoring on your app server. It might just be you're being 
hammered by a botnet and that's bogging down all your network IO you use for 
your DB. Or you could have a rouge application eating all your app server cpu.

Whenever you're stuck, just set up more and more monitoring. :) It's never a 
waste of time, and you can often find your problem that way.

From: OCsite 
Sent: Thursday, February 1, 2024 7:39 PM
To: OCsite via Webobjects-dev 
Cc: Jérémy DE ROYER ; Ramsey Gurley 

Subject: Re: lng saveChanges prep-stage

You don't often get email from o...@ocs.cz. Learn why this is 
important
Thanks again!

Though I do not really think the database would be the culprit. Does actually 
EOF do anything with the DB when saving changes before 
DatabaseContextDelegate.databaseContextWillPerformAdaptorOperations is called? 
I might be wrong as so often, but I don't think so; I believe it contacts the 
DB for the first time only after that.

Thanks and all the best,
OC

On 1. 2. 2024, at 6:12, Ramsey Gurley  wrote:

Are you hosting your database? If so, don't rule out a problem outside your 
application and in your database. It could be low memory. If you can't fit all 
the tables/indexes into memory, you fall off a performance cliff by going to 
disk. It could also be an antivirus scanner that kicked off on your DB's data 
directory. Once it starts scanning and locking files, it can ruin performance. 
If you can catch it during these slow periods, ssh into the box and check with 
top and iostat. Look for something on the machine tying up resources. It could 
be a poorly tuned database too, your box has plenty of memory, but your 
database isn't configured correctly to use it. For example, if you're using 
Postgres:
https://pgtune.leopard.in.ua/

It could even be below your database config and in your OS config. On linux, 
it's highly dependent on which file system you are using (ext4, btrfs, xfs, 
etc), but maybe you have insufficent read-ahead, or some other filesystem 
setting. Even if you're not using Postgres, you should probably get a copy of 
Postgresql High Performance and read the first few chapters which covers 
hardware tuning below the database.

It could also be your database is busy reindexing or freeing up table space. On 
postgres, vacuuming and reindexing can lock tables until the job is done. This 
can take several minutes or more depending on the size of your table. It could 
be the autovacuum doing it. If you're not vacuuming/reindexing, that can hurt 
your performance too, since indexes will grow to exceed your available memory 
and drop you off the performance cliff. You can do the maintenance and get 
around the locking problems using something like pg_repack.

From: OCsite 
Sent: Wednesday, January 31, 2024 7:59 PM
To: OCsite via Webobjects-dev 
Cc: Jérémy DE ROYER ; Ramsey Gurley 

Subject: Re: lng saveChanges prep-stage

You don't often get email from o...@ocs.cz. Learn why this is 
important
Thanks, guys!

I am pretty sure though the problem can't be a background process either 
reading for a long time or saving for a long time, for I do use the 
ERXAdaptorChannelDelegate.trace logs and through 
DatabaseContextDelegate.databaseContextWillPerformAdaptorOperations I log each 
save — and there's nothing like that in the log in the vicinity of those long 
saveChanges, alas. Thus the culprit must be something else.

Perhaps indeed something locks the OSC pretty often and for a long time, but 
that something is neither a long SELECT which would log through 
ERXAdaptorChannelDelegate.trace, nor another unrelated save, which would log 
through DatabaseContextDelegate.databaseContextWillPerformAdaptorOperations.

Besides, it does not really feel like OSC locks caused by another thread. 
Meantime, I've rigged an awk script to compute how long each saveChanges takes, 
and it looks like this:
- for a long time, all is 

Re: loooong saveChanges prep-stage

2024-02-01 Thread OCsite via Webobjects-dev
Thanks again!

Though I do not really think the database would be the culprit. Does actually 
EOF do anything with the DB when saving changes before 
DatabaseContextDelegate.databaseContextWillPerformAdaptorOperations is called? 
I might be wrong as so often, but I don't think so; I believe it contacts the 
DB for the first time only after that.

Thanks and all the best,
OC

> On 1. 2. 2024, at 6:12, Ramsey Gurley  wrote:
> 
> Are you hosting your database? If so, don't rule out a problem outside your 
> application and in your database. It could be low memory. If you can't fit 
> all the tables/indexes into memory, you fall off a performance cliff by going 
> to disk. It could also be an antivirus scanner that kicked off on your DB's 
> data directory. Once it starts scanning and locking files, it can ruin 
> performance. If you can catch it during these slow periods, ssh into the box 
> and check with top and iostat. Look for something on the machine tying up 
> resources. It could be a poorly tuned database too, your box has plenty of 
> memory, but your database isn't configured correctly to use it. For example, 
> if you're using Postgres:
> https://pgtune.leopard.in.ua/
> 
> It could even be below your database config and in your OS config. On linux, 
> it's highly dependent on which file system you are using (ext4, btrfs, xfs, 
> etc), but maybe you have insufficent read-ahead, or some other filesystem 
> setting. Even if you're not using Postgres, you should probably get a copy of 
> Postgresql High Performance and read the first few chapters which covers 
> hardware tuning below the database.
> 
> It could also be your database is busy reindexing or freeing up table space. 
> On postgres, vacuuming and reindexing can lock tables until the job is done. 
> This can take several minutes or more depending on the size of your table. It 
> could be the autovacuum doing it. If you're not vacuuming/reindexing, that 
> can hurt your performance too, since indexes will grow to exceed your 
> available memory and drop you off the performance cliff. You can do the 
> maintenance and get around the locking problems using something like 
> pg_repack.
> From: OCsite 
> Sent: Wednesday, January 31, 2024 7:59 PM
> To: OCsite via Webobjects-dev 
> Cc: Jérémy DE ROYER ; Ramsey Gurley 
> 
> Subject: Re: lng saveChanges prep-stage
>  
> You don't often get email from o...@ocs.cz. Learn why this is important 
>    
> Thanks, guys!
> 
> I am pretty sure though the problem can't be a background process either 
> reading for a long time or saving for a long time, for I do use the 
> ERXAdaptorChannelDelegate.trace logs and through 
> DatabaseContextDelegate.databaseContextWillPerformAdaptorOperations I log 
> each save — and there's nothing like that in the log in the vicinity of those 
> long saveChanges, alas. Thus the culprit must be something else.
> 
> Perhaps indeed something locks the OSC pretty often and for a long time, but 
> that something is neither a long SELECT which would log through 
> ERXAdaptorChannelDelegate.trace, nor another unrelated save, which would log 
> through DatabaseContextDelegate.databaseContextWillPerformAdaptorOperations.
> 
> Besides, it does not really feel like OSC locks caused by another thread. 
> Meantime, I've rigged an awk script to compute how long each saveChanges 
> takes, and it looks like this:
> - for a long time, all is OK
> - when the save times begin to grow, they keep consistently long (e.g., about 
> 30 s each, or about 50 s each) for each save for awhile (a quarter or half an 
> hour), before things get back to normal
> 
> If another thread locked OSC, it would most probably mean some saveChanges 
> would be long, but some quick; it does not seem probable a background thread 
> would consistently keep OSC locked so that each saveChanges takes roughly the 
> same (long) time.
> 
> This rather feels by something at the beginning of saveChanges becomes slow. 
> This would most probably happen under the OSC lock, and given the way it 
> works, does not seem really plausible that it is simply waiting to acquire 
> the lock itself.
> 
> For the moment, I'm rather outta ideas :(
> 
> Thanks again and all the best,
> OC
> Confidentiality Notice: This email, including all attachments and replies 
> thereto, are covered by the Electronic Communications Privacy Act, 18 U.S.C. 
> Sections 2510-2521 and are legally privileged. This information is 
> confidential, and intended only for the use of the individuals or entities 
> named above. If you are not the intended recipient, you are hereby notified 
> that any disclosure, copying, distribution or the taking of any action in 
> reliance on the contents of this transmitted information is strictly 
> prohibited. Please notify us if you have received this transmission in error. 
> Thank you.

 ___
Do not post admin requests to the