hey robert,

so i don't agree with your interpretation of the dtrace data below.  the
data doesn't show a long time spent executing in semop().  instead it
shows that the most common way that a database thread blocks after doing
some processing is by calling semop().

ie, your script takes a timestamp when a thread starts executing
(probably when returning from a semop() operation).  then when the
thread stops executing, you look at the elapsed time and then you're
attributing that execution time to the function where it stopped
execution.  that doesn't really seeem right.  i'll bet that the majority
of execution time was spent somewhere else, not in semop().  semop() is
just the place where the thread is getting swapped off proc.

ed

On Mon, Mar 29, 2010 at 03:58:13AM -0700, Robert Milkowski wrote:
> Hi,
>
> I have an Oracle 8.1.7.4 64bit running in a Solaris 9 branded zone on-top of 
> Solaris 10.
> When I put a tablespace in a hotbackup mode it takes long time - usually more 
> than 4s for each tablespace. Looks like ora_ckpt process is consuming 100% of 
> cpu (in usr) while a tablespace is being put in the hotbackup mode.
>
> It looks like there might be an issue with semop() interposition in S9 
> branded zone.
> I can't easily test it now but IIRC when running Oracle in a global zone 
> there was no such issue.
>
> More details below.
>
> First, lets check a PID of ora_ckpt
>
> # ps -ef | grep 20326
>   oracle 20326     1   0 15:37:26 ?          15:52 ora_ckpt_I321
>
> Now lets put a TEMP tablespace in a backup mode while running a dtrace script 
> as well:
>
> SQL> alter tablespace TEMP begin backup;
>
> Tablespace altered.
>
> Elapsed: 00:00:04.65
>
>
>
> # dtrace -n sched:::on-cpu'/pid==20326 && tid==1/{self->t=timestamp;}' -n 
> sched:::off-cpu'/self->t/{...@u[ustack()]=sum(timestamp-self->t);@c[ustack()]=count();self->t=0;}'
>  -n tick-10s'{printa(@u);printa(@c)}'
> dtrace: description 'sched:::on-cpu' matched 1 probe
> dtrace: description 'sched:::off-cpu' matched 2 probes
> dtrace: description 'tick-10s' matched 1 probe
> ^C
>
>
>               `_s9brand_systemcall+0x30
>               `s9_semsys+0x16c
>               `s9_handler+0x84
>               oracle`sslssalck+0x514
>               libc.so.1`semop+0x18
>               oracle`sskgpwwait+0x13c
>               oracle`ksliwat+0x44c
>               oracle`kslwaitns+0x20
>               oracle`kskthbwt+0x174
>               oracle`kslwait+0x50
>               oracle`ksarcv+0x1bc
>               oracle`ksbabs+0x344
>               oracle`ksbrdp+0x2ec
>               oracle`opirip+0x31c
>               oracle`opidrv+0x6a0
>               oracle`sou2o+0x10
>               oracle`main+0x128
>               oracle`_start+0x17c
>               oracle`_start+0x18
>        3852868427
>
>               oracle`_s9brand_systemcall+0x30
>               oracle`s9_semsys+0x16c
>               oracle`s9_handler+0x84
>               oracle`sslssalck+0x514
>               libc.so.1`semop+0x18
>               oracle`sskgpwwait+0x13c
>               oracle`ksliwat+0x44c
>               oracle`kslwaitns+0x20
>               oracle`kskthbwt+0x174
>               oracle`kslwait+0x50
>               oracle`ksarcv+0x1bc
>               oracle`ksbabs+0x344
>               oracle`ksbrdp+0x2ec
>               oracle`opirip+0x31c
>               oracle`opidrv+0x6a0
>               oracle`sou2o+0x10
>               oracle`main+0x128
>               oracle`_start+0x17c
>               oracle`_start+0x18
>                12
>
> So the same ustack() consumed almost 4s of a CPU time out of 4.6s of a total 
> time of the operation.
>
> Has anyone been hit by it?
>
> Also the 2nd stack looks strange (top 5 lines)...
>
> --
> Robert Milkowski
> http://milek.blogspot.com
> --
> This message posted from opensolaris.org
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss@opensolaris.org
>
_______________________________________________
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org

Reply via email to