Re: [Mjpeg-users] -M 2/3 on SMP is slower than -M 0

Slepp Lukwai Wed, 17 Dec 2003 02:19:53 -0800

Just a side note, I find it interesting your name is Andrew Stevens,
whereby mine is Stephen Andrew (middle name).

On Tue, 2003-12-16 at 14:41, Andrew Stevens wrote:
> Yep.   You should (in theory) get a lot closer to that with the current 
> MPEG_DEVEL branch mpeg2enc.   However, your scaling is really remarkably bad 
> as even the -R 2 values where two CPUs should be fairly busy are unusually 
> bad.  I've never heard of worse than 70% utilisation on dual CPU machines.

And I'm wondering why it's not scaling... Hence the original post about
this.

> Here's a fairly typical snapshot of mpeg2enc -M 2 -I 1 -R 2 in action on my 
> dual P-III machine...
> 
>   PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
> 12620 as        18   0 46464  45M   768 R    80.9 24.3   0:18 lt-mpeg2enc
> 12621 as        18   0 46464  45M   768 R    70.8 24.3   0:18 lt-mpeg2enc
> 12619 as         9   0 46464  45M   768 S     3.9 24.3   0:01 lt-mpeg2enc

Which is nothing like I see. I rarely see two of them break 60%, but
they hover closer to 45%.

> You're getting very very symmetrical CPU loads and very very poor utilisation.  
> What kernel are you using... I vaugely recall 2.6.x series radically changed 
> the threading libs.  It could be something pathological is happening in the 
> scheduling.  

It's 2.4.20-gentoo-r9, actually. I'm wondering if a patch in here is
causing problems, but I'm very hesitant to try any other kernels since
this chipset/board are rather flaky and now that it's working again, I
don't want it to break (I couldn't run mpeg2enc, let alone
transcode/dvdrip for almost 5 months because it would lock the system
hard when it was under load). The newest kernels, 2.6.x, don't let me
disable the APIC in the kernel itself, and that causes problems. Perhaps
tonight I'll test a 2.4.23 without any patches (just vanilla) and see
what happens with scheduling.

As a side note, I'm also using a 200Hz timer, instead of the standard
100Hz. Though I don't see this doing anything but making it quicker, as
it reduces latency on scheduling, while slightly increasing scheduler
overhead and context switching (or is an SSE/3Dnow! CS really expensive,
anyone know?).

> The  2100+ is of course  a lot faster than the P-III but: I doubt the balance 
> between the motion estimation and the rest of the code is hugely shifted.  
> Cerainly, the approximate proportions of time spent in each are quite similar 
> on my 2100+ single-CPU machine and a P-III.

On the single 2000 XP we have, it runs about 90% of the speed of my
machine in SMP mode (-M 3).

> > Also, encoding with one B frame is a touch faster in -I 1 mode than
> > encoding without them, but it is slower when you encode two B frames
> ...
>
> Not really. However: I would expect going to two B frames to greatly increase 
> your CPU utilisation without much wall-clock time increase due the increased 
> scope for parallel computation.

but I was more or less pointing out the timings from the message that -I
1 -R 1 was faster than -I 0 -R 1, for some reason. Not all that much,
but noticably.

> This is what you'd expect: -R 2 offers much more scope for the 3 worker 
> threads of -M 3 to do something useful.

It still worked out 3-0-1 was the shortest overall time spent, even if
CPU usage was still not peaked.

> The usefulness of B frames depends a *lot* on the type of material.  For 
> captured stuff they rarely buy you much apart from free room heating from 
> your CPU. Hence the provision of -R 0 ;-).  They should get a little more 
> useful when I add dynamic frame type selection to mpeg2enc in the new year.

Strictly DVD copies. With three cats and being lazy, I have more DVDs
ruined than I'd like to count. (Speaking of room heating, it's about
-20degC at the moment outside, and my window is about 10cm open, and
it's still a toasty 25 degrees in here. my office doubles as the server
closet).

> > > - There is also a parallel read-ahead thread but this rarely soaks much
> > > CPU on modern CPUs.
> 
> Weirdly enough on your machine the reader thread is exceedingly busy

I use LVM, but can read 35MB/s off the disks with that. The memory
buffer cache is about 250MB/s. I wonder if it comes back to the
increased timing of the scheduler? (Though it's using a supposed O(1)
scheduler, which should offset that).

> cvs co -d :ext:[EMAIL PROTECTED]:/cvsroot/mjpeg mjpeg_play
> cd mjpeg_play
> cvs update -r MPEG_DEVEL mpeg2enc

:ext: wanted a password for anonymous, and 'enter' didn't work. So, I
used :pserver:. I sent you a message about the problems I encountered
thereof.

> The 'mjpeg_play' is a bit of a historical oddity but it is momumentally 
> painful to change directory names in CVS...

I tend to just drop the entire project, clean it up, and reimport it
into a fresh tree to rename it. :> You'd think that in the years and
revisions CVS has undergone, renaming of directories wouldn't be nearly
as painful.

-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
Mjpeg-users mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/mjpeg-users

Re: [Mjpeg-users] -M 2/3 on SMP is slower than -M 0

Reply via email to