Re: Kaffeine problem with CFS

2007-04-21 Thread Ingo Molnar

* Bill Davidsen <[EMAIL PROTECTED]> wrote:

> > Instead of that one, i tried CFSv3 and i cannot reproduce the hang 
> > anymore, Thanks!...
>
> And that explains why CFS-v3 on 21-rc7-git3 wouldn't show me the hang. 
> As a matter of fact, nothing I did showed any bad behavior! Note that 
> I was doing actual badly behaved things which do sometimes glitch the 
> standard scheduler, not running benchmarks.
> 
> This scheduler is boring, everything works. [...]

hehe :) Having a 'boring' scheduler in the end is the main goal :)

> [...] I am going to try some tests on a uniprocessor, though, I have 
> been running everything on either SMP or HT CPUs. But so far it looks 
> fine.

yeah, please do - while i do test UP frequently, most of my CFS testing 
was on SMP.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-21 Thread Ingo Molnar

* Bill Davidsen [EMAIL PROTECTED] wrote:

  Instead of that one, i tried CFSv3 and i cannot reproduce the hang 
  anymore, Thanks!...

 And that explains why CFS-v3 on 21-rc7-git3 wouldn't show me the hang. 
 As a matter of fact, nothing I did showed any bad behavior! Note that 
 I was doing actual badly behaved things which do sometimes glitch the 
 standard scheduler, not running benchmarks.
 
 This scheduler is boring, everything works. [...]

hehe :) Having a 'boring' scheduler in the end is the main goal :)

 [...] I am going to try some tests on a uniprocessor, though, I have 
 been running everything on either SMP or HT CPUs. But so far it looks 
 fine.

yeah, please do - while i do test UP frequently, most of my CFS testing 
was on SMP.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-20 Thread Bill Davidsen

S.Çag(lar Onur wrote:
18 Nis 2007 Çar tarihinde, Ingo Molnar s,unlar? yazm?s,t?: 

* S.Çag(lar Onur <[EMAIL PROTECTED]> wrote:

-   schedule();
+   msleep(1);

which Ingo sends me to try also has the same effect on me. I cannot
reproduce hangs anymore with that patch applied top of CFS while one
console checks out SVN repos and other one compiles a small test
software.

great! Could you please unapply the hack above and try the proper fix
below, does this one solve the hangs too?


Instead of that one, i tried CFSv3 and i cannot reproduce the hang anymore, 
Thanks!...


And that explains why CFS-v3 on 21-rc7-git3 wouldn't show me the hang. 
As a matter of fact, nothing I did showed any bad behavior! Note that I 
was doing actual badly behaved things which do sometimes glitch the 
standard scheduler, not running benchmarks.


This scheduler is boring, everything works. I am going to try some tests 
on a uniprocessor, though, I have been running everything on either SMP 
or HT CPUs. But so far it looks fine.



--
Bill Davidsen <[EMAIL PROTECTED]>
  "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-20 Thread Bill Davidsen

S.Çag(lar Onur wrote:
18 Nis 2007 Çar tarihinde, Ingo Molnar s,unlar? yazm?s,t?: 

* S.Çag(lar Onur [EMAIL PROTECTED] wrote:

-   schedule();
+   msleep(1);

which Ingo sends me to try also has the same effect on me. I cannot
reproduce hangs anymore with that patch applied top of CFS while one
console checks out SVN repos and other one compiles a small test
software.

great! Could you please unapply the hack above and try the proper fix
below, does this one solve the hangs too?


Instead of that one, i tried CFSv3 and i cannot reproduce the hang anymore, 
Thanks!...


And that explains why CFS-v3 on 21-rc7-git3 wouldn't show me the hang. 
As a matter of fact, nothing I did showed any bad behavior! Note that I 
was doing actual badly behaved things which do sometimes glitch the 
standard scheduler, not running benchmarks.


This scheduler is boring, everything works. I am going to try some tests 
on a uniprocessor, though, I have been running everything on either SMP 
or HT CPUs. But so far it looks fine.



--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

* S.Çağlar Onur <[EMAIL PROTECTED]> wrote:

> > great! Could you please unapply the hack above and try the proper 
> > fix below, does this one solve the hangs too?
> 
> Instead of that one, i tried CFSv3 and i cannot reproduce the hang 
> anymore, Thanks!...

cool, thanks for the quick turnaround!

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread S.Çağlar Onur
18 Nis 2007 Çar tarihinde, Ingo Molnar şunları yazmıştı: 
> * S.Çağlar Onur <[EMAIL PROTECTED]> wrote:
> > -   schedule();
> > +   msleep(1);
> >
> > which Ingo sends me to try also has the same effect on me. I cannot
> > reproduce hangs anymore with that patch applied top of CFS while one
> > console checks out SVN repos and other one compiles a small test
> > software.
>
> great! Could you please unapply the hack above and try the proper fix
> below, does this one solve the hangs too?

Instead of that one, i tried CFSv3 and i cannot reproduce the hang anymore, 
Thanks!...

Cheers
-- 
S.Çağlar Onur <[EMAIL PROTECTED]>
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


signature.asc
Description: This is a digitally signed message part.


Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

* William Lee Irwin III <[EMAIL PROTECTED]> wrote:

> At this point you might as well call the requeue operation something 
> having to do with yield. [...]

agreed - i've just done a requeue_task -> yield_task rename in my tree. 
(patch below)

> [...] I also suspect what goes on during the timer tick may eventually 
> become something different from requeueing the current task, and 
> furthermore class-dependent.

it already is, scheduler tick processing is done in class->task_tick().

Ingo

---
 include/linux/sched.h |2 +-
 kernel/sched.c|7 +--
 kernel/sched_fair.c   |4 ++--
 kernel/sched_rt.c |2 +-
 4 files changed, 5 insertions(+), 10 deletions(-)

Index: linux/include/linux/sched.h
===
--- linux.orig/include/linux/sched.h
+++ linux/include/linux/sched.h
@@ -796,7 +796,7 @@ struct sched_class {
 
void (*enqueue_task) (struct rq *rq, struct task_struct *p);
void (*dequeue_task) (struct rq *rq, struct task_struct *p);
-   void (*requeue_task) (struct rq *rq, struct task_struct *p);
+   void (*yield_task) (struct rq *rq, struct task_struct *p);
 
void (*check_preempt_curr) (struct rq *rq, struct task_struct *p);
 
Index: linux/kernel/sched.c
===
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -560,11 +560,6 @@ static void dequeue_task(struct rq *rq, 
p->on_rq = 0;
 }
 
-static void requeue_task(struct rq *rq, struct task_struct *p)
-{
-   p->sched_class->requeue_task(rq, p);
-}
-
 /*
  * __normal_prio - return the priority that is based on the static prio
  */
@@ -3773,7 +3768,7 @@ asmlinkage long sys_sched_yield(void)
schedstat_inc(rq, yld_cnt);
if (rq->nr_running == 1)
schedstat_inc(rq, yld_act_empty);
-   requeue_task(rq, current);
+   current->sched_class->yield_task(rq, current);
 
/*
 * Since we are going to call schedule() anyway, there's
Index: linux/kernel/sched_fair.c
===
--- linux.orig/kernel/sched_fair.c
+++ linux/kernel/sched_fair.c
@@ -273,7 +273,7 @@ static void dequeue_task_fair(struct rq 
  * dequeue the task and move it to the rightmost position, which
  * causes the task to roundrobin to the end of the tree.
  */
-static void requeue_task_fair(struct rq *rq, struct task_struct *p)
+static void yield_task_fair(struct rq *rq, struct task_struct *p)
 {
dequeue_task_fair(rq, p);
p->on_rq = 0;
@@ -509,7 +509,7 @@ static void task_init_fair(struct rq *rq
 struct sched_class fair_sched_class __read_mostly = {
.enqueue_task   = enqueue_task_fair,
.dequeue_task   = dequeue_task_fair,
-   .requeue_task   = requeue_task_fair,
+   .yield_task = yield_task_fair,
 
.check_preempt_curr = check_preempt_curr_fair,
 
Index: linux/kernel/sched_rt.c
===
--- linux.orig/kernel/sched_rt.c
+++ linux/kernel/sched_rt.c
@@ -165,7 +165,7 @@ static void task_init_rt(struct rq *rq, 
 static struct sched_class rt_sched_class __read_mostly = {
.enqueue_task   = enqueue_task_rt,
.dequeue_task   = dequeue_task_rt,
-   .requeue_task   = requeue_task_rt,
+   .yield_task = requeue_task_rt,
 
.check_preempt_curr = check_preempt_curr_rt,
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread William Lee Irwin III
On Wed, Apr 18, 2007 at 05:48:11PM +0200, Ingo Molnar wrote:
>  static void requeue_task_fair(struct rq *rq, struct task_struct *p)
>  {
>   dequeue_task_fair(rq, p);
>   p->on_rq = 0;
> - enqueue_task_fair(rq, p);
> + /*
> +  * Temporarily insert at the last position of the tree:
> +  */
> + p->fair_key = LLONG_MAX;
> + __enqueue_task_fair(rq, p);
>   p->on_rq = 1;
> +
> + /*
> +  * Update the key to the real value, so that when all other
> +  * tasks from before the rightmost position have executed,
> +  * this task is picked up again:
> +  */
> + p->fair_key = rq->fair_clock - p->wait_runtime + p->nice_offset;

At this point you might as well call the requeue operation something
having to do with yield. I also suspect what goes on during the timer
tick may eventually become something different from requeueing the
current task, and furthermore class-dependent.


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

* S.Çağlar Onur <[EMAIL PROTECTED]> wrote:

> -   schedule();
> +   msleep(1);

> which Ingo sends me to try also has the same effect on me. I cannot 
> reproduce hangs anymore with that patch applied top of CFS while one 
> console checks out SVN repos and other one compiles a small test 
> software.

great! Could you please unapply the hack above and try the proper fix 
below, does this one solve the hangs too?

Ingo

Index: linux/kernel/sched_fair.c
===
--- linux.orig/kernel/sched_fair.c
+++ linux/kernel/sched_fair.c
@@ -264,15 +264,26 @@ static void dequeue_task_fair(struct rq 
 
 /*
  * sched_yield() support is very simple via the rbtree, we just
- * dequeue and enqueue the task, which causes the task to
- * roundrobin to the end of the tree:
+ * dequeue the task and move it to the rightmost position, which
+ * causes the task to roundrobin to the end of the tree.
  */
 static void requeue_task_fair(struct rq *rq, struct task_struct *p)
 {
dequeue_task_fair(rq, p);
p->on_rq = 0;
-   enqueue_task_fair(rq, p);
+   /*
+* Temporarily insert at the last position of the tree:
+*/
+   p->fair_key = LLONG_MAX;
+   __enqueue_task_fair(rq, p);
p->on_rq = 1;
+
+   /*
+* Update the key to the real value, so that when all other
+* tasks from before the rightmost position have executed,
+* this task is picked up again:
+*/
+   p->fair_key = rq->fair_clock - p->wait_runtime + p->nice_offset;
 }
 
 /*
@@ -380,7 +391,10 @@ static void task_tick_fair(struct rq *rq
 * Dequeue and enqueue the task to update its
 * position within the tree:
 */
-   requeue_task_fair(rq, curr);
+   dequeue_task_fair(rq, curr);
+   curr->on_rq = 0;
+   enqueue_task_fair(rq, curr);
+   curr->on_rq = 1;
 
/*
 * Reschedule if another task tops the current one.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

* Christoph Pfister <[EMAIL PROTECTED]> wrote:

> Replacing the sched_yield in demux.c with an usleep(10) stopped those 
> seeking hangs here (at least I was able to pull the slider back and 
> forth during 2 mins without trouble compared to the few secs I need 
> earlier to get a hang).

great - thanks for figuring it out!

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread S.Çağlar Onur
18 Nis 2007 Çar tarihinde, Christoph Pfister şunları yazmıştı: 
> Replacing the sched_yield in demux.c with an usleep(10) stopped those
> seeking hangs here (at least I was able to pull the slider back and
> forth during 2 mins without trouble compared to the few secs I need
> earlier to get a hang).

Index: linux/kernel/sched.c
===
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -3785,7 +3785,7 @@ asmlinkage long sys_sched_yield(void)
_raw_spin_unlock(>lock);
preempt_enable_no_resched();
 
-   schedule();
+   msleep(1);
 
return 0;
 }

which Ingo sends me to try also has the same effect on me. I cannot reproduce 
hangs anymore with that patch applied top of CFS while one console checks out 
SVN repos and other one compiles a small test software.

Cheers
-- 
S.Çağlar Onur <[EMAIL PROTECTED]>
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


signature.asc
Description: This is a digitally signed message part.


Re: Kaffeine problem with CFS

2007-04-18 Thread Christoph Pfister

2007/4/18, Christoph Pfister <[EMAIL PROTECTED]>:

[ Sorry for accidentally dropping CCs ]

2007/4/18, Christoph Pfister <[EMAIL PROTECTED]>:
> 2007/4/18, Ingo Molnar <[EMAIL PROTECTED]>:
> >
> > * Christoph Pfister <[EMAIL PROTECTED]> wrote:
> >
> > > Or I could try playing around a bit with your patchset and trying to
> > > reproduce it over here. Because I already have debug builds for
> > > xine-lib and compiling a new kernel can take place in the background
> > > it wouldn't be much effort for me.
> >
> > that would be great :) Here are the URLs for it. CFS is based on
> > v2.6.21-rc7:
> >
> >   http://kernel.org/pub/linux/kernel/v2.6/testing/linux-2.6.21-rc7.tar.bz2
> >
> > And the CFS patch is at:
> >
> >   http://people.redhat.com/mingo/cfs-scheduler/sched-cfs-v2.patch
> >
> > rebuild your kernel as usual and boot into it. No extra configuration is
> > needed, you'll get CFS by default.
> >
> > if this kernel builds/boots fine for you then you might also want to
> > send me a quick note about how it feels, interactivity-wise. And of
> > course i'm interested in any sort of feedback about problems as well.
> > I'd like to make CFS as media-playback friendly as possible, so if
> > there's any problem in that area it would be nice for me to know about
> > it as soon as possible.
> >
> > Ingo
>
> Okay - so here are some results (it's strange that gdb goes nuts
> inside the xine_play call). I have three bts (seems to be fairly easy
> to reproduce that behaviour over here): Twice while playing an audio
> cd and once while playing a normal file. The hang usually ends if you
> wait long enough (something around 30 secs over here).





> Christoph
>
>
> PS: Haven't analyzed them yet - but doing so now :-)

Ok - one nice thing: In all those bts demux_loop is at demux.c:285 -
meaing that demux_lock is held and xine_play is waiting for it ...
The lock should be temporilary unreleased with a sched_yield so that
the main thread can access it. As you wrote the implementation of this
function seems to have changed a bit - so I'll replace it with a short
sleep and try again ...

Christoph


Replacing the sched_yield in demux.c with an usleep(10) stopped those
seeking hangs here (at least I was able to pull the slider back and
forth during 2 mins without trouble compared to the few secs I need
earlier to get a hang).

Christoph
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread S.Çağlar Onur
18 Nis 2007 Çar tarihinde, Christoph Pfister şunları yazmıştı: 
> > Okay - so here are some results (it's strange that gdb goes nuts
> > inside the xine_play call). I have three bts (seems to be fairly easy
> > to reproduce that behaviour over here): Twice while playing an audio
> > cd and once while playing a normal file. The hang usually ends if you
> > wait long enough (something around 30 secs over here).

I can confirm this, freeze ends after some wait period (~20-30 secs) if 
kaffine is the only active process. I didn't notice that cause most probably 
CPU is busy with compiling kernel at that time...

Now i'm testing Ingo's msleep patch + xine-lib-1.1.6...

Cheers
-- 
S.Çağlar Onur <[EMAIL PROTECTED]>
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


signature.asc
Description: This is a digitally signed message part.


Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> hm. I've reviewed all uses of demux_lock. ./src/xine-engine/demux.c 
> does this:
> 
> pthread_mutex_unlock( >demux_lock );
> 
> lprintf ("sched_yield\n");
> 
> sched_yield();
> pthread_mutex_lock( >demux_lock );
> 
> why is this done? CFS has definitely changed the yield implementation 
> so there could be some connection.
> 
> OTOH, in the 'hung' state none of the straces suggests any yield() 
> call.

yeah, there's no yield() call in any of the straces so i'd exclude this 
as a possibility .

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> > hm. I've reviewed all uses of demux_lock. ./src/xine-engine/demux.c 
> > does this:
> 
> plus it does this too:
> 
>   pthread_mutex_unlock( >demux_lock );
>   xine_usec_sleep(10);
>   pthread_mutex_lock( >demux_lock );
> 
> this would explain the nanosleep() strace entries. But the task stuck 
> on demux_lock never gets the unlock event. Weird.

9303 is stuck here on demux_lock:

#0  0xe410 in __kernel_vsyscall ()
#1  0x4a2538ce in __lll_mutex_lock_wait () from /lib/libpthread.so.0
#2  0x4a24f71c in _L_mutex_lock_79 () from /lib/libpthread.so.0
#3  0x4a24f24d in pthread_mutex_lock () from /lib/libpthread.so.0
#4  0xb79f64f9 in xine_play () from /usr/lib/libxine.so.1

that mutex related futex is at address 0xb07409e0, but the only sign in 
the strace of that futex being touched is:

9303  futex(0xb07409e0, FUTEX_WAIT, 2, NULL 

no other event ever happens on futex 0xb07409e0. Other threads dont 
touch it.

Maybe thread 9324 is the owner of that mutex, and it's looping somewhere 
that does xine_xmalloc_aligned(), with the lock held? It did this:

#0  0xe410 in __kernel_vsyscall ()
#1  0x4a2539e1 in __lll_mutex_unlock_wake () from /lib/libpthread.so.0
#2  0x4a2506f9 in _L_mutex_unlock_99 () from /lib/libpthread.so.0
#3  0x4a250370 in __pthread_mutex_unlock_usercnt () from /lib/libpthread.so.0
#4  0x4a2506f0 in pthread_mutex_unlock () from /lib/libpthread.so.0
#5  0xb79fce5a in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
#6  0xb7a4b90b in dvd_plugin_free_buffer (buf=0xb0745470) at input_dvd.c:570
#7  0xb7a030a2 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
#8  0x4a24d2db in start_thread () from /lib/libpthread.so.0
#9  0x4a05820e in clone () from /lib/libc.so.6

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> hm. I've reviewed all uses of demux_lock. ./src/xine-engine/demux.c 
> does this:

plus it does this too:

  pthread_mutex_unlock( >demux_lock );
  xine_usec_sleep(10);
  pthread_mutex_lock( >demux_lock );

this would explain the nanosleep() strace entries. But the task stuck on 
demux_lock never gets the unlock event. Weird.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

hm. I've reviewed all uses of demux_lock. ./src/xine-engine/demux.c does 
this:

pthread_mutex_unlock( >demux_lock );

lprintf ("sched_yield\n");

sched_yield();
pthread_mutex_lock( >demux_lock );

why is this done? CFS has definitely changed the yield implementation so 
there could be some connection.

OTOH, in the 'hung' state none of the straces suggests any yield() call.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

* Christoph Pfister <[EMAIL PROTECTED]> wrote:

> It's nearly impossible for me to find out which mutex is deadlocking.

i've disassembled the xine_play function, and here are the function 
calls in it:

  
 pthread_mutex_lock()
 xine_log()
  
 function pointer call
 right after it: pthread_mutex_lock()

this second pthread_mutex_lock() in question is the one that deadlocks. 
It comes right after that function pointer call, maybe that identifies 
it?

[some time passes]

i rebuilt the library from source and while the installed library is 
different from it, looking at the disassembly i'm quite sure it's this 
pthread_mutex_lock() in xine_play_internal():

  pthread_mutex_lock( >demux_lock );

src/xine-engine/xine.c:1201

the function pointer call was:

  stream->xine->port_ticket->acquire(stream->xine->port_ticket, 1);

right before the pthread_mutex_lock() call.

> It would be great if you could reproduce the same problem with a 
> xine-lib which has been compiled with debug support (so you'd get line 
> numbers in the back trace - that makes life _a lot_ easier and maybe I 
> could identify the problem that way) and the least optimization 
> possible ... :-)

ok, i'll try that too (but it will take some more time), but given how 
hard it was for me to trigger it, i wanted to get maximum info out of it 
before having to kill the threads.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Christoph Pfister

2007/4/18, Christoph Pfister <[EMAIL PROTECTED]>:

2007/4/18, Ingo Molnar <[EMAIL PROTECTED]>:
>
> * Christoph Pfister <[EMAIL PROTECTED]> wrote:
>
> > >which thread would be the most interesting to you - 9324?
> >
> > The thread which should wake the main thread - but hmm ... 9303 seems
> > to be rather dead-locked than doing pthread_cond_timedwait() ?
>
> ok, here it is, 9303 with better symbol names:
>
> #0  0xe410 in __kernel_vsyscall ()
> #1  0x4a2538ce in __lll_mutex_lock_wait () from /lib/libpthread.so.0
> #2  0x4a24f71c in _L_mutex_lock_79 () from /lib/libpthread.so.0
> #3  0x4a24f24d in pthread_mutex_lock () from /lib/libpthread.so.0
> #4  0xb79f64f9 in xine_play () from /usr/lib/libxine.so.1
> #5  0xb7a9b0fb in KXineWidget::slotSeekToPosition ()
>from /usr/lib/kde3/libxinepart.so
> #6  0xb7a9b3bc in KXineWidget::wheelEvent () from /usr/lib/kde3/libxinepart.so
> #7  0x4b5f9150 in QWidget::event () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
> #8  0x4b55353b in QApplication::internalNotify ()
>from /usr/lib/qt-3.3/lib/libqt-mt.so.3
> #9  0x4b55526e in QApplication::notify ()
>from /usr/lib/qt-3.3/lib/libqt-mt.so.3
> #10 0x4a72065e in KApplication::notify () from /usr/lib/libkdecore.so.4
> #11 0x4b4dd5de in QETWidget::translateWheelEvent ()
>from /usr/lib/qt-3.3/lib/libqt-mt.so.3
> #12 0x4b4eb41d in QETWidget::translateMouseEvent ()
>from /usr/lib/qt-3.3/lib/libqt-mt.so.3
> #13 0x4b4e9766 in QApplication::x11ProcessEvent ()
>from /usr/lib/qt-3.3/lib/libqt-mt.so.3
> #14 0x4b4fb38b in QEventLoop::processEvents ()
>from /usr/lib/qt-3.3/lib/libqt-mt.so.3
> #15 0x4b56ce30 in QEventLoop::enterLoop ()
>from /usr/lib/qt-3.3/lib/libqt-mt.so.3
> #16 0x4b56cce6 in QEventLoop::exec () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
> #17 0x4b55317f in QApplication::exec () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
> #18 0x0806fc1a in QWidget::setUpdatesEnabled ()
> #19 0x49f9df10 in __libc_start_main () from /lib/libc.so.6
> #20 0x0806f7e1 in QWidget::setUpdatesEnabled ()
>
> does this make more sense to you?

It's nearly impossible for me to find out which mutex is deadlocking.
There are 4 mutexs locked / released during xine_play (or one of the
possibly inlined functions) and to be honest I have little idea which
other thread is also involved in the deadlock (maybe some xine-lib
junkie could help you more with that).
It would be great if you could reproduce the same problem with a
xine-lib which has been compiled with debug support (so you'd get line
numbers in the back trace - that makes life _a lot_ easier and maybe I
could identify the problem that way) and the least optimization
possible ... :-)

> Ingo


Or I could try playing around a bit with your patchset and trying to
reproduce it over here. Because I already have debug builds for
xine-lib and compiling a new kernel can take place in the background
it wouldn't be much effort for me.

Christoph
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Christoph Pfister

2007/4/18, Ingo Molnar <[EMAIL PROTECTED]>:


* Christoph Pfister <[EMAIL PROTECTED]> wrote:

> >which thread would be the most interesting to you - 9324?
>
> The thread which should wake the main thread - but hmm ... 9303 seems
> to be rather dead-locked than doing pthread_cond_timedwait() ?

ok, here it is, 9303 with better symbol names:

#0  0xe410 in __kernel_vsyscall ()
#1  0x4a2538ce in __lll_mutex_lock_wait () from /lib/libpthread.so.0
#2  0x4a24f71c in _L_mutex_lock_79 () from /lib/libpthread.so.0
#3  0x4a24f24d in pthread_mutex_lock () from /lib/libpthread.so.0
#4  0xb79f64f9 in xine_play () from /usr/lib/libxine.so.1
#5  0xb7a9b0fb in KXineWidget::slotSeekToPosition ()
   from /usr/lib/kde3/libxinepart.so
#6  0xb7a9b3bc in KXineWidget::wheelEvent () from /usr/lib/kde3/libxinepart.so
#7  0x4b5f9150 in QWidget::event () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#8  0x4b55353b in QApplication::internalNotify ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#9  0x4b55526e in QApplication::notify ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#10 0x4a72065e in KApplication::notify () from /usr/lib/libkdecore.so.4
#11 0x4b4dd5de in QETWidget::translateWheelEvent ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#12 0x4b4eb41d in QETWidget::translateMouseEvent ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#13 0x4b4e9766 in QApplication::x11ProcessEvent ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#14 0x4b4fb38b in QEventLoop::processEvents ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#15 0x4b56ce30 in QEventLoop::enterLoop ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#16 0x4b56cce6 in QEventLoop::exec () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#17 0x4b55317f in QApplication::exec () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#18 0x0806fc1a in QWidget::setUpdatesEnabled ()
#19 0x49f9df10 in __libc_start_main () from /lib/libc.so.6
#20 0x0806f7e1 in QWidget::setUpdatesEnabled ()

does this make more sense to you?


It's nearly impossible for me to find out which mutex is deadlocking.
There are 4 mutexs locked / released during xine_play (or one of the
possibly inlined functions) and to be honest I have little idea which
other thread is also involved in the deadlock (maybe some xine-lib
junkie could help you more with that).
It would be great if you could reproduce the same problem with a
xine-lib which has been compiled with debug support (so you'd get line
numbers in the back trace - that makes life _a lot_ easier and maybe I
could identify the problem that way) and the least optimization
possible ... :-)


Ingo


Christoph
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

* Christoph Pfister <[EMAIL PROTECTED]> wrote:

> >which thread would be the most interesting to you - 9324?
> 
> The thread which should wake the main thread - but hmm ... 9303 seems 
> to be rather dead-locked than doing pthread_cond_timedwait() ?

ok, here it is, 9303 with better symbol names:

#0  0xe410 in __kernel_vsyscall ()
#1  0x4a2538ce in __lll_mutex_lock_wait () from /lib/libpthread.so.0
#2  0x4a24f71c in _L_mutex_lock_79 () from /lib/libpthread.so.0
#3  0x4a24f24d in pthread_mutex_lock () from /lib/libpthread.so.0
#4  0xb79f64f9 in xine_play () from /usr/lib/libxine.so.1
#5  0xb7a9b0fb in KXineWidget::slotSeekToPosition ()
   from /usr/lib/kde3/libxinepart.so
#6  0xb7a9b3bc in KXineWidget::wheelEvent () from /usr/lib/kde3/libxinepart.so
#7  0x4b5f9150 in QWidget::event () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#8  0x4b55353b in QApplication::internalNotify ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#9  0x4b55526e in QApplication::notify ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#10 0x4a72065e in KApplication::notify () from /usr/lib/libkdecore.so.4
#11 0x4b4dd5de in QETWidget::translateWheelEvent ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#12 0x4b4eb41d in QETWidget::translateMouseEvent ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#13 0x4b4e9766 in QApplication::x11ProcessEvent ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#14 0x4b4fb38b in QEventLoop::processEvents ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#15 0x4b56ce30 in QEventLoop::enterLoop ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#16 0x4b56cce6 in QEventLoop::exec () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#17 0x4b55317f in QApplication::exec () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#18 0x0806fc1a in QWidget::setUpdatesEnabled ()
#19 0x49f9df10 in __libc_start_main () from /lib/libc.so.6
#20 0x0806f7e1 in QWidget::setUpdatesEnabled ()

does this make more sense to you?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Christoph Pfister

2007/4/18, Ingo Molnar <[EMAIL PROTECTED]>:


* Christoph Pfister <[EMAIL PROTECTED]> wrote:

> >are the updated backtraces in the followup mail i just sent more
> >useful? (I still have that stuck session running so i can whatever
> >debugging you'd like to see done.)
>
> QWidget::setUpdatesEnabled() is (wrongly) present in every thread
> except the main. So I'm afraid there's nothing which can be done :-/
> Btw the main thread is waiting for the first frame being displayed
> after the seek.

i didnt have all the debuginfo packages installed. I installed some (but
not all yet), here's an updated backtrace:

(gdb) bt
#0  0xe410 in __kernel_vsyscall ()
#1  0x4a2539e1 in __lll_mutex_unlock_wake () from /lib/libpthread.so.0
#2  0x4a2506f9 in _L_mutex_unlock_99 () from /lib/libpthread.so.0
#3  0x4a250370 in __pthread_mutex_unlock_usercnt () from /lib/libpthread.so.0
#4  0x4a2506f0 in pthread_mutex_unlock () from /lib/libpthread.so.0
#5  0xb79fce5a in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
#6  0xb7a4b90b in dvd_plugin_free_buffer (buf=0xb0745470) at input_dvd.c:570
#7  0xb7a030a2 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
#8  0x4a24d2db in start_thread () from /lib/libpthread.so.0
#9  0x4a05820e in clone () from /lib/libc.so.6

at least the dvd_plugin_free_buffer() call has been resolved now. (I'll
hunt for the other debuginfo packages too.)

which thread would be the most interesting to you - 9324?


The thread which should wake the main thread - but hmm ... 9303 seems
to be rather dead-locked than doing pthread_cond_timedwait() ?


Ingo


Christoph
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

* Christoph Pfister <[EMAIL PROTECTED]> wrote:

> >are the updated backtraces in the followup mail i just sent more 
> >useful? (I still have that stuck session running so i can whatever 
> >debugging you'd like to see done.)
> 
> QWidget::setUpdatesEnabled() is (wrongly) present in every thread 
> except the main. So I'm afraid there's nothing which can be done :-/ 
> Btw the main thread is waiting for the first frame being displayed 
> after the seek.

i didnt have all the debuginfo packages installed. I installed some (but 
not all yet), here's an updated backtrace:

(gdb) bt
#0  0xe410 in __kernel_vsyscall ()
#1  0x4a2539e1 in __lll_mutex_unlock_wake () from /lib/libpthread.so.0
#2  0x4a2506f9 in _L_mutex_unlock_99 () from /lib/libpthread.so.0
#3  0x4a250370 in __pthread_mutex_unlock_usercnt () from /lib/libpthread.so.0
#4  0x4a2506f0 in pthread_mutex_unlock () from /lib/libpthread.so.0
#5  0xb79fce5a in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
#6  0xb7a4b90b in dvd_plugin_free_buffer (buf=0xb0745470) at input_dvd.c:570
#7  0xb7a030a2 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
#8  0x4a24d2db in start_thread () from /lib/libpthread.so.0
#9  0x4a05820e in clone () from /lib/libc.so.6

at least the dvd_plugin_free_buffer() call has been resolved now. (I'll 
hunt for the other debuginfo packages too.)

which thread would be the most interesting to you - 9324?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Christoph Pfister

2007/4/18, Ingo Molnar <[EMAIL PROTECTED]>:


* Christoph Pfister <[EMAIL PROTECTED]> wrote:

> >backtrace:
> >
> > #0  0xe410 in __kernel_vsyscall ()
> > #1  0x4a2510c6 in pthread_cond_wait@@GLIBC_2.3.2 () from
> > /lib/libpthread.so.0
> > #2  0xb79fd1a8 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
> > #3  0xb7a030ab in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
> > #4  0x4a24d2db in start_thread () from /lib/libpthread.so.0
> > #5  0x4a05820e in clone () from /lib/libc.so.6
>
> This backtrace is useless - QWidget::setUpdatesEnabled() is certainly
> _not_ defined in libxine. So the function names in #2 and #3 are wrong
> because the addresses seem to belong to libxine.

are the updated backtraces in the followup mail i just sent more useful?
(I still have that stuck session running so i can whatever debugging
you'd like to see done.)


QWidget::setUpdatesEnabled() is (wrongly) present in every thread
except the main. So I'm afraid there's nothing which can be done :-/
Btw the main thread is waiting for the first frame being displayed
after the seek.


Ingo


Christoph
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Mike Galbraith
On Wed, 2007-04-18 at 11:01 +0200, Ingo Molnar wrote:
> * Christoph Pfister <[EMAIL PROTECTED]> wrote:
> 
> > >backtrace:
> > >
> > > #0  0xe410 in __kernel_vsyscall ()
> > > #1  0x4a2510c6 in pthread_cond_wait@@GLIBC_2.3.2 () from 
> > > /lib/libpthread.so.0
> > > #2  0xb79fd1a8 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
> > > #3  0xb7a030ab in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
> > > #4  0x4a24d2db in start_thread () from /lib/libpthread.so.0
> > > #5  0x4a05820e in clone () from /lib/libc.so.6
> > 
> > This backtrace is useless - QWidget::setUpdatesEnabled() is certainly 
> > _not_ defined in libxine. So the function names in #2 and #3 are wrong 
> > because the addresses seem to belong to libxine.
> 
> are the updated backtraces in the followup mail i just sent more useful? 
> (I still have that stuck session running so i can whatever debugging 
> you'd like to see done.)

The xine website release note says there are problems with playback with
xine-lib version 1.1.5, so people encountering this may want to check to
see if they're running 1.1.5, and either upgrade to the latest, or
downgrade to 1.1.4.



18.04.2007   xine-lib 1.1.6   A new xine-lib version is now available.
This is mainly a bug-fix release; 1.1.5 has CD audio and DVD playback
problems and a couple of X-related build problems.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> these were only the threads that showed up in htop. Here's a full 
> analysis about what all threads are doing:
> 
>  Process 9303: stuck in xine_play()/pthread_mutex_lock()
>  Process 9319:  stuck in pthread_cond_timedwait()
>  Process 9320:  stuck in pthread_cond_timedwait()
>  Process 9321: loop of ~3 msec nanosleeps
>  Process 9322: loop of poll() calls every 335 msecs
>  Process 9323:  stuck in pthread_cond_timedwait()
>  Process 9324: stuck in a loop of 1-second futex-waits + mmap/munmap (malloc)
>  Process 9325:  stuck in pthread_cond_timedwait()
>  Process 9326:  stuck in pthread_cond_timedwait()
>  Process 9327:  stuck in pthread_cond_timedwait()

and here's a top snapshot:

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 9324 mingo 20   0  300m  59m  17m R 96.4  6.8  21:00.55 kaffeine
 9325 mingo 20   0  300m  59m  17m S  2.0  6.8   0:15.57 kaffeine
 9327 mingo 20   0  300m  59m  17m S  2.0  6.8   0:20.10 kaffeine

so 9324 doing the mpeg decoding seems to be stuck somehow?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

* Christoph Pfister <[EMAIL PROTECTED]> wrote:

> >backtrace:
> >
> > #0  0xe410 in __kernel_vsyscall ()
> > #1  0x4a2510c6 in pthread_cond_wait@@GLIBC_2.3.2 () from 
> > /lib/libpthread.so.0
> > #2  0xb79fd1a8 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
> > #3  0xb7a030ab in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
> > #4  0x4a24d2db in start_thread () from /lib/libpthread.so.0
> > #5  0x4a05820e in clone () from /lib/libc.so.6
> 
> This backtrace is useless - QWidget::setUpdatesEnabled() is certainly 
> _not_ defined in libxine. So the function names in #2 and #3 are wrong 
> because the addresses seem to belong to libxine.

are the updated backtraces in the followup mail i just sent more useful? 
(I still have that stuck session running so i can whatever debugging 
you'd like to see done.)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Christoph Pfister

Hi,

2007/4/18, Ingo Molnar <[EMAIL PROTECTED]>:


[ i've Cc:-ed Ulrich Drepper, this CFS-triggered hang seems to have some
  futex and pthread_cond_wait() relevance. ]

* Christoph Pfister <[EMAIL PROTECTED]> wrote:

> >> > [1] http://cekirdek.pardus.org.tr/~caglar/strace.kaffeine
>
> Could you try xine-ui or gxine? Because I suspect rather xine-lib for
> freezing issues. In any way I think a gdb backtrace would be much
> nicer - but if you can't reproduce the freeze issue with other xine
> based players and want to run kaffeine in gdb, you need to execute
> "gdb --args kaffeine --nofork".

update: i've reproduced one kind of a hang but i'm not sure it's the
same hang Ismail is seeing. It was quite hard to trigger it under CFS, i
had to do wild forward/backward button seeks on a real DVD and i mixed
it with CPU-intense workloads on the same box. Here are the straces and
gdb backtraces:

kaffeine thread PID 9303, waiting for other threads to do something,
stuck in pthread_mutex_lock():

  futex(0xb07409e0, FUTEX_WAIT, 2, NULL 

backtrace:

 #0  0xe410 in __kernel_vsyscall ()
 #1  0x4a2538ce in __lll_mutex_lock_wait () from /lib/libpthread.so.0
 #2  0x4a24f71c in _L_mutex_lock_79 () from /lib/libpthread.so.0
 #3  0x4a24f24d in pthread_mutex_lock () from /lib/libpthread.so.0
 #4  0xb79f64f9 in xine_play () from /usr/lib/libxine.so.1
 #5  0xb7a9b0fb in KXineWidget::slotSeekToPosition () from 
/usr/lib/kde3/libxinepart.so
 #6  0xb7a9b3bc in KXineWidget::wheelEvent () from /usr/lib/kde3/libxinepart.so
 #7  0x4b5f9150 in QWidget::event () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #8  0x4b55353b in QApplication::internalNotify () from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #9  0x4b55526e in QApplication::notify ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #10 0x4a72065e in KApplication::notify () from /usr/lib/libkdecore.so.4
 #11 0x4b4dd5de in QETWidget::translateWheelEvent ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #12 0x4b4eb41d in QETWidget::translateMouseEvent ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #13 0x4b4e9766 in QApplication::x11ProcessEvent ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #14 0x4b4fb38b in QEventLoop::processEvents ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #15 0x4b56ce30 in QEventLoop::enterLoop ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #16 0x4b56cce6 in QEventLoop::exec () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #17 0x4b55317f in QApplication::exec () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #18 0x0806fc1a in QWidget::setUpdatesEnabled ()
 #19 0x49f9df10 in __libc_start_main () from /lib/libc.so.6
 #20 0x0806f7e1 in QWidget::setUpdatesEnabled ()

Kaffeine thread 9324, seems to be in an infinite pthread_cond_wait()
loop that does:

 futex(0xb0740b78, FUTEX_WAIT, 3559, NULL) = 0
 futex(0xb0740b5c, FUTEX_WAKE, 1)= 0
 munmap(0xaacb1000, 1662976) = 0
 mmap2(NULL, 1662976, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0xaacb1000
 gettimeofday({1176891363, 347259}, NULL) = 0
 munmap(0xab309000, 1662976) = 0

backtrace:

 #0  0xe410 in __kernel_vsyscall ()
 #1  0x4a2510c6 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
 #2  0xb79fd1a8 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
 #3  0xb7a030ab in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
 #4  0x4a24d2db in start_thread () from /lib/libpthread.so.0
 #5  0x4a05820e in clone () from /lib/libc.so.6


This backtrace is useless - QWidget::setUpdatesEnabled() is certainly
_not_ defined in libxine. So the function names in #2 and #3 are wrong
because the addresses seem to belong to libxine.


Kaffine thread 9325 does a loop of short pthread_cond_wait() futex
sleeps:

 1176891721.419314 futex(0xb07527e8, FUTEX_WAIT, 8537, NULL) = 0 <0.011710>
 1176891721.431068 futex(0xb07527cc, FUTEX_WAKE, 1) = 0 <0.06>
 1176891721.431429 futex(0xb0740c04, 0x5 /* FUTEX_??? */, 1) = 1 <0.08>
 1176891721.431458 futex(0xb0740be8, FUTEX_WAKE, 1) = 1 <0.12>
 1176891721.431489 futex(0xb07527e8, FUTEX_WAIT, 8539, NULL) = 0 <0.007339>
 1176891721.439008 futex(0xb07527cc, FUTEX_WAKE, 1) = 0 <0.52>
 1176891721.439510 futex(0xb0740c04, 0x5 /* FUTEX_??? */, 1) = 1 <0.55>
 1176891721.439636 futex(0xb0740be8, FUTEX_WAKE, 1) = 1 <0.89>
 1176891721.439789 futex(0xb07527e8, FUTEX_WAIT, 8541, NULL) = 0 <0.007045>
 1176891721.447017 futex(0xb07527cc, FUTEX_WAKE, 1) = 0 <0.54>
 1176891721.447682 futex(0xb0740c04, 0x5 /* FUTEX_??? */, 1) = 1 <0.65>

backtrace:

 #0  0xe410 in __kernel_vsyscall ()
 #1  0x4a2510c6 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
 #2  0xb79fd1a8 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
 #3  0xb7a04079 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
 #4  0x4a24d2db in start_thread () from /lib/libpthread.so.0
 #5  0x4a05820e in clone () from /lib/libc.so.6


Dito.


library versions:

 xine-lib-1.1.5-1.fc7
 xine-plugin-1.0-3.fc7
 glibc-headers-2.5.90-21
 

Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> update: i've reproduced one kind of a hang but i'm not sure it's the 
> same hang Ismail is seeing. It was quite hard to trigger it under CFS, 
> i had to do wild forward/backward button seeks on a real DVD and i 
> mixed it with CPU-intense workloads on the same box. Here are the 
> straces and gdb backtraces:

these were only the threads that showed up in htop. Here's a full 
analysis about what all threads are doing:

 Process 9303: stuck in xine_play()/pthread_mutex_lock()
 Process 9319:  stuck in pthread_cond_timedwait()
 Process 9320:  stuck in pthread_cond_timedwait()
 Process 9321: loop of ~3 msec nanosleeps
 Process 9322: loop of poll() calls every 335 msecs
 Process 9323:  stuck in pthread_cond_timedwait()
 Process 9324: stuck in a loop of 1-second futex-waits + mmap/munmap (malloc)
 Process 9325:  stuck in pthread_cond_timedwait()
 Process 9326:  stuck in pthread_cond_timedwait()
 Process 9327:  stuck in pthread_cond_timedwait()

now here's a weird thing: occasionally, when i strace one of the 
threads, i can get a single frame refreshed in the Kaffeine window - but 
the general picture does not change, the same 'stuck' state is still 
there.

most threads are sitting in:

 #0  0xe410 in __kernel_vsyscall ()
 #1  0x4a25134c in pthread_cond_timedwait@@GLIBC_2.3.2 ()   from 
/lib/libpthread.so.0
 #2  0xb79f9a05 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
 #3  0x4a24d2db in start_thread () from /lib/libpthread.so.0
 #4  0x4a05820e in clone () from /lib/libc.so.6

9324 is looping around this place, apparently in the opengl video output 
driver, but the backtrace is not always this one:

 (gdb) bt
 #0  0x49ff7257 in memset () from /lib/libc.so.6
 #1  0x49ff1877 in calloc () from /lib/libc.so.6
 #2  0xb7a224d6 in xine_xmalloc_aligned () from /usr/lib/libxine.so.1
 #3  0xb708c8f6 in QWidget::setUpdatesEnabled ()
from /usr/lib/xine/plugins/1.1.5/xineplug_vo_out_opengl.so
 #4  0xb7a0525a in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
 #5  0xb78944e4 in QWidget::setUpdatesEnabled ()
from /usr/lib/xine/plugins/1.1.5/post/xineplug_post_tvtime.so
 #6  0xb7895234 in QWidget::setUpdatesEnabled ()
from /usr/lib/xine/plugins/1.1.5/post/xineplug_post_tvtime.so
 #7  0xad4e5439 in QWidget::setUpdatesEnabled ()
from /usr/lib/xine/plugins/1.1.5/xineplug_decode_mpeg2.so
 #8  0xad4fa8e1 in QWidget::setUpdatesEnabled ()
from /usr/lib/xine/plugins/1.1.5/xineplug_decode_mpeg2.so
 #9  0xb7a032d6 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
 #10 0x4a24d2db in start_thread () from /lib/libpthread.so.0
 #11 0x4a05820e in clone () from /lib/libc.so.6

9321 is sitting in:

(gdb) bt
 #0  0xe410 in __kernel_vsyscall ()
 #1  0x4a2544a6 in nanosleep () from /lib/libpthread.so.0
 #2  0xb7a222fa in xine_usec_sleep () from /usr/lib/libxine.so.1
 #3  0xb7a073bb in QWidget::setUpdatesEnabled () from  /usr/lib/libxine.so.1
 #4  0x4a24d2db in start_thread () from /lib/libpthread.so.0
 #5  0x4a05820e in clone () from /lib/libc.so.6

9322 is in poll():

(gdb) bt
 #0  0xe410 in __kernel_vsyscall ()
 #1  0x4a04e533 in poll () from /lib/libc.so.6
 #2  0xb12e1f75 in QWidget::setUpdatesEnabled () from 
/usr/lib/xine/plugins/1.1.5/xineplug_ao_out_alsa.so
 #3  0x4a24d2db in start_thread () from /lib/libpthread.so.0
 #4  0x4a05820e in clone () from /lib/libc.so.6

9303 is stuck in xine_play(), pthread_mutex_lock():

 #0  0xe410 in __kernel_vsyscall ()
 #1  0x4a2538ce in __lll_mutex_lock_wait () from /lib/libpthread.so.0
 #2  0x4a24f71c in _L_mutex_lock_79 () from /lib/libpthread.so.0
 #3  0x4a24f24d in pthread_mutex_lock () from /lib/libpthread.so.0
 #4  0xb79f64f9 in xine_play () from /usr/lib/libxine.so.1
 #5  0xb7a9b0fb in KXineWidget::slotSeekToPosition () from 
/usr/lib/kde3/libxinepart.so
 #6  0xb7a9b3bc in KXineWidget::wheelEvent () from /usr/lib/kde3/libxinepart.so
 #7  0x4b5f9150 in QWidget::event () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #8  0x4b55353b in QApplication::internalNotify () from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #9  0x4b55526e in QApplication::notify ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #10 0x4a72065e in KApplication::notify () from /usr/lib/libkdecore.so.4
 #11 0x4b4dd5de in QETWidget::translateWheelEvent ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #12 0x4b4eb41d in QETWidget::translateMouseEvent ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #13 0x4b4e9766 in QApplication::x11ProcessEvent ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #14 0x4b4fb38b in QEventLoop::processEvents ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #15 0x4b56ce30 in QEventLoop::enterLoop ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #16 0x4b56cce6 in QEventLoop::exec () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #17 0x4b55317f in QApplication::exec () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #18 0x0806fc1a in QWidget::setUpdatesEnabled ()
 #19 0x49f9df10 in __libc_start_main () from /lib/libc.so.6
 #20 0x0806f7e1 in 

Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

[ i've Cc:-ed Ulrich Drepper, this CFS-triggered hang seems to have some 
  futex and pthread_cond_wait() relevance. ]

* Christoph Pfister <[EMAIL PROTECTED]> wrote:

> >> > [1] http://cekirdek.pardus.org.tr/~caglar/strace.kaffeine
> 
> Could you try xine-ui or gxine? Because I suspect rather xine-lib for 
> freezing issues. In any way I think a gdb backtrace would be much 
> nicer - but if you can't reproduce the freeze issue with other xine 
> based players and want to run kaffeine in gdb, you need to execute 
> "gdb --args kaffeine --nofork".

update: i've reproduced one kind of a hang but i'm not sure it's the 
same hang Ismail is seeing. It was quite hard to trigger it under CFS, i 
had to do wild forward/backward button seeks on a real DVD and i mixed 
it with CPU-intense workloads on the same box. Here are the straces and 
gdb backtraces:

kaffeine thread PID 9303, waiting for other threads to do something, 
stuck in pthread_mutex_lock():

  futex(0xb07409e0, FUTEX_WAIT, 2, NULL 

backtrace:

 #0  0xe410 in __kernel_vsyscall ()
 #1  0x4a2538ce in __lll_mutex_lock_wait () from /lib/libpthread.so.0
 #2  0x4a24f71c in _L_mutex_lock_79 () from /lib/libpthread.so.0
 #3  0x4a24f24d in pthread_mutex_lock () from /lib/libpthread.so.0
 #4  0xb79f64f9 in xine_play () from /usr/lib/libxine.so.1
 #5  0xb7a9b0fb in KXineWidget::slotSeekToPosition () from 
/usr/lib/kde3/libxinepart.so
 #6  0xb7a9b3bc in KXineWidget::wheelEvent () from /usr/lib/kde3/libxinepart.so
 #7  0x4b5f9150 in QWidget::event () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #8  0x4b55353b in QApplication::internalNotify () from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #9  0x4b55526e in QApplication::notify ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #10 0x4a72065e in KApplication::notify () from /usr/lib/libkdecore.so.4
 #11 0x4b4dd5de in QETWidget::translateWheelEvent ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #12 0x4b4eb41d in QETWidget::translateMouseEvent ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #13 0x4b4e9766 in QApplication::x11ProcessEvent ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #14 0x4b4fb38b in QEventLoop::processEvents ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #15 0x4b56ce30 in QEventLoop::enterLoop ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #16 0x4b56cce6 in QEventLoop::exec () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #17 0x4b55317f in QApplication::exec () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #18 0x0806fc1a in QWidget::setUpdatesEnabled ()
 #19 0x49f9df10 in __libc_start_main () from /lib/libc.so.6
 #20 0x0806f7e1 in QWidget::setUpdatesEnabled ()

Kaffeine thread 9324, seems to be in an infinite pthread_cond_wait() 
loop that does:

 futex(0xb0740b78, FUTEX_WAIT, 3559, NULL) = 0
 futex(0xb0740b5c, FUTEX_WAKE, 1)= 0
 munmap(0xaacb1000, 1662976) = 0
 mmap2(NULL, 1662976, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0xaacb1000
 gettimeofday({1176891363, 347259}, NULL) = 0
 munmap(0xab309000, 1662976) = 0

backtrace:

 #0  0xe410 in __kernel_vsyscall ()
 #1  0x4a2510c6 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
 #2  0xb79fd1a8 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
 #3  0xb7a030ab in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
 #4  0x4a24d2db in start_thread () from /lib/libpthread.so.0
 #5  0x4a05820e in clone () from /lib/libc.so.6

Kaffine thread 9325 does a loop of short pthread_cond_wait() futex 
sleeps:

 1176891721.419314 futex(0xb07527e8, FUTEX_WAIT, 8537, NULL) = 0 <0.011710>
 1176891721.431068 futex(0xb07527cc, FUTEX_WAKE, 1) = 0 <0.06>
 1176891721.431429 futex(0xb0740c04, 0x5 /* FUTEX_??? */, 1) = 1 <0.08>
 1176891721.431458 futex(0xb0740be8, FUTEX_WAKE, 1) = 1 <0.12>
 1176891721.431489 futex(0xb07527e8, FUTEX_WAIT, 8539, NULL) = 0 <0.007339>
 1176891721.439008 futex(0xb07527cc, FUTEX_WAKE, 1) = 0 <0.52>
 1176891721.439510 futex(0xb0740c04, 0x5 /* FUTEX_??? */, 1) = 1 <0.55>
 1176891721.439636 futex(0xb0740be8, FUTEX_WAKE, 1) = 1 <0.89>
 1176891721.439789 futex(0xb07527e8, FUTEX_WAIT, 8541, NULL) = 0 <0.007045>
 1176891721.447017 futex(0xb07527cc, FUTEX_WAKE, 1) = 0 <0.54>
 1176891721.447682 futex(0xb0740c04, 0x5 /* FUTEX_??? */, 1) = 1 <0.65>

backtrace:

 #0  0xe410 in __kernel_vsyscall ()
 #1  0x4a2510c6 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
 #2  0xb79fd1a8 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
 #3  0xb7a04079 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
 #4  0x4a24d2db in start_thread () from /lib/libpthread.so.0
 #5  0x4a05820e in clone () from /lib/libc.so.6

library versions:

 xine-lib-1.1.5-1.fc7
 xine-plugin-1.0-3.fc7
 glibc-headers-2.5.90-21
 glibc-common-2.5.90-21
 glibc-2.5.90-21
 glibc-devel-2.5.90-21
 gxine-0.5.11-3.fc7
 kaffeine-0.8.3-4.fc7
 xine-0.99.4-11.lvn7
 xine-lib-extras-1.1.5-1.fc7
 gxine-mozplugin-0.5.11-3.fc7

what's weird is that all threads are in a pthread op and seem to be 

Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

[ i've Cc:-ed Ulrich Drepper, this CFS-triggered hang seems to have some 
  futex and pthread_cond_wait() relevance. ]

* Christoph Pfister [EMAIL PROTECTED] wrote:

   [1] http://cekirdek.pardus.org.tr/~caglar/strace.kaffeine
 
 Could you try xine-ui or gxine? Because I suspect rather xine-lib for 
 freezing issues. In any way I think a gdb backtrace would be much 
 nicer - but if you can't reproduce the freeze issue with other xine 
 based players and want to run kaffeine in gdb, you need to execute 
 gdb --args kaffeine --nofork.

update: i've reproduced one kind of a hang but i'm not sure it's the 
same hang Ismail is seeing. It was quite hard to trigger it under CFS, i 
had to do wild forward/backward button seeks on a real DVD and i mixed 
it with CPU-intense workloads on the same box. Here are the straces and 
gdb backtraces:

kaffeine thread PID 9303, waiting for other threads to do something, 
stuck in pthread_mutex_lock():

  futex(0xb07409e0, FUTEX_WAIT, 2, NULL unfinished ...

backtrace:

 #0  0xe410 in __kernel_vsyscall ()
 #1  0x4a2538ce in __lll_mutex_lock_wait () from /lib/libpthread.so.0
 #2  0x4a24f71c in _L_mutex_lock_79 () from /lib/libpthread.so.0
 #3  0x4a24f24d in pthread_mutex_lock () from /lib/libpthread.so.0
 #4  0xb79f64f9 in xine_play () from /usr/lib/libxine.so.1
 #5  0xb7a9b0fb in KXineWidget::slotSeekToPosition () from 
/usr/lib/kde3/libxinepart.so
 #6  0xb7a9b3bc in KXineWidget::wheelEvent () from /usr/lib/kde3/libxinepart.so
 #7  0x4b5f9150 in QWidget::event () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #8  0x4b55353b in QApplication::internalNotify () from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #9  0x4b55526e in QApplication::notify ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #10 0x4a72065e in KApplication::notify () from /usr/lib/libkdecore.so.4
 #11 0x4b4dd5de in QETWidget::translateWheelEvent ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #12 0x4b4eb41d in QETWidget::translateMouseEvent ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #13 0x4b4e9766 in QApplication::x11ProcessEvent ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #14 0x4b4fb38b in QEventLoop::processEvents ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #15 0x4b56ce30 in QEventLoop::enterLoop ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #16 0x4b56cce6 in QEventLoop::exec () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #17 0x4b55317f in QApplication::exec () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #18 0x0806fc1a in QWidget::setUpdatesEnabled ()
 #19 0x49f9df10 in __libc_start_main () from /lib/libc.so.6
 #20 0x0806f7e1 in QWidget::setUpdatesEnabled ()

Kaffeine thread 9324, seems to be in an infinite pthread_cond_wait() 
loop that does:

 futex(0xb0740b78, FUTEX_WAIT, 3559, NULL) = 0
 futex(0xb0740b5c, FUTEX_WAKE, 1)= 0
 munmap(0xaacb1000, 1662976) = 0
 mmap2(NULL, 1662976, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0xaacb1000
 gettimeofday({1176891363, 347259}, NULL) = 0
 munmap(0xab309000, 1662976) = 0

backtrace:

 #0  0xe410 in __kernel_vsyscall ()
 #1  0x4a2510c6 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
 #2  0xb79fd1a8 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
 #3  0xb7a030ab in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
 #4  0x4a24d2db in start_thread () from /lib/libpthread.so.0
 #5  0x4a05820e in clone () from /lib/libc.so.6

Kaffine thread 9325 does a loop of short pthread_cond_wait() futex 
sleeps:

 1176891721.419314 futex(0xb07527e8, FUTEX_WAIT, 8537, NULL) = 0 0.011710
 1176891721.431068 futex(0xb07527cc, FUTEX_WAKE, 1) = 0 0.06
 1176891721.431429 futex(0xb0740c04, 0x5 /* FUTEX_??? */, 1) = 1 0.08
 1176891721.431458 futex(0xb0740be8, FUTEX_WAKE, 1) = 1 0.12
 1176891721.431489 futex(0xb07527e8, FUTEX_WAIT, 8539, NULL) = 0 0.007339
 1176891721.439008 futex(0xb07527cc, FUTEX_WAKE, 1) = 0 0.52
 1176891721.439510 futex(0xb0740c04, 0x5 /* FUTEX_??? */, 1) = 1 0.55
 1176891721.439636 futex(0xb0740be8, FUTEX_WAKE, 1) = 1 0.89
 1176891721.439789 futex(0xb07527e8, FUTEX_WAIT, 8541, NULL) = 0 0.007045
 1176891721.447017 futex(0xb07527cc, FUTEX_WAKE, 1) = 0 0.54
 1176891721.447682 futex(0xb0740c04, 0x5 /* FUTEX_??? */, 1) = 1 0.65

backtrace:

 #0  0xe410 in __kernel_vsyscall ()
 #1  0x4a2510c6 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
 #2  0xb79fd1a8 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
 #3  0xb7a04079 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
 #4  0x4a24d2db in start_thread () from /lib/libpthread.so.0
 #5  0x4a05820e in clone () from /lib/libc.so.6

library versions:

 xine-lib-1.1.5-1.fc7
 xine-plugin-1.0-3.fc7
 glibc-headers-2.5.90-21
 glibc-common-2.5.90-21
 glibc-2.5.90-21
 glibc-devel-2.5.90-21
 gxine-0.5.11-3.fc7
 kaffeine-0.8.3-4.fc7
 xine-0.99.4-11.lvn7
 xine-lib-extras-1.1.5-1.fc7
 gxine-mozplugin-0.5.11-3.fc7

what's weird is that all threads are in a pthread op and seem to be kind 
of busy-looping. 

Re: Kaffeine problem with CFS

2007-04-18 Thread Christoph Pfister

Hi,

2007/4/18, Ingo Molnar [EMAIL PROTECTED]:


[ i've Cc:-ed Ulrich Drepper, this CFS-triggered hang seems to have some
  futex and pthread_cond_wait() relevance. ]

* Christoph Pfister [EMAIL PROTECTED] wrote:

   [1] http://cekirdek.pardus.org.tr/~caglar/strace.kaffeine

 Could you try xine-ui or gxine? Because I suspect rather xine-lib for
 freezing issues. In any way I think a gdb backtrace would be much
 nicer - but if you can't reproduce the freeze issue with other xine
 based players and want to run kaffeine in gdb, you need to execute
 gdb --args kaffeine --nofork.

update: i've reproduced one kind of a hang but i'm not sure it's the
same hang Ismail is seeing. It was quite hard to trigger it under CFS, i
had to do wild forward/backward button seeks on a real DVD and i mixed
it with CPU-intense workloads on the same box. Here are the straces and
gdb backtraces:

kaffeine thread PID 9303, waiting for other threads to do something,
stuck in pthread_mutex_lock():

  futex(0xb07409e0, FUTEX_WAIT, 2, NULL unfinished ...

backtrace:

 #0  0xe410 in __kernel_vsyscall ()
 #1  0x4a2538ce in __lll_mutex_lock_wait () from /lib/libpthread.so.0
 #2  0x4a24f71c in _L_mutex_lock_79 () from /lib/libpthread.so.0
 #3  0x4a24f24d in pthread_mutex_lock () from /lib/libpthread.so.0
 #4  0xb79f64f9 in xine_play () from /usr/lib/libxine.so.1
 #5  0xb7a9b0fb in KXineWidget::slotSeekToPosition () from 
/usr/lib/kde3/libxinepart.so
 #6  0xb7a9b3bc in KXineWidget::wheelEvent () from /usr/lib/kde3/libxinepart.so
 #7  0x4b5f9150 in QWidget::event () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #8  0x4b55353b in QApplication::internalNotify () from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #9  0x4b55526e in QApplication::notify ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #10 0x4a72065e in KApplication::notify () from /usr/lib/libkdecore.so.4
 #11 0x4b4dd5de in QETWidget::translateWheelEvent ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #12 0x4b4eb41d in QETWidget::translateMouseEvent ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #13 0x4b4e9766 in QApplication::x11ProcessEvent ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #14 0x4b4fb38b in QEventLoop::processEvents ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #15 0x4b56ce30 in QEventLoop::enterLoop ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #16 0x4b56cce6 in QEventLoop::exec () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #17 0x4b55317f in QApplication::exec () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #18 0x0806fc1a in QWidget::setUpdatesEnabled ()
 #19 0x49f9df10 in __libc_start_main () from /lib/libc.so.6
 #20 0x0806f7e1 in QWidget::setUpdatesEnabled ()

Kaffeine thread 9324, seems to be in an infinite pthread_cond_wait()
loop that does:

 futex(0xb0740b78, FUTEX_WAIT, 3559, NULL) = 0
 futex(0xb0740b5c, FUTEX_WAKE, 1)= 0
 munmap(0xaacb1000, 1662976) = 0
 mmap2(NULL, 1662976, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0xaacb1000
 gettimeofday({1176891363, 347259}, NULL) = 0
 munmap(0xab309000, 1662976) = 0

backtrace:

 #0  0xe410 in __kernel_vsyscall ()
 #1  0x4a2510c6 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
 #2  0xb79fd1a8 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
 #3  0xb7a030ab in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
 #4  0x4a24d2db in start_thread () from /lib/libpthread.so.0
 #5  0x4a05820e in clone () from /lib/libc.so.6


This backtrace is useless - QWidget::setUpdatesEnabled() is certainly
_not_ defined in libxine. So the function names in #2 and #3 are wrong
because the addresses seem to belong to libxine.


Kaffine thread 9325 does a loop of short pthread_cond_wait() futex
sleeps:

 1176891721.419314 futex(0xb07527e8, FUTEX_WAIT, 8537, NULL) = 0 0.011710
 1176891721.431068 futex(0xb07527cc, FUTEX_WAKE, 1) = 0 0.06
 1176891721.431429 futex(0xb0740c04, 0x5 /* FUTEX_??? */, 1) = 1 0.08
 1176891721.431458 futex(0xb0740be8, FUTEX_WAKE, 1) = 1 0.12
 1176891721.431489 futex(0xb07527e8, FUTEX_WAIT, 8539, NULL) = 0 0.007339
 1176891721.439008 futex(0xb07527cc, FUTEX_WAKE, 1) = 0 0.52
 1176891721.439510 futex(0xb0740c04, 0x5 /* FUTEX_??? */, 1) = 1 0.55
 1176891721.439636 futex(0xb0740be8, FUTEX_WAKE, 1) = 1 0.89
 1176891721.439789 futex(0xb07527e8, FUTEX_WAIT, 8541, NULL) = 0 0.007045
 1176891721.447017 futex(0xb07527cc, FUTEX_WAKE, 1) = 0 0.54
 1176891721.447682 futex(0xb0740c04, 0x5 /* FUTEX_??? */, 1) = 1 0.65

backtrace:

 #0  0xe410 in __kernel_vsyscall ()
 #1  0x4a2510c6 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
 #2  0xb79fd1a8 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
 #3  0xb7a04079 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
 #4  0x4a24d2db in start_thread () from /lib/libpthread.so.0
 #5  0x4a05820e in clone () from /lib/libc.so.6


Dito.


library versions:

 xine-lib-1.1.5-1.fc7
 xine-plugin-1.0-3.fc7
 glibc-headers-2.5.90-21
 glibc-common-2.5.90-21
 glibc-2.5.90-21
 

Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

* Ingo Molnar [EMAIL PROTECTED] wrote:

 update: i've reproduced one kind of a hang but i'm not sure it's the 
 same hang Ismail is seeing. It was quite hard to trigger it under CFS, 
 i had to do wild forward/backward button seeks on a real DVD and i 
 mixed it with CPU-intense workloads on the same box. Here are the 
 straces and gdb backtraces:

these were only the threads that showed up in htop. Here's a full 
analysis about what all threads are doing:

 Process 9303: stuck in xine_play()/pthread_mutex_lock()
 Process 9319:  stuck in pthread_cond_timedwait()
 Process 9320:  stuck in pthread_cond_timedwait()
 Process 9321: loop of ~3 msec nanosleeps
 Process 9322: loop of poll() calls every 335 msecs
 Process 9323:  stuck in pthread_cond_timedwait()
 Process 9324: stuck in a loop of 1-second futex-waits + mmap/munmap (malloc)
 Process 9325:  stuck in pthread_cond_timedwait()
 Process 9326:  stuck in pthread_cond_timedwait()
 Process 9327:  stuck in pthread_cond_timedwait()

now here's a weird thing: occasionally, when i strace one of the 
threads, i can get a single frame refreshed in the Kaffeine window - but 
the general picture does not change, the same 'stuck' state is still 
there.

most threads are sitting in:

 #0  0xe410 in __kernel_vsyscall ()
 #1  0x4a25134c in pthread_cond_timedwait@@GLIBC_2.3.2 ()   from 
/lib/libpthread.so.0
 #2  0xb79f9a05 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
 #3  0x4a24d2db in start_thread () from /lib/libpthread.so.0
 #4  0x4a05820e in clone () from /lib/libc.so.6

9324 is looping around this place, apparently in the opengl video output 
driver, but the backtrace is not always this one:

 (gdb) bt
 #0  0x49ff7257 in memset () from /lib/libc.so.6
 #1  0x49ff1877 in calloc () from /lib/libc.so.6
 #2  0xb7a224d6 in xine_xmalloc_aligned () from /usr/lib/libxine.so.1
 #3  0xb708c8f6 in QWidget::setUpdatesEnabled ()
from /usr/lib/xine/plugins/1.1.5/xineplug_vo_out_opengl.so
 #4  0xb7a0525a in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
 #5  0xb78944e4 in QWidget::setUpdatesEnabled ()
from /usr/lib/xine/plugins/1.1.5/post/xineplug_post_tvtime.so
 #6  0xb7895234 in QWidget::setUpdatesEnabled ()
from /usr/lib/xine/plugins/1.1.5/post/xineplug_post_tvtime.so
 #7  0xad4e5439 in QWidget::setUpdatesEnabled ()
from /usr/lib/xine/plugins/1.1.5/xineplug_decode_mpeg2.so
 #8  0xad4fa8e1 in QWidget::setUpdatesEnabled ()
from /usr/lib/xine/plugins/1.1.5/xineplug_decode_mpeg2.so
 #9  0xb7a032d6 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
 #10 0x4a24d2db in start_thread () from /lib/libpthread.so.0
 #11 0x4a05820e in clone () from /lib/libc.so.6

9321 is sitting in:

(gdb) bt
 #0  0xe410 in __kernel_vsyscall ()
 #1  0x4a2544a6 in nanosleep () from /lib/libpthread.so.0
 #2  0xb7a222fa in xine_usec_sleep () from /usr/lib/libxine.so.1
 #3  0xb7a073bb in QWidget::setUpdatesEnabled () from  /usr/lib/libxine.so.1
 #4  0x4a24d2db in start_thread () from /lib/libpthread.so.0
 #5  0x4a05820e in clone () from /lib/libc.so.6

9322 is in poll():

(gdb) bt
 #0  0xe410 in __kernel_vsyscall ()
 #1  0x4a04e533 in poll () from /lib/libc.so.6
 #2  0xb12e1f75 in QWidget::setUpdatesEnabled () from 
/usr/lib/xine/plugins/1.1.5/xineplug_ao_out_alsa.so
 #3  0x4a24d2db in start_thread () from /lib/libpthread.so.0
 #4  0x4a05820e in clone () from /lib/libc.so.6

9303 is stuck in xine_play(), pthread_mutex_lock():

 #0  0xe410 in __kernel_vsyscall ()
 #1  0x4a2538ce in __lll_mutex_lock_wait () from /lib/libpthread.so.0
 #2  0x4a24f71c in _L_mutex_lock_79 () from /lib/libpthread.so.0
 #3  0x4a24f24d in pthread_mutex_lock () from /lib/libpthread.so.0
 #4  0xb79f64f9 in xine_play () from /usr/lib/libxine.so.1
 #5  0xb7a9b0fb in KXineWidget::slotSeekToPosition () from 
/usr/lib/kde3/libxinepart.so
 #6  0xb7a9b3bc in KXineWidget::wheelEvent () from /usr/lib/kde3/libxinepart.so
 #7  0x4b5f9150 in QWidget::event () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #8  0x4b55353b in QApplication::internalNotify () from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #9  0x4b55526e in QApplication::notify ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #10 0x4a72065e in KApplication::notify () from /usr/lib/libkdecore.so.4
 #11 0x4b4dd5de in QETWidget::translateWheelEvent ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #12 0x4b4eb41d in QETWidget::translateMouseEvent ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #13 0x4b4e9766 in QApplication::x11ProcessEvent ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #14 0x4b4fb38b in QEventLoop::processEvents ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #15 0x4b56ce30 in QEventLoop::enterLoop ()   from 
/usr/lib/qt-3.3/lib/libqt-mt.so.3
 #16 0x4b56cce6 in QEventLoop::exec () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #17 0x4b55317f in QApplication::exec () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #18 0x0806fc1a in QWidget::setUpdatesEnabled ()
 #19 0x49f9df10 in __libc_start_main () from /lib/libc.so.6
 #20 0x0806f7e1 in 

Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

* Christoph Pfister [EMAIL PROTECTED] wrote:

 backtrace:
 
  #0  0xe410 in __kernel_vsyscall ()
  #1  0x4a2510c6 in pthread_cond_wait@@GLIBC_2.3.2 () from 
  /lib/libpthread.so.0
  #2  0xb79fd1a8 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
  #3  0xb7a030ab in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
  #4  0x4a24d2db in start_thread () from /lib/libpthread.so.0
  #5  0x4a05820e in clone () from /lib/libc.so.6
 
 This backtrace is useless - QWidget::setUpdatesEnabled() is certainly 
 _not_ defined in libxine. So the function names in #2 and #3 are wrong 
 because the addresses seem to belong to libxine.

are the updated backtraces in the followup mail i just sent more useful? 
(I still have that stuck session running so i can whatever debugging 
you'd like to see done.)

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

* Ingo Molnar [EMAIL PROTECTED] wrote:

 these were only the threads that showed up in htop. Here's a full 
 analysis about what all threads are doing:
 
  Process 9303: stuck in xine_play()/pthread_mutex_lock()
  Process 9319:  stuck in pthread_cond_timedwait()
  Process 9320:  stuck in pthread_cond_timedwait()
  Process 9321: loop of ~3 msec nanosleeps
  Process 9322: loop of poll() calls every 335 msecs
  Process 9323:  stuck in pthread_cond_timedwait()
  Process 9324: stuck in a loop of 1-second futex-waits + mmap/munmap (malloc)
  Process 9325:  stuck in pthread_cond_timedwait()
  Process 9326:  stuck in pthread_cond_timedwait()
  Process 9327:  stuck in pthread_cond_timedwait()

and here's a top snapshot:

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 9324 mingo 20   0  300m  59m  17m R 96.4  6.8  21:00.55 kaffeine
 9325 mingo 20   0  300m  59m  17m S  2.0  6.8   0:15.57 kaffeine
 9327 mingo 20   0  300m  59m  17m S  2.0  6.8   0:20.10 kaffeine

so 9324 doing the mpeg decoding seems to be stuck somehow?

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Mike Galbraith
On Wed, 2007-04-18 at 11:01 +0200, Ingo Molnar wrote:
 * Christoph Pfister [EMAIL PROTECTED] wrote:
 
  backtrace:
  
   #0  0xe410 in __kernel_vsyscall ()
   #1  0x4a2510c6 in pthread_cond_wait@@GLIBC_2.3.2 () from 
   /lib/libpthread.so.0
   #2  0xb79fd1a8 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
   #3  0xb7a030ab in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
   #4  0x4a24d2db in start_thread () from /lib/libpthread.so.0
   #5  0x4a05820e in clone () from /lib/libc.so.6
  
  This backtrace is useless - QWidget::setUpdatesEnabled() is certainly 
  _not_ defined in libxine. So the function names in #2 and #3 are wrong 
  because the addresses seem to belong to libxine.
 
 are the updated backtraces in the followup mail i just sent more useful? 
 (I still have that stuck session running so i can whatever debugging 
 you'd like to see done.)

The xine website release note says there are problems with playback with
xine-lib version 1.1.5, so people encountering this may want to check to
see if they're running 1.1.5, and either upgrade to the latest, or
downgrade to 1.1.4.

snippet from xine website

18.04.2007   xine-lib 1.1.6   A new xine-lib version is now available.
This is mainly a bug-fix release; 1.1.5 has CD audio and DVD playback
problems and a couple of X-related build problems.

-Mike

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Christoph Pfister

2007/4/18, Ingo Molnar [EMAIL PROTECTED]:


* Christoph Pfister [EMAIL PROTECTED] wrote:

 backtrace:
 
  #0  0xe410 in __kernel_vsyscall ()
  #1  0x4a2510c6 in pthread_cond_wait@@GLIBC_2.3.2 () from
  /lib/libpthread.so.0
  #2  0xb79fd1a8 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
  #3  0xb7a030ab in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
  #4  0x4a24d2db in start_thread () from /lib/libpthread.so.0
  #5  0x4a05820e in clone () from /lib/libc.so.6

 This backtrace is useless - QWidget::setUpdatesEnabled() is certainly
 _not_ defined in libxine. So the function names in #2 and #3 are wrong
 because the addresses seem to belong to libxine.

are the updated backtraces in the followup mail i just sent more useful?
(I still have that stuck session running so i can whatever debugging
you'd like to see done.)


QWidget::setUpdatesEnabled() is (wrongly) present in every thread
except the main. So I'm afraid there's nothing which can be done :-/
Btw the main thread is waiting for the first frame being displayed
after the seek.


Ingo


Christoph
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

* Christoph Pfister [EMAIL PROTECTED] wrote:

 are the updated backtraces in the followup mail i just sent more 
 useful? (I still have that stuck session running so i can whatever 
 debugging you'd like to see done.)
 
 QWidget::setUpdatesEnabled() is (wrongly) present in every thread 
 except the main. So I'm afraid there's nothing which can be done :-/ 
 Btw the main thread is waiting for the first frame being displayed 
 after the seek.

i didnt have all the debuginfo packages installed. I installed some (but 
not all yet), here's an updated backtrace:

(gdb) bt
#0  0xe410 in __kernel_vsyscall ()
#1  0x4a2539e1 in __lll_mutex_unlock_wake () from /lib/libpthread.so.0
#2  0x4a2506f9 in _L_mutex_unlock_99 () from /lib/libpthread.so.0
#3  0x4a250370 in __pthread_mutex_unlock_usercnt () from /lib/libpthread.so.0
#4  0x4a2506f0 in pthread_mutex_unlock () from /lib/libpthread.so.0
#5  0xb79fce5a in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
#6  0xb7a4b90b in dvd_plugin_free_buffer (buf=0xb0745470) at input_dvd.c:570
#7  0xb7a030a2 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
#8  0x4a24d2db in start_thread () from /lib/libpthread.so.0
#9  0x4a05820e in clone () from /lib/libc.so.6

at least the dvd_plugin_free_buffer() call has been resolved now. (I'll 
hunt for the other debuginfo packages too.)

which thread would be the most interesting to you - 9324?

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Christoph Pfister

2007/4/18, Ingo Molnar [EMAIL PROTECTED]:


* Christoph Pfister [EMAIL PROTECTED] wrote:

 are the updated backtraces in the followup mail i just sent more
 useful? (I still have that stuck session running so i can whatever
 debugging you'd like to see done.)

 QWidget::setUpdatesEnabled() is (wrongly) present in every thread
 except the main. So I'm afraid there's nothing which can be done :-/
 Btw the main thread is waiting for the first frame being displayed
 after the seek.

i didnt have all the debuginfo packages installed. I installed some (but
not all yet), here's an updated backtrace:

(gdb) bt
#0  0xe410 in __kernel_vsyscall ()
#1  0x4a2539e1 in __lll_mutex_unlock_wake () from /lib/libpthread.so.0
#2  0x4a2506f9 in _L_mutex_unlock_99 () from /lib/libpthread.so.0
#3  0x4a250370 in __pthread_mutex_unlock_usercnt () from /lib/libpthread.so.0
#4  0x4a2506f0 in pthread_mutex_unlock () from /lib/libpthread.so.0
#5  0xb79fce5a in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
#6  0xb7a4b90b in dvd_plugin_free_buffer (buf=0xb0745470) at input_dvd.c:570
#7  0xb7a030a2 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
#8  0x4a24d2db in start_thread () from /lib/libpthread.so.0
#9  0x4a05820e in clone () from /lib/libc.so.6

at least the dvd_plugin_free_buffer() call has been resolved now. (I'll
hunt for the other debuginfo packages too.)

which thread would be the most interesting to you - 9324?


The thread which should wake the main thread - but hmm ... 9303 seems
to be rather dead-locked than doing pthread_cond_timedwait() ?


Ingo


Christoph
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

* Christoph Pfister [EMAIL PROTECTED] wrote:

 which thread would be the most interesting to you - 9324?
 
 The thread which should wake the main thread - but hmm ... 9303 seems 
 to be rather dead-locked than doing pthread_cond_timedwait() ?

ok, here it is, 9303 with better symbol names:

#0  0xe410 in __kernel_vsyscall ()
#1  0x4a2538ce in __lll_mutex_lock_wait () from /lib/libpthread.so.0
#2  0x4a24f71c in _L_mutex_lock_79 () from /lib/libpthread.so.0
#3  0x4a24f24d in pthread_mutex_lock () from /lib/libpthread.so.0
#4  0xb79f64f9 in xine_play () from /usr/lib/libxine.so.1
#5  0xb7a9b0fb in KXineWidget::slotSeekToPosition ()
   from /usr/lib/kde3/libxinepart.so
#6  0xb7a9b3bc in KXineWidget::wheelEvent () from /usr/lib/kde3/libxinepart.so
#7  0x4b5f9150 in QWidget::event () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#8  0x4b55353b in QApplication::internalNotify ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#9  0x4b55526e in QApplication::notify ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#10 0x4a72065e in KApplication::notify () from /usr/lib/libkdecore.so.4
#11 0x4b4dd5de in QETWidget::translateWheelEvent ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#12 0x4b4eb41d in QETWidget::translateMouseEvent ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#13 0x4b4e9766 in QApplication::x11ProcessEvent ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#14 0x4b4fb38b in QEventLoop::processEvents ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#15 0x4b56ce30 in QEventLoop::enterLoop ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#16 0x4b56cce6 in QEventLoop::exec () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#17 0x4b55317f in QApplication::exec () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#18 0x0806fc1a in QWidget::setUpdatesEnabled ()
#19 0x49f9df10 in __libc_start_main () from /lib/libc.so.6
#20 0x0806f7e1 in QWidget::setUpdatesEnabled ()

does this make more sense to you?

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Christoph Pfister

2007/4/18, Ingo Molnar [EMAIL PROTECTED]:


* Christoph Pfister [EMAIL PROTECTED] wrote:

 which thread would be the most interesting to you - 9324?

 The thread which should wake the main thread - but hmm ... 9303 seems
 to be rather dead-locked than doing pthread_cond_timedwait() ?

ok, here it is, 9303 with better symbol names:

#0  0xe410 in __kernel_vsyscall ()
#1  0x4a2538ce in __lll_mutex_lock_wait () from /lib/libpthread.so.0
#2  0x4a24f71c in _L_mutex_lock_79 () from /lib/libpthread.so.0
#3  0x4a24f24d in pthread_mutex_lock () from /lib/libpthread.so.0
#4  0xb79f64f9 in xine_play () from /usr/lib/libxine.so.1
#5  0xb7a9b0fb in KXineWidget::slotSeekToPosition ()
   from /usr/lib/kde3/libxinepart.so
#6  0xb7a9b3bc in KXineWidget::wheelEvent () from /usr/lib/kde3/libxinepart.so
#7  0x4b5f9150 in QWidget::event () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#8  0x4b55353b in QApplication::internalNotify ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#9  0x4b55526e in QApplication::notify ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#10 0x4a72065e in KApplication::notify () from /usr/lib/libkdecore.so.4
#11 0x4b4dd5de in QETWidget::translateWheelEvent ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#12 0x4b4eb41d in QETWidget::translateMouseEvent ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#13 0x4b4e9766 in QApplication::x11ProcessEvent ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#14 0x4b4fb38b in QEventLoop::processEvents ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#15 0x4b56ce30 in QEventLoop::enterLoop ()
   from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#16 0x4b56cce6 in QEventLoop::exec () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#17 0x4b55317f in QApplication::exec () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
#18 0x0806fc1a in QWidget::setUpdatesEnabled ()
#19 0x49f9df10 in __libc_start_main () from /lib/libc.so.6
#20 0x0806f7e1 in QWidget::setUpdatesEnabled ()

does this make more sense to you?


It's nearly impossible for me to find out which mutex is deadlocking.
There are 4 mutexs locked / released during xine_play (or one of the
possibly inlined functions) and to be honest I have little idea which
other thread is also involved in the deadlock (maybe some xine-lib
junkie could help you more with that).
It would be great if you could reproduce the same problem with a
xine-lib which has been compiled with debug support (so you'd get line
numbers in the back trace - that makes life _a lot_ easier and maybe I
could identify the problem that way) and the least optimization
possible ... :-)


Ingo


Christoph
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Christoph Pfister

2007/4/18, Christoph Pfister [EMAIL PROTECTED]:

2007/4/18, Ingo Molnar [EMAIL PROTECTED]:

 * Christoph Pfister [EMAIL PROTECTED] wrote:

  which thread would be the most interesting to you - 9324?
 
  The thread which should wake the main thread - but hmm ... 9303 seems
  to be rather dead-locked than doing pthread_cond_timedwait() ?

 ok, here it is, 9303 with better symbol names:

 #0  0xe410 in __kernel_vsyscall ()
 #1  0x4a2538ce in __lll_mutex_lock_wait () from /lib/libpthread.so.0
 #2  0x4a24f71c in _L_mutex_lock_79 () from /lib/libpthread.so.0
 #3  0x4a24f24d in pthread_mutex_lock () from /lib/libpthread.so.0
 #4  0xb79f64f9 in xine_play () from /usr/lib/libxine.so.1
 #5  0xb7a9b0fb in KXineWidget::slotSeekToPosition ()
from /usr/lib/kde3/libxinepart.so
 #6  0xb7a9b3bc in KXineWidget::wheelEvent () from /usr/lib/kde3/libxinepart.so
 #7  0x4b5f9150 in QWidget::event () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #8  0x4b55353b in QApplication::internalNotify ()
from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #9  0x4b55526e in QApplication::notify ()
from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #10 0x4a72065e in KApplication::notify () from /usr/lib/libkdecore.so.4
 #11 0x4b4dd5de in QETWidget::translateWheelEvent ()
from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #12 0x4b4eb41d in QETWidget::translateMouseEvent ()
from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #13 0x4b4e9766 in QApplication::x11ProcessEvent ()
from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #14 0x4b4fb38b in QEventLoop::processEvents ()
from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #15 0x4b56ce30 in QEventLoop::enterLoop ()
from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #16 0x4b56cce6 in QEventLoop::exec () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #17 0x4b55317f in QApplication::exec () from /usr/lib/qt-3.3/lib/libqt-mt.so.3
 #18 0x0806fc1a in QWidget::setUpdatesEnabled ()
 #19 0x49f9df10 in __libc_start_main () from /lib/libc.so.6
 #20 0x0806f7e1 in QWidget::setUpdatesEnabled ()

 does this make more sense to you?

It's nearly impossible for me to find out which mutex is deadlocking.
There are 4 mutexs locked / released during xine_play (or one of the
possibly inlined functions) and to be honest I have little idea which
other thread is also involved in the deadlock (maybe some xine-lib
junkie could help you more with that).
It would be great if you could reproduce the same problem with a
xine-lib which has been compiled with debug support (so you'd get line
numbers in the back trace - that makes life _a lot_ easier and maybe I
could identify the problem that way) and the least optimization
possible ... :-)

 Ingo


Or I could try playing around a bit with your patchset and trying to
reproduce it over here. Because I already have debug builds for
xine-lib and compiling a new kernel can take place in the background
it wouldn't be much effort for me.

Christoph
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

* Christoph Pfister [EMAIL PROTECTED] wrote:

 It's nearly impossible for me to find out which mutex is deadlocking.

i've disassembled the xine_play function, and here are the function 
calls in it:

  unresolved widget call?
 pthread_mutex_lock()
 xine_log()
  unresolved widget call?
 function pointer call
 right after it: pthread_mutex_lock()

this second pthread_mutex_lock() in question is the one that deadlocks. 
It comes right after that function pointer call, maybe that identifies 
it?

[some time passes]

i rebuilt the library from source and while the installed library is 
different from it, looking at the disassembly i'm quite sure it's this 
pthread_mutex_lock() in xine_play_internal():

  pthread_mutex_lock( stream-demux_lock );

src/xine-engine/xine.c:1201

the function pointer call was:

  stream-xine-port_ticket-acquire(stream-xine-port_ticket, 1);

right before the pthread_mutex_lock() call.

 It would be great if you could reproduce the same problem with a 
 xine-lib which has been compiled with debug support (so you'd get line 
 numbers in the back trace - that makes life _a lot_ easier and maybe I 
 could identify the problem that way) and the least optimization 
 possible ... :-)

ok, i'll try that too (but it will take some more time), but given how 
hard it was for me to trigger it, i wanted to get maximum info out of it 
before having to kill the threads.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

hm. I've reviewed all uses of demux_lock. ./src/xine-engine/demux.c does 
this:

pthread_mutex_unlock( stream-demux_lock );

lprintf (sched_yield\n);

sched_yield();
pthread_mutex_lock( stream-demux_lock );

why is this done? CFS has definitely changed the yield implementation so 
there could be some connection.

OTOH, in the 'hung' state none of the straces suggests any yield() call.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

* Ingo Molnar [EMAIL PROTECTED] wrote:

 hm. I've reviewed all uses of demux_lock. ./src/xine-engine/demux.c 
 does this:

plus it does this too:

  pthread_mutex_unlock( stream-demux_lock );
  xine_usec_sleep(10);
  pthread_mutex_lock( stream-demux_lock );

this would explain the nanosleep() strace entries. But the task stuck on 
demux_lock never gets the unlock event. Weird.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

* Ingo Molnar [EMAIL PROTECTED] wrote:

  hm. I've reviewed all uses of demux_lock. ./src/xine-engine/demux.c 
  does this:
 
 plus it does this too:
 
   pthread_mutex_unlock( stream-demux_lock );
   xine_usec_sleep(10);
   pthread_mutex_lock( stream-demux_lock );
 
 this would explain the nanosleep() strace entries. But the task stuck 
 on demux_lock never gets the unlock event. Weird.

9303 is stuck here on demux_lock:

#0  0xe410 in __kernel_vsyscall ()
#1  0x4a2538ce in __lll_mutex_lock_wait () from /lib/libpthread.so.0
#2  0x4a24f71c in _L_mutex_lock_79 () from /lib/libpthread.so.0
#3  0x4a24f24d in pthread_mutex_lock () from /lib/libpthread.so.0
#4  0xb79f64f9 in xine_play () from /usr/lib/libxine.so.1

that mutex related futex is at address 0xb07409e0, but the only sign in 
the strace of that futex being touched is:

9303  futex(0xb07409e0, FUTEX_WAIT, 2, NULL unfinished ...

no other event ever happens on futex 0xb07409e0. Other threads dont 
touch it.

Maybe thread 9324 is the owner of that mutex, and it's looping somewhere 
that does xine_xmalloc_aligned(), with the lock held? It did this:

#0  0xe410 in __kernel_vsyscall ()
#1  0x4a2539e1 in __lll_mutex_unlock_wake () from /lib/libpthread.so.0
#2  0x4a2506f9 in _L_mutex_unlock_99 () from /lib/libpthread.so.0
#3  0x4a250370 in __pthread_mutex_unlock_usercnt () from /lib/libpthread.so.0
#4  0x4a2506f0 in pthread_mutex_unlock () from /lib/libpthread.so.0
#5  0xb79fce5a in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
#6  0xb7a4b90b in dvd_plugin_free_buffer (buf=0xb0745470) at input_dvd.c:570
#7  0xb7a030a2 in QWidget::setUpdatesEnabled () from /usr/lib/libxine.so.1
#8  0x4a24d2db in start_thread () from /lib/libpthread.so.0
#9  0x4a05820e in clone () from /lib/libc.so.6

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

* Ingo Molnar [EMAIL PROTECTED] wrote:

 hm. I've reviewed all uses of demux_lock. ./src/xine-engine/demux.c 
 does this:
 
 pthread_mutex_unlock( stream-demux_lock );
 
 lprintf (sched_yield\n);
 
 sched_yield();
 pthread_mutex_lock( stream-demux_lock );
 
 why is this done? CFS has definitely changed the yield implementation 
 so there could be some connection.
 
 OTOH, in the 'hung' state none of the straces suggests any yield() 
 call.

yeah, there's no yield() call in any of the straces so i'd exclude this 
as a possibility .

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread S.Çağlar Onur
18 Nis 2007 Çar tarihinde, Christoph Pfister şunları yazmıştı: 
  Okay - so here are some results (it's strange that gdb goes nuts
  inside the xine_play call). I have three bts (seems to be fairly easy
  to reproduce that behaviour over here): Twice while playing an audio
  cd and once while playing a normal file. The hang usually ends if you
  wait long enough (something around 30 secs over here).

I can confirm this, freeze ends after some wait period (~20-30 secs) if 
kaffine is the only active process. I didn't notice that cause most probably 
CPU is busy with compiling kernel at that time...

Now i'm testing Ingo's msleep patch + xine-lib-1.1.6...

Cheers
-- 
S.Çağlar Onur [EMAIL PROTECTED]
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


signature.asc
Description: This is a digitally signed message part.


Re: Kaffeine problem with CFS

2007-04-18 Thread Christoph Pfister

2007/4/18, Christoph Pfister [EMAIL PROTECTED]:

[ Sorry for accidentally dropping CCs ]

2007/4/18, Christoph Pfister [EMAIL PROTECTED]:
 2007/4/18, Ingo Molnar [EMAIL PROTECTED]:
 
  * Christoph Pfister [EMAIL PROTECTED] wrote:
 
   Or I could try playing around a bit with your patchset and trying to
   reproduce it over here. Because I already have debug builds for
   xine-lib and compiling a new kernel can take place in the background
   it wouldn't be much effort for me.
 
  that would be great :) Here are the URLs for it. CFS is based on
  v2.6.21-rc7:
 
http://kernel.org/pub/linux/kernel/v2.6/testing/linux-2.6.21-rc7.tar.bz2
 
  And the CFS patch is at:
 
http://people.redhat.com/mingo/cfs-scheduler/sched-cfs-v2.patch
 
  rebuild your kernel as usual and boot into it. No extra configuration is
  needed, you'll get CFS by default.
 
  if this kernel builds/boots fine for you then you might also want to
  send me a quick note about how it feels, interactivity-wise. And of
  course i'm interested in any sort of feedback about problems as well.
  I'd like to make CFS as media-playback friendly as possible, so if
  there's any problem in that area it would be nice for me to know about
  it as soon as possible.
 
  Ingo

 Okay - so here are some results (it's strange that gdb goes nuts
 inside the xine_play call). I have three bts (seems to be fairly easy
 to reproduce that behaviour over here): Twice while playing an audio
 cd and once while playing a normal file. The hang usually ends if you
 wait long enough (something around 30 secs over here).


big snip


 Christoph


 PS: Haven't analyzed them yet - but doing so now :-)

Ok - one nice thing: In all those bts demux_loop is at demux.c:285 -
meaing that demux_lock is held and xine_play is waiting for it ...
The lock should be temporilary unreleased with a sched_yield so that
the main thread can access it. As you wrote the implementation of this
function seems to have changed a bit - so I'll replace it with a short
sleep and try again ...

Christoph


Replacing the sched_yield in demux.c with an usleep(10) stopped those
seeking hangs here (at least I was able to pull the slider back and
forth during 2 mins without trouble compared to the few secs I need
earlier to get a hang).

Christoph
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread S.Çağlar Onur
18 Nis 2007 Çar tarihinde, Christoph Pfister şunları yazmıştı: 
 Replacing the sched_yield in demux.c with an usleep(10) stopped those
 seeking hangs here (at least I was able to pull the slider back and
 forth during 2 mins without trouble compared to the few secs I need
 earlier to get a hang).

Index: linux/kernel/sched.c
===
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -3785,7 +3785,7 @@ asmlinkage long sys_sched_yield(void)
_raw_spin_unlock(rq-lock);
preempt_enable_no_resched();
 
-   schedule();
+   msleep(1);
 
return 0;
 }

which Ingo sends me to try also has the same effect on me. I cannot reproduce 
hangs anymore with that patch applied top of CFS while one console checks out 
SVN repos and other one compiles a small test software.

Cheers
-- 
S.Çağlar Onur [EMAIL PROTECTED]
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


signature.asc
Description: This is a digitally signed message part.


Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

* Christoph Pfister [EMAIL PROTECTED] wrote:

 Replacing the sched_yield in demux.c with an usleep(10) stopped those 
 seeking hangs here (at least I was able to pull the slider back and 
 forth during 2 mins without trouble compared to the few secs I need 
 earlier to get a hang).

great - thanks for figuring it out!

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

* S.Çağlar Onur [EMAIL PROTECTED] wrote:

 -   schedule();
 +   msleep(1);

 which Ingo sends me to try also has the same effect on me. I cannot 
 reproduce hangs anymore with that patch applied top of CFS while one 
 console checks out SVN repos and other one compiles a small test 
 software.

great! Could you please unapply the hack above and try the proper fix 
below, does this one solve the hangs too?

Ingo

Index: linux/kernel/sched_fair.c
===
--- linux.orig/kernel/sched_fair.c
+++ linux/kernel/sched_fair.c
@@ -264,15 +264,26 @@ static void dequeue_task_fair(struct rq 
 
 /*
  * sched_yield() support is very simple via the rbtree, we just
- * dequeue and enqueue the task, which causes the task to
- * roundrobin to the end of the tree:
+ * dequeue the task and move it to the rightmost position, which
+ * causes the task to roundrobin to the end of the tree.
  */
 static void requeue_task_fair(struct rq *rq, struct task_struct *p)
 {
dequeue_task_fair(rq, p);
p-on_rq = 0;
-   enqueue_task_fair(rq, p);
+   /*
+* Temporarily insert at the last position of the tree:
+*/
+   p-fair_key = LLONG_MAX;
+   __enqueue_task_fair(rq, p);
p-on_rq = 1;
+
+   /*
+* Update the key to the real value, so that when all other
+* tasks from before the rightmost position have executed,
+* this task is picked up again:
+*/
+   p-fair_key = rq-fair_clock - p-wait_runtime + p-nice_offset;
 }
 
 /*
@@ -380,7 +391,10 @@ static void task_tick_fair(struct rq *rq
 * Dequeue and enqueue the task to update its
 * position within the tree:
 */
-   requeue_task_fair(rq, curr);
+   dequeue_task_fair(rq, curr);
+   curr-on_rq = 0;
+   enqueue_task_fair(rq, curr);
+   curr-on_rq = 1;
 
/*
 * Reschedule if another task tops the current one.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread William Lee Irwin III
On Wed, Apr 18, 2007 at 05:48:11PM +0200, Ingo Molnar wrote:
  static void requeue_task_fair(struct rq *rq, struct task_struct *p)
  {
   dequeue_task_fair(rq, p);
   p-on_rq = 0;
 - enqueue_task_fair(rq, p);
 + /*
 +  * Temporarily insert at the last position of the tree:
 +  */
 + p-fair_key = LLONG_MAX;
 + __enqueue_task_fair(rq, p);
   p-on_rq = 1;
 +
 + /*
 +  * Update the key to the real value, so that when all other
 +  * tasks from before the rightmost position have executed,
 +  * this task is picked up again:
 +  */
 + p-fair_key = rq-fair_clock - p-wait_runtime + p-nice_offset;

At this point you might as well call the requeue operation something
having to do with yield. I also suspect what goes on during the timer
tick may eventually become something different from requeueing the
current task, and furthermore class-dependent.


-- wli
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

* William Lee Irwin III [EMAIL PROTECTED] wrote:

 At this point you might as well call the requeue operation something 
 having to do with yield. [...]

agreed - i've just done a requeue_task - yield_task rename in my tree. 
(patch below)

 [...] I also suspect what goes on during the timer tick may eventually 
 become something different from requeueing the current task, and 
 furthermore class-dependent.

it already is, scheduler tick processing is done in class-task_tick().

Ingo

---
 include/linux/sched.h |2 +-
 kernel/sched.c|7 +--
 kernel/sched_fair.c   |4 ++--
 kernel/sched_rt.c |2 +-
 4 files changed, 5 insertions(+), 10 deletions(-)

Index: linux/include/linux/sched.h
===
--- linux.orig/include/linux/sched.h
+++ linux/include/linux/sched.h
@@ -796,7 +796,7 @@ struct sched_class {
 
void (*enqueue_task) (struct rq *rq, struct task_struct *p);
void (*dequeue_task) (struct rq *rq, struct task_struct *p);
-   void (*requeue_task) (struct rq *rq, struct task_struct *p);
+   void (*yield_task) (struct rq *rq, struct task_struct *p);
 
void (*check_preempt_curr) (struct rq *rq, struct task_struct *p);
 
Index: linux/kernel/sched.c
===
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -560,11 +560,6 @@ static void dequeue_task(struct rq *rq, 
p-on_rq = 0;
 }
 
-static void requeue_task(struct rq *rq, struct task_struct *p)
-{
-   p-sched_class-requeue_task(rq, p);
-}
-
 /*
  * __normal_prio - return the priority that is based on the static prio
  */
@@ -3773,7 +3768,7 @@ asmlinkage long sys_sched_yield(void)
schedstat_inc(rq, yld_cnt);
if (rq-nr_running == 1)
schedstat_inc(rq, yld_act_empty);
-   requeue_task(rq, current);
+   current-sched_class-yield_task(rq, current);
 
/*
 * Since we are going to call schedule() anyway, there's
Index: linux/kernel/sched_fair.c
===
--- linux.orig/kernel/sched_fair.c
+++ linux/kernel/sched_fair.c
@@ -273,7 +273,7 @@ static void dequeue_task_fair(struct rq 
  * dequeue the task and move it to the rightmost position, which
  * causes the task to roundrobin to the end of the tree.
  */
-static void requeue_task_fair(struct rq *rq, struct task_struct *p)
+static void yield_task_fair(struct rq *rq, struct task_struct *p)
 {
dequeue_task_fair(rq, p);
p-on_rq = 0;
@@ -509,7 +509,7 @@ static void task_init_fair(struct rq *rq
 struct sched_class fair_sched_class __read_mostly = {
.enqueue_task   = enqueue_task_fair,
.dequeue_task   = dequeue_task_fair,
-   .requeue_task   = requeue_task_fair,
+   .yield_task = yield_task_fair,
 
.check_preempt_curr = check_preempt_curr_fair,
 
Index: linux/kernel/sched_rt.c
===
--- linux.orig/kernel/sched_rt.c
+++ linux/kernel/sched_rt.c
@@ -165,7 +165,7 @@ static void task_init_rt(struct rq *rq, 
 static struct sched_class rt_sched_class __read_mostly = {
.enqueue_task   = enqueue_task_rt,
.dequeue_task   = dequeue_task_rt,
-   .requeue_task   = requeue_task_rt,
+   .yield_task = requeue_task_rt,
 
.check_preempt_curr = check_preempt_curr_rt,
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-18 Thread S.Çağlar Onur
18 Nis 2007 Çar tarihinde, Ingo Molnar şunları yazmıştı: 
 * S.Çağlar Onur [EMAIL PROTECTED] wrote:
  -   schedule();
  +   msleep(1);
 
  which Ingo sends me to try also has the same effect on me. I cannot
  reproduce hangs anymore with that patch applied top of CFS while one
  console checks out SVN repos and other one compiles a small test
  software.

 great! Could you please unapply the hack above and try the proper fix
 below, does this one solve the hangs too?

Instead of that one, i tried CFSv3 and i cannot reproduce the hang anymore, 
Thanks!...

Cheers
-- 
S.Çağlar Onur [EMAIL PROTECTED]
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


signature.asc
Description: This is a digitally signed message part.


Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar

* S.Çağlar Onur [EMAIL PROTECTED] wrote:

  great! Could you please unapply the hack above and try the proper 
  fix below, does this one solve the hangs too?
 
 Instead of that one, i tried CFSv3 and i cannot reproduce the hang 
 anymore, Thanks!...

cool, thanks for the quick turnaround!

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-15 Thread S.Çağlar Onur
15 Nis 2007 Paz tarihinde, Christoph Pfister şunları yazmıştı: 
> Could you try xine-ui or gxine? Because I suspect rather xine-lib for
> freezing issues. In any way I think a gdb backtrace would be much
> nicer - but if you can't reproduce the freeze issue with other xine
> based players and want to run kaffeine in gdb, you need to execute
> "gdb --args kaffeine --nofork".

I just tested xine-ui and i can easily reproduce exact same problem with 
xine-ui also so you are right, it seems a xine-lib problem trigger by CFS 
changes.

> > > thanks. This does has the appearance of a userspace race condition of
> > > some sorts. Can you trigger this hang with the patch below applied to
> > > the vanilla tree as well? (with no CFS patch applied)
> >
> > oops, please use the patch below instead.

Tomorrow i'll test that patch and also try to get a backtrace.

Cheers
-- 
S.Çağlar Onur <[EMAIL PROTECTED]>
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


signature.asc
Description: This is a digitally signed message part.


Re: Kaffeine problem with CFS

2007-04-15 Thread Christoph Pfister

Hi,

2007/4/15, Ingo Molnar <[EMAIL PROTECTED]>:


* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> > [1] http://cekirdek.pardus.org.tr/~caglar/strace.kaffeine


Could you try xine-ui or gxine? Because I suspect rather xine-lib for
freezing issues. In any way I think a gdb backtrace would be much
nicer - but if you can't reproduce the freeze issue with other xine
based players and want to run kaffeine in gdb, you need to execute
"gdb --args kaffeine --nofork".


> thanks. This does has the appearance of a userspace race condition of
> some sorts. Can you trigger this hang with the patch below applied to
> the vanilla tree as well? (with no CFS patch applied)

oops, please use the patch below instead.

Ingo



Christoph
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-15 Thread Ingo Molnar

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> > [1] http://cekirdek.pardus.org.tr/~caglar/strace.kaffeine
> 
> thanks. This does has the appearance of a userspace race condition of 
> some sorts. Can you trigger this hang with the patch below applied to 
> the vanilla tree as well? (with no CFS patch applied)

oops, please use the patch below instead.

Ingo

---
 kernel/sched.c |   69 -
 1 file changed, 5 insertions(+), 64 deletions(-)

Index: linux/kernel/sched.c
===
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -1653,77 +1653,18 @@ void fastcall sched_fork(struct task_str
  */
 void fastcall wake_up_new_task(struct task_struct *p, unsigned long 
clone_flags)
 {
-   struct rq *rq, *this_rq;
unsigned long flags;
-   int this_cpu, cpu;
+   struct rq *rq;
 
rq = task_rq_lock(p, );
BUG_ON(p->state != TASK_RUNNING);
-   this_cpu = smp_processor_id();
-   cpu = task_cpu(p);
-
-   /*
-* We decrease the sleep average of forking parents
-* and children as well, to keep max-interactive tasks
-* from forking tasks that are max-interactive. The parent
-* (current) is done further down, under its lock.
-*/
-   p->sleep_avg = JIFFIES_TO_NS(CURRENT_BONUS(p) *
-   CHILD_PENALTY / 100 * MAX_SLEEP_AVG / MAX_BONUS);
 
p->prio = effective_prio(p);
+   __activate_task(p, rq);
+   if (TASK_PREEMPTS_CURR(p, rq))
+   resched_task(rq->curr);
 
-   if (likely(cpu == this_cpu)) {
-   if (!(clone_flags & CLONE_VM)) {
-   /*
-* The VM isn't cloned, so we're in a good position to
-* do child-runs-first in anticipation of an exec. This
-* usually avoids a lot of COW overhead.
-*/
-   if (unlikely(!current->array))
-   __activate_task(p, rq);
-   else {
-   p->prio = current->prio;
-   p->normal_prio = current->normal_prio;
-   list_add_tail(>run_list, >run_list);
-   p->array = current->array;
-   p->array->nr_active++;
-   inc_nr_running(p, rq);
-   }
-   set_need_resched();
-   } else
-   /* Run child last */
-   __activate_task(p, rq);
-   /*
-* We skip the following code due to cpu == this_cpu
-*
-*   task_rq_unlock(rq, );
-*   this_rq = task_rq_lock(current, );
-*/
-   this_rq = rq;
-   } else {
-   this_rq = cpu_rq(this_cpu);
-
-   /*
-* Not the local CPU - must adjust timestamp. This should
-* get optimised away in the !CONFIG_SMP case.
-*/
-   p->timestamp = (p->timestamp - this_rq->most_recent_timestamp)
-   + rq->most_recent_timestamp;
-   __activate_task(p, rq);
-   if (TASK_PREEMPTS_CURR(p, rq))
-   resched_task(rq->curr);
-
-   /*
-* Parent and child are on different CPUs, now get the
-* parent runqueue to update the parent's ->sleep_avg:
-*/
-   task_rq_unlock(rq, );
-   this_rq = task_rq_lock(current, );
-   }
-   current->sleep_avg = JIFFIES_TO_NS(CURRENT_BONUS(current) *
-   PARENT_PENALTY / 100 * MAX_SLEEP_AVG / MAX_BONUS);
-   task_rq_unlock(this_rq, );
+   task_rq_unlock(rq, );
 }
 
 /*
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-15 Thread Ingo Molnar

* S.Çağlar Onur <[EMAIL PROTECTED]> wrote:

> > hm, could you try to strace it and/or attach gdb to it and figure 
> > out what's wrong? (perhaps involving the Kaffeine developers too?) 
> > As long as it's not a kernel level crash i cannot see how the 
> > scheduler could directly cause this - other than by accident 
> > creating a scheduling pattern that triggers a user-space bug more 
> > often than with other schedulers.
> 
> ...
> futex(0x89ac218, FUTEX_WAIT, 2, NULL)   = 0
> futex(0x89ac218, FUTEX_WAIT, 2, NULL)   = 0
> futex(0x89ac218, FUTEX_WAIT, 2, NULL)   = 0
> futex(0x89ac218, FUTEX_WAIT, 2, NULL)   = 0
> futex(0x89ac218, FUTEX_WAIT, 2, NULL)   = 0
> futex(0x89ac218, FUTEX_WAIT, 2, NULL)   = -1 EINTR (Interrupted system call)
> --- SIGINT (Interrupt) @ 0 (0) ---
> +++ killed by SIGINT +++
> 
> is where freeze occurs. Full log can be found at [1]

> [1] http://cekirdek.pardus.org.tr/~caglar/strace.kaffeine

thanks. This does has the appearance of a userspace race condition of 
some sorts. Can you trigger this hang with the patch below applied to 
the vanilla tree as well? (with no CFS patch applied)

if yes then this would suggest that Kaffeine has some sort of 
child-runs-first problem. (which CFS changes to parent-runs-first. 
Kaffeine starts a couple of threads and the futex calls are a sign of 
thread<->thread communication.)

[ i have also Cc:-ed the Kaffeine folks - maybe your strace gives them
  an idea about what the problem could be :) ]

Ingo

---
 kernel/sched.c |   70 ++---
 1 file changed, 3 insertions(+), 67 deletions(-)

Index: linux/kernel/sched.c
===
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -1653,77 +1653,13 @@ void fastcall sched_fork(struct task_str
  */
 void fastcall wake_up_new_task(struct task_struct *p, unsigned long 
clone_flags)
 {
-   struct rq *rq, *this_rq;
unsigned long flags;
-   int this_cpu, cpu;
+   struct rq *rq;
 
rq = task_rq_lock(p, );
BUG_ON(p->state != TASK_RUNNING);
-   this_cpu = smp_processor_id();
-   cpu = task_cpu(p);
-
-   /*
-* We decrease the sleep average of forking parents
-* and children as well, to keep max-interactive tasks
-* from forking tasks that are max-interactive. The parent
-* (current) is done further down, under its lock.
-*/
-   p->sleep_avg = JIFFIES_TO_NS(CURRENT_BONUS(p) *
-   CHILD_PENALTY / 100 * MAX_SLEEP_AVG / MAX_BONUS);
-
-   p->prio = effective_prio(p);
-
-   if (likely(cpu == this_cpu)) {
-   if (!(clone_flags & CLONE_VM)) {
-   /*
-* The VM isn't cloned, so we're in a good position to
-* do child-runs-first in anticipation of an exec. This
-* usually avoids a lot of COW overhead.
-*/
-   if (unlikely(!current->array))
-   __activate_task(p, rq);
-   else {
-   p->prio = current->prio;
-   p->normal_prio = current->normal_prio;
-   list_add_tail(>run_list, >run_list);
-   p->array = current->array;
-   p->array->nr_active++;
-   inc_nr_running(p, rq);
-   }
-   set_need_resched();
-   } else
-   /* Run child last */
-   __activate_task(p, rq);
-   /*
-* We skip the following code due to cpu == this_cpu
-*
-*   task_rq_unlock(rq, );
-*   this_rq = task_rq_lock(current, );
-*/
-   this_rq = rq;
-   } else {
-   this_rq = cpu_rq(this_cpu);
-
-   /*
-* Not the local CPU - must adjust timestamp. This should
-* get optimised away in the !CONFIG_SMP case.
-*/
-   p->timestamp = (p->timestamp - this_rq->most_recent_timestamp)
-   + rq->most_recent_timestamp;
-   __activate_task(p, rq);
-   if (TASK_PREEMPTS_CURR(p, rq))
-   resched_task(rq->curr);
-
-   /*
-* Parent and child are on different CPUs, now get the
-* parent runqueue to update the parent's ->sleep_avg:
-*/
-   task_rq_unlock(rq, );
-   this_rq = task_rq_lock(current, );
-   }
-   current->sleep_avg = JIFFIES_TO_NS(CURRENT_BONUS(current) *
-   PARENT_PENALTY / 100 * MAX_SLEEP_AVG / MAX_BONUS);
-   task_rq_unlock(this_rq, );
+   __activate_task(p, rq);
+   task_rq_unlock(rq, );
 }
 
 /*
-
To 

Re: Kaffeine problem with CFS

2007-04-15 Thread Ingo Molnar

* S.Çağlar Onur [EMAIL PROTECTED] wrote:

  hm, could you try to strace it and/or attach gdb to it and figure 
  out what's wrong? (perhaps involving the Kaffeine developers too?) 
  As long as it's not a kernel level crash i cannot see how the 
  scheduler could directly cause this - other than by accident 
  creating a scheduling pattern that triggers a user-space bug more 
  often than with other schedulers.
 
 ...
 futex(0x89ac218, FUTEX_WAIT, 2, NULL)   = 0
 futex(0x89ac218, FUTEX_WAIT, 2, NULL)   = 0
 futex(0x89ac218, FUTEX_WAIT, 2, NULL)   = 0
 futex(0x89ac218, FUTEX_WAIT, 2, NULL)   = 0
 futex(0x89ac218, FUTEX_WAIT, 2, NULL)   = 0
 futex(0x89ac218, FUTEX_WAIT, 2, NULL)   = -1 EINTR (Interrupted system call)
 --- SIGINT (Interrupt) @ 0 (0) ---
 +++ killed by SIGINT +++
 
 is where freeze occurs. Full log can be found at [1]

 [1] http://cekirdek.pardus.org.tr/~caglar/strace.kaffeine

thanks. This does has the appearance of a userspace race condition of 
some sorts. Can you trigger this hang with the patch below applied to 
the vanilla tree as well? (with no CFS patch applied)

if yes then this would suggest that Kaffeine has some sort of 
child-runs-first problem. (which CFS changes to parent-runs-first. 
Kaffeine starts a couple of threads and the futex calls are a sign of 
thread-thread communication.)

[ i have also Cc:-ed the Kaffeine folks - maybe your strace gives them
  an idea about what the problem could be :) ]

Ingo

---
 kernel/sched.c |   70 ++---
 1 file changed, 3 insertions(+), 67 deletions(-)

Index: linux/kernel/sched.c
===
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -1653,77 +1653,13 @@ void fastcall sched_fork(struct task_str
  */
 void fastcall wake_up_new_task(struct task_struct *p, unsigned long 
clone_flags)
 {
-   struct rq *rq, *this_rq;
unsigned long flags;
-   int this_cpu, cpu;
+   struct rq *rq;
 
rq = task_rq_lock(p, flags);
BUG_ON(p-state != TASK_RUNNING);
-   this_cpu = smp_processor_id();
-   cpu = task_cpu(p);
-
-   /*
-* We decrease the sleep average of forking parents
-* and children as well, to keep max-interactive tasks
-* from forking tasks that are max-interactive. The parent
-* (current) is done further down, under its lock.
-*/
-   p-sleep_avg = JIFFIES_TO_NS(CURRENT_BONUS(p) *
-   CHILD_PENALTY / 100 * MAX_SLEEP_AVG / MAX_BONUS);
-
-   p-prio = effective_prio(p);
-
-   if (likely(cpu == this_cpu)) {
-   if (!(clone_flags  CLONE_VM)) {
-   /*
-* The VM isn't cloned, so we're in a good position to
-* do child-runs-first in anticipation of an exec. This
-* usually avoids a lot of COW overhead.
-*/
-   if (unlikely(!current-array))
-   __activate_task(p, rq);
-   else {
-   p-prio = current-prio;
-   p-normal_prio = current-normal_prio;
-   list_add_tail(p-run_list, current-run_list);
-   p-array = current-array;
-   p-array-nr_active++;
-   inc_nr_running(p, rq);
-   }
-   set_need_resched();
-   } else
-   /* Run child last */
-   __activate_task(p, rq);
-   /*
-* We skip the following code due to cpu == this_cpu
-*
-*   task_rq_unlock(rq, flags);
-*   this_rq = task_rq_lock(current, flags);
-*/
-   this_rq = rq;
-   } else {
-   this_rq = cpu_rq(this_cpu);
-
-   /*
-* Not the local CPU - must adjust timestamp. This should
-* get optimised away in the !CONFIG_SMP case.
-*/
-   p-timestamp = (p-timestamp - this_rq-most_recent_timestamp)
-   + rq-most_recent_timestamp;
-   __activate_task(p, rq);
-   if (TASK_PREEMPTS_CURR(p, rq))
-   resched_task(rq-curr);
-
-   /*
-* Parent and child are on different CPUs, now get the
-* parent runqueue to update the parent's -sleep_avg:
-*/
-   task_rq_unlock(rq, flags);
-   this_rq = task_rq_lock(current, flags);
-   }
-   current-sleep_avg = JIFFIES_TO_NS(CURRENT_BONUS(current) *
-   PARENT_PENALTY / 100 * MAX_SLEEP_AVG / MAX_BONUS);
-   task_rq_unlock(this_rq, flags);
+   __activate_task(p, rq);
+   task_rq_unlock(rq, flags);
 }
 
 /*
-
To 

Re: Kaffeine problem with CFS

2007-04-15 Thread Ingo Molnar

* Ingo Molnar [EMAIL PROTECTED] wrote:

  [1] http://cekirdek.pardus.org.tr/~caglar/strace.kaffeine
 
 thanks. This does has the appearance of a userspace race condition of 
 some sorts. Can you trigger this hang with the patch below applied to 
 the vanilla tree as well? (with no CFS patch applied)

oops, please use the patch below instead.

Ingo

---
 kernel/sched.c |   69 -
 1 file changed, 5 insertions(+), 64 deletions(-)

Index: linux/kernel/sched.c
===
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -1653,77 +1653,18 @@ void fastcall sched_fork(struct task_str
  */
 void fastcall wake_up_new_task(struct task_struct *p, unsigned long 
clone_flags)
 {
-   struct rq *rq, *this_rq;
unsigned long flags;
-   int this_cpu, cpu;
+   struct rq *rq;
 
rq = task_rq_lock(p, flags);
BUG_ON(p-state != TASK_RUNNING);
-   this_cpu = smp_processor_id();
-   cpu = task_cpu(p);
-
-   /*
-* We decrease the sleep average of forking parents
-* and children as well, to keep max-interactive tasks
-* from forking tasks that are max-interactive. The parent
-* (current) is done further down, under its lock.
-*/
-   p-sleep_avg = JIFFIES_TO_NS(CURRENT_BONUS(p) *
-   CHILD_PENALTY / 100 * MAX_SLEEP_AVG / MAX_BONUS);
 
p-prio = effective_prio(p);
+   __activate_task(p, rq);
+   if (TASK_PREEMPTS_CURR(p, rq))
+   resched_task(rq-curr);
 
-   if (likely(cpu == this_cpu)) {
-   if (!(clone_flags  CLONE_VM)) {
-   /*
-* The VM isn't cloned, so we're in a good position to
-* do child-runs-first in anticipation of an exec. This
-* usually avoids a lot of COW overhead.
-*/
-   if (unlikely(!current-array))
-   __activate_task(p, rq);
-   else {
-   p-prio = current-prio;
-   p-normal_prio = current-normal_prio;
-   list_add_tail(p-run_list, current-run_list);
-   p-array = current-array;
-   p-array-nr_active++;
-   inc_nr_running(p, rq);
-   }
-   set_need_resched();
-   } else
-   /* Run child last */
-   __activate_task(p, rq);
-   /*
-* We skip the following code due to cpu == this_cpu
-*
-*   task_rq_unlock(rq, flags);
-*   this_rq = task_rq_lock(current, flags);
-*/
-   this_rq = rq;
-   } else {
-   this_rq = cpu_rq(this_cpu);
-
-   /*
-* Not the local CPU - must adjust timestamp. This should
-* get optimised away in the !CONFIG_SMP case.
-*/
-   p-timestamp = (p-timestamp - this_rq-most_recent_timestamp)
-   + rq-most_recent_timestamp;
-   __activate_task(p, rq);
-   if (TASK_PREEMPTS_CURR(p, rq))
-   resched_task(rq-curr);
-
-   /*
-* Parent and child are on different CPUs, now get the
-* parent runqueue to update the parent's -sleep_avg:
-*/
-   task_rq_unlock(rq, flags);
-   this_rq = task_rq_lock(current, flags);
-   }
-   current-sleep_avg = JIFFIES_TO_NS(CURRENT_BONUS(current) *
-   PARENT_PENALTY / 100 * MAX_SLEEP_AVG / MAX_BONUS);
-   task_rq_unlock(this_rq, flags);
+   task_rq_unlock(rq, flags);
 }
 
 /*
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-15 Thread Christoph Pfister

Hi,

2007/4/15, Ingo Molnar [EMAIL PROTECTED]:


* Ingo Molnar [EMAIL PROTECTED] wrote:

  [1] http://cekirdek.pardus.org.tr/~caglar/strace.kaffeine


Could you try xine-ui or gxine? Because I suspect rather xine-lib for
freezing issues. In any way I think a gdb backtrace would be much
nicer - but if you can't reproduce the freeze issue with other xine
based players and want to run kaffeine in gdb, you need to execute
gdb --args kaffeine --nofork.


 thanks. This does has the appearance of a userspace race condition of
 some sorts. Can you trigger this hang with the patch below applied to
 the vanilla tree as well? (with no CFS patch applied)

oops, please use the patch below instead.

Ingo

snip

Christoph
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-15 Thread S.Çağlar Onur
15 Nis 2007 Paz tarihinde, Christoph Pfister şunları yazmıştı: 
 Could you try xine-ui or gxine? Because I suspect rather xine-lib for
 freezing issues. In any way I think a gdb backtrace would be much
 nicer - but if you can't reproduce the freeze issue with other xine
 based players and want to run kaffeine in gdb, you need to execute
 gdb --args kaffeine --nofork.

I just tested xine-ui and i can easily reproduce exact same problem with 
xine-ui also so you are right, it seems a xine-lib problem trigger by CFS 
changes.

   thanks. This does has the appearance of a userspace race condition of
   some sorts. Can you trigger this hang with the patch below applied to
   the vanilla tree as well? (with no CFS patch applied)
 
  oops, please use the patch below instead.

Tomorrow i'll test that patch and also try to get a backtrace.

Cheers
-- 
S.Çağlar Onur [EMAIL PROTECTED]
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


signature.asc
Description: This is a digitally signed message part.