Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Nick Piggin
On Sun, 2005-04-03 at 20:55 -0700, Paul Jackson wrote:

> But if we knew the CPU hierarchy in more detail, and if we had some
> other use for that detail (we don't that I know), then I take it from
> your comment that we should be reluctant to push those details into the
> sched domains.  Put them someplace else if we need them.
> 

In a sense, the information *is* already there - in node_distance.
What I think should be done is probably to use node_distance when
calculating costs, and correlate that with sched-domains as best
we can.

I've got an idea of how to do it, but I'll wait until Ingo gets the
fundamentals working wel before I have a look.

> 
> One question - how serious do you view difference in migration cost
> between say 21.7 and 25.3, two of the cacheflush times I reported on a
> small SN2?
> 
> I'm guessing that this is probably below the noise threshold, at least
> as far as scheduler domains, schedulers and migration care, unless and
> until some persuasive measurements show a situation in which it matters.
> 

Yes, likely below noise. There is an issue with a behavioural
transition point in the wakeup code where you might see good
behaviour with 21 and bad with 25, or vice versa on some workloads.
This is fixed in the scheduler patches coming through -mm though.

But I wasn't worried so much about the absolute value not being
right, rather it maybe not being deterministic. So maybe depending
on what CPU gets assigned what cpuid, you might get different
values on identical machines.

> As you say - not an exact science.
> 




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


how to cope with "Scheduling in interrupt" problem

2005-04-03 Thread MingChieh
Dear all,

I try to modify inet_sendmsg() and inet_recvmsg().
To defer the time to notify a receiver, I use a timer for the problem.
But it causes "Scheduling in interrupt" error.
Is there any method to reform it?

Thank you for tour help


MingChieh Chang,
Taiwan


Scheduling in interrupt
invalid operand: 
CPU: 0
EIP: 0819:[] Not tainted
EFLAGS: 00010286
eax: 0018 ebx: c19c2000 ecx: c0170894 edx: fbff9000
esi: c19c2000 edi: c1d42da0 ebp: c19c3cf4 esp: c19c3cd0
ds: 0821 es: 0821 ss: 0821
Process ftp (pid: 1312, stackpage=c19c3000)<1>



EX:
inet_sendmsg()
{
.
.
.
BYE:

if(sock->send_nonnotify_size>0&&0==sock->send_set_timer)
{
sock->send_notify_timer.function=notify_receiver;
sock->send_notify_timer.expires=MY_EXT_NOTIFY_TIME + jiffies;
sock->send_notify_timer.data=(unsigned long)(sock);

dbprintk("set notify timer, sock addr=%p\n",sock);
add_timer(>send_notify_timer);
sock->send_set_timer=1;

}
release_sock(sock->sk);
}

static void notify_receiver(unsigned long data)
{
struct socket* sock=(struct socket*)data;
struct SHM_INFO shm_tmp;

if(!sock||!sock->sk)
return;

lock_sock(sock->sk);
sock->send_set_timer=0;

if(sock->send_nonnotify_size)
{
dbprintk("notify_receiver:notify
receivers,size=%d\n",sock->send_nonnotify_size);
sock->send_nonnotify_size=0;

shm_tmp.saddr=ntohl(sock->sk->saddr);
shm_tmp.sport=ntohl(sock->sk->sport);

shm_tmp.reqaddr=shm_tmp.saddr;
shm_tmp.reqport=shm_tmp.sport;

shm_tmp.daddr=ntohl(sock->sk->daddr);
shm_tmp.dport=ntohl(sock->sk->dport);
shm_tmp.maddr=NULL;
release_sock(sock->sk);

dbprintk("notift_recv: call send_data()..");
HYPERVISOR_send_data(_tmp);
dbprintk("done\n");
return;
}

release_sock(sock->sk);
}


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Paul Jackson
Ingo wrote:
> There's no other place to push them 

One could make a place, if the need arose.

> but trying and benchmarking it is necessary to tell for sure.

Hard to argue with that ... ;).

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson <[EMAIL PROTECTED]> 1.650.933.1373, 
1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Ingo Molnar

* Paul Jackson <[EMAIL PROTECTED]> wrote:

> Ingo, if I understood correctly, suggested pushing any necessary 
> detail of the CPU hierarchy into the scheduler domains, so that his 
> latest work tuning migration costs could pick it up from there.
> 
> It makes good sense for the migration cost estimation to be based on 
> whatever CPU hierarchy is visible in the sched domains.
> 
> But if we knew the CPU hierarchy in more detail, and if we had some 
> other use for that detail (we don't that I know), then I take it from 
> your comment that we should be reluctant to push those details into 
> the sched domains.  Put them someplace else if we need them.

There's no other place to push them - most of the hierarchy related 
decisions are done based on the domain tree. So the decision to make is: 
"is it worth complicating the domain tree, in exchange for more accurate 
handling of the real hierarchy?".

In general, the pros are potentially more accuracy and thus higher 
application performance, the cons are overhead (more tree walking) and 
artifacts (the sched-domains logic is good but not perfect, and even if 
there were no bugs in it, the decisions are approximations. One more 
domain level might make things worse.)

but trying and benchmarking it is necessary to tell for sure.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU

2005-04-03 Thread Li Shaohua
Hi,
On Mon, 2005-04-04 at 13:28, Nathan Lynch wrote:
> On Mon, Apr 04, 2005 at 10:07:02AM +0800, Li Shaohua wrote:
> > Clean up all CPU states including its runqueue and idle thread, 
> > so we can use boot time code without any changes.
> > Note this makes /sys/devices/system/cpu/cpux/online unworkable.
> 
> In what sense does it make the online attribute unworkable?
I removed the idle thread and other CPU states, and makes the dead CPU
into a 'halt' busy loop. 

> 
> > diff -puN kernel/exit.c~cpu_state_clean kernel/exit.c
> > --- linux-2.6.11/kernel/exit.c~cpu_state_clean  2005-03-31 
> > 10:50:27.0 +0800
> > +++ linux-2.6.11-root/kernel/exit.c 2005-03-31 10:50:27.0 +0800
> > @@ -845,6 +845,65 @@ fastcall NORET_TYPE void do_exit(long co
> > for (;;) ;
> >  }
> >  
> > +#ifdef CONFIG_STR_SMP
> > +void do_exit_idle(void)
> > +{
> > +   struct task_struct *tsk = current;
> > +   int group_dead;
> > +
> > +   BUG_ON(tsk->pid);
> > +   BUG_ON(tsk->mm);
> > +
> > +   if (tsk->io_context)
> > +   exit_io_context();
> > +   tsk->flags |= PF_EXITING;
> > +   tsk->it_virt_expires = cputime_zero;
> > +   tsk->it_prof_expires = cputime_zero;
> > +   tsk->it_sched_expires = 0;
> > +
> > +   acct_update_integrals(tsk);
> > +   update_mem_hiwater(tsk);
> > +   group_dead = atomic_dec_and_test(>signal->live);
> > +   if (group_dead) {
> > +   del_timer_sync(>signal->real_timer);
> > +   acct_process(-1);
> > +   }
> > +   exit_mm(tsk);
> > +
> > +   exit_sem(tsk);
> > +   __exit_files(tsk);
> > +   __exit_fs(tsk);
> > +   exit_namespace(tsk);
> > +   exit_thread();
> > +   exit_keys(tsk);
> > +
> > +   if (group_dead && tsk->signal->leader)
> > +   disassociate_ctty(1);
> > +
> > +   module_put(tsk->thread_info->exec_domain->module);
> > +   if (tsk->binfmt)
> > +   module_put(tsk->binfmt->module);
> > +
> > +   tsk->exit_code = -1;
> > +   tsk->exit_state = EXIT_DEAD;
> > +
> > +   /* in release_task */
> > +   atomic_dec(>user->processes);
> > +   write_lock_irq(_lock);
> > +   __exit_signal(tsk);
> > +   __exit_sighand(tsk);
> > +   write_unlock_irq(_lock);
> > +   release_thread(tsk);
> > +   put_task_struct(tsk);
> > +
> > +   tsk->flags |= PF_DEAD;
> > +#ifdef CONFIG_NUMA
> > +   mpol_free(tsk->mempolicy);
> > +   tsk->mempolicy = NULL;
> > +#endif
> > +}
> > +#endif
> 
> I don't understand why this is needed at all.  It looks like a fair
> amount of code from do_exit is being duplicated here.  
Yes, exactly. Someone who understand do_exit please help clean up the
code. I'd like to remove the idle thread, since the smpboot code will
create a new idle thread.

> We've been
> doing cpu removal on ppc64 logical partitions for a while and never
> needed to do anything like this. 
Did it remove idle thread? or dead cpu is in a busy loop of idle?

>  Maybe idle_task_exit would suffice?
idle_task_exit seems just drop mm. We need destroy the idle task for
physical CPU hotplug, right?

> 
> 
> > diff -puN kernel/sched.c~cpu_state_clean kernel/sched.c
> > --- linux-2.6.11/kernel/sched.c~cpu_state_clean 2005-03-31 
> > 10:50:27.0 +0800
> > +++ linux-2.6.11-root/kernel/sched.c2005-04-04 09:06:40.362357104 
> > +0800
> > @@ -4028,6 +4028,58 @@ void __devinit init_idle(task_t *idle, i
> >  }
> >  
> >  /*
> > + * Initial dummy domain for early boot and for hotplug cpu. Being static,
> > + * it is initialized to zero, so all balancing flags are cleared which is
> > + * what we want.
> > + */
> > +static struct sched_domain sched_domain_dummy;
> > +
> > +#ifdef CONFIG_STR_SMP
> > +static void __devinit exit_idle(int cpu)
> > +{
> > +   runqueue_t *rq = cpu_rq(cpu);
> > +   struct task_struct *p = rq->idle;
> > +   int j, k;
> > +   prio_array_t *array;
> > +
> > +   /* init runqueue */
> > +   spin_lock_init(>lock);
> > +   rq->active = rq->arrays;
> > +   rq->expired = rq->arrays + 1;
> > +   rq->best_expired_prio = MAX_PRIO;
> > +
> > +   rq->prev_mm = NULL;
> > +   rq->curr = rq->idle = NULL;
> > +   rq->expired_timestamp = 0;
> > +
> > +   rq->sd = _domain_dummy;
> > +   rq->cpu_load = 0;
> > +   rq->active_balance = 0;
> > +   rq->push_cpu = 0;
> > +   rq->migration_thread = NULL;
> > +   INIT_LIST_HEAD(>migration_queue);
> > +   atomic_set(>nr_iowait, 0);
> > +
> > +   for (j = 0; j < 2; j++) {
> > +   array = rq->arrays + j;
> > +   for (k = 0; k < MAX_PRIO; k++) {
> > +   INIT_LIST_HEAD(array->queue + k);
> > +   __clear_bit(k, array->bitmap);
> > +   }
> > +   // delimiter for bitsearch
> > +   __set_bit(MAX_PRIO, array->bitmap);
> > +   }
> > +   /* Destroy IDLE thread.
> > +* it's safe now, the CPU is in busy loop
> > +*/
> > +   if (p->active_mm)
> > +   mmdrop(p->active_mm);
> > +   p->active_mm = NULL;
> > +   put_task_struct(p);
> > +}
> > +#endif
> > +
> > +/*
> >   * In a system that switches off the HZ timer nohz_cpu_mask
> >   * indicates 

Re: [PATCH 4/4] psmouse: dynamic protocol switching via sysfs

2005-04-03 Thread Dmitry Torokhov
Hi Kenan,

On Sunday 03 April 2005 14:49, Kenan Esau wrote:
> Patches 1-3 are fine.
> 

Thank you very much for testing the patches. Based on the feedback I
received I am goping to drop that DMI patch - does not save enough to
justify the ifdefs...

> Protocol switching via sysfs works too but if I switch from LBPS/2 to
> PS/2 the device name changes from "/dev/event1" to "/dev/event2" -- is
> this intended?

Yes - we in fact getting somewhat a "new" device with new capabilities so
the driver unregisters old input device and creates a new one. I strongly
believe that we should not change input device attributes "on fly".

> If I do "echo -n 50 > resolution" "0xe8 0x01" is sent. I don't know if
> this is correct for "usual" PS/2-devices but for the lifebook it's
> wrong.
> 
> For the lifebook the parameters are as following:
> 
> 50cpi  <=> 0x00
> 100cpi <=> 0x01
> 200cpi <=> 0x02
> 400cpi <=> 0x03
> 

"Classic" PS/2 protocol specifies available resolutions of 1, 2, 4 and 8
units per mm which gives you 25, 50, 100 and 200 cpi respectively. I am
surprised that Lifebook simply doubles the rates, but if it does I guess
the patch below will suffice. 

-- 
Dmitry

===

Input: apparently Lifebook touchscreens have double resolution
   compared to "classic" PS/2 mice, provide appropriate
   resolution setting handler.

Signed-off-by: Dmitry Torokhov <[EMAIL PROTECTED]>


 lifebook.c |   12 
 1 files changed, 12 insertions(+)

Index: dtor/drivers/input/mouse/lifebook.c
===
--- dtor.orig/drivers/input/mouse/lifebook.c
+++ dtor/drivers/input/mouse/lifebook.c
@@ -82,6 +82,17 @@ static int lifebook_absolute_mode(struct
return 0;
 }
 
+static void lifebook_set_resolution(struct psmouse *psmouse, unsigned int 
resolution)
+{
+   unsigned char params[] = { 0, 1, 2, 2, 3 };
+
+   if (resolution == 0 || resolution > 400)
+   resolution = 400;
+
+   ps2_command(>ps2dev, [resolution / 100], 
PSMOUSE_CMD_SETRES);
+   psmouse->resolution = 50 << params[resolution / 100];
+}
+
 static void lifebook_disconnect(struct psmouse *psmouse)
 {
psmouse_reset(psmouse);
@@ -114,6 +125,7 @@ int lifebook_init(struct psmouse *psmous
input_set_abs_params(>dev, ABS_Y, 0, 1024, 0, 0);
 
psmouse->protocol_handler = lifebook_process_byte;
+   psmouse->set_resolution = lifebook_set_resolution;
psmouse->disconnect = lifebook_disconnect;
psmouse->reconnect  = lifebook_absolute_mode;
psmouse->pktsize = 3;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ppc: fix single-stepping of emulated instructions

2005-04-03 Thread Paul Mackerras
On ppc, we emulate instructions that cause alignment exceptions.  If
we are single-stepping an instruction and it causes an alignment
exception, we will currently do the next instruction as well before
taking the single-step exception.  This patch fixes that, so we take
the single-step exception after emulating the instruction.

Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]>

diff -urN linux-2.5/arch/ppc/kernel/traps.c pmac-2.5/arch/ppc/kernel/traps.c
--- linux-2.5/arch/ppc/kernel/traps.c   2005-03-29 16:24:53.0 +1000
+++ pmac-2.5/arch/ppc/kernel/traps.c2005-03-31 08:37:53.0 +1000
@@ -679,6 +701,7 @@
fixed = fix_alignment(regs);
if (fixed == 1) {
regs->nip += 4; /* skip over emulated instruction */
+   emulate_single_step(regs);
return;
}
if (fixed == -EFAULT) {
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ppc: oops on kernel altivec assist exceptions

2005-04-03 Thread Paul Mackerras
If we should happen to get an altivec assist exception while executing
in the kernel, we will currently try to handle it and fail, and end up
oopsing with (apparently) a segfault.  (An altivec assist exception
occurs for floating-point altivec instructions with denormalized
inputs or outputs if the altivec unit is in java mode.)

This patch checks explicitly if we are in user mode and prints a
useful message if not.

Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]>

diff -urN linux-2.5/arch/ppc/kernel/traps.c pmac-2.5/arch/ppc/kernel/traps.c
--- linux-2.5/arch/ppc/kernel/traps.c   2005-03-29 16:24:53.0 +1000
+++ pmac-2.5/arch/ppc/kernel/traps.c2005-03-31 08:37:53.0 +1000
@@ -805,6 +828,13 @@
if (regs->msr & MSR_VEC)
giveup_altivec(current);
preempt_enable();
+   if (!user_mode(regs)) {
+   printk(KERN_ERR "altivec assist exception in kernel mode"
+  " at %lx\n", regs->nip);
+   debugger(regs);
+   die("altivec assist exception", regs, SIGFPE);
+   return;
+   }
 
err = emulate_altivec(regs);
if (err == 0) {
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


how to cope with "Scheduling in interrupt" problem

2005-04-03 Thread MingJie Chang
Dear all,

I try to modify inet_sendmsg() and inet_recvmsg().
To defer the time to notify a receiver, I use a timer for the problem.
But it causes "Scheduling in interrupt" error.
Is there any method to reform it?

Thank you for tour help


Scheduling in interrupt
invalid operand: 
CPU:0
EIP:0819:[]Not tainted
EFLAGS: 00010286
eax: 0018   ebx: c19c2000   ecx: c0170894   edx: fbff9000
esi: c19c2000   edi: c1d42da0   ebp: c19c3cf4   esp: c19c3cd0
ds: 0821   es: 0821   ss: 0821
Process ftp (pid: 1312, stackpage=c19c3000)<1>



EX:
inet_sendmsg()
{
.
.
.
BYE:

if(sock->send_nonnotify_size>0&&0==sock->send_set_timer)
{
sock->send_notify_timer.function=notify_receiver;   
sock->send_notify_timer.expires=MY_EXT_NOTIFY_TIME + jiffies;
sock->send_notify_timer.data=(unsigned long)(sock);

dbprintk("set notify timer, sock addr=%p\n",sock);
add_timer(>send_notify_timer);
sock->send_set_timer=1;

}
release_sock(sock->sk); 
}

static void notify_receiver(unsigned long data)
{
struct socket* sock=(struct socket*)data;
struct SHM_INFO shm_tmp;

if(!sock||!sock->sk)
return;

lock_sock(sock->sk);
sock->send_set_timer=0;

if(sock->send_nonnotify_size)
{
dbprintk("notify_receiver:notify
receivers,size=%d\n",sock->send_nonnotify_size);
sock->send_nonnotify_size=0;

shm_tmp.saddr=ntohl(sock->sk->saddr);
shm_tmp.sport=ntohl(sock->sk->sport);

shm_tmp.reqaddr=shm_tmp.saddr;
shm_tmp.reqport=shm_tmp.sport;

shm_tmp.daddr=ntohl(sock->sk->daddr);
shm_tmp.dport=ntohl(sock->sk->dport);
shm_tmp.maddr=NULL;
release_sock(sock->sk);

dbprintk("notift_recv: call send_data()..");
HYPERVISOR_send_data(_tmp);
dbprintk("done\n");
return;
}   

release_sock(sock->sk);
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 5/6]clean cpu state after hotremove CPU

2005-04-03 Thread Nathan Lynch
On Mon, Apr 04, 2005 at 10:07:02AM +0800, Li Shaohua wrote:
> Clean up all CPU states including its runqueue and idle thread, 
> so we can use boot time code without any changes.
> Note this makes /sys/devices/system/cpu/cpux/online unworkable.

In what sense does it make the online attribute unworkable?


> diff -puN kernel/exit.c~cpu_state_clean kernel/exit.c
> --- linux-2.6.11/kernel/exit.c~cpu_state_clean2005-03-31 
> 10:50:27.0 +0800
> +++ linux-2.6.11-root/kernel/exit.c   2005-03-31 10:50:27.0 +0800
> @@ -845,6 +845,65 @@ fastcall NORET_TYPE void do_exit(long co
>   for (;;) ;
>  }
>  
> +#ifdef CONFIG_STR_SMP
> +void do_exit_idle(void)
> +{
> + struct task_struct *tsk = current;
> + int group_dead;
> +
> + BUG_ON(tsk->pid);
> + BUG_ON(tsk->mm);
> +
> + if (tsk->io_context)
> + exit_io_context();
> + tsk->flags |= PF_EXITING;
> + tsk->it_virt_expires = cputime_zero;
> + tsk->it_prof_expires = cputime_zero;
> + tsk->it_sched_expires = 0;
> +
> + acct_update_integrals(tsk);
> + update_mem_hiwater(tsk);
> + group_dead = atomic_dec_and_test(>signal->live);
> + if (group_dead) {
> + del_timer_sync(>signal->real_timer);
> + acct_process(-1);
> + }
> + exit_mm(tsk);
> +
> + exit_sem(tsk);
> + __exit_files(tsk);
> + __exit_fs(tsk);
> + exit_namespace(tsk);
> + exit_thread();
> + exit_keys(tsk);
> +
> + if (group_dead && tsk->signal->leader)
> + disassociate_ctty(1);
> +
> + module_put(tsk->thread_info->exec_domain->module);
> + if (tsk->binfmt)
> + module_put(tsk->binfmt->module);
> +
> + tsk->exit_code = -1;
> + tsk->exit_state = EXIT_DEAD;
> +
> + /* in release_task */
> + atomic_dec(>user->processes);
> + write_lock_irq(_lock);
> + __exit_signal(tsk);
> + __exit_sighand(tsk);
> + write_unlock_irq(_lock);
> + release_thread(tsk);
> + put_task_struct(tsk);
> +
> + tsk->flags |= PF_DEAD;
> +#ifdef CONFIG_NUMA
> + mpol_free(tsk->mempolicy);
> + tsk->mempolicy = NULL;
> +#endif
> +}
> +#endif

I don't understand why this is needed at all.  It looks like a fair
amount of code from do_exit is being duplicated here.  We've been
doing cpu removal on ppc64 logical partitions for a while and never
needed to do anything like this.  Maybe idle_task_exit would suffice?


> diff -puN kernel/sched.c~cpu_state_clean kernel/sched.c
> --- linux-2.6.11/kernel/sched.c~cpu_state_clean   2005-03-31 
> 10:50:27.0 +0800
> +++ linux-2.6.11-root/kernel/sched.c  2005-04-04 09:06:40.362357104 +0800
> @@ -4028,6 +4028,58 @@ void __devinit init_idle(task_t *idle, i
>  }
>  
>  /*
> + * Initial dummy domain for early boot and for hotplug cpu. Being static,
> + * it is initialized to zero, so all balancing flags are cleared which is
> + * what we want.
> + */
> +static struct sched_domain sched_domain_dummy;
> +
> +#ifdef CONFIG_STR_SMP
> +static void __devinit exit_idle(int cpu)
> +{
> + runqueue_t *rq = cpu_rq(cpu);
> + struct task_struct *p = rq->idle;
> + int j, k;
> + prio_array_t *array;
> +
> + /* init runqueue */
> + spin_lock_init(>lock);
> + rq->active = rq->arrays;
> + rq->expired = rq->arrays + 1;
> + rq->best_expired_prio = MAX_PRIO;
> +
> + rq->prev_mm = NULL;
> + rq->curr = rq->idle = NULL;
> + rq->expired_timestamp = 0;
> +
> + rq->sd = _domain_dummy;
> + rq->cpu_load = 0;
> + rq->active_balance = 0;
> + rq->push_cpu = 0;
> + rq->migration_thread = NULL;
> + INIT_LIST_HEAD(>migration_queue);
> + atomic_set(>nr_iowait, 0);
> +
> + for (j = 0; j < 2; j++) {
> + array = rq->arrays + j;
> + for (k = 0; k < MAX_PRIO; k++) {
> + INIT_LIST_HEAD(array->queue + k);
> + __clear_bit(k, array->bitmap);
> + }
> + // delimiter for bitsearch
> + __set_bit(MAX_PRIO, array->bitmap);
> + }
> + /* Destroy IDLE thread.
> +  * it's safe now, the CPU is in busy loop
> +  */
> + if (p->active_mm)
> + mmdrop(p->active_mm);
> + p->active_mm = NULL;
> + put_task_struct(p);
> +}
> +#endif
> +
> +/*
>   * In a system that switches off the HZ timer nohz_cpu_mask
>   * indicates which cpus entered this state. This is used
>   * in the rcu update to wait only for active cpus. For system
> @@ -4432,6 +4484,9 @@ static int migration_call(struct notifie
>   complete(>done);
>   }
>   spin_unlock_irq(>lock);
> +#ifdef CONFIG_STR_SMP
> + exit_idle(cpu);
> +#endif

I don't understand the need for this, either.  The existing cpu
hotplug notifier in the scheduler takes care of initializing the sched
domains and groups appropriately for online/offline events; why do you
need to touch the runqueue structures?


Nathan
-
To unsubscribe from this 

[no subject]

2005-04-03 Thread MingJie Chang
subscribe
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ppc: improve timebase sync for SMP

2005-04-03 Thread Paul Mackerras
Currently the procedure in the ppc32 kernel that synchronizes the
timebase registers across an SMP powermac system does so by setting
both timebases to zero.  That is OK at boot but causes problems if
done later.  So that we can do hotplug CPU on these machines, this
patch changes the code so it reads the timebase from one CPU and
transfers the value to the other CPU.  (Hotplug CPU is needed for
sleep (aka suspend to RAM) to work.)

Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]>

diff -urN linux-2.5/arch/ppc/platforms/pmac_smp.c 
pmac-2.5/arch/ppc/platforms/pmac_smp.c
--- linux-2.5/arch/ppc/platforms/pmac_smp.c 2005-03-15 10:18:23.0 
+1100
+++ pmac-2.5/arch/ppc/platforms/pmac_smp.c  2005-03-15 11:59:02.0 
+1100
@@ -116,6 +116,8 @@
 
 /* Sync flag for HW tb sync */
 static volatile int sec_tb_reset = 0;
+static unsigned int pri_tb_hi, pri_tb_lo;
+static unsigned int pri_tb_stamp;
 
 static void __init core99_init_caches(int cpu)
 {
@@ -453,7 +455,7 @@
 #endif
struct device_node *cpus, *firstcpu;
int i, ncpus = 0, boot_cpu = -1;
-   u32 *tbprop;
+   u32 *tbprop = NULL;
 
if (ppc_md.progress) ppc_md.progress("smp_core99_probe", 0x345);
cpus = firstcpu = find_type_devices("cpu");
@@ -576,46 +578,74 @@
}
 }
 
-void __init smp_core99_take_timebase(void)
+/* not __init, called in sleep/wakeup code */
+void smp_core99_take_timebase(void)
 {
-   /* Secondary processor "takes" the timebase by freezing
-* it, resetting its local TB and telling CPU 0 to go on
-*/
-   pmac_call_feature(PMAC_FTR_WRITE_GPIO, NULL, core99_tb_gpio, 4);
-   pmac_call_feature(PMAC_FTR_READ_GPIO, NULL, core99_tb_gpio, 0);
+   unsigned long flags;
+
+   /* tell the primary we're here */
+   sec_tb_reset = 1;
mb();
 
-   set_dec(tb_ticks_per_jiffy);
-   set_tb(0, 0);
-   last_jiffy_stamp(smp_processor_id()) = 0;
+   /* wait for the primary to set pri_tb_hi/lo */
+   while (sec_tb_reset < 2)
+   mb();
 
+   /* set our stuff the same as the primary */
+   local_irq_save(flags);
+   set_dec(1);
+   set_tb(pri_tb_hi, pri_tb_lo);
+   last_jiffy_stamp(smp_processor_id()) = pri_tb_stamp;
+   mb();
+
+   /* tell the primary we're done */
+   sec_tb_reset = 0;
mb();
-   sec_tb_reset = 1;
+   local_irq_restore(flags);
 }
 
-void __init smp_core99_give_timebase(void)
+/* not __init, called in sleep/wakeup code */
+void smp_core99_give_timebase(void)
 {
+   unsigned long flags;
unsigned int t;
 
-   /* Primary processor waits for secondary to have frozen
-* the timebase, resets local TB, and kick timebase again
-*/
-   /* wait for the secondary to have reset its TB before proceeding */
-   for (t = 1000; t > 0 && !sec_tb_reset; --t)
-   udelay(1000);
-   if (t == 0)
+   /* wait for the secondary to be in take_timebase */
+   for (t = 10; t > 0 && !sec_tb_reset; --t)
+   udelay(10);
+   if (!sec_tb_reset) {
printk(KERN_WARNING "Timeout waiting sync on second CPU\n");
+   return;
+   }
 
-   set_dec(tb_ticks_per_jiffy);
-   set_tb(0, 0);
-   last_jiffy_stamp(smp_processor_id()) = 0;
+   /* freeze the timebase and read it */
+   /* disable interrupts so the timebase is disabled for the
+  shortest possible time */
+   local_irq_save(flags);
+   pmac_call_feature(PMAC_FTR_WRITE_GPIO, NULL, core99_tb_gpio, 4);
+   pmac_call_feature(PMAC_FTR_READ_GPIO, NULL, core99_tb_gpio, 0);
+   mb();
+   pri_tb_hi = get_tbu();
+   pri_tb_lo = get_tbl();
+   pri_tb_stamp = last_jiffy_stamp(smp_processor_id());
mb();
 
+   /* tell the secondary we're ready */
+   sec_tb_reset = 2;
+   mb();
+
+   /* wait for the secondary to have taken it */
+   for (t = 10; t > 0 && sec_tb_reset; --t)
+   udelay(10);
+   if (sec_tb_reset)
+   printk(KERN_WARNING "Timeout waiting sync(2) on second CPU\n");
+   else
+   smp_tb_synchronized = 1;
+
/* Now, restart the timebase by leaving the GPIO to an open collector */
pmac_call_feature(PMAC_FTR_WRITE_GPIO, NULL, core99_tb_gpio, 0);
 pmac_call_feature(PMAC_FTR_READ_GPIO, NULL, core99_tb_gpio, 0);
-
-   smp_tb_synchronized = 1;
+   local_irq_restore(flags);
 }
 
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


how to cope with "Scheduling in interrupt" problem

2005-04-03 Thread MingJie Chang
Dear all,

I try to modify inet_sendmsg() and inet_recvmsg().
To defer the time to notify a receiver, I use a timer for the problem.
But it causes "Scheduling in interrupt" error.
Is there any method to reform it?

Thank you for tour help

Error:

Scheduling in interrupt
invalid operand: 
CPU:0
EIP:0819:[]Not tainted
EFLAGS: 00010286
eax: 0018   ebx: c19c2000   ecx: c0170894   edx: fbff9000
esi: c19c2000   edi: c1d42da0   ebp: c19c3cf4   esp: c19c3cd0
ds: 0821   es: 0821   ss: 0821
Process ftp (pid: 1312, stackpage=c19c3000)<1>


code:
inet_sendmsg()
{
.
.
.
BYE:

if(sock->send_nonnotify_size>0&&0==sock->send_set_timer)
{
sock->send_notify_timer.function=notify_receiver;   
sock->send_notify_timer.expires=MY_EXT_NOTIFY_TIME + jiffies;
sock->send_notify_timer.data=(unsigned long)(sock);

dbprintk("set notify timer, sock addr=%p\n",sock);
add_timer(>send_notify_timer);
sock->send_set_timer=1;

}
release_sock(sock->sk); 
}

static void notify_receiver(unsigned long data)
{
struct socket* sock=(struct socket*)data;
struct SHM_INFO shm_tmp;

if(!sock||!sock->sk)
return;

lock_sock(sock->sk);
sock->send_set_timer=0;

if(sock->send_nonnotify_size)
{
dbprintk("notify_receiver:notify
receivers,size=%d\n",sock->send_nonnotify_size);
sock->send_nonnotify_size=0;

shm_tmp.saddr=ntohl(sock->sk->saddr);
shm_tmp.sport=ntohl(sock->sk->sport);

shm_tmp.reqaddr=shm_tmp.saddr;
shm_tmp.reqport=shm_tmp.sport;

shm_tmp.daddr=ntohl(sock->sk->daddr);
shm_tmp.dport=ntohl(sock->sk->dport);
shm_tmp.maddr=NULL;
release_sock(sock->sk);

dbprintk("notift_recv: call send_data()..");
HYPERVISOR_send_data(_tmp);
dbprintk("done\n");
return;
}   

release_sock(sock->sk);
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Paul Jackson
Andy wrote:
> Not that I really know what I'm talking about here, but this sounds 
> highly parallelizable.

I doubt it.  If we are testing the cost of a migration between CPUs
alpha and beta, and at the same time testing betweeen CPUs gamma and
delta, then often there will be some hardware that is shared by both the
 path, and the  path.  This would affect the
test results.

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson <[EMAIL PROTECTED]> 1.650.933.1373, 
1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: How to make linux ping behaves like MS ping?

2005-04-03 Thread Gene Heskett
On Sunday 03 April 2005 23:43, Beast wrote:
>Hi Gene,
>Is this posting for me?

Yes it was.

>Gene Heskett wrote:
>> I also don't play Sender Confirmation games, particularly when the
>> confirmation message is in html only.
>
>AFAIK, I never set any confirm receipt or sending html format to the
> list.

The sender confirmation form I got was in html, not from your mailer, 
but from the ISP's server, which would not deliver the message unless 
I did some sort of a click thru reply.  As many of these are scams 
designed to collect valid email addresses, and some will lead to 
phishing attacks, its something I've never, ever considered doing. 
SCP is IMO a solution in search of a problem.

In other words, talk to your ISP, and if some sort of SCP is in 
effect, see if they will shut it off, and then setup your own version 
of spamassassin or similar filter.

>> You've been removed from the Cc: line in this reply.
>>
>> You asked for help, but set the Cc: line as if you weren't
>> subscribed. Anyone who posts to this list, can IMNSHO, accept the
>> fscking replies you set in the Cc: line of your message or do
>> without the reply.
>
>I don't understand this. Could you shed some light on this?
>Tks.
>
>> Read it here, or don't read it at all, doesn't bother me.

In other words, this is the only message you saw from me, which does 
not contain the answer to your original post, as best as I could 
answer it.  That I've been known to be erronious does occasionally 
happen though. :)

And, if you don't see this answer, it will be because I didn't do the 
dance to clear thru your server again.  In that event, you'll 
conclude I'm an outstanding jerk for not answering.  Yes, I've been 
that occasionally in my 70 years, usually after something like this 
pulls my trigger.

However, since your server probably won't let this reply through 
either, I've taken the liberty (excuse me please, list readers) of 
adding lkml back into the Cc: line.  Maybe you'll read it there 
eventually.


-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.34% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com and AOL/TW attorneys please note, additions to the above
message by Gene Heskett are:
Copyright 2005 by Maurice Eugene Heskett, all rights reserved.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


4$B7n$N(B"$BD6(B"$B$*%H%/$J%-(B $B%c%s%Z!<%s!*(B

2005-04-03 Thread info


$B(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(B

$B"#"#"#!!(B $B"#"#"#(B $B"#(B $B"#(B $B!!"#"#"#(B
$B"#"#"#(B  $B"#(B

$B"#"#(B $B"#(B$B"#"#(B   $B!!"#(B$B"#(B 
$B"#(B$B"#(B $B"#(B

$B"#"#(B $B"#"#"#(B   $B"#(B  $B"#(B   $B"#(B $B!!"#(B
$B"#(B $B"#(B   $B"#(B

$B"#"#(B $B"#(B  $B"#"#"#"#(B  $B"#(B $B!!"#"#"#(B   
$B"#(B   $B"#(B

$B"#"#"#!!(B $B"#"#"#(B $B"#(B  $B"#(B $B"#(B   $B"#(B  
 $B"#(B$B"#(B

  $B!!"#(B$B"#"#"#(B  $B"#(B

$B(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(B



$B(.(,"!(B*$B!~(B 
4$B7n$ND6$*%H%/$J%-%c%s%Z!<%s!*5^$$$G%"%/%;%92<$5$$!~(B*$B"!(,(,(/(B

$B(-EP(-O?(-http://com.deai-pc.com/index.php?num=2001



$B!c!!ITNQ4jK>!)!!!d(B

$B"M!!(Bhttp://com.deai-pc.com/index.php?num=2002



$B!c!!$H$K$+$/B(%(%C%A!y!!!d(B

$B"M!!(Bhttp://com.deai-pc.com/index.php?num=2003



$B!c!!G/2<4jK>!!!d(B

$B"M!!(Bhttp://com.deai-pc.com/index.php?num=2004





$B"#"""#"""#"""#"""#"""#"""#"""#"""#"""#"""#"""#"""#"""#"""#"""#"""#"""#"""#(B



[EMAIL PROTECTED]@-8r!K$r<}$a$k!!!z!D!D!D!D(B

$B!y(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,!y(B



$B!!$[$H$s$I$,%5%/%i1?1D$G$"$k!V=P2q$$7O%5%$%H!W$r96N,$9$k$?$a$N80$O!)(B



$B!!$b$A$m$s%5%/%i$r8[$&$K$b?M7oHq$,$+$+$k!#2f!9$OL5BL$J!J%5%/%i!K?M7oHq(B



$B!!$rGS=|$7!"[EMAIL PROTECTED];;$rAH$s$G$*$j$^$9!*(B







$B!D!D!D!D!z!!=P2q$$BN83<[EMAIL PROTECTED](B

$B!y(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,!y(B



[EMAIL PROTECTED]@n8)!!2#IM;T!!2qP!K(B



$B!!A0J'[EMAIL 
PROTECTED]>!P!K(B



$B!!%3%s%Q$H$+9T$/$H!"7k6I<~$j$K5$$r;H$C$A$c$C$F;}$A5"$l$J$$KM$G$9$,!"$3$s$J(B



[EMAIL PROTECTED]@b$/:MG=$,$"$C$?$s$G$9$M!*!)(B



$B%3%3$+$i"M!!(Bhttp://com.deai-pc.com/?num=2010



$B!y(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,!y(B





$B=P(-2q(-$$(-#P(-#C(-$G(-(B

$B(,(0(,(0(,(0(,(0(,(0(,(0(B

$BAG(-E((-$J(-%;(-%U(-%l(-$r(-(B

$B(,(0(,(0(,(0(,(0(,(0(,(0(,(0(B

$B#G(-#E(-#T(-$7(-$m(-!*(-(B

$B(,(0(,(0(,(0(,(0(,(0(,(0(B



$B"-"-"-"-"-"-"-"-"-"-"-"-"-"-(B



$B!!(Bhttp://com.deai-pc.com/?num=2010$B!!(B



$B"(Ev%a!<%k%^%,%8%s$OL5CGE>:\$r$41sN8$7$F$*$j$^$9(B



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Andy Lutomirski
Paul Jackson wrote:
Ok - that flies, or at least walks.  It took 53 seconds to
compute this cost matrix.
Not that I really know what I'm talking about here, but this sounds 
highly parallelizable.  It seems like you could do N/2 measurements at a 
time, so this should be O(N) to compute the matrix (ignoring issues of 
how long it takes to write the data to memory, but that should be 
insignificant).

Even if you can't parallelize it all the way, it ought to at least help.
--Andy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][SELINUX] Add name_connect permission check

2005-04-03 Thread James Morris
On Sun, 3 Apr 2005, Dave Airlie wrote:

> On a standard FC3 with selinux enabled, booting the latest -bk breaks
> all my outgoing TCP connections at a guess due to this patch.. this
> probably isn't something that people really want to happen.. or maybe
> Fedora can release an updated policy to deal with it?

You need an updated policy, which you can grab from rawhide for FC3 or via 
CVS at http://selinux.sourceforge.net/



- James
-- 
James Morris
<[EMAIL PROTECTED]>


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Can't use SYSFS for "Proprietry" driver modules !!!.

2005-04-03 Thread Paul Jackson
Mark wrote:
> Probably all Linux binary drivers *are* compiled using GPL'd header files,
> and thus are themselves subject to the GPL.

I doubt that there is a consensus that simply compiling something with
a GPL header necessarily and always subjects it to the GPL.  See your lawyer.

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson <[EMAIL PROTECTED]> 1.650.933.1373, 
1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ppc: fix bogosity in process-freezing code

2005-04-03 Thread Paul Mackerras
The code that went into arch/ppc/kernel/signal.c recently to handle
process freezing seems to contain a dubious assumption: that a process
that calls do_signal when PF_FREEZE is set will have entered the
kernel because of a system call.  This patch removes that assumption.

Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]>

diff -urN linux-2.5/arch/ppc/kernel/signal.c pmac-2.5/arch/ppc/kernel/signal.c
--- linux-2.5/arch/ppc/kernel/signal.c  2005-03-15 10:18:23.0 +1100
+++ pmac-2.5/arch/ppc/kernel/signal.c   2005-04-02 15:12:21.0 +1000
@@ -708,7 +708,6 @@
if (current->flags & PF_FREEZE) {
refrigerator(PF_FREEZE);
signr = 0;
-   ret = regs->gpr[3];
if (!signal_pending(current))
goto no_signal;
}
@@ -719,7 +718,7 @@
newsp = frame = 0;
 
signr = get_signal_to_deliver(, , regs, NULL);
-
+ no_signal:
if (TRAP(regs) == 0x0C00/* System Call! */
&& regs->ccr & 0x1000   /* error signalled */
&& ((ret = regs->gpr[3]) == ERESTARTSYS
@@ -735,7 +734,6 @@
regs->gpr[3] = EINTR;
/* note that the cr0.SO bit is already set */
} else {
-no_signal:
regs->nip -= 4; /* Back up & retry system call */
regs->result = 0;
regs->trap = 0;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Paul Jackson
Paul wrote:
> I should push in the direction of improving the
> SN2 sched domain hierarchy.

Nick wrote:
> I'd just be a bit careful about this.

Good point - thanks.

I will - be careful.  I have no delusions that I know what would be an
"improvement" to the scheduler - if anything.

Ingo, if I understood correctly, suggested pushing any necessary detail
of the CPU hierarchy into the scheduler domains, so that his latest work
tuning migration costs could pick it up from there.

It makes good sense for the migration cost estimation to be based on
whatever CPU hierarchy is visible in the sched domains.

But if we knew the CPU hierarchy in more detail, and if we had some
other use for that detail (we don't that I know), then I take it from
your comment that we should be reluctant to push those details into the
sched domains.  Put them someplace else if we need them.


One question - how serious do you view difference in migration cost
between say 21.7 and 25.3, two of the cacheflush times I reported on a
small SN2?

I'm guessing that this is probably below the noise threshold, at least
as far as scheduler domains, schedulers and migration care, unless and
until some persuasive measurements show a situation in which it matters.

As you say - not an exact science.

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson <[EMAIL PROTECTED]> 1.650.933.1373, 
1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2.6.12-rc1-mm5 1/3] perfctr: ppc64 arch hooks

2005-04-03 Thread David Gibson
On Fri, Apr 01, 2005 at 02:46:53PM +0200, Mikael Pettersson wrote:
> Andrew Morton writes:
>  > David Gibson <[EMAIL PROTECTED]> wrote:
>  > >
>  > > On Thu, Mar 31, 2005 at 03:11:29PM -0800, Andrew Morton wrote:
>  > > > Mikael Pettersson <[EMAIL PROTECTED]> wrote:
>  > > > >
>  > > > > Here's a 3-part patch kit which adds a ppc64 driver to perfctr,
>  > > > > written by David Gibson <[EMAIL PROTECTED]>.
>  > > > 
>  > > > Well that seems like progress.  Where do we feel that we stand wrt
>  > > > preparedness for merging all this up?
>  > > 
>  > > I'm still uneasy about it.  There were sufficient changes made getting
>  > > this one ready to go that I'm not confident there aren't more
>  > > important things to be found.
>  > 
>  > That's a bit open-ended.  How do we determine whether more things will be
>  > needed?  How do we know when we're done?
> 
> I have two planned changes that will be done RSN:
> - On x86/x86-64, user-space uses the mmap()ed state's TSC start
>   value as a way to detect if a user-space sampling operation
>   (which needs to be "virtually atomic") was preempted by the kernel.
>   On ppc{32,64} we've used the TB for the same thing up to now,
>   but that doesn't quite work because the TB is about a magnitude
>   or two too slow. So the plan is to change ppc to store a
>   software generation counter in the mmap()ed state, and change
>   the ppc user-space to check that one instead.

If we're going to do it for ppc, we might as well do it for all
platforms.  That gets us one step closer to eliminating cstatus from
the user visible stuff, too, which I think should be done.

> - Move  common stuff to .
> 
> In addition, there is one unresolved issue:
> - A counter's value is represented by a 64-bit software sum,
>   a 32-bit start value containing the HW counter's value at the
>   start of the current time slice, and the current HW counter's value
>   (now). The actual value is computed as sum + (now - start).
>   This is reflected in the mmap()ed state, which contains a variable-
>   length { u32 map; u32 start; u64 sum; } pmc[] array.
>   This layout is very cache-efficient on current 32 and 64-bit CPUs,
>   but there is a _possible_ concern that it won't do on 10+ GHz CPUs.
>   So the question is, should we change it to use 64-bit start values
>   already now (and take more cache misses), or should that wait a few
>   years until it becomes a necessity (causing ABI change issues)?

Is there any way we could rearrange the user visible stuff to not
include the 'map' field?  After all userspace set up the counters, so
it ought to know what the mapping is already...

That would mean we could fit in a 64-bit start value without having to
mess around to get good alignment.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/people/dgibson
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] network configs: disconnect network options from drivers

2005-04-03 Thread Randy.Dunlap
Sam Ravnborg wrote:
On Thu, Mar 31, 2005 at 12:02:13PM -0800, Randy.Dunlap wrote:
Other than "sounds good," are there some comments on:
a.  leaving IrDA and Bluetooth subsystem (with drivers) where they
   are, which is under "Network options and protocols"
(I really don't want to split their drivers away from their
subsystem, just to put them under Network driver support.)

Agreed. All IrDA / Bluetooth stuff belongs together.
Leave them where they are for now.

b.  leaving SLIP, PPP, and PLIP where they are under Network driver
   support, even though they say that they are "protocols" ?
SLIP and PLIP is no that common. PPP is more common for cable-modem/ADSL
I suppose. But still it would make sense to create an Misc protocols
menu, like we have a misc filesystems menu.
While looking into this suggestion, I see that SLIP, PLIP,
and PPP depend on NETDEVICES, and they use some netdev
interfaces, so they appear to be more like net devices
than protocols even though they are called
protocols in Kconfig text, so I am leaving them alone
for now.  Don't hesitate to correct me
Any comments on this new version?
Thanks,
--
~Randy

A few people dislike that the Networking Options menu is inside
the Device Drivers/Networking menu.  This patch moves the
Networking Options menu to immediately before the Device Drivers
menu, renames it to "Networking options and protocols", & moves
most protocols to more logical places.

Notes:
- IrDA & Bluetooth subsystems include protocols & drivers, yet
  they are displayed under Networking protocols.  I don't see
  much good reason to split them up.  (See, this is an example
  of why the Networking Options and Network Drivers were close
  together)
- SLIP, PLIP, and PPP option names say that they are protocols,
  but they are sort of a hybrid device and protocol, and they
  use network device interfaces, so they remain listed under
  Network devices.

 drivers/Kconfig  |4
 drivers/net/Kconfig  |5
 net/Kconfig  |  450 ++-
 net/bridge/netfilter/Kconfig |1
 4 files changed, 241 insertions(+), 219 deletions(-)

Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]>

diff -Naurp -X /home/rddunlap/doc/dontdiff-osdl linux-2612-rc1-bk5-pv/drivers/Kconfig linux-2612-rc1-bk5-netconfigs/drivers/Kconfig
--- linux-2612-rc1-bk5-pv/drivers/Kconfig	2005-03-01 23:38:26.0 -0800
+++ linux-2612-rc1-bk5-netconfigs/drivers/Kconfig	2005-04-03 19:45:18.330102257 -0700
@@ -1,5 +1,7 @@
 # drivers/Kconfig
 
+source "net/Kconfig"
+
 menu "Device Drivers"
 
 source "drivers/base/Kconfig"
@@ -28,7 +30,7 @@ source "drivers/message/i2o/Kconfig"
 
 source "drivers/macintosh/Kconfig"
 
-source "net/Kconfig"
+source "drivers/net/Kconfig"
 
 source "drivers/isdn/Kconfig"
 
diff -Naurp -X /home/rddunlap/doc/dontdiff-osdl linux-2612-rc1-bk5-pv/drivers/net/Kconfig linux-2612-rc1-bk5-netconfigs/drivers/net/Kconfig
--- linux-2612-rc1-bk5-pv/drivers/net/Kconfig	2005-04-03 19:42:32.0 -0700
+++ linux-2612-rc1-bk5-netconfigs/drivers/net/Kconfig	2005-04-03 19:45:18.335101815 -0700
@@ -1,8 +1,9 @@
-
 #
 # Network device configuration
 #
 
+menu "Network device support"
+
 config NETDEVICES
 	depends on NET
 	bool "Network device support"
@@ -2536,3 +2537,5 @@ config NETCONSOLE
 	If you want to log kernel messages over the network, enable this.
 	See  for details.
 
+endmenu
+
diff -Naurp -X /home/rddunlap/doc/dontdiff-osdl linux-2612-rc1-bk5-pv/net/bridge/netfilter/Kconfig linux-2612-rc1-bk5-netconfigs/net/bridge/netfilter/Kconfig
--- linux-2612-rc1-bk5-pv/net/bridge/netfilter/Kconfig	2005-03-01 23:37:50.0 -0800
+++ linux-2612-rc1-bk5-netconfigs/net/bridge/netfilter/Kconfig	2005-04-03 19:45:18.0 -0700
@@ -139,6 +139,7 @@ config BRIDGE_EBT_VLAN
 config BRIDGE_EBT_ARPREPLY
 	tristate "ebt: arp reply target support"
 	depends on BRIDGE_NF_EBTABLES
+	depends on INET
 	help
 	  This option adds the arp reply target, which allows
 	  automatically sending arp replies to arp requests.
diff -Naurp -X /home/rddunlap/doc/dontdiff-osdl linux-2612-rc1-bk5-pv/net/Kconfig linux-2612-rc1-bk5-netconfigs/net/Kconfig
--- linux-2612-rc1-bk5-pv/net/Kconfig	2005-04-03 19:42:35.0 -0700
+++ linux-2612-rc1-bk5-netconfigs/net/Kconfig	2005-04-03 19:45:18.0 -0700
@@ -2,7 +2,7 @@
 # Network configuration
 #
 
-menu "Networking support"
+menu "Networking options and protocols"
 
 config NET
 	bool "Networking support"
@@ -10,7 +10,9 @@ config NET
 	  Unless you really know what you are doing, you should say Y here.
 	  The reason is that some programs need kernel networking support even
 	  when running on a stand-alone machine that isn't connected to any
-	  other computer. If you are upgrading from an older kernel, you
+	  other computer.
+
+	  If you are upgrading from an older kernel, you
 	  should consider updating your networking tools too because changes
 	  in the kernel and the tools often go 

Re: Use of C99 int types

2005-04-03 Thread Herbert Xu
Dag Arne Osvik <[EMAIL PROTECTED]> wrote:
>
>>... and with such name 99% will assume (at least at the first reading)
>>that it _is_ 32bits.  We have more than enough portability bugs as it
>>is, no need to invite more by bad names.
> 
> Agreed.  The way I see it there are two reasonable options.  One is to 
> just use u32, which is always correct but sacrifices speed (at least 
> with the current gcc).  The other is to introduce C99 types, which Linus 
> doesn't seem to object to when they are kept away from interfaces 
> (http://infocenter.guardiandigital.com/archive/linux-kernel/2004/Dec/0117.html).

There is a third option which has already been pointed out before:

Use unsigned long.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] RCU in kernel/intermodule.c

2005-04-03 Thread Dave Jones
On Sun, Apr 03, 2005 at 01:38:33PM -0400, Kyle Moffett wrote:

 > Also, the intermodule stuff is slated for removal, as soon as MTD and
 > such are fixed; the interface has been deprecated for a while.

Actually 'just' mtd now iirc. agpgart was the penultimate user which
got fixed a few months back.

Dave

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 0/6] S3 SMP support with physcial CPU hotplug

2005-04-03 Thread Li Shaohua
On Mon, 2005-04-04 at 10:48, Andrew Morton wrote:
> Li Shaohua <[EMAIL PROTECTED]> wrote:
> >
> > On Mon, 2005-04-04 at 10:37, Andrew Morton wrote:
> > > Li Shaohua <[EMAIL PROTECTED]> wrote:
> > > >
> > > > The patches are against 2.6.11-rc1 with Zwane's CPU hotplug patch in -mm
> > > >  tree.
> > > 
> > > Should I merge that thing into mainline?  It seems that a few people are
> > > needing it.
> > I'd like to listen to some comments first. There are still some things
> > I'm not sure, such as the do_exit_idle.
> > 
> 
> I was referring to Zwane's i386-cpu-hotplug-updated-for-mm.patch
Yep, great. Pavel's swsusp also need it.

Thanks,
Shaohua

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 0/6] S3 SMP support with physcial CPU hotplug

2005-04-03 Thread Andrew Morton
Li Shaohua <[EMAIL PROTECTED]> wrote:
>
> On Mon, 2005-04-04 at 10:37, Andrew Morton wrote:
> > Li Shaohua <[EMAIL PROTECTED]> wrote:
> > >
> > > The patches are against 2.6.11-rc1 with Zwane's CPU hotplug patch in -mm
> > >  tree.
> > 
> > Should I merge that thing into mainline?  It seems that a few people are
> > needing it.
> I'd like to listen to some comments first. There are still some things
> I'm not sure, such as the do_exit_idle.
> 

I was referring to Zwane's i386-cpu-hotplug-updated-for-mm.patch
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 0/6] S3 SMP support with physcial CPU hotplug

2005-04-03 Thread Li Shaohua
On Mon, 2005-04-04 at 10:37, Andrew Morton wrote:
> Li Shaohua <[EMAIL PROTECTED]> wrote:
> >
> > The patches are against 2.6.11-rc1 with Zwane's CPU hotplug patch in -mm
> >  tree.
> 
> Should I merge that thing into mainline?  It seems that a few people are
> needing it.
I'd like to listen to some comments first. There are still some things
I'm not sure, such as the do_exit_idle.

Thanks,
Shaohua

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 0/6] S3 SMP support with physcial CPU hotplug

2005-04-03 Thread Andrew Morton
Li Shaohua <[EMAIL PROTECTED]> wrote:
>
> The patches are against 2.6.11-rc1 with Zwane's CPU hotplug patch in -mm
>  tree.

Should I merge that thing into mainline?  It seems that a few people are
needing it.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


AW: Gewinn

2005-04-03 Thread Support
Herzlichen GlĂĽckwĂĽnsch.

Sie sind einer der glĂĽcklichen Menschen denen wir folgendes Super-Angebot
unterbreiten können:
Spielen Sie mit beim Casino Royal Las Vegas und freuen Sie sich auf bis zu
500 $ Extra
beim ersten Kauf von Chips!!
Ja Sie lesen richtig - bei Ihrem ersten Chipkauf geben wir ihnen bis zu
500$ extra - geschenkt ohne Verpflichtungen.

Holen Sie sich das Casino-Game zum bequemen spielen von Zuhause. 

JETZT MIT TOLLEN NEUEN SPIELEN UND GEWINNCHANCEN

Sie können das Spiel auch ohne Geldeinsatz spielen - allerdings können Sie
dann auch nichts gewinnen. 
Aufladen können Sie ihr Konto auf den Verschiedensten Wegen - mit
Kreditkarte. BankĂĽberweisung und und und.

Also worauf warten Sie noch - gehören auch Sie bald zu den glücklichen
Gewinnern...

http://www.yournews99.com/a22werde/



Mit freundlichem GruĂź


Das Casino-Team


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Software Suspend on Sony Vaio R505E

2005-04-03 Thread Aaron Gaudio
(I'm not currently subscribed to the kernel mailing list, please CC any
replies to my account.)

I've been trying hibernate on my Sony Vaio R505E using Software Suspend.
I'm using the following command:
echo -n "disk" > /sys/power/state

The basic suspend/resume seems to be working now (after many failed
attempts at the past). 

However, there are still a few problems...

I'm using Fedora Core 3 with the fedora kernel 2.6.10-1.770 kernel,
rebuilt to enable software suspend (and USB suspend). At some point,
I'll be trying the software suspend 2 patch to see how far that can get
me. Attached is the output of my syslog after the resume.

My wireless adapter doesn't work on resume. It's a built-in using the
orinoco_cs driver. I get the following messages in /var/log/messages:

Apr  3 13:07:12 localhost kernel: NETDEV WATCHDOG: eth1: transmit timed
out
Apr  3 13:07:12 localhost kernel: eth1: Tx timeout! ALLOCFID=00c0,
TXCOMPLFID=00bf, EVSTAT=808b

I also get a bunch of messages about usb problems, though I don't have a
USB device connected at the time:

Apr  3 13:07:25 localhost kernel: usb 3-1: 05-wait_for_sys timed out on
ep0in
Apr  3 13:07:25 localhost kernel: usb 3-1: khubd timed out on ep0in
Apr  3 13:07:30 localhost kernel: usb 3-1: khubd timed out on ep0in
Apr  3 13:07:30 localhost kernel: usb 3-1: string descriptor 0 read
error: -110

... and so forth...

Finally, trying to use 3D fails after resuming (Intel 830 using the i915
DRM module). glxgears gives this error: "intelWaitIrq: drmI830IrqWait:
-16" and syslog has "Apr  3 20:29:53 localhost kernel:
[drm:i915_wait_irq] *ERROR* i915_wait_irq: EBUSY -- rec: 0 emitted: 5".
After restarting X, 3D works as usual.

If there's any further debug or investigation I can do, or any tricks to
try, lemme know. I'll try out Software Suspend 2 when I get a chance.

-- 
Aaron Gaudio <[EMAIL PROTECTED]>


suspend.sh
Description: application/shellscript


[RFC 6/6]Physcial CPU hotadd and S3 SMP support

2005-04-03 Thread Li Shaohua
Boot a CPU at runtime and use it to support S3 SMP.

Thanks,
Shaohua

---

 linux-2.6.11-root/arch/i386/kernel/smpboot.c |   79 +++
 linux-2.6.11-root/include/asm-i386/smp.h |4 +
 linux-2.6.11-root/kernel/power/main.c|   30 ++
 3 files changed, 104 insertions(+), 9 deletions(-)

diff -puN arch/i386/kernel/smpboot.c~warmboot_cpu arch/i386/kernel/smpboot.c
--- linux-2.6.11/arch/i386/kernel/smpboot.c~warmboot_cpu2005-04-04 
09:13:48.600255048 +0800
+++ linux-2.6.11-root/arch/i386/kernel/smpboot.c2005-04-04 
09:13:48.607253984 +0800
@@ -76,6 +76,12 @@ cpumask_t cpu_callin_map;
 cpumask_t cpu_callout_map;
 static cpumask_t smp_commenced_mask;
 
+/* This is ugly, but TSC's upper 32 bits can't be written in eariler CPU
+ * (before prescott), there is no way to resync one AP against BP
+ * TBD: for prescott and above, we should use IA64's algorithm
+ */
+static int __devinit tsc_sync_disabled;
+
 /* Per CPU bogomips and other parameters */
 struct cpuinfo_x86 cpu_data[NR_CPUS] __cacheline_aligned;
 
@@ -412,7 +418,7 @@ static void __devinit smp_callin(void)
/*
 *  Synchronize the TSC with the BP
 */
-   if (cpu_has_tsc && cpu_khz)
+   if (cpu_has_tsc && cpu_khz && !tsc_sync_disabled)
synchronize_tsc_ap();
 }
 
@@ -781,8 +787,19 @@ wakeup_secondary_cpu(int phys_apicid, un
 #endif /* WAKE_SECONDARY_VIA_INIT */
 
 extern cpumask_t cpu_initialized;
+static inline int alloc_cpu_id(void)
+{
+   cpumask_t   tmp_map;
+   int cpu;
 
-static int __devinit do_boot_cpu(int apicid)
+   cpus_complement(tmp_map, cpu_present_map);
+   cpu = first_cpu(tmp_map);
+   if (cpu >= NR_CPUS)
+   return -ENODEV;
+   return cpu;
+}
+
+static int __devinit do_boot_cpu(int apicid, int cpu)
 /*
  * NOTE - on most systems this is a PHYSICAL apic ID, but on multiquad
  * (ie clustered apic addressing mode), this is a LOGICAL apic ID.
@@ -791,15 +808,10 @@ static int __devinit do_boot_cpu(int api
 {
struct task_struct *idle;
unsigned long boot_error;
-   int timeout, cpu;
+   int timeout;
unsigned long start_eip;
unsigned short nmi_high = 0, nmi_low = 0;
-   cpumask_t   tmp_map;
 
-   cpus_complement(tmp_map, cpu_present_map);
-   cpu = first_cpu(tmp_map);
-   if (cpu >= NR_CPUS)
-   return -ENODEV;
++cpucount;
/*
 * We can't use kernel_thread since we must avoid to
@@ -920,6 +932,53 @@ void cpu_exit_clear(int cpu)
 
do_exit_idle();
 }
+
+struct warm_boot_cpu_info {
+   struct completion *complete;
+   int apicid;
+   int cpu;
+};
+
+static void __devinit do_warm_boot_cpu(void *p)
+{
+   struct warm_boot_cpu_info *info = p;
+   do_boot_cpu(info->apicid, info->cpu);
+   complete(info->complete);
+}
+
+int __devinit smp_prepare_cpu(int apicid)
+{
+   DECLARE_COMPLETION(done);
+   struct warm_boot_cpu_info info;
+   struct work_struct task;
+   int cpu;
+
+   lock_cpu_hotplug();
+   cpu = alloc_cpu_id();
+
+   if (cpu < 0)
+   goto exit;
+
+   info.complete = 
+   info.apicid = apicid;
+   info.cpu = cpu;
+   INIT_WORK(, do_warm_boot_cpu, );
+
+   tsc_sync_disabled = 1;
+
+   /* init low mem mapping */
+   memcpy(swapper_pg_dir, swapper_pg_dir + USER_PGD_PTRS,
+   sizeof(swapper_pg_dir[0]) * KERNEL_PGD_PTRS);
+   flush_tlb_all();
+   schedule_work();
+   wait_for_completion();
+
+   tsc_sync_disabled = 0;
+   zap_low_mappings();
+exit:
+   unlock_cpu_hotplug();
+   return cpu;
+}
 #endif
 static void smp_tune_scheduling (void)
 {
@@ -1064,7 +1123,7 @@ static void __init smp_boot_cpus(unsigne
if (max_cpus <= cpucount+1)
continue;
 
-   if (do_boot_cpu(apicid))
+   if (((cpu = alloc_cpu_id()) > 0) && do_boot_cpu(apicid, cpu))
printk("CPU #%d not responding - cannot use it.\n",
apicid);
else
@@ -1253,10 +1312,12 @@ void __init smp_cpus_done(unsigned int m
setup_ioapic_dest();
 #endif
zap_low_mappings();
+#ifndef CONFIG_STR_SMP
/*
 * Disable executability of the SMP trampoline:
 */
set_kernel_exec((unsigned long)trampoline_base, trampoline_exec);
+#endif
 }
 
 void __init smp_intr_init(void)
diff -puN kernel/power/main.c~warmboot_cpu kernel/power/main.c
--- linux-2.6.11/kernel/power/main.c~warmboot_cpu   2005-04-04 
09:13:48.601254896 +0800
+++ linux-2.6.11-root/kernel/power/main.c   2005-04-04 09:13:48.607253984 
+0800
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 
 
 #include "power.h"
@@ -137,6 +138,24 @@ static char * pm_states[] = {
 static int enter_state(suspend_state_t state)
 {
int error;
+#ifdef CONFIG_STR_SMP
+  

[RFC 5/6]clean cpu state after hotremove CPU

2005-04-03 Thread Li Shaohua
Clean up all CPU states including its runqueue and idle thread, 
so we can use boot time code without any changes.
Note this makes /sys/devices/system/cpu/cpux/online unworkable.

Thanks,
Shaohua

---

 linux-2.6.11-root/arch/i386/kernel/cpu/common.c |   12 
 linux-2.6.11-root/arch/i386/kernel/irq.c|5 +
 linux-2.6.11-root/arch/i386/kernel/process.c|   20 +++
 linux-2.6.11-root/arch/i386/kernel/smpboot.c|   44 -
 linux-2.6.11-root/include/asm-i386/irq.h|2 
 linux-2.6.11-root/kernel/exit.c |   59 +++
 linux-2.6.11-root/kernel/sched.c|   61 +---
 7 files changed, 195 insertions(+), 8 deletions(-)

diff -puN arch/i386/kernel/process.c~cpu_state_clean arch/i386/kernel/process.c
--- linux-2.6.11/arch/i386/kernel/process.c~cpu_state_clean 2005-03-31 
10:50:27.0 +0800
+++ linux-2.6.11-root/arch/i386/kernel/process.c2005-04-04 
09:07:29.172936768 +0800
@@ -144,12 +144,32 @@ static void poll_idle (void)
 
 #ifdef CONFIG_HOTPLUG_CPU
 #include 
+
+#ifdef CONFIG_STR_SMP
+extern void cpu_exit_clear(int);
+#endif
+
 /* We don't actually take CPU down, just spin without interrupts. */
 static inline void play_dead(void)
 {
+#ifdef CONFIG_STR_SMP
+   cpu_exit_clear(_smp_processor_id());
+#endif
+
/* Ack it */
__get_cpu_var(cpu_state) = CPU_DEAD;
 
+#ifdef CONFIG_STR_SMP
+   /*
+* With physical CPU hotplug, we should halt the CPU
+* Note: release idle task struct requires the CPU doesn't
+* touch stack or anything else.
+*/
+   local_irq_disable();
+   while (1)
+   __asm__ __volatile__ ("hlt": : :"memory");
+#endif
+
/* We shouldn't have to disable interrupts while dead, but
 * some interrupts just don't seem to go away, and this makes
 * it "work" for testing purposes. */
diff -puN arch/i386/kernel/smpboot.c~cpu_state_clean arch/i386/kernel/smpboot.c
--- linux-2.6.11/arch/i386/kernel/smpboot.c~cpu_state_clean 2005-03-31 
10:50:27.0 +0800
+++ linux-2.6.11-root/arch/i386/kernel/smpboot.c2005-04-04 
09:05:41.699275248 +0800
@@ -794,8 +794,13 @@ static int __devinit do_boot_cpu(int api
int timeout, cpu;
unsigned long start_eip;
unsigned short nmi_high = 0, nmi_low = 0;
+   cpumask_t   tmp_map;
 
-   cpu = ++cpucount;
+   cpus_complement(tmp_map, cpu_present_map);
+   cpu = first_cpu(tmp_map);
+   if (cpu >= NR_CPUS)
+   return -ENODEV;
+   ++cpucount;
/*
 * We can't use kernel_thread since we must avoid to
 * reschedule the child.
@@ -867,13 +872,16 @@ static int __devinit do_boot_cpu(int api
inquire_remote_apic(apicid);
}
}
-   x86_cpu_to_apicid[cpu] = apicid;
+
if (boot_error) {
/* Try to put things back the way they were before ... */
unmap_cpu_to_logical_apicid(cpu);
cpu_clear(cpu, cpu_callout_map); /* was set here 
(do_boot_cpu()) */
cpu_clear(cpu, cpu_initialized); /* was set by cpu_init() */
cpucount--;
+   } else {
+   x86_cpu_to_apicid[cpu] = apicid;
+   cpu_set(cpu, cpu_present_map);
}
 
/* mark "stuck" area as not stuck */
@@ -882,6 +890,37 @@ static int __devinit do_boot_cpu(int api
return boot_error;
 }
 
+#ifdef CONFIG_STR_SMP
+extern void do_exit_idle(void);
+extern void cpu_uninit(void);
+void cpu_exit_clear(int cpu)
+{
+   int sibling;
+   cpucount --;
+
+   cpu_uninit();
+
+   irq_ctx_exit(cpu);
+
+   cpu_clear(cpu, cpu_callout_map);
+   cpu_clear(cpu, cpu_callin_map);
+   cpu_clear(cpu, cpu_present_map);
+
+   x86_cpu_to_apicid[cpu] = BAD_APICID;
+
+   for_each_cpu_mask(sibling, cpu_sibling_map[cpu])
+   cpu_clear(cpu, cpu_sibling_map[sibling]);
+   cpus_clear(cpu_sibling_map[cpu]);
+
+   phys_proc_id[cpu] = BAD_APICID;
+
+   cpu_clear(cpu, smp_commenced_mask);
+
+   unmap_cpu_to_logical_apicid(cpu);
+
+   do_exit_idle();
+}
+#endif
 static void smp_tune_scheduling (void)
 {
unsigned long cachesize;   /* kB   */
@@ -1104,6 +1143,7 @@ void __devinit smp_prepare_boot_cpu(void
 {
cpu_set(smp_processor_id(), cpu_online_map);
cpu_set(smp_processor_id(), cpu_callout_map);
+   cpu_set(smp_processor_id(), cpu_present_map);
 }
 
 #ifdef CONFIG_HOTPLUG_CPU
diff -puN arch/i386/kernel/cpu/common.c~cpu_state_clean 
arch/i386/kernel/cpu/common.c
--- linux-2.6.11/arch/i386/kernel/cpu/common.c~cpu_state_clean  2005-03-31 
10:50:27.0 +0800
+++ linux-2.6.11-root/arch/i386/kernel/cpu/common.c 2005-03-31 
10:50:27.0 +0800
@@ -621,3 +621,15 @@ void __devinit cpu_init (void)
clear_used_math();
mxcsr_feature_mask_init();
 }
+
+#ifdef CONFIG_STR_SMP
+void 

[RFC 3/6]init call cleanup

2005-04-03 Thread Li Shaohua
Trival patch for CPU hotplug. In CPU identify part, only did cleanup for intel
CPUs. Need do for other CPUs if they support S3 SMP.

Thanks,
Shaohua

---

 linux-2.6.11-root/arch/i386/kernel/apic.c|   14 +++
 linux-2.6.11-root/arch/i386/kernel/cpu/common.c  |   30 +++
 linux-2.6.11-root/arch/i386/kernel/cpu/intel.c   |   10 ++---
 linux-2.6.11-root/arch/i386/kernel/cpu/intel_cacheinfo.c |4 +-
 linux-2.6.11-root/arch/i386/kernel/cpu/mcheck/mce.c  |4 +-
 linux-2.6.11-root/arch/i386/kernel/cpu/mcheck/p4.c   |4 +-
 linux-2.6.11-root/arch/i386/kernel/cpu/mcheck/p5.c   |2 -
 linux-2.6.11-root/arch/i386/kernel/cpu/mcheck/p6.c   |2 -
 linux-2.6.11-root/arch/i386/kernel/process.c |2 -
 linux-2.6.11-root/arch/i386/kernel/smpboot.c |   18 -
 linux-2.6.11-root/arch/i386/kernel/timers/timer_tsc.c|2 -
 11 files changed, 46 insertions(+), 46 deletions(-)

diff -puN arch/i386/kernel/process.c~init_call_cleanup 
arch/i386/kernel/process.c
--- linux-2.6.11/arch/i386/kernel/process.c~init_call_cleanup   2005-03-31 
10:48:40.721107104 +0800
+++ linux-2.6.11-root/arch/i386/kernel/process.c2005-03-31 
10:48:40.745103456 +0800
@@ -242,7 +242,7 @@ static void mwait_idle(void)
}
 }
 
-void __init select_idle_routine(const struct cpuinfo_x86 *c)
+void __devinit select_idle_routine(const struct cpuinfo_x86 *c)
 {
if (cpu_has(c, X86_FEATURE_MWAIT)) {
printk("monitor/mwait feature present.\n");
diff -puN arch/i386/kernel/smpboot.c~init_call_cleanup 
arch/i386/kernel/smpboot.c
--- linux-2.6.11/arch/i386/kernel/smpboot.c~init_call_cleanup   2005-03-31 
10:48:40.722106952 +0800
+++ linux-2.6.11-root/arch/i386/kernel/smpboot.c2005-03-31 
10:48:40.746103304 +0800
@@ -59,7 +59,7 @@
 #include 
 
 /* Set if we find a B stepping CPU */
-static int __initdata smp_b_stepping;
+static int __devinitdata smp_b_stepping;
 
 /* Number of siblings per CPU package */
 int smp_num_siblings = 1;
@@ -103,7 +103,7 @@ DEFINE_PER_CPU(int, cpu_state) = { 0 };
  * has made sure it's suitably aligned.
  */
 
-static unsigned long __init setup_trampoline(void)
+static unsigned long __devinit setup_trampoline(void)
 {
memcpy(trampoline_base, trampoline_data, trampoline_end - 
trampoline_data);
return virt_to_phys(trampoline_base);
@@ -133,7 +133,7 @@ void __init smp_alloc_memory(void)
  * a given CPU
  */
 
-static void __init smp_store_cpu_info(int id)
+static void __devinit smp_store_cpu_info(int id)
 {
struct cpuinfo_x86 *c = cpu_data + id;
 
@@ -327,7 +327,7 @@ extern void calibrate_delay(void);
 
 static atomic_t init_deasserted;
 
-static void __init smp_callin(void)
+static void __devinit smp_callin(void)
 {
int cpuid, phys_id;
unsigned long timeout;
@@ -423,7 +423,7 @@ extern void enable_sep_cpu(void *);
 /*
  * Activate a secondary processor.
  */
-static void __init start_secondary(void *unused)
+static void __devinit start_secondary(void *unused)
 {
int siblings = 0;
int i;
@@ -486,7 +486,7 @@ static void __init start_secondary(void 
  * from the task structure
  * This function must not return.
  */
-void __init initialize_secondary(void)
+void __devinit initialize_secondary(void)
 {
/*
 * We don't actually need to load the full TSS,
@@ -600,7 +600,7 @@ static inline void __inquire_remote_apic
  * INIT, INIT, STARTUP sequence will reset the chip hard for us, and this
  * won't ... remember to clear down the APIC, etc later.
  */
-static int __init
+static int __devinit
 wakeup_secondary_cpu(int logical_apicid, unsigned long start_eip)
 {
unsigned long send_status = 0, accept_status = 0;
@@ -646,7 +646,7 @@ wakeup_secondary_cpu(int logical_apicid,
 #endif /* WAKE_SECONDARY_VIA_NMI */
 
 #ifdef WAKE_SECONDARY_VIA_INIT
-static int __init
+static int __devinit
 wakeup_secondary_cpu(int phys_apicid, unsigned long start_eip)
 {
unsigned long send_status = 0, accept_status = 0;
@@ -782,7 +782,7 @@ wakeup_secondary_cpu(int phys_apicid, un
 
 extern cpumask_t cpu_initialized;
 
-static int __init do_boot_cpu(int apicid)
+static int __devinit do_boot_cpu(int apicid)
 /*
  * NOTE - on most systems this is a PHYSICAL apic ID, but on multiquad
  * (ie clustered apic addressing mode), this is a LOGICAL apic ID.
diff -puN arch/i386/kernel/cpu/common.c~init_call_cleanup 
arch/i386/kernel/cpu/common.c
--- linux-2.6.11/arch/i386/kernel/cpu/common.c~init_call_cleanup
2005-03-31 10:48:40.724106648 +0800
+++ linux-2.6.11-root/arch/i386/kernel/cpu/common.c 2005-03-31 
10:48:40.747103152 +0800
@@ -21,9 +21,9 @@
 DEFINE_PER_CPU(struct desc_struct, cpu_gdt_table[GDT_ENTRIES]);
 EXPORT_PER_CPU_SYMBOL(cpu_gdt_table);
 
-static int cachesize_override __initdata = -1;
-static int disable_x86_fxsr __initdata = 0;
-static int disable_x86_serial_nr __initdata = 1;
+static int cachesize_override __devinitdata = 

[RFC 4/6]Add kconfig for S3 SMP

2005-04-03 Thread Li Shaohua
Add kconfig for IA32 S3 SMP.

Thanks,
Shaohua

---

 linux-2.6.11-root/kernel/power/Kconfig |7 +++
 1 files changed, 7 insertions(+)

diff -puN kernel/power/Kconfig~smp_s3_kconfig kernel/power/Kconfig
--- linux-2.6.11/kernel/power/Kconfig~smp_s3_kconfig2005-03-31 
10:49:57.156487160 +0800
+++ linux-2.6.11-root/kernel/power/Kconfig  2005-03-31 10:49:57.158486856 
+0800
@@ -72,3 +72,10 @@ config PM_STD_PARTITION
  suspended image to. It will simply pick the first available swap 
  device.
 
+config STR_SMP
+   bool "Suspend to RAM SMP support (EXPERIMENTAL)"
+   depends on EXPERIMENTAL && ACPI_SLEEP && !X86_64
+   depends on HOTPLUG_CPU
+   default y
+   ---help---
+enable Suspend to RAM SMP support. Some HT systems require this.
_


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Nick Piggin
Paul Jackson wrote:
Ingo wrote:
if you create a sched-domains hierarchy (based on the SLIT tables, or in 
whatever other way) that matches the CPU hierarchy then you'll 
automatically get the proper distances detected.

Yes - agreed.  I should push in the direction of improving the
SN2 sched domain hierarchy.
I'd just be a bit careful about this. Your biggest systems will have
what? At least 7 or 8 domains if you're just going by the number of
hops, right? And maybe more if there is more to your topology than
just number of hops.
sched-domains firstly has a few problems even with your 2 level NUMA
domains (although I'm looking at fixing them if possible), but also
everything just has to do more work as you traverse the domains and
scan all CPUs for balancing opportunities. And its not like the cpu
scheduler uses any sort of exact science to make choices...
Nick
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 2/6]cpu_sibling_map rework

2005-04-03 Thread Li Shaohua

Make sibling map init per-cpu. Hotplug CPU may change the map at runtime.
cpuhotplug semaphore should be used to protect the map.


Thanks,
Shaohua

---

 linux-2.6.11-root/arch/i386/kernel/smpboot.c |   56 +--
 1 files changed, 29 insertions(+), 27 deletions(-)

diff -puN arch/i386/kernel/smpboot.c~sibling_map_init_cleanup 
arch/i386/kernel/smpboot.c
--- linux-2.6.11/arch/i386/kernel/smpboot.c~sibling_map_init_cleanup
2005-03-28 16:29:55.0 +0800
+++ linux-2.6.11-root/arch/i386/kernel/smpboot.c2005-03-31 
10:46:51.572700184 +0800
@@ -63,9 +63,12 @@ static int __initdata smp_b_stepping;
 
 /* Number of siblings per CPU package */
 int smp_num_siblings = 1;
-int phys_proc_id[NR_CPUS]; /* Package ID of each logical CPU */
+/* Package ID of each logical CPU */
+int phys_proc_id[NR_CPUS] = {[0 ... NR_CPUS-1] = BAD_APICID};
 EXPORT_SYMBOL(phys_proc_id);
 
+cpumask_t cpu_sibling_map[NR_CPUS] __cacheline_aligned;
+
 /* bitmap of online cpus */
 cpumask_t cpu_online_map;
 
@@ -422,6 +425,9 @@ extern void enable_sep_cpu(void *);
  */
 static void __init start_secondary(void *unused)
 {
+   int siblings = 0;
+   int i;
+   int self = smp_processor_id();
/*
 * Dont put anything before smp_callin(), SMP
 * booting is too fragile that we want to limit the
@@ -443,6 +449,27 @@ static void __init start_secondary(void 
 * the local TLBs too.
 */
local_flush_tlb();
+
+   /* This must be doen before setting cpu_online_map */
+   if (smp_num_siblings > 1) {
+   for (i = 0; i < NR_CPUS; i++) {
+   if (!cpu_isset(i, cpu_callout_map))
+   continue;
+   if (phys_proc_id[self] == phys_proc_id[i]) {
+   siblings ++;
+   cpu_set(i, cpu_sibling_map[self]);
+   cpu_set(self, cpu_sibling_map[i]);
+   }
+   }
+   } else {
+   siblings ++;
+   cpu_set(self, cpu_sibling_map[self]);
+   }
+
+   if (siblings != smp_num_siblings)
+   printk(KERN_WARNING "WARNING: %d siblings found for CPU%d, 
should be %d\n", siblings, self, smp_num_siblings);
+   wmb();
+
cpu_set(smp_processor_id(), cpu_online_map);
 
/* We can take interrupts now: we're officially "up". */
@@ -893,8 +920,6 @@ static int boot_cpu_logical_apicid;
 /* Where the IO area was mapped on multiquad, always 0 otherwise */
 void *xquad_portio;
 
-cpumask_t cpu_sibling_map[NR_CPUS] __cacheline_aligned;
-
 static void __init smp_boot_cpus(unsigned int max_cpus)
 {
int apicid, cpu, bit, kicked;
@@ -1049,30 +1074,7 @@ static void __init smp_boot_cpus(unsigne
 */
for (cpu = 0; cpu < NR_CPUS; cpu++)
cpus_clear(cpu_sibling_map[cpu]);
-
-   for (cpu = 0; cpu < NR_CPUS; cpu++) {
-   int siblings = 0;
-   int i;
-   if (!cpu_isset(cpu, cpu_callout_map))
-   continue;
-
-   if (smp_num_siblings > 1) {
-   for (i = 0; i < NR_CPUS; i++) {
-   if (!cpu_isset(i, cpu_callout_map))
-   continue;
-   if (phys_proc_id[cpu] == phys_proc_id[i]) {
-   siblings++;
-   cpu_set(i, cpu_sibling_map[cpu]);
-   }
-   }
-   } else {
-   siblings++;
-   cpu_set(cpu, cpu_sibling_map[cpu]);
-   }
-
-   if (siblings != smp_num_siblings)
-   printk(KERN_WARNING "WARNING: %d siblings found for 
CPU%d, should be %d\n", siblings, cpu, smp_num_siblings);
-   }
+   cpu_set(0, cpu_sibling_map[0]);
 
if (nmi_watchdog == NMI_LOCAL_APIC)
check_nmi_watchdog();
_


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 0/6] S3 SMP support with physcial CPU hotplug

2005-04-03 Thread Li Shaohua
Hi,
The following 6 patches try to add suspend-to-ram (or S3) SMP support
for IA32. It's for support HT based system suspend/resume currently and
most of the code are also useful for physical CPU hotplug.

In a SMP system, after S3 resume, the BP is starting to execute the ACPI
wakeup address just like the UP case. And the APs possibly are in a
BIOS's busy loop. This just looks like the boot time case, we must use a
SIPI circle to wakeup the APs.

We uses the CPU hotplug infrastructure. In order to reuse the SMP boot
code, we clean up all CPU states after the CPU is dead, including its
idle thread, runqueue and other CPU states. Since the CPU is in idle
thread before suspend, we don't require to save and restore after resume
most of the CPU states.

Now the sequences of S3 are:
1. hotremove all APs, put them into idle thread.
2. follow UP S3 code path.
3. warm boot all APs.
4. UP all APs.

The patches are against 2.6.11-rc1 with Zwane's CPU hotplug patch in -mm
tree. To test the SMP S3, please don't enable MTRR driver (it's SMP
broken for Suspend/resume). And please kill syslogd, there is a bug in
the sususpend/resume refrigerator mechanism, which can be fixed by
swsusp2 refrigerator.
I'm looking forward to your comments. Thanks in advance!

Thanks,
Shaohua

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 1/6]SEP initialization rework

2005-04-03 Thread Li Shaohua

Make SEP init per-cpu, so is hotplug safe.

Thanks,
Shaohua

---

 linux-2.6.11-root/arch/i386/kernel/smpboot.c   |6 ++
 linux-2.6.11-root/arch/i386/kernel/sysenter.c  |   10 ++
 linux-2.6.11-root/arch/i386/mach-voyager/voyager_smp.c |6 ++
 3 files changed, 18 insertions(+), 4 deletions(-)

diff -puN arch/i386/kernel/sysenter.c~sep_init_cleanup 
arch/i386/kernel/sysenter.c
--- linux-2.6.11/arch/i386/kernel/sysenter.c~sep_init_cleanup   2005-03-28 
09:32:30.936304248 +0800
+++ linux-2.6.11-root/arch/i386/kernel/sysenter.c   2005-03-28 
09:58:20.703703792 +0800
@@ -26,6 +26,11 @@ void enable_sep_cpu(void *info)
int cpu = get_cpu();
struct tss_struct *tss = _cpu(init_tss, cpu);
 
+   if (!boot_cpu_has(X86_FEATURE_SEP)) {
+   put_cpu();
+   return;
+   }
+
tss->ss1 = __KERNEL_CS;
tss->esp1 = sizeof(struct tss_struct) + (unsigned long) tss;
wrmsr(MSR_IA32_SYSENTER_CS, __KERNEL_CS, 0);
@@ -41,7 +46,7 @@ void enable_sep_cpu(void *info)
 extern const char vsyscall_int80_start, vsyscall_int80_end;
 extern const char vsyscall_sysenter_start, vsyscall_sysenter_end;
 
-static int __init sysenter_setup(void)
+int __init sysenter_setup(void)
 {
void *page = (void *)get_zeroed_page(GFP_ATOMIC);
 
@@ -58,8 +63,5 @@ static int __init sysenter_setup(void)
   _sysenter_start,
   _sysenter_end - _sysenter_start);
 
-   on_each_cpu(enable_sep_cpu, NULL, 1, 1);
return 0;
 }
-
-__initcall(sysenter_setup);
diff -puN arch/i386/kernel/smpboot.c~sep_init_cleanup arch/i386/kernel/smpboot.c
--- linux-2.6.11/arch/i386/kernel/smpboot.c~sep_init_cleanup2005-03-28 
09:33:49.972288952 +0800
+++ linux-2.6.11-root/arch/i386/kernel/smpboot.c2005-03-28 
09:46:01.814032096 +0800
@@ -415,6 +415,8 @@ static void __init smp_callin(void)
 
 static int cpucount;
 
+extern int sysenter_setup(void);
+extern void enable_sep_cpu(void *);
 /*
  * Activate a secondary processor.
  */
@@ -445,6 +447,7 @@ static void __init start_secondary(void 
 
/* We can take interrupts now: we're officially "up". */
local_irq_enable();
+   enable_sep_cpu(NULL);
 
wmb();
cpu_idle();
@@ -913,6 +916,9 @@ static void __init smp_boot_cpus(unsigne
cpus_clear(cpu_sibling_map[0]);
cpu_set(0, cpu_sibling_map[0]);
 
+   sysenter_setup();
+   enable_sep_cpu(NULL);
+
/*
 * If we couldn't find an SMP configuration at boot time,
 * get out of here now!
diff -puN arch/i386/mach-voyager/voyager_smp.c~sep_init_cleanup 
arch/i386/mach-voyager/voyager_smp.c
--- linux-2.6.11/arch/i386/mach-voyager/voyager_smp.c~sep_init_cleanup  
2005-03-28 09:48:27.909822160 +0800
+++ linux-2.6.11-root/arch/i386/mach-voyager/voyager_smp.c  2005-03-28 
09:51:37.896939728 +0800
@@ -441,6 +441,8 @@ setup_trampoline(void)
return virt_to_phys((__u8 *)trampoline_base);
 }
 
+extern void enable_sep_cpu(void *);
+extern int sysenter_setup(void);
 /* Routine initially called when a non-boot CPU is brought online */
 static void __init
 start_secondary(void *unused)
@@ -499,6 +501,7 @@ start_secondary(void *unused)
while (!cpu_isset(cpuid, smp_commenced_mask))
rep_nop();
local_irq_enable();
+   enable_sep_cpu(NULL);
 
local_flush_tlb();
 
@@ -696,6 +699,9 @@ smp_boot_cpus(void)
printk("CPU%d: ", boot_cpu_id);
print_cpu_info(_data[boot_cpu_id]);
 
+   sysenter_setup();
+   enable_sep_cpu(NULL);
+
if(is_cpu_quad()) {
/* booting on a Quad CPU */
printk("VOYAGER SMP: Boot CPU is Quad\n");
_


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [patch] sched: improve pinned task handling again!

2005-04-03 Thread Chen, Kenneth W
Siddha, Suresh B wrote on Friday, April 01, 2005 8:05 PM
> On Sat, Apr 02, 2005 at 01:11:20PM +1000, Nick Piggin wrote:
> > How important is this? Any application to real workloads? Even if
> > not, I agree it would be nice to improve this more. I don't know
> > if I really like this approach - I guess due to what it adds to
> > fastpaths.
>
> Ken initially observed with older kernels(2.4 kernel with Ingo's sched),
> it was happening with few hundred processes. 2.6 is not that bad and it
> improved with recent fixes. It is not very important. We want to raise
> the flag and see if we can comeup with a decent solution.

The livelock is observed with an in-house stress test suite.  The original
intent of that test is remotely connected to stress the kernel.  It is by
accident that it triggered a kernel issue.  Though, we are now worried that
this can be used as a DOS attack.


Nick Piggin wrote on Friday, April 01, 2005 7:11 PM
> > Now presumably if the all_pinned logic is working properly in the
> > first place, and it is correctly causing balancing to back-off, you
> > could tweak that a bit to avoid livelocks? Perhaps the all_pinned
> > case should back off faster than the usual doubling of the interval,
> > and be allowed to exceed max_interval?

This sounds plausible, though my first try did not yield desired result
(i.e., still hangs the kernel, I might missed a few things here and there).


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-2.4.30 released

2005-04-03 Thread Marcelo Tosatti
final:

- 2.4.30-rc4 was released as 2.4.30 with no changes.



Summary of changes from v2.4.30-rc3 to v2.4.30-rc4


Herbert Xu:
  o [NETLINK] Fix bogus mc_list deletion

Marcelo Tosatti:
  o Cset exclude: [EMAIL PROTECTED]|ChangeSet|20050226095914|25750
  o Change VERSION to 2.4.30-rc4

Willy Tarreau:
  o Keith Owens: modutils >= 2.4.14 is required for 
MODVERSIONS+EXPORT_SYMBOL_GPL() combination


Summary of changes from v2.4.30-rc2 to v2.4.30-rc3


Marcelo Tosatti:
  o Andreas Arens: Fix deadly mismerge of binfmt_elf DoS fix
  o Change VERSION to 2.4.30-rc3


Summary of changes from v2.4.30-rc1 to v2.4.30-rc2


:
  o [TG3]: Add missing CHIPREV_5750_{A,B}X defines
  o [TG3]: Missing counter bump in tigon3_4gb_hwbug_workaround()
  o [TG3]: Update driver version and reldate

:
  o eepro100: fix module parameter description typo

:
  o CAN-2005-0400: ext2 mkdir() directory entry random kernel memory leak

:
  o fs/hpfs/*: fix HPFS support under 64-bit kernel

:
  o [NETFILTER]: Fix another DECLARE_MUTEX in header file

Bjorn Helgaas:
  o ia64: force all kernel sections into one and the same segment
  o ia64: round iommu allocations to power-of-two sizes
  o ia64: fix perfmon typo in /proc/pal/CPU*/processor_info w.r.t. BERR
  o ia64: add missing syscall-slot
  o ia64: Update defconfigs

Chris Wright:
  o isofs: Some more defensive checks to keep corrupt isofs images from 
corrupting memory/oopsing

Dave Kleikamp:
  o JFS: remove aops from directory inodes

David Mosberger:
  o Fix pte_modify() bug which allowed mprotect() to change too many bits
  o ia64: Fix _PAGE_CHG_MASK so PROT_NONE works again.  Caught by Linus

Greg Banks:
  o link_path_walk refcount problem allows umount of active filesystem

Herbert Xu:
  o [CRYPTO]: Mark myself as co-maintainer
  o [NETLINK]: Fix multicast bind/autobind race
  o CAN-2005-0794: Potential DOS in load_elf_library

Keith Owens:
  o [IA64] Sanity check unw_unwind_to_user
  o [IA64] Tighten up unw_unwind_to_user check

Linus Torvalds:
  o isofs: Handle corupted rock-ridge info slightly better
  o isofs: more "corrupted iso image" error cases

Marcel Holtmann:
  o CAN-2005-0750: Fix af_bluetooth range checking bug, discovered by Ilja van 
Sprundel <[EMAIL PROTECTED]>

Marcelo Tosatti:
  o Change VERSION to 2.4.30-rc2

Michael Chan:
  o [TG3]: Add 5705_plus flag
  o [TG3]: Flush status block in tg3_interrupt()
  o [TG3]: Add unstable PLL workaround for 5750
  o [TG3]: Fix jumbo frames phy settings
  o [TG3]: Fix ethtool set functions
  o [TG3]: Add Broadcom copyright

Neil Brown:
  o nlm: fix f_count leak
  o [PATCH md: allow degraded raid1 array to resync after an unclean shutdown

Pablo Neira:
  o [NETFILTER]: Fix DECLARE_MUTEX in header file

Patrick McHardy:
  o [NETFILTER]: fix return values of ipt_recent checkentry
  o [NETFILTER]: Fix ip_ct_selective_cleanup(), and rename 
ip_ct_iterate_cleanup()
  o [NETFILTER]: Fix cleanup in ipt_recent
  o [NETFILTER]: Fix ip6tables ESP matching with "-p all"
  o [NETFILTER]: Fix refreshing of overlapping expectations
  o [NETFILTER]: Fix IP/TCP option logging
  o [TUN]: Fix check for underflow

Pete Zaitcev:
  o USB: fix oops in serial_write
  o USB: Fix baud selection in mct_u232

Simon Horman:
  o [IPVS]: Fix comment typos
  o Backport v2.6 ATM copy-to-user signedness fix
  o earlyquirk.o is needed for CONFIG_ACPI_BOOT

Stephen Hemminger:
  o [TCP]: BIC not binary searching correctly

Wensong Zhang:
  o [IPVS]: Update mark->cw in the WRR scheduler while service is updated

Yanmin Zhang:
  o [IA64] clean up ptrace corner cases



Summary of changes from v2.4.30-pre3 to v2.4.30-rc1


:
  o [SPARC32]: Fix build dependencies for vmlinux.o
  o [SPARC32]: Fix sun4d sbus and current handling
  o [SPARC32]: sun4d needs ZS_WSYNC() zilog reg flushing too

:
  o [SPARC64]: Fix semtimedop compat ipc code

:
  o Fix softdog no reboot on unexpected close

Alan Hourihane:
  o agpgart Intel i915GM ID's and tweaks

Andrea Arcangeli:
  o Write throttling should not take free highmem into account

Chris Wedgwood:
  o early boot code references check_acpi_pci()

Linus Torvalds:
  o Workaround possible pty line discipline race

Marcelo Tosatti:
  o Andrea Arcangeli: get_user_pages() shall not grab PG_reserved pages
  o Paul Mackerras: Remote Linux DoS on ppp servers (CAN-2005-0384)
  o Change VERSION to 2.4.30-rc1

Roland McGrath:
  o i386/x86_64 fpu: fix x87 tag word simulation using fxsave

Solar Designer:
  o Enable gcc warnings for vsprintf/vsnprintf with "format" attribute

Stephen Hemminger:
  o TCP BIC not binary searching correctly

Willy Tarreau:
  o acpi.h needs 


Summary of changes from v2.4.30-pre2 to v2.4.30-pre3


:
  o [SPARC64]: Tomatillo PCI controller bug fixes
  o [TIGON3]: Do not touch NIC_SRAM_FIRMWARE_MBOX 

RE: Industry db benchmark result on recent 2.6 kernels

2005-04-03 Thread Kevin Puetz
Linus Torvalds wrote:

> 
> 
> On Fri, 1 Apr 2005, Chen, Kenneth W wrote:
>> 
>> Paul, you definitely want to check this out on your large numa box.  I
>> booted a kernel with this patch on a 32-way numa box and it took a long
>>  time to produce the cost matrix.
> 
> Is there anything fundamentally wrong with the notion of just initializing
> the cost matrix to something that isn't completely wrong at bootup, and
> just lettign user space fill it in?

Wouldn't getting rescheduled (and thus having another program trash the
cache on you) really mess up the data collection though? I suppose by
spawning off threads, each with a fixed affinity and SCHED_FIFO one could
hang onto the CPU to collect the data. But then it's not (a lot) different
than doing it in-kernel.
 
> Then you couple that with a program that can do so automatically (ie
> move the in-kernel heuristics into user-land), and something that can
> re-load it on demand.

This part seems sensible though :-)

> Voila - you have something potentially expensive that you run once, and
> then you have a matrix that can be edited by the sysadmin later and just
> re-loaded at each boot.. That sounds pretty optimal, especially in the
> sense that it allows the sysadmin to tweak things depending on the use of
> the box is he really wants to.
> 
> Hmm? Or am I just totally on crack?
> 
> Linus


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: OOM problems on 2.6.12-rc1 with many fsx tests

2005-04-03 Thread Andrew Morton
Mingming Cao <[EMAIL PROTECTED]> wrote:
>
> I run into OOM problem again on 2.6.12-rc1. I run some(20) fsx tests on
>  2.6.12-rc1 kernel(and 2.6.11-mm4) on ext3 filesystem, after about 10
>  hours the system hit OOM, and OOM keep killing processes one by one. I
>  could reproduce this problem very constantly on a 2 way PIII 700MHZ with
>  512MB RAM. Also the problem could be reproduced on running the same test
>  on reiser fs.
> 
>  The fsx command is:
> 
>  ./fsx -c 10 -n -r 4096 -w 4096 /mnt/test/foo1 &


This ext3 bug goes all the way back to 2.6.6.

I don't know yet why you saw problems with reiser3 and I'm pretty sure I
saw problems with ext2.  More testing is needed there.

Without the below patch it's possible to make ext3 leak at around a
megabyte per minute by arranging for the fs to run a commit every 50
milliseconds, btw.

(Stephen, please review...)



This fixes the lots-of-fsx-linux-instances-cause-a-slow-leak bug.

It's been there since 2.6.6, caused by:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.5/2.6.5-mm4/broken-out/jbd-move-locked-buffers.patch

That patch moves under-writeout ordered-data buffers onto a separate journal
list during commit.  It took out the old code which was based on a single
list.

The old code (necessarily) had logic which would restart I/O against buffers
which had been redirtied while they were on the committing transaction's
t_sync_datalist list.  The new code only writes buffers once, ignoring
redirtyings by a later transaction, which is good.

But over on the truncate side of things, in journal_unmap_buffer(), we're
treating buffers on the t_locked_list as inviolable things which belong to the
committing transaction, and we just leave them alone during concurrent
truncate-vs-commit.

The net effect is that when truncate tries to invalidate a page whose buffers
are on t_locked_list and have been redirtied, journal_unmap_buffer() just
leaves those buffers alone.  truncate will remove the page from its mapping
and we end up with an anonymous clean page with dirty buffers, which is an
illegal state for a page.  The JBD commit will not clean those buffers as they
are removed from t_locked_list.  The VM (try_to_free_buffers) cannot reclaim
these pages.

The patch teaches journal_unmap_buffer() about buffers which are on the
committing transaction's t_locked_list.  These buffers have been written and
I/O has completed.  We can take them off the transaction and undirty them
within the context of journal_invalidatepage()->journal_unmap_buffer().

Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
---

 25-akpm/fs/jbd/transaction.c |   13 +++--
 1 files changed, 11 insertions(+), 2 deletions(-)

diff -puN fs/jbd/transaction.c~jbd-dirty-buffer-leak-fix fs/jbd/transaction.c
--- 25/fs/jbd/transaction.c~jbd-dirty-buffer-leak-fix   2005-04-03 
15:12:12.0 -0700
+++ 25-akpm/fs/jbd/transaction.c2005-04-03 15:14:40.0 -0700
@@ -1812,7 +1812,17 @@ static int journal_unmap_buffer(journal_
}
}
} else if (transaction == journal->j_committing_transaction) {
-   /* If it is committing, we simply cannot touch it.  We
+   if (jh->b_jlist == BJ_Locked) {
+   /*
+* The buffer is on the committing transaction's locked
+* list.  We have the buffer locked, so I/O has
+* completed.  So we can nail the buffer now.
+*/
+   may_free = __dispose_buffer(jh, transaction);
+   goto zap_buffer;
+   }
+   /*
+* If it is committing, we simply cannot touch it.  We
 * can remove it's next_transaction pointer from the
 * running transaction if that is set, but nothing
 * else. */
@@ -1887,7 +1897,6 @@ int journal_invalidatepage(journal_t *jo
unsigned int next_off = curr_off + bh->b_size;
next = bh->b_this_page;
 
-   /* AKPM: doing lock_buffer here may be overly paranoid */
if (offset <= curr_off) {
/* This block is wholly outside the truncation point */
lock_buffer(bh);
_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Chen, Kenneth W
Ingo Molnar wrote on Sunday, April 03, 2005 7:30 AM
> how close are these numbers to the real worst-case migration costs on
> that box?

I booted your latest patch on a 4-way SMP box (1.5 GHz, 9MB ia64). This
is what it produces.  I think the estimate is excellent.

[00]: -10.4(0) 10.4(0) 10.4(0)
[01]:  10.4(0)-10.4(0) 10.4(0)
[02]:  10.4(0) 10.4(0)-10.4(0)
[03]:  10.4(0) 10.4(0) 10.4(0)-
-
cacheflush times [1]: 10.4 (10448800)


One other minor thing: when booting a numa kernel on smp box, there is
a numa scheduler domain at the top level and cache_hot_time will be set
to 0 in that case on smp box.  Though this will be a mutt point with
recent patch from Suresh Siddha for removing the extra bogus scheduler
domains.  http://marc.theaimsgroup.com/?t=11124020801=1=2


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


out of vmalloc space - but vmalloc parameter does not allow boot

2005-04-03 Thread Ranko Zivojnovic
Greetings,

(Please CC responses as I am not subscribed to the list. Thanks!)

I've recently started experiencing the following problem on one of my
Linux servers:

allocation failed: out of vmalloc space - use vmalloc= to increase
size.
allocation failed: out of vmalloc space - use vmalloc= to increase
size.
XFS: possible memory allocation deadlock in kmem_alloc (mode:0x2d0)
XFS: possible memory allocation deadlock in kmem_alloc (mode:0x2d0)

...and so on until it completely locks up and needs reboot.

>From what I can tell from fs/xfs/linux-2.6/kmem.c, the XFS message is
just another confirmation that the machine has run out of vmalloc space.

The machine has 4GB of RAM and is running 2.6.11.5 kernel.

I have tried to specify vmalloc=256m to start with, but no luck - the
machine does not even want to boot. It panics with:
EXT2-fs: unable to read superblock
isofs_fill_super: bread failed, dev=md0, iso_blknum=16, block=32
XFS: SB read failed
VFS: Cannot open root device "md0" or unknown-block(9,0)
Please append a correct "root=" boot option
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-
block(9,0)

If I remove the "vmalloc" parameter, it boots just fine but then after
some hours, when the load on the server goes up, I get the above
"request" to increase vmalloc. Being desperate to find the way out, I
have also tried increasing the hardcoded value in arch/i386/mm/init.c,
but ended up with the same effect as with the parameter - panic on boot.

/proc/meminfo says (while the system is up and running):
MemTotal:  4073244 kB
MemFree:144356 kB
Buffers:  1184 kB
Cached:2735576 kB
SwapCached:  0 kB
Active: 921804 kB
Inactive:  2408800 kB
HighTotal: 3193792 kB
HighFree:  896 kB
LowTotal:   879452 kB
LowFree:143460 kB
SwapTotal: 7341600 kB
SwapFree:  7341600 kB
Dirty:   50940 kB
Writeback:   0 kB
Mapped: 613172 kB
Slab:   498936 kB
CommitLimit:   9378220 kB
Committed_AS:   736392 kB
PageTables:   1760 kB
VmallocTotal:   114680 kB
VmallocUsed: 88996 kB
VmallocChunk:20988 kB
HugePages_Total: 0
HugePages_Free:  0
Hugepagesize: 4096 kB

I have also tried changing the following parameters - but no luck
either:
vm.lower_zone_protection = 900
vm.min_free_kbytes = 3
vm.vfs_cache_pressure = 150

Please help! What am I doing wrong?

Also, if this question does not belong here - please point to the right
direction :).

Best regards,

Ranko

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


beursverkoop carties, chopard, omega en gucci

2005-04-03 Thread gucci
Wij hebben geheel nieuw in ons aanbod edel horloges opgenomen.
Wij hebben bijna alle fantastische modelle voor u, die u zich maar wensen
kunt.
Alles van Bulgari, Cartier tot Chopard en Omega en Gucci uurwerken is te
verkrijgen.
Gesorteerd naar mannen en vrouwen uurwerken, of als geschenkbox is er ook
voor u de juist.
 
Wat is beter als geschenk dan een uurwerk dat lang mee gaat?
Het is niet alleen een teken van goede smaak, maar ook een geheel
persoonlijk geschenk voor uw geliefde.
 
Direct naar de uurwerken hier.
http://ccjh.djverninab.com/?VbrrXuWtjtwxUppudunaoy
 
Veel winkelplezier gewenst door uw uurwerk team.
 
Uw Albert Schmitz
 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Chen, Kenneth W
Ingo Molnar wrote on Saturday, April 02, 2005 11:04 PM
> the default on ia64 (32MB) was way too large and caused the search to
> start from 64MB. That can take a _long_ time.
>
> i've attached a new patch with your changes included, and a couple of
> new things added:
>
>  - removed the 32MB max_cache_size hack from ia64 - it should now fall
>back to the default 5MB and do a search from 10MB downwards. This
>should speed up the search.

The cache size information on ia64 is already available at the finger tip.
Here is a patch that I whipped up to set max_cache_size for ia64.


--- linux-2.6.12-rc1/arch/ia64/kernel/setup.c.orig  2005-04-03 
17:14:40.0 -0700
+++ linux-2.6.12-rc1/arch/ia64/kernel/setup.c   2005-04-03 17:55:46.0 
-0700
@@ -561,6 +561,7 @@ static void
 get_max_cacheline_size (void)
 {
unsigned long line_size, max = 1;
+   unsigned int cache_size = 0;
u64 l, levels, unique_caches;
 pal_cache_config_info_t cci;
 s64 status;
@@ -585,8 +586,11 @@ get_max_cacheline_size (void)
line_size = 1 << cci.pcci_line_size;
if (line_size > max)
max = line_size;
+   if (cache_size < cci.pcci_cache_size)
+   cache_size = cci.pcci_cache_size;
 }
   out:
+   max_cache_size = max(max_cache_size, cache_size);
if (max > ia64_max_cacheline_size)
ia64_max_cacheline_size = max;
 }



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.12-rc1-mm4 crash while mounting a reiserfs3 filesystem

2005-04-03 Thread Mathieu BĂ©rard
Andrew Morton a Ă©crit :

It appears that we might have jumped from flagged_taskfile into something
at 0xdf6d1211, which is rather odd.
You have two different low-level IDE drivers configured.  Which one is
driving that filesystem?  VIA or Promise?
hdg is connected to my Promise PDC20268 (Ultra100 TX2)
--
Mathieu BĂ©rard
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Use of C99 int types

2005-04-03 Thread Adrian Bunk
On Mon, Apr 04, 2005 at 12:48:04AM +0200, Dag Arne Osvik wrote:
> Andreas Schwab wrote:
> 
> >Dag Arne Osvik <[EMAIL PROTECTED]> writes:
> >
> > 
> >
> >>Yes, but wouldn't it be much better to avoid code like the following, 
> >>which may also be wrong (in terms of speed)?
> >>
> >>#ifdef CONFIG_64BIT  // or maybe CONFIG_X86_64?
> >>#define fast_u32 u64
> >>#else
> >>#define fast_u32 u32
> >>#endif
> >>   
> >>
> >
> >How about using just unsigned long instead?
> > 
> >
> 
> unsigned long happens to coincide with uint_fast32_t for x86 and x86-64, 
> but there's no guarantee that it will on other architectures.
>...

The stdint.h shipped with glibc says:

<--  snip  -->

/* Unsigned.  */
typedef unsigned char   uint_fast8_t;
#if __WORDSIZE == 64
typedef unsigned long int   uint_fast16_t;
typedef unsigned long int   uint_fast32_t;
typedef unsigned long int   uint_fast64_t;
#else
typedef unsigned intuint_fast16_t;
typedef unsigned intuint_fast32_t;
__extension__
typedef unsigned long long int  uint_fast64_t;
#endif

<--  snip  -->

>  Dag Arne

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [SCSI] Driver broken in 2.6.x?

2005-04-03 Thread Adrian Bunk
On Sun, Apr 03, 2005 at 07:58:10PM +0200, |TEcHNO| wrote:
> Hi,
> 
> As told, I tested it w/o nvidia module loaded, here's what I found:
> 1. It now doesn't hang on scanning for devices.
> 2. It now hangs on acquiring preview, logs will follow.
>...
> Apr  3 15:54:27 techno kernel: Unable to handle kernel NULL pointer
> dereference at virtual address 014c
> Apr  3 15:54:27 techno kernel:  printing eip:
> Apr  3 15:54:27 techno kernel: c03d8143
> Apr  3 15:54:27 techno kernel: *pde = 
> Apr  3 15:54:27 techno kernel: Oops:  [#1]
> Apr  3 15:54:27 techno kernel: PREEMPT
> Apr  3 15:54:27 techno kernel: Modules linked in: nvidia
>...

Still with nvidia.

An Oops with the nvidia module loaded since the last boot is simply not 
debuggable for anyone except nvidia.

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Isn't there race issue during fput() and the dentry_open()?

2005-04-03 Thread Tomita, Haruo
Hi Viro,
(B
(BThank you for your replay.
(B
(B> > Stack traceback for pid 2130
(B> > 0xf717f1b0  21301   1   0   R   
(B> 0xf717f400 *irqbalance
(B> > ESP EIP Function (args)
(B> > 0xf75bfe38 0xc02d04b2 _spin_lock+0x2e (0xf7441a80)
(B> > 0xf75bff34 0xc015667c file_move+0x14 (0xf63080e4, 
(B> 0xf75bff58, 0x0, 0xf74bf000)
(B> > 0xf75bff40 0xc0154e37 dentry_open+0xb9 (0xf63080e4, 
(B> 0xf7f5ad80, 0xc02d00e6, 0x100100, 0x246)
(B> > 0xf75bff58 0xc0154d78 filp_open+0x36
(B> > 0xf75bffb4 0xc0155079 sys_open+0x31
(B> > 0xf75bffc4 0xc02d196f syscall_call+0x7
(B> > 
(B> > The patch was made. Is this patch right?
(B> > 
(B> > diff -urN linux-2.6.12-rc1.orig/fs/file_table.c 
(B> linux-2.6.12-rc1/fs/file_table.c
(B> > --- linux-2.6.12-rc1.orig/fs/file_table.c   2005-03-02 
(B> 16:37:47.0 +0900
(B> > +++ linux-2.6.12-rc1/fs/file_table.c2005-03-31 
(B> 17:50:46.323999320 +0900
(B> > @@ -209,11 +209,11 @@
(B> >  
(B> >  void file_kill(struct file *file)
(B> >  {
(B> > +   file_list_lock();
(B> > if (!list_empty(>f_list)) {
(B> > -   file_list_lock();
(B> > list_del_init(>f_list);
(B> > -   file_list_unlock();
(B> > }
(B> > +   file_list_unlock();
(B> >  }
(B> 
(B> This is absolutely useless.  What are you trying to protect 
(B> and how the
(B> hell could keeping a lock around that check prevent any sort 
(B> of deadlock?
(B
(BI think that it is true not to be able to acquire file_list_lock()
(Bthat is called from file_move().
(B 
(B> Besides, who could possibly call fput() on struct file allocated by
(B> dentry_open() and do that before the latter returns a reference to
(B> that struct file?
(B
(BIndeed, Is there a good method of debugging this issue?
(BIn the check on the source, a doubtful place was not found except file_kill(). 
(BI might call not race of fput() and the dentry_open() but
(Bthe deadlock of file_list_lock(). 
(B--
(BHaruo
(B-
(BTo unsubscribe from this list: send the line "unsubscribe linux-kernel" in
(Bthe body of a message to [EMAIL PROTECTED]
(BMore majordomo info at  http://vger.kernel.org/majordomo-info.html
(BPlease read the FAQ at  http://www.tux.org/lkml/

Re: Can't use SYSFS for "Proprietry" driver modules !!!.

2005-04-03 Thread Mark Lord
Zan Lynx wrote:
That does not really make sense, as the driver model code could be used
for ndiswrapper, for example.  That would not make the Windows net
drivers derived code of the Linux kernel.  ndiswrapper, yes it would be.
Binary driver blobs, no.
The Windows net drivers are not (we believe) compiled using GPL'd
header files and their included macros, asm code, etc..
Probably all Linux binary drivers *are* compiled using GPL'd header files,
and thus are themselves subject to the GPL.
Cheers
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Logitech MX1000 Horizontal Scrolling

2005-04-03 Thread Jeremy Nickurak
On dim, 2005-04-03 at 18:01 +0200, Juergen Kreileder wrote:
> Esben Stien <[EMAIL PROTECTED]> writes:
> 
> > Jeremy Nickurak <[EMAIL PROTECTED]> writes:
> >
> >> I'm playing with this under 2.6.11.4 
> >
> > I got 2.6.12-rc1 
> >
> >> The vertical cruise control buttons work properly, with the
> >> exception of the extra button press.
> >
> > Yup, nice, I see the same
> 
> Same here.
> 
> >> But the horizontal buttons are mapping to 6/7 as non-repeat
> >> buttons, and adding simulateously the 4/5 events auto-repeated for
> >> as long as the button is down. That is to say, pressing the the
> >> horizontal scroll in a 2d scrolling area will scroll *diagonally*
> >> one step, then vertically until the button is released.
> >
> > Yup, seeing exactly the same here. 
> 
> Horizontal scrolling works fine for me.  I just get repeated 6/7
> events, nothing else.
> 
> I'm using the configuration described at:
> http://blog.blackdown.de/2005/04/03/logitech-mx1000-configuration/

Well that's a big step up. My horizontal scrolling is now working
perfectly. I had never in my life seen a ZAxisMapping line with 4
buttons, but it seems to do the trick, and it's even worked its way into
the mouse man page since the last time I remember seeing it. (Running
current Xorg here, can't speak for XFree86 users)

Now it's just the vertical scroller issue.

-
Jeremy Nickurak


signature.asc
Description: This is a digitally signed message part


Re: Oops in set_spdif_output in i810_audio

2005-04-03 Thread SuD (Alex)
Triffid Hunter wrote:
try turning off your internal modem in bios until someone works out 
whats going on here
* It's one of those modern bios, no way of configuring that.
* It seems to me that it detects only 1 card with 1 only codec which is 
the sound card (sound works if i avoid the null pointer oops). So one of 
the problems is the wrong detection.
Googling i found that  jgarzik already got a patch for this 
(ac97_codec.c:158):
+{0x43585430, "CXT48",_ops,
AC97_DELUDED_MODEM },

* About fixing i810_probe/i810_ac97_init, the safest and simplest 
solution may be changing "continue" for "break" (i810_audio.c:3089), 
i.e. give up scanning for sound codecs when the first modem is found. I 
don't if that would prevent any real-world device from working, but the 
alternative is add a lot of checks everywhere.

See you.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Use of C99 int types

2005-04-03 Thread Dag Arne Osvik
Grzegorz Kulewski wrote:
On Mon, 4 Apr 2005, Dag Arne Osvik wrote:
(...) And, at least in theory, long may even provide less than 32 bits.

Are you sure?
My copy of famous C book by B. W. Kernighan and D. Ritchie says that
sizeof(short) <= sizeof(int) <= sizeof(long)
and
sizeof(short) >= 16,
sizeof(int) >= 16,
sizeof(long) >= 32.
The book is about ANSI C not C99 but I think this is still valid.
Am I wrong?

No, I just looked it up (section 2.2), and you're right.
--
 Dag Arne
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Paul Jackson
Ingo wrote:
> how close are these numbers to the real worst-case migration costs on 
> that box? What are the cache sizes and what is their hierarchies?
>  ...
> is there any workload that shows the same scheduling related performance 
> regressions, other than Ken's $1m+ benchmark kit?

I'll have to talk to some people Monday and get back to you.

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson <[EMAIL PROTECTED]> 1.650.933.1373, 
1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Use of C99 int types

2005-04-03 Thread Grzegorz Kulewski
On Mon, 4 Apr 2005, Dag Arne Osvik wrote:
(...) And, at least in 
theory, long may even provide less than 32 bits.
Are you sure?
My copy of famous C book by B. W. Kernighan and D. Ritchie says that
sizeof(short) <= sizeof(int) <= sizeof(long)
and
sizeof(short) >= 16,
sizeof(int) >= 16,
sizeof(long) >= 32.
The book is about ANSI C not C99 but I think this is still valid.
Am I wrong?
Grzegorz Kulewski
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Paul Jackson
Ingo wrote:
> if you create a sched-domains hierarchy (based on the SLIT tables, or in 
> whatever other way) that matches the CPU hierarchy then you'll 
> automatically get the proper distances detected.

Yes - agreed.  I should push in the direction of improving the
SN2 sched domain hierarchy.


Would be a good idea to rename 'cpu_distance()' to something more
specific, like 'cpu_dist_ndx()', and reserve the generic name
'cpu_distance()' for later use to return a scaled integer distance,
rather like 'node_distance()' does now.  For example, 'cpu_distance()'
might, someday, return integer values such as:

40  217  252  253

as are displayed (in tenths) in the debug line:

-
cacheflush times [4]: 4.0 (4080540) 21.7 (21781380) 25.2 (25259428) 25.3 
(25372682)
-

(that is, the integer (long)cost / 10 - one less zero).

I don't know that we have any use, yet, for this 'cpu_distance()' as a
scaled integer value.  But I'd be more comfortable reserving that name
for that purpose.

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson <[EMAIL PROTECTED]> 1.650.933.1373, 
1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sched /HT processor

2005-04-03 Thread Steven Rostedt
On Mon, 2005-04-04 at 04:22 +0530, Arun Srinivas wrote:
> Thanks. yes, a reschedule may not take place after a ms, if the currently 
> running task cannot be preempted by another task.
> 
> (1) But, can a reschedule happen within a millisec (or once a process is 
> scheduled can schedule() be called before the next millisec.) ?
> 

Yes.  For example: a high priority task may be waiting for some IO to
come in. Right after the normal timer interrupt scheduled another task,
the IO may come in and wake the high priority process up. This process
will preempt the other task right away. (ie. less than 1 ms).

> 2) Also in case argument (1) is not true, and I want rescheduling to be done 
> (i.e., schedule() called) in less than 1 ms , can I directly change the HZ 
> value in  and recompile my kernel so that my timer 
> interrupt will occur frequently?
> 

Well, 1) is true, but you can also increase HZ over 1000 if you like,
but that will usually cause more overhead, since, although a schedule
may not take place every HZ, a timer interrupt will.

-- Steve


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Use of C99 int types

2005-04-03 Thread Al Viro
On Mon, Apr 04, 2005 at 12:48:04AM +0200, Dag Arne Osvik wrote:
> unsigned long happens to coincide with uint_fast32_t for x86 and x86-64, 
> but there's no guarantee that it will on other architectures.  And, at 
> least in theory, long may even provide less than 32 bits.

To port on such platform we'd have to do a lot of rewriting - so much that
the impact of this issue will be lost in noise.

Look, it's very simple:
* too many people blindly assume that all world is 32bit l-e.
* too many of those who try to do portable code have very little
idea of what that means - see the drivers that try and mix e.g. size_t with
int, etc.
* stdint is not widely understood, to put it mildly.
* ...fast... types have very unfortunate names - these are guaranteed
to create a lot of confusion.
* pretty much everything in the kernel assumes that
4 = sizeof(int) <=
sizeof(long) = sizeof(pointer) = sizeof(size_t) = sizeof(ptrdiff_t) <=
sizeof(long long) = 8
and any platform that doesn't satisfy the above will require very serious
work on porting anyway.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Use of C99 int types

2005-04-03 Thread Dag Arne Osvik
Al Viro wrote:
On Sun, Apr 03, 2005 at 02:30:11PM +0200, Dag Arne Osvik wrote:
 

Yes, but wouldn't it be much better to avoid code like the following, 
which may also be wrong (in terms of speed)?

#ifdef CONFIG_64BIT  // or maybe CONFIG_X86_64?
#define fast_u32 u64
#else
#define fast_u32 u32
#endif
   

... and with such name 99% will assume (at least at the first reading)
that it _is_ 32bits.  We have more than enough portability bugs as it
is, no need to invite more by bad names.
 

Agreed.  The way I see it there are two reasonable options.  One is to 
just use u32, which is always correct but sacrifices speed (at least 
with the current gcc).  The other is to introduce C99 types, which Linus 
doesn't seem to object to when they are kept away from interfaces 
(http://infocenter.guardiandigital.com/archive/linux-kernel/2004/Dec/0117.html).

--
 Dag Arne
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ACPI] Re: 2.6.12-rc1-mm4 and suspend2ram (and synaptics)

2005-04-03 Thread Pavel Machek
Hi!

> it is only suspend2ram which stopped working after 2.6.11-mm4 (at least
> in 2.6.12-rc1-mm3,4).
> 
> Concerning tests with minimal kernel config: I guess since it *was*
> working there should not be a change necessary -- but well, from
> 2.6.11-mm2 to 2.6.11-mm4 I had to stop hotplug/usb to get ot working, so
> maybe now I have to stop all the other things. G. This is not what I
> want!
> 
> Isn't there a different way?

I'd like to fix the problem, but first I need to know where the
problem is.  If it works with minimal config, I know that it is one of
drivers you deselected.
Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re(2): fix u32 vs. pm message t in usb

2005-04-03 Thread Pavel Machek
Hi!

> > Okay, you obviously have easy access to usb development trees...
> > Do you think you could just take this patch as a basis and fix
> > remaining u32 vs pm-message-t in usb? --p  
> 
> Fixing the "sparse -Wbitwise" messages, and addressing some other
> behavior changes/bugs that crept in, was the idea.  That's already
> done, but _without_ taking this as a basis (or breaking the sysfs
> support etc).

Okay, if you fixed -Wbitwise, it should be all fixed...

> The patches I sent fix everything I had time to test (just a subset
> of the dozens of cases previously tested, probably covering the main
> stuff that got broken) except the non-PCI platform_bus drivers where
> pm_message_t has discarded essential functionality.  (Notably, info
> about whether device clocks and/or power must be turned off.)

At what places is essential functionality lost? I thought that u32
state is pretty much always 3 ;-). Is there platform where it is not
the case?

Could you push also trivial bits that you could not test? I'd like to
get rid of all "u32 state"s...

> p.s. PCI-express patches don't belong with USB patches.  :)

Oops, sorry. Same maintainer, though ;-).
Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sched /HT processor

2005-04-03 Thread Arun Srinivas
Thanks. yes, a reschedule may not take place after a ms, if the currently 
running task cannot be preempted by another task.

(1) But, can a reschedule happen within a millisec (or once a process is 
scheduled can schedule() be called before the next millisec.) ?

2) Also in case argument (1) is not true, and I want rescheduling to be done 
(i.e., schedule() called) in less than 1 ms , can I directly change the HZ 
value in  and recompile my kernel so that my timer 
interrupt will occur frequently?

Thanks
Arun
From: Steven Rostedt <[EMAIL PROTECTED]>
To: Jesper Juhl <[EMAIL PROTECTED]>
CC: Arun Srinivas <[EMAIL PROTECTED]>,LKML 

Subject: Re: sched /HT processor
Date: Sun, 03 Apr 2005 11:31:03 -0400

On Sun, 2005-04-03 at 13:17 +0200, Jesper Juhl wrote:
>
> A reschedule can happen once every ms, but also upon returning to
> userspace and when returning from an interrupt handler, and also when
> something in the kernel explicitly calls schedule() or sleeps (which in
> turn results in a call to schedule()). And each CPU runs schedule()
> independently.
> At least that's my understanding of it - if I'm wrong I hope someone on
> the list will correct me.
You're correct, but I'll add some more details here.  The actual
schedule happens when needed.  A schedule may not take place at every
ms, if the task running is not done with its time slice and no events
happened where another task should preempt it. If an RT task is running
in a FIFO policy, then it will continue to run until it calls schedule
itself or another process of higher priority preempts it.
Now if you don't have PREEMPT turned on, than the schedule won't take
place at all while a task is in the kernel, unless the task explicitly
calls schedule.
What happens on a timer interrupt where a task is done with its time
slice or another event where a schedule should take place, is just the
need_resched flag is set for the task.  On return from the interrupt the
flag is checked, and if set a schedule is called.
This is still a pretty basic description of what really happens, and if
you want to learn more, just start searching the kernel code for
schedule and need_resched. Don't forget to look in the asm code (ie
entry.S, and dependent on your arch other *.S files).
-- Steve

_
The MSN Survey! 
http://www.cross-tab.com/surveys/run/test.asp?sid=2026=1 Help us help 
you better!

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sched /HT processor

2005-04-03 Thread Arun Srinivas

From: Steven Rostedt <[EMAIL PROTECTED]>
To: Jesper Juhl <[EMAIL PROTECTED]>
CC: Arun Srinivas <[EMAIL PROTECTED]>,LKML 

Subject: Re: sched /HT processor
Date: Sun, 03 Apr 2005 11:31:03 -0400

On Sun, 2005-04-03 at 13:17 +0200, Jesper Juhl wrote:
>
> A reschedule can happen once every ms, but also upon returning to
> userspace and when returning from an interrupt handler, and also when
> something in the kernel explicitly calls schedule() or sleeps (which in
> turn results in a call to schedule()). And each CPU runs schedule()
> independently.
> At least that's my understanding of it - if I'm wrong I hope someone on
> the list will correct me.
You're correct, but I'll add some more details here.  The actual
schedule happens when needed.  A schedule may not take place at every
ms, if the task running is not done with its time slice and no events
happened where another task should preempt it. If an RT task is running
in a FIFO policy, then it will continue to run until it calls schedule
itself or another process of higher priority preempts it.
Now if you don't have PREEMPT turned on, than the schedule won't take
place at all while a task is in the kernel, unless the task explicitly
calls schedule.
What happens on a timer interrupt where a task is done with its time
slice or another event where a schedule should take place, is just the
need_resched flag is set for the task.  On return from the interrupt the
flag is checked, and if set a schedule is called.
This is still a pretty basic description of what really happens, and if
you want to learn more, just start searching the kernel code for
schedule and need_resched. Don't forget to look in the asm code (ie
entry.S, and dependent on your arch other *.S files).
-- Steve

_
News, views and gossip. http://www.msn.co.in/Cinema/ Get it all at MSN 
Cinema!

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Use of C99 int types

2005-04-03 Thread Dag Arne Osvik
Andreas Schwab wrote:
Dag Arne Osvik <[EMAIL PROTECTED]> writes:
 

Yes, but wouldn't it be much better to avoid code like the following, 
which may also be wrong (in terms of speed)?

#ifdef CONFIG_64BIT  // or maybe CONFIG_X86_64?
#define fast_u32 u64
#else
#define fast_u32 u32
#endif
   

How about using just unsigned long instead?
 

unsigned long happens to coincide with uint_fast32_t for x86 and x86-64, 
but there's no guarantee that it will on other architectures.  And, at 
least in theory, long may even provide less than 32 bits.

--
 Dag Arne
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ACPI] Re: 2.6.12-rc1-mm4 and suspend2ram (and synaptics)

2005-04-03 Thread Norbert Preining
On Fre, 01 Apr 2005, Pavel Machek wrote:
> > I suspends fine, but never resumes. No CapsLock, no sysrq, no network is
> > working. Nothing in the log files. Is there anything which may cause
> > these troubles when compiled into the kernel and not loaded as module?
> > (as it was with my usb stuff until 2.6.11-mm2, after this I had to stop
> > hotplug, before I could go with usb running).
> 
> Try suspend2disk, first, and try suspend2ram with minimal kernel
> config.

suspend2disk runs without any problem, even with X running.

it is only suspend2ram which stopped working after 2.6.11-mm4 (at least
in 2.6.12-rc1-mm3,4).

Concerning tests with minimal kernel config: I guess since it *was*
working there should not be a change necessary -- but well, from
2.6.11-mm2 to 2.6.11-mm4 I had to stop hotplug/usb to get ot working, so
maybe now I have to stop all the other things. G. This is not what I
want!

Isn't there a different way?

Best wishes

Norbert

---
Norbert Preining  UniversitĂ  di Siena
sip:[EMAIL PROTECTED] +43 (0) 59966-690018
gpg DSA: 0x09C5B094  fp: 14DF 2E6C 0307 BE6D AD76  A9C0 D2BF 4AA3 09C5 B094
---
DORRIDGE (n.)
Technical term for one of the lame excuses written in very small print
on the side of packets of food or washing powder to explain why
there's hardly anything inside. Examples include 'Contents may have
settled in transit' and 'To keep each biscuit fresh they have been
individually wrapped in silver paper and cellophane and separated with
corrugated lining, a cardboard flap, and heavy industrial tyres'.
--- Douglas Adams, The Meaning of Liff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Paul Jackson
Ingo wrote:
> if_ there is a significant hierarchy between CPUs it
> should be represented via a matching sched-domains hierarchy,

Agreed.

I'll see how the sched domains hierarchy looks on a bigger SN2 systems.

If the CPU hierarchy is not reflected in the sched-domain hierarchy any
better there, then I will look to involve the "SN2 sched domain
hierarchy experts" in improving SN2 the sched-domain hierarchy.

Ok - that works.  Your patch of yesterday provides just the tool
I need to measure this.  Cool.

> i'll first try the bottom-up approach to speed up detection (getting to
> the hump is very fast most of the time).

Good.

> then we can let the arch override the cpu_distance() method

I'm not aware we need that, yet anyway.  First I should see if
the SN2 sched_domains need improving.  Take a shot at doing it
'the right way' before we go inventing overrides.  I suspect
you agree.

> the migration cost matrix we can later use to tune all the other 
> sched-domains balancing related tunables as well

That comes close to my actual motivation here.  I hope to expose a
"cpu_distance" such as based on this cost matrix, to userland.

We already expose the SLIT table node distances (using SN2 specific
/proc files today, others are working on an arch-neutral mechanism).

As we push more cores and hyperthreads into a single package on one end,
and more complex numa topologies on the other end, this becomes
increasingly interesting to NUMA aware user software.

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson <[EMAIL PROTECTED]> 1.650.933.1373, 
1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC,PATCH 2/4] Deprecate synchronize_kernel, GPL replacement

2005-04-03 Thread Arnd Bergmann
On SĂĽnndag 03 April 2005 20:50, Paul E. McKenney wrote:
> I couldn't find any way to suppress the "deprecated" warning that is
> generated by the "" in the last line of the __EXPORT_SYMBOL()
> macro.  Anyone know a way of doing this?  There doesn't seem to me
> to be any point to giving the warning on the EXPORT_SYMBOL() -- and
> it does clutter up compiler output with useless "deprecated" warnings.

You can define an inline function that is marked __deprecated and calls
the exported function:

extern void __synchronize_kernel(void);
static inline __deprecated synchronize_kernel(void)
{
__synchronize_kernel();
}
===
void __synchronize_kernel(void)
{
synchronize_rcu();
}
EXPORT_SYMBOL(__synchronize_kernel);

You could even make __synchronize_kernel() static to let it only be used
by modules, but that might create some confusion about the interface.

Arnd <><
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Use of C99 int types

2005-04-03 Thread Kyle Moffett
On Apr 03, 2005, at 16:25, Kenneth Johansson wrote:
But is this not exactly what Dag Arne Osvik was trying to do ??
uint_fast32_t means that we want at least 32 bits but it's OK with
more if that happens to be faster on this particular architecture.
The problem was that the C99 standard types are not defined anywhere
in the kernel headers so they can not be used.
Uhh, so what's wrong with "int" or "long"?  On all existing archs
supported by linux, "int" is 32 bits, "long long" is 64 bits, and
"long" is an efficient word-sized value that can hold a casted
pointer.  I suppose it's theoretical that linux could be ported to
some arch where int is 16 bits, but so much stuff implicitly depends
on at least 32-bits in int that I think that's unlikely.  GCC will
generally do the right thing if you just tell it "int".
Cheers,
Kyle Moffett
-BEGIN GEEK CODE BLOCK-
Version: 3.12
GCM/CS/IT/U d- s++: a18 C>$ UB/L/X/*(+)>$ P+++()>$
L(+++) E W++(+) N+++(++) o? K? w--- O? M++ V? PS+() PE+(-) Y+
PGP+++ t+(+++) 5 X R? tv-(--) b(++) DI+ D+ G e->$ h!*()>++$ r  
!y?(-)
--END GEEK CODE BLOCK--

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.12-rc1-mm4 crash while mounting a reiserfs3 filesystem

2005-04-03 Thread Andrew Morton
Mathieu BĂ©rard <[EMAIL PROTECTED]> wrote:
>
> Hi,
> I get a 100% reproductible oops while booting linux 2.6.12-rc1-mm4.
> (Everyting run smoothly using 2.6.11-mm1)
> It seems to be related with mounting a reiserfs3 filesystem.

It looks more like an IDE bug.

> ReiserFS: hdg1: checking transaction log (hdg1)
> Unable to handle kernel paging request at virtual address 0a373138
>   printing eip:
> df6d1211
> *pde = 
> Oops: 0002 [#1]
> PREEMPT
> Modules linked in: ext2 mbcache w83627hf i2c_sensor i2c_isa ppp_generic 
> slhc w83627hf_wdt msr cpuid
> rtc
> CPU:0
> EIP:0060:[]Not tainted VLI
> EFLAGS: 00010202   (2.6.12-rc1-mm4)
> EIP is at 0xdf6d1211
> eax: c9393266   ebx: df6d1c84   ecx: d84eab1e   edx: c155ccf8
> esi: c039242c   edi: c039239c   ebp: 700d580a   esp: df6d1c80
> ds: 007b   es: 007b   ss: 0068
> Process mount (pid: 1132, threadinfo=df6d1000 task=df711a50)
> Stack: c039242c c0229945 c039239c df6d1000 df6d1000 c039242c c155ccf8 
> c0223051
> 0088 1388 c159ae28 df6d1000 c039242c c155ccf8 c039239c 
> c022333e
> df6d1d1c  c153d6e0 c155bd78  df6d1d1c c14007f0 
> c0212260
> Call Trace:
>   [] flagged_taskfile+0x125/0x380
>   [] start_request+0x1f1/0x2a0
>   [] ide_do_request+0x20e/0x3c0
>   [] __generic_unplug_device+0x20/0x30
>   [] generic_unplug_device+0x11/0x30
>   [] blk_backing_dev_unplug+0xc/0x10
>   [] sync_buffer+0x26/0x40
>   [] __wait_on_bit+0x42/0x70
>   [] sync_buffer+0x0/0x40
>   [] sync_buffer+0x0/0x40
>   [] out_of_line_wait_on_bit+0x7d/0x90
>   [] wake_bit_function+0x0/0x60
>   [] __wait_on_buffer+0x29/0x30
>   [] _update_journal_header_block+0xf7/0x140
>   [] journal_read+0x31d/0x470
>   [] journal_init+0x4e1/0x650
>   [] printk+0x1b/0x20
>   [] reiserfs_fill_super+0x34d/0x770
>   [] snprintf+0x20/0x30
>   [] disk_name+0x96/0xf0
>   [] get_sb_bdev+0xe5/0x130
>   [] link_path_walk+0x65/0x140
>   [] get_super_block+0x18/0x20
>   [] reiserfs_fill_super+0x0/0x770
>   [] do_kern_mount+0x44/0xf020 30 20 30 20 30 20 30 20 30 20 
> 30 20 30 20 30 20 <1>general p

It appears that we might have jumped from flagged_taskfile into something
at 0xdf6d1211, which is rather odd.

You have two different low-level IDE drivers configured.  Which one is
driving that filesystem?  VIA or Promise?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] fix boot hang on some architectures

2005-04-03 Thread David S. Miller
On Sun, 3 Apr 2005 12:21:01 -0300
Arnaldo Carvalho de Melo <[EMAIL PROTECTED]> wrote:

> Em Sat, Apr 02, 2005 at 01:46:03PM -0600, James Bottomley escreveu:
> > Well, this is a brown paper bag for someone.  The new protocol
> 
> /me using such bag now :(
> 
> Thanks a lot for the fix.
> 
> David, Please apply.

Done, thanks everyone.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] configfs, a filesystem for userspace-driven kernel object configuration

2005-04-03 Thread Joel Becker
On Sun, Apr 03, 2005 at 12:57:28PM -0700, Joel Becker wrote:
>   I humbly submit configfs.  With configfs, a configfs
> ...
>   Patch is against 2.6.12-rc1-bk3.

Updated for bk5 and the new backing_dev capabilites mask:

http://oss.oracle.com/~jlbec/files/configfs/2.6.12-rc1-bk5/configfs-2.6.12-rc1-bk5-1.patch

Joel

-- 

"Against stupidity the Gods themselves contend in vain."
- Freidrich von Schiller

Joel Becker
Senior Member of Technical Staff
Oracle
E-mail: [EMAIL PROTECTED]
Phone: (650) 506-8127
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Use of C99 int types

2005-04-03 Thread Kenneth Johansson
On Sun, 2005-04-03 at 21:23 +0200, Renate Meijer wrote:
> On Apr 3, 2005, at 2:30 PM, Dag Arne Osvik wrote:
> 
> > Stephen Rothwell wrote:
> >
> >> On Sun, 03 Apr 2005 13:55:39 +0200 Dag Arne Osvik <[EMAIL PROTECTED]> 
> >> wrote:
> >>
> >>> I've been working on a new DES implementation for Linux, and ran into
> >>> the problem of how to get access to C99 types like uint_fast32_t for
> >>> internal (not interface) use.  In my tests, key setup on Athlon 64 
> >>> slows
> >>> down by 40% when using u32 instead of uint_fast32_t.
> >>>
> >>
> >> If you look in stdint.h you may find that uint_fast32_t is actually
> >> 64 bits on Athlon 64 ... so does it help if you use u64?
> >>
> >>
> >
> > Yes, but wouldn't it be much better to avoid code like the following, 
> > which may also be wrong (in terms of speed)?
> >
> > #ifdef CONFIG_64BIT  // or maybe CONFIG_X86_64?
> >  #define fast_u32 u64
> > #else
> >  #define fast_u32 u32
> > #endif
> 
> Isn't it better to use a general integer type, reflecting the cpu's 
> native register-size and let the compiler sort it out? Restrict all 
> uses of explicit width types to where it's *really* needed, that is, in 

But is this not exactly what Dag Arne Osvik was trying to do ?? 
uint_fast32_t means that we want at least 32 bits but it's OK with more
if that happens to be faster on this particular architecture.  The
problem was that the C99 standard types are not defined anywhere in the
kernel headers so they can not be used.

Perhaps they should be added to asm/types.h ?





signature.asc
Description: This is a digitally signed message part


Re: Logitech MX1000 Horizontal Scrolling

2005-04-03 Thread Peter Osterlund
Jeremy Nickurak <[EMAIL PROTECTED]> writes:

> On Tue, 2005-03-08 at 21:52 +0100, Vojtech Pavlik wrote:
> > The problem is that the mouse really does reports all the double-button
> > stuff and autorepeat, and horizontal wheel together with button press on
> > wheel tilt.
> 
> Okay, I'm playing with this under 2.6.11.4 some more, and it really
> seems out of whack. The vertical cruise control buttons work properly,
> with the exception of the extra button press. But the horizontal buttons
> are mapping to 6/7 as non-repeat buttons, and adding simulateously the
> 4/5 events auto-repeated for as long as the button is down. That is to
> say, pressing the the horizontal scroll in a 2d scrolling area will
> scroll *diagonally* one step, then vertically until the button is
> released. 

Have you tried the Logitech mouse applet?

http://freshmeat.net/projects/logitech_applet/

"logitech_applet --disable-cc" used to work for me when I owned an
MX1000 mouse.

-- 
Peter Osterlund - [EMAIL PROTECTED]
http://web.telia.com/~u89404340
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: AIM9 slowdowns between 2.6.11 and 2.6.12-rc1 (probably false positive)

2005-04-03 Thread Mel Gorman
On Sun, 3 Apr 2005, Dave Hansen wrote:

> On Sun, 2005-04-03 at 15:37 +0100, Mel Gorman wrote:
> > While testing the page placement policy patches on 2.6.12-rc1, I noticed
> > that aim9 is showing significant slowdowns on page allocation-related
> > tests. An excerpt of the results is at the end of this mail but it shows
> > that page_test is allocating 18000 less pages.
> >
> > I did not check who has been recently changing the buddy allocator but
> > they might want to run a benchmark or two to make sure this is not
> > something specific to my setup.
>
> Can you get some kernel profiles to see what, exactly, is causing the
> decreased performance?  Also, what kind of system do you have?  Does
> backing this out help?  If not, can you test some BK snapshots to see
> when this started occurring?
>
> http://linus.bkbits.net:8080/linux-2.5/[EMAIL PROTECTED]
>

The machine is a quad xeon with P III 733 processors. I don't have profile
information available as I wasn't collecting as I went along. However,
backing out the patch made little difference

However, I reran the test on 2.6.11 and this time the performance
difference was a lot less. Something else must have been happening when I
collected the first set of poor results. I'll revisit this when I'm next
working on kernel stuff and see can I reproduce it again reliably.

-- 
Mel Gorman
Part-time Phd Student  Java Applications Developer
University of Limerick IBM Dublin Software Lab
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-lvm] Re: 2.6.11ac5 oops while reconstructing md array and moving volumegroup with pvmove

2005-04-03 Thread Alasdair G Kergon
On Sat, Apr 02, 2005 at 09:09:37AM +0300, Antti Salmela wrote:
> % mdadm --create -l 1 -n 2 /dev/md2 /dev/hde /dev/hdg
> % pvcreate /dev/md2
> % vgextend vg1 /dev/md2
> % pvmove /dev/hdf /dev/md2
 
A few similar reports still appearing, possibly still related to 
the md bio_clone changes that fixed some bugs for md but 
created new ones for dm...

Would be good if you could re-test with a current 2.6.12-
I'll look into later this week if nobody beats me to it - please!

Alasdair

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re(2): fix u32 vs. pm message t in usb

2005-04-03 Thread David Brownell
On Sunday 03 April 2005 12:31 pm, [EMAIL PROTECTED] wrote:
> Okay, you obviously have easy access to usb development trees...
> Do you think you could just take this patch as a basis and fix
> remaining u32 vs pm-message-t in usb? --p  

Fixing the "sparse -Wbitwise" messages, and addressing some other
behavior changes/bugs that crept in, was the idea.  That's already
done, but _without_ taking this as a basis (or breaking the sysfs
support etc).

The patches I sent fix everything I had time to test (just a subset
of the dozens of cases previously tested, probably covering the main
stuff that got broken) except the non-PCI platform_bus drivers where
pm_message_t has discarded essential functionality.  (Notably, info
about whether device clocks and/or power must be turned off.)

Fixing those will be more work than seems reasonable for 2.6.12
kernels.  Among other things, there's still a lot of stuff that
needs to percolate out to arch trees; designing and testing such
fixes takes time, as does percolating it back.

- Dave

p.s. PCI-express patches don't belong with USB patches.  :)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] configfs, a filesystem for userspace-driven kernel object configuration

2005-04-03 Thread Joel Becker
Folks,
I humbly submit configfs.  With configfs, a configfs
config_item is created via an explicit userspace operation: mkdir(2).
It is destroyed via rmdir(2).  The attributes appear at mkdir(2) time,
and can be read or modified via read(2) and write(2).  readdir(3)
queries the list of items and/or attributes.
The lifetime of the filesystem representation is completely
driven by userspace.  The lifetime of the objects themselves are managed
by a kref, but at rmdir(2) time they disappear from the filesystem.
configfs is not intended to replace sysfs or procfs, merely to
coexist with them.
An interface in /proc where the API is: 

# echo "create foo 1 3 0x00013" > /proc/mythingy

or an ioctl(2) interface where the API is:

struct mythingy_create {
char *name;
int index;
int count;
unsigned long address;
}

do_create {
mythingy_create = {"foo", 1, 3, 0x0013};
return ioctl(fd, MYTHINGY_CREATE, _create);
}

becomes this in configfs:

# cd /config/mythingy
# mkdir foo
# echo 1 > foo/index
# echo 3 > foo/count
# echo 0x00013 > foo/address

Instead of a binary blob that's passed around or a cryptic
string that has to be formatted just so, configfs provides an interface
that's completely scriptable and navigable.
Patch is against 2.6.12-rc1-bk3.

http://oss.oracle.com/~jlbec/files/configfs/2.6.12-rc1-bk3/configfs-2.6.12-rc1-bk3-1.patch

Joel

-- 

"Not everything that can be counted counts, and not everything
 that counts can be counted."
- Albert Einstein 

Joel Becker
Senior Member of Technical Staff
Oracle
E-mail: [EMAIL PROTECTED]
Phone: (650) 506-8127
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/4] psmouse: dynamic protocol switching via sysfs

2005-04-03 Thread Kenan Esau
Patches 1-3 are fine.

Protocol switching via sysfs works too but if I switch from LBPS/2 to
PS/2 the device name changes from "/dev/event1" to "/dev/event2" -- is
this intended?

If I do "echo -n 50 > resolution" "0xe8 0x01" is sent. I don't know if
this is correct for "usual" PS/2-devices but for the lifebook it's
wrong.

For the lifebook the parameters are as following:

50cpi  <=> 0x00
100cpi <=> 0x01
200cpi <=> 0x02
400cpi <=> 0x03

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [KBUILD] Bug in make deb-pkg when using seperate source and object directories

2005-04-03 Thread Sam Ravnborg
On Sat, Mar 19, 2005 at 07:28:00PM -0500, Ryan Anderson wrote:
> On Mon, Mar 14, 2005 at 11:59:26AM -0800, Ajay Patel wrote:
> > I had a similar problem building binrpm-pkg.
> > Try following patch. It worked for me.
> 
> My problem wasn't actually resolved by this - the make in builddeb still
> caused issues.
> 
> So, a normal, unified diff form of the patch, fixed up, is attached.
> 
> Signed-off-By: Ryan Anderson <[EMAIL PROTECTED]>

Applied.

Sam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel stack size

2005-04-03 Thread Manfred Spraul
Steven Rostedt wrote:
On Sun, 2005-04-03 at 09:10 +0200, Manfred Spraul wrote:
 

Yes - sem or spin locks are quicker as long as no cache line transfers 
are necessary. If the semaphore is accessed by multiple cpus, then 
kmalloc would be faster: slab tries hard to avoid taking global locks. 
I'm not speaking about contention, just the cache line ping pong for 
acquiring a free semaphore.
   

Without contention, is there still a problem with cache line ping pong
of acquiring a free semaphore?
I mean, say only one task is using a given semaphore. Is there still
going to be cache line transfers for acquiring it? Even if the task in
question stays on a CPU. Is the "LOCK" on an instruction that expensive
even if the other CPUs haven't accessed that location of memory.
 

No. If everything is cpu-local, then there are obviously no cache line 
transfers. LOCK is not that expensive. On a Pentium 3, it was 20 cpu 
cycles. On an Athlon 64, it's virtually free.

--
   Manfred
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Use of C99 int types

2005-04-03 Thread Renate Meijer
On Apr 3, 2005, at 2:30 PM, Dag Arne Osvik wrote:
Stephen Rothwell wrote:
On Sun, 03 Apr 2005 13:55:39 +0200 Dag Arne Osvik <[EMAIL PROTECTED]> wrote:
I've been working on a new DES implementation for Linux, and ran into
the problem of how to get access to C99 types like uint_fast32_t for
internal (not interface) use.  In my tests, key setup on Athlon 64 
slows
down by 40% when using u32 instead of uint_fast32_t.

If you look in stdint.h you may find that uint_fast32_t is actually
64 bits on Athlon 64 ... so does it help if you use u64?

Yes, but wouldn't it be much better to avoid code like the following, 
which may also be wrong (in terms of speed)?

#ifdef CONFIG_64BIT  // or maybe CONFIG_X86_64?
 #define fast_u32 u64
#else
 #define fast_u32 u32
#endif
Isn't it better to use a general integer type, reflecting the cpu's 
native register-size and let the compiler sort it out? Restrict all 
uses of explicit width types to where it's *really* needed, that is, in 
drivers, network-code, etc. I firmly oppose any definition of "#define 
fast_u32 u64". This kind of definitions will only create needless 
confusion.

I wonder how much other code is suffering from this kind of overly 
explicit typing. It's much easier to make assumptions about integer 
size unwittingly than it is to avoid them. I used to assume (for 
instance) that sizeof(int) == sizeof(long) == sizeof(void *) at one 
point in my career. Fortunately, reality soon asserted itself again.

Regards,
Renate Meijer.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Auto-append localversion for BK users needs to use CONFIG_SHELL

2005-04-03 Thread Sam Ravnborg
On Sat, Mar 12, 2005 at 11:32:29PM -0500, Ryan Anderson wrote:
> 
> Sam, you'll probably want this on top of the patch I sent.  (I haven't
> built in a clean tree in a while, found a minor problem when I was
> transitioning to quilt today.)

Combined this to one patch and applied it.
Let's see what feedback lkml gives.

Sam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: AIM9 slowdowns between 2.6.11 and 2.6.12-rc1

2005-04-03 Thread Dave Hansen
On Sun, 2005-04-03 at 15:37 +0100, Mel Gorman wrote:
> While testing the page placement policy patches on 2.6.12-rc1, I noticed
> that aim9 is showing significant slowdowns on page allocation-related
> tests. An excerpt of the results is at the end of this mail but it shows
> that page_test is allocating 18000 less pages.
> 
> I did not check who has been recently changing the buddy allocator but
> they might want to run a benchmark or two to make sure this is not
> something specific to my setup.

Can you get some kernel profiles to see what, exactly, is causing the
decreased performance?  Also, what kind of system do you have?  Does
backing this out help?  If not, can you test some BK snapshots to see
when this started occurring?  

http://linus.bkbits.net:8080/linux-2.5/[EMAIL PROTECTED]

-- Dave

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC,PATCH 2/4] Deprecate synchronize_kernel, GPL replacement

2005-04-03 Thread Paul E. McKenney
On Mon, Apr 04, 2005 at 12:11:56AM +1000, Michael Ellerman wrote:
> Hi Paul,
> 
> I'm not quite clear on the difference between the two synchronize functions , 
> the comment for synchronize_sched() seems to have a bit missing? (see below)
> 
> cheers
> 
> On Sun, 3 Apr 2005 16:21, Paul E. McKenney wrote:
> > +/**
> > + * synchronize_sched - block until all CPUs have exited any non-preemptive
> > + * kernel code sequences.
> > + *
> > + * This means that all preempt_disable code sequences, including NMI and
> > + * hardware-interrupt handlers, in progress on entry will have completed
> > + * before this primitive returns.  However, this does not guarantee that
> > + * softirq handlers will have completed, since in some kernels
> 
> ??
> 
> > + *
> > + * This primitive provides the guarantees made by the (deprecated)
> > + * synchronize_kernel() API.  In contrast, synchronize_rcu() only
> > + * guarantees that rcu_read_lock() sections will have completed.
> > + */
> > +#define synchronize_sched() synchronize_rcu()

Good catch!  How about the following?

+ * This means that all preempt_disable code sequences, including NMI and
+ * hardware-interrupt handlers, in progress on entry will have completed
+ * before this primitive returns.  However, this does not guarantee softirq
+ * handlers have completed, since some configs run them in process context.

Updated patch below.

Thanx, Paul

Signed-off-by: <[EMAIL PROTECTED]>


diff -urpN -X dontdiff linux-2.6.12-rc1/include/linux/rcupdate.h 
linux-2.6.12-rc1-bettersk/include/linux/rcupdate.h
--- linux-2.6.12-rc1/include/linux/rcupdate.h   Tue Mar  1 23:37:50 2005
+++ linux-2.6.12-rc1-bettersk/include/linux/rcupdate.h  Sat Apr  2 13:06:15 2005
@@ -41,6 +41,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /**
  * struct rcu_head - callback structure for use with RCU
@@ -157,9 +158,9 @@ static inline int rcu_pending(int cpu)
 /**
  * rcu_read_lock - mark the beginning of an RCU read-side critical section.
  *
- * When synchronize_kernel() is invoked on one CPU while other CPUs
+ * When synchronize_rcu() is invoked on one CPU while other CPUs
  * are within RCU read-side critical sections, then the
- * synchronize_kernel() is guaranteed to block until after all the other
+ * synchronize_rcu() is guaranteed to block until after all the other
  * CPUs exit their critical sections.  Similarly, if call_rcu() is invoked
  * on one CPU while other CPUs are within RCU read-side critical
  * sections, invocation of the corresponding RCU callback is deferred
@@ -256,6 +257,21 @@ static inline int rcu_pending(int cpu)
(p) = (v); \
})
 
+/**
+ * synchronize_sched - block until all CPUs have exited any non-preemptive
+ * kernel code sequences.
+ *
+ * This means that all preempt_disable code sequences, including NMI and
+ * hardware-interrupt handlers, in progress on entry will have completed
+ * before this primitive returns.  However, this does not guarantee softirq
+ * handlers have completed, since some configs run them in process context.
+ * 
+ * This primitive provides the guarantees made by the (deprecated)
+ * synchronize_kernel() API.  In contrast, synchronize_rcu() only
+ * guarantees that rcu_read_lock() sections will have completed.
+ */
+#define synchronize_sched() synchronize_rcu()
+
 extern void rcu_init(void);
 extern void rcu_check_callbacks(int cpu, int user);
 extern void rcu_restart_cpu(int cpu);
@@ -265,7 +281,9 @@ extern void FASTCALL(call_rcu(struct rcu
void (*func)(struct rcu_head *head)));
 extern void FASTCALL(call_rcu_bh(struct rcu_head *head,
void (*func)(struct rcu_head *head)));
-extern void synchronize_kernel(void);
+extern __deprecated_for_modules void synchronize_kernel(void);
+extern void synchronize_rcu(void);
+void synchronize_idle(void);
 
 #endif /* __KERNEL__ */
 #endif /* __LINUX_RCUPDATE_H */
diff -urpN -X dontdiff linux-2.6.12-rc1/kernel/rcupdate.c 
linux-2.6.12-rc1-bettersk/kernel/rcupdate.c
--- linux-2.6.12-rc1/kernel/rcupdate.c  Tue Mar  1 23:37:30 2005
+++ linux-2.6.12-rc1-bettersk/kernel/rcupdate.c Sat Apr  2 13:10:09 2005
@@ -444,15 +444,18 @@ static void wakeme_after_rcu(struct rcu_
 }
 
 /**
- * synchronize_kernel - wait until a grace period has elapsed.
+ * synchronize_rcu - wait until a grace period has elapsed.
  *
  * Control will return to the caller some time after a full grace
  * period has elapsed, in other words after all currently executing RCU
  * read-side critical sections have completed.  RCU read-side critical
  * sections are delimited by rcu_read_lock() and rcu_read_unlock(),
  * and may be nested.
+ *
+ * If your read-side code is not protected by rcu_read_lock(), do -not-
+ * use synchronize_rcu().
  */
-void synchronize_kernel(void)
+void synchronize_rcu(void)
 {
struct rcu_synchronize rcu;
 
@@ 

Re: [RFC,PATCH 2/4] Deprecate synchronize_kernel, GPL replacement

2005-04-03 Thread Paul E. McKenney
On Sun, Apr 03, 2005 at 02:26:50PM +0530, Dipankar Sarma wrote:
> On Sat, Apr 02, 2005 at 10:21:50PM -0800, Paul E. McKenney wrote:
> > The synchronize_kernel() primitive is used for quite a few different
> > purposes: waiting for RCU readers, waiting for NMIs, waiting for interrupts,
> > and so on.  This makes RCU code harder to read, since synchronize_kernel()
> > might or might not have matching rcu_read_lock()s.  This patch creates
> > a new synchronize_rcu() that is to be used for RCU readers and a new
> > synchronize_sched() that is used for the rest.  These two new primitives
> > currently have the same implementation, but this is might well change
> > with additional real-time support.  Both new primitives are GPL-only,
> > the old primitive is deprecated.
> > 
> > Signed-off-by: <[EMAIL PROTECTED]>
> > ---
> > Depends on earlier "Add deprecated_for_modules" patch.
> > 
> > +/*
> > + * Deprecated, use synchronize_rcu() or synchronize_sched() instead.
> > + */
> > +void synchronize_kernel(void)
> > +{
> > +   synchronize_rcu();
> > +}
> > +
> 
> We should probably mark it deprecated - 
> 
> void __deprecated synchronize_kernel(void)
> {
>   synchronize_rcu();
> }

Hello, Dipankar,

That was the first thing I tried.  ;-)

When I did that, I got a "deprecated" warning from gcc on the
EXPORT_SYMBOL() later in that same file.  After futzing around a
bit, I hit on the compromise of marking the rcupdate.h declaration
of synchronize_kernel() as "__deprecated_for_modules".

That said, you are quite right that it would be better if gcc also gave
the "deprecated" warning for use of synchronize_kernel() within in-tree
non-module code.  I suppose I could do something like the following
before the #includes in rcupdate.c:

#define SUPPRESS_DEPRECATION_OF_SYNCHRONIZE_KERNEL

and then something like this in rcupdate.h:

#ifdef SUPPRESS_DEPRECATION_OF_SYNCHRONIZE_KERNEL
extern void synchronize_kernel(void);
#else /* #ifdef SUPPRESS_DEPRECATION_OF_SYNCHRONIZE_KERNEL */
extern __deprecated void synchronize_kernel(void);
#endif /* #else #ifdef SUPPRESS_DEPRECATION_OF_SYNCHRONIZE_KERNEL */

but this seemed a bit ugly at the time.  Maybe it is worthwhile.

I couldn't find any way to suppress the "deprecated" warning that is
generated by the "" in the last line of the __EXPORT_SYMBOL()
macro.  Anyone know a way of doing this?  There doesn't seem to me
to be any point to giving the warning on the EXPORT_SYMBOL() -- and
it does clutter up compiler output with useless "deprecated" warnings.

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Use of C99 int types

2005-04-03 Thread Al Viro
On Sun, Apr 03, 2005 at 02:30:11PM +0200, Dag Arne Osvik wrote:
> Yes, but wouldn't it be much better to avoid code like the following, 
> which may also be wrong (in terms of speed)?
> 
> #ifdef CONFIG_64BIT  // or maybe CONFIG_X86_64?
>  #define fast_u32 u64
> #else
>  #define fast_u32 u32
> #endif

... and with such name 99% will assume (at least at the first reading)
that it _is_ 32bits.  We have more than enough portability bugs as it
is, no need to invite more by bad names.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: fix u32 vs. pm_message_t in usb

2005-04-03 Thread David Brownell
Actually, please do NOT apply this.  It conflicts with other
patches, which have been in the past few MM releases, have
also been circulated on linux-usb-devel, and actually address
some of the bugs which crept in as things have changed around
the USB stack.

- Dave
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel stack size

2005-04-03 Thread Steven Rostedt
On Sun, 2005-04-03 at 09:10 +0200, Manfred Spraul wrote:

> Yes - sem or spin locks are quicker as long as no cache line transfers 
> are necessary. If the semaphore is accessed by multiple cpus, then 
> kmalloc would be faster: slab tries hard to avoid taking global locks. 
> I'm not speaking about contention, just the cache line ping pong for 
> acquiring a free semaphore.

Without contention, is there still a problem with cache line ping pong
of acquiring a free semaphore?

I mean, say only one task is using a given semaphore. Is there still
going to be cache line transfers for acquiring it? Even if the task in
question stays on a CPU. Is the "LOCK" on an instruction that expensive
even if the other CPUs haven't accessed that location of memory.

Sorry for my ignorance, I don't know all the interworkings of the Cache
on SMP systems.  Is there any good references on the Internet? I
definitely want to know so that my coding practices for SMP improve. 

Thanks,

-- Steve


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [SCSI] Driver broken in 2.6.x?

2005-04-03 Thread |TEcHNO|
Hi,
As told, I tested it w/o nvidia module loaded, here's what I found:
1. It now doesn't hang on scanning for devices.
2. It now hangs on acquiring preview, logs will follow.
2a.Not it hanged on scanning for devices again, don't know why.
3. It's true for both module loaded and w/o it.
4. xsane badly reports my scanner as Plustek 12000 or such.
5. Turning on "partial updates"(or such) in view->updates, coused whole
machine to hang up, hard reset was needed. W/o this, only xsane hanged.
Shoudln't kernel protect form that somehow?
6. Anytime xsane fails/hangs, it hangs the scanner, making it blink it's
lamp all the time, and it needt to be dissconnected form electricity to
work.
7. As a side question: any way to "reload" the whole SCSI subsystem, so
I don't hjave to reboot if I connect something new?

Apr  3 15:36:38 techno kernel: scsi0 : aborting command
Apr  3 15:36:38 techno kernel: scsi0 : destination target 6, lun 0
Apr  3 15:36:38 techno kernel:
Apr  3 15:36:38 techno kernel: NCR5380 core release=7.
Apr  3 15:36:38 techno kernel: Base Addr: 0x0io_port: d800
IRQ: 11.
Apr  3 15:36:38 techno kernel: scsi0 : destination target 6, lun 0
Apr  3 15:36:38 techno kernel: command = 42 (0x2a)00 03 00 00 00
01 10 00 00
Apr  3 15:36:38 techno kernel: scsi0: issue_queue
Apr  3 15:36:38 techno kernel: scsi0: disconnected_queue
Apr  3 15:36:38 techno kernel:
Apr  3 15:36:38 techno kernel: scsi0 : aborting command
Apr  3 15:36:38 techno kernel: scsi0 : destination target 6, lun 0
Apr  3 15:36:38 techno kernel:
Apr  3 15:36:38 techno kernel: NCR5380 core release=7.
Apr  3 15:36:38 techno kernel: Base Addr: 0x0io_port: d800
IRQ: 11.
Apr  3 15:36:38 techno kernel: scsi0 : destination target 6, lun 0
Apr  3 15:36:38 techno kernel: command = 42 (0x2a)00 03 00 00 00
01 10 00 00
Apr  3 15:36:38 techno kernel: scsi0: issue_queue
Apr  3 15:36:38 techno kernel: scsi0: disconnected_queue
Apr  3 15:36:38 techno kernel:
Apr  3 15:36:38 techno kernel:
Apr  3 15:36:38 techno kernel: NCR5380 core release=7.
Apr  3 15:36:38 techno kernel: Base Addr: 0x0io_port: d800
IRQ: 11.
Apr  3 15:36:38 techno kernel: scsi0 : destination target 6, lun 0
Apr  3 15:36:38 techno kernel: command = 42 (0x2a)00 03 00 00 00
01 10 00 00
Apr  3 15:36:39 techno kernel: scsi0: issue_queue
Apr  3 15:36:39 techno kernel: scsi0: disconnected_queue
Apr  3 15:36:39 techno kernel:
Apr  3 15:40:26 techno kernel: Linux version 2.6.11.3 ([EMAIL PROTECTED]) (gcc
version 3.3.5) #1 Tue Mar 15 14:23:17 CET 2005

Apr  3 15:54:07 techno kernel: scsi0 : aborting command
Apr  3 15:54:07 techno kernel: scsi0 : destination target 6, lun 0
Apr  3 15:54:07 techno kernel:
Apr  3 15:54:07 techno kernel: NCR5380 core release=7.
Apr  3 15:54:07 techno kernel: Base Addr: 0x0io_port: d800
IRQ: 11.
Apr  3 15:54:07 techno kernel: scsi0: no currently connected command
Apr  3 15:54:07 techno kernel: scsi0: issue_queue
Apr  3 15:54:07 techno kernel: scsi0: disconnected_queue
Apr  3 15:54:07 techno kernel:
Apr  3 15:54:07 techno kernel: scsi0 : aborting command
Apr  3 15:54:07 techno kernel: scsi0 : destination target 6, lun 0
Apr  3 15:54:07 techno kernel:
Apr  3 15:54:07 techno kernel: NCR5380 core release=7.
Apr  3 15:54:07 techno kernel: Base Addr: 0x0io_port: d800
IRQ: 11.
Apr  3 15:54:07 techno kernel: scsi0: no currently connected command
Apr  3 15:54:07 techno kernel: scsi0: issue_queue
Apr  3 15:54:07 techno kernel: scsi0: disconnected_queue
Apr  3 15:54:07 techno kernel:
Apr  3 15:54:07 techno kernel: scsi0 : warning : SCSI command probably
completed successfully
Apr  3 15:54:07 techno kernel:  before abortion
Apr  3 15:54:07 techno kernel:
Apr  3 15:54:07 techno kernel: NCR5380 core release=7.
Apr  3 15:54:07 techno kernel: Base Addr: 0x0io_port: d800
IRQ: 11.
Apr  3 15:54:07 techno kernel: scsi0: no currently connected command
Apr  3 15:54:07 techno kernel: scsi0: issue_queue
Apr  3 15:54:07 techno kernel: scsi0: disconnected_queue
Apr  3 15:54:07 techno kernel:
Apr  3 15:54:27 techno kernel: scsi0 : aborting command
Apr  3 15:54:27 techno kernel: scsi0 : destination target 6, lun 0
Apr  3 15:54:27 techno kernel:
Apr  3 15:54:27 techno kernel: NCR5380 core release=7.
Apr  3 15:54:27 techno kernel: Base Addr: 0x0io_port: d800
IRQ: 11.
Apr  3 15:54:27 techno kernel: scsi0: no currently connected command
Apr  3 15:54:27 techno kernel: scsi0: issue_queue
Apr  3 15:54:27 techno kernel: scsi0 : destination target 6, lun 0
Apr  3 15:54:27 techno kernel: command =  0 (0x00)00 00 00 00 00
Apr  3 15:54:27 techno kernel: scsi0: disconnected_queue
Apr  3 15:54:27 techno kernel:
Apr  3 15:54:27 techno kernel: scsi0 : aborting command
Apr  3 15:54:27 techno kernel: scsi0 : destination target 6, lun 0
Apr  3 15:54:27 techno kernel:
Apr  3 15:54:27 techno kernel: NCR5380 core release=7.
Apr  3 15:54:27 techno kernel: Base Addr: 0x0io_port: d800
IRQ: 11.
Apr  3 15:54:27 techno kernel: scsi0: 

Re: [PATCH] RCU in kernel/intermodule.c

2005-04-03 Thread Kyle Moffett
On Apr 02, 2005, at 06:28, Luca Falavigna wrote:
-BEGIN PGP SIGNED MESSAGE-
- --- ./kernel/intermodule.c.orig   2005-04-01 19:25:26.0 +
+++ ./kernel/intermodule.c  2005-04-02 02:46:22.0 +
@@ -3,7 +3,7 @@
 /* Written by Keith Owens  Oct 2000 */
 #include 
 #include 
- -#include 
  ^ Ugh, mangled patch
+#include 
 #include 
 #include 
Please don't sign patches with PGP, it mangles them and makes them
much harder to apply.
Also, the intermodule stuff is slated for removal, as soon as MTD and
such are fixed; the interface has been deprecated for a while.
Cheers,
Kyle Moffett
-BEGIN GEEK CODE BLOCK-
Version: 3.12
GCM/CS/IT/U d- s++: a18 C>$ UB/L/X/*(+)>$ P+++()>$
L(+++) E W++(+) N+++(++) o? K? w--- O? M++ V? PS+() PE+(-) Y+
PGP+++ t+(+++) 5 X R? tv-(--) b(++) DI+ D+ G e->$ h!*()>++$ r  
!y?(-)
--END GEEK CODE BLOCK--

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[2.6.12-rc-bk5] Avermedia TV/Phone98 remote control problem (input layer)

2005-04-03 Thread Jose Luis Domingo Lopez
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi all:

I am trying to make my Avermedia TV/Phone 98 remote control work with
linux kernel 2.6.x input layer support (driver ir-kbd-gpio). Module bttv
detects the remote ("bttv0: add subdevice "remote0"), and "ir-kbd-gpio"
makes the remote control show under /proc/bus/input/devices:

I: Bus=0001 Vendor=1461 Product=0001 Version=0001
N: Name="bttv IR (card=41)"
P: Phys=pci-:00:0b.0/ir0
H: Handlers=kbd event3 
B: EV=13 
B: KEY=fc304 80100040 0 0 3 0 2008000 82 1 9e 7bb80 0 0 

I tried Gerd's input layer utilities "input-20040421-115547.tar.gz" to
check if everything was working ok. "lsinput" shows the following:
/dev/input/event3
   bustype : BUS_PCI
   vendor  : 0x1461
   product : 0x1
   version : 1
   name: "bttv IR (card=41)"
   phys: "pci-:00:0b.0/ir0"
   bits ev : EV_SYN EV_KEY EV_REP

Then I tried "input-events" to check if keypresses in the remote where
detected OK. I get a neverending flow of "ghost" keypresses, because I
didn't touch any key at that time:
waiting for events
11:30:11.198718: EV_KEY KEY_KP0 pressed
11:30:11.198719: EV_SYN code=0 value=0
11:30:11.231712: EV_KEY KEY_KP0 pressed
11:30:11.231714: EV_SYN code=0 value=0
11:30:11.264705: EV_KEY KEY_KP0 pressed
...

Damnit!. So next I unloaded "ir-kbd-gpio", and loaded it again passing the
(undocumented) "debug" parameter to it (modprobe ir-kbd-gpio debug=1), and
the following shows in the logs. Checking the sources at
"drivers/media/video/ir-kbd-gpio.c" it seems that the key supposedly being
pressed is KEY_KP0 from "ir_codes_avermedia", exactly code 34 as shown below:
kernel: ir-kbd-gpio: ir-kbd-gpio: irq gpio=0x8d77c5 code=34 | poll down
kernel: ir-kbd-gpio: bttv IR (card=41) detected at pci-:00:0b.0/ir0

Just for completeness, here is the "lspci" output for the PCI device
"associated" to the remote:
:00:0b.0 Multimedia video controller: Brooktree Corporation Bt878 Video 
Capture (rev 02)

I also tried reading from the input device with "lircd" version 0.7.0 with
support for "devinput", and the same happens. However, this time the "key
code" detected was another one, but this happened after a reboot.

I compiled a 2.4.26 kernel with lirc 0.7.1-pre3 lirc-dev and lirc-gpio
kernel module, and version 0.7.1-pre2 lirc userspace, and it works.

Comparing the sources of "lirc-gpio" in lirc-0.7.1-pre3" and linux kernel
2.6.12-rc1-bk5 "ir-kbd-gpio.c" I see a possible cause for the problem.
Under 2.4.26 the card is detected as "id=0x11461", and under
2.6.12-rc1-bk5 it is detected as "PCI subsystem ID is 1461:0001". It seems
to me that both are the same.

For 2.4.26 this assigns the card's remote the following configuration
(file lirc_gpio.c). The second line I understand describes newer versions
of the same card (mine is from year 1999):
bttv_id  card_id gpio_mask   gpio_enable   gpio_lock_mask  
gpio_xor_mask soft_gap  sample_rate  code_length 
{BTTV_AVPHONE98, 0x00011461, 0x003b8000, 0x4000,   0x080,  
0x0080,   0,10,  0}
{BTTV_AVPHONE98, 0x00031461, 0x00f88000,  0,   0x001,  
0x0001,   0,10,  32}

For 2.6.12-rc1-bk5, any card that is recognized as BTTV_AVERMEDIA,
BTTV_AVPHONE98 or BTTV_AVERMEDIA98, gets assigned the _same_ configuration 
(file drivers/media/video/ir-kbd-gpio.c starting at line 313):
ir_codes = ir_codes_avermedia;
ir->mask_keycode = 0xf88000;
ir->mask_keydown = 0x01;
ir->polling = 50; // ms
 
I am no expert at lirc kernel internals, nor at input layer in kernels
2.6.x, but maybe the problem with my remote comes from being "initialized"
with a generic parameter set that doesn't work at all for it. My knowledge
doesn't go much further, so I thank any additional help to fix this issue.

Greetings,

- -- 
Jose Luis Domingo Lopez
Linux Registered User #189436 Debian Linux Sid (Linux 2.6.10-rc3)

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)

iD8DBQFCUCJ3ao1/w/yPYI0RAuzjAJ4grc8aiFlpN8WtkrgJGZRbPXO3PQCfaWus
DcVUyPVVeKVxOCrREkd4WAI=
=LmCk
-END PGP SIGNATURE-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] RCU in kernel/intermodule.c

2005-04-03 Thread Christoph Hellwig
On Sat, Apr 02, 2005 at 11:28:12AM +, Luca Falavigna wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> This patch, compiled against version 2.6.12-rc1, implements RCU mechanism in
> intermodule functions.

There's no point as these functions are going to go away soon.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: initramfs linus tree breakage in last day

2005-04-03 Thread Jon Smirl
On Apr 3, 2005 11:51 AM, Arnaldo Carvalho de Melo
<[EMAIL PROTECTED]> wrote:
> Em Fri, Apr 01, 2005 at 10:30:42PM -0500, Jon Smirl escreveu:
> > This is what I see on boot.
> >
> > --
> > Jon Smirl
> > [EMAIL PROTECTED]
> >
> > Linux version 2.6.12-rc1 ([EMAIL PROTECTED]) (gcc version
> > 3.4.2 200410
> > 17 (Red Hat 3.4.2-6.fc3)) #21 SMP Fri Apr 1 22:09:28 EST 2005
> > found SMP MP-table at 000fe710
> 
> OK, SMP, could you please try this patch by James Bottomley that fixes
> a brown paper bag bug in my proto_register patch?
> 
> Regards,
> 
> - Arnaldo
> 
> = net/core/sock.c 1.67 vs edited =
> --- 1.67/net/core/sock.c2005-03-26 17:04:35 -06:00
> +++ edited/net/core/sock.c  2005-04-02 13:37:20 -06:00
> @@ -1352,7 +1352,7 @@
> 
>  EXPORT_SYMBOL(sk_common_release);
> 
> -static rwlock_t proto_list_lock;
> +static DEFINE_RWLOCK(proto_list_lock);
>  static LIST_HEAD(proto_list);
> 
>  int proto_register(struct proto *prot, int alloc_slab)
> 

This patch fixes the boot problem. 

-- 
Jon Smirl
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] fix subarch breakage in intel_cacheinfo.c

2005-04-03 Thread Pallipadi, Venkatesh

Errr. That was my oversight. I will compile-test the patches 
against all sub-archs in future. Thanks for catching this 
and sending the patch.  

Thanks,
Venki

>-Original Message-
>From: James Bottomley [mailto:[EMAIL PROTECTED] 
>Sent: Saturday, April 02, 2005 10:10 AM
>To: Andrew Morton
>Cc: Pallipadi, Venkatesh; Linux Kernel
>Subject: [PATCH] fix subarch breakage in intel_cacheinfo.c
>
>Not all x86 subarchitectures have support for hyperthreading, so every
>piece you add for it has to be predicated on checks for CONFIG_X86_HT.
>
>The patch corrects this hyperthreading leakage problem in
>intel_cacheinfo.c
>
>Signed-off-by: James Bottomley <[EMAIL PROTECTED]>
>
>= arch/i386/kernel/cpu/intel_cacheinfo.c 1.3 vs edited =
>--- 1.3/arch/i386/kernel/cpu/intel_cacheinfo.c 2005-03-31 
>05:06:44 -06:00
>+++ edited/arch/i386/kernel/cpu/intel_cacheinfo.c  
>2005-04-02 12:03:39 -06:00
>@@ -311,8 +311,10 @@
> 
>   if (num_threads_sharing == 1)
>   cpu_set(cpu, this_leaf->shared_cpu_map);
>+#ifdef CONFIG_X86_HT
>   else if (num_threads_sharing == smp_num_siblings)
>   this_leaf->shared_cpu_map = cpu_sibling_map[cpu];
>+#endif
>   else
>   printk(KERN_INFO "Number of CPUs sharing cache 
>didn't match "
>   "any known set of CPUs\n");
>
>
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Logitech MX1000 Horizontal Scrolling

2005-04-03 Thread Juergen Kreileder
Esben Stien <[EMAIL PROTECTED]> writes:

> Jeremy Nickurak <[EMAIL PROTECTED]> writes:
>
>> I'm playing with this under 2.6.11.4 
>
> I got 2.6.12-rc1 
>
>> The vertical cruise control buttons work properly, with the
>> exception of the extra button press.
>
> Yup, nice, I see the same

Same here.

>> But the horizontal buttons are mapping to 6/7 as non-repeat
>> buttons, and adding simulateously the 4/5 events auto-repeated for
>> as long as the button is down. That is to say, pressing the the
>> horizontal scroll in a 2d scrolling area will scroll *diagonally*
>> one step, then vertically until the button is released.
>
> Yup, seeing exactly the same here. 

Horizontal scrolling works fine for me.  I just get repeated 6/7
events, nothing else.

I'm using the configuration described at:
http://blog.blackdown.de/2005/04/03/logitech-mx1000-configuration/


Juergen

-- 
Juergen Kreileder, Blackdown Java-Linux Team
http://blog.blackdown.de/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: initramfs linus tree breakage in last day

2005-04-03 Thread Arnaldo Carvalho de Melo
Em Fri, Apr 01, 2005 at 10:30:42PM -0500, Jon Smirl escreveu:
> This is what I see on boot.
> 
> -- 
> Jon Smirl
> [EMAIL PROTECTED]
> 
> Linux version 2.6.12-rc1 ([EMAIL PROTECTED]) (gcc version
> 3.4.2 200410
> 17 (Red Hat 3.4.2-6.fc3)) #21 SMP Fri Apr 1 22:09:28 EST 2005
> found SMP MP-table at 000fe710  

OK, SMP, could you please try this patch by James Bottomley that fixes
a brown paper bag bug in my proto_register patch?

Regards,

- Arnaldo


= net/core/sock.c 1.67 vs edited =
--- 1.67/net/core/sock.c2005-03-26 17:04:35 -06:00
+++ edited/net/core/sock.c  2005-04-02 13:37:20 -06:00
@@ -1352,7 +1352,7 @@
 
 EXPORT_SYMBOL(sk_common_release);
 
-static rwlock_t proto_list_lock;
+static DEFINE_RWLOCK(proto_list_lock);
 static LIST_HEAD(proto_list);
 
 int proto_register(struct proto *prot, int alloc_slab)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.12-rc1-mm4 crash while mounting a reiserfs3 filesystem

2005-04-03 Thread Mathieu BĂ©rard
Hi,
I get a 100% reproductible oops while booting linux 2.6.12-rc1-mm4.
(Everyting run smoothly using 2.6.11-mm1)
It seems to be related with mounting a reiserfs3 filesystem.
Please also note that this kernel was compiled using gcc-4.0-0pre9
(from Debian)
I'm using mount 2.12p
Please CC: me.
Here the OOPs (netconsole captured):
input: AT Translated Set 2 keyboard on isa0060/serio0
ReiserFS: hda5: using ordered data mode
ReiserFS: hda5: journal params: device hda5, size 8192, journal first 
block 18, max trans len 1024,
max batch 900, max commit age 30, max trans age 30
ReiserFS: hda5: checking transaction log (hda5)
ReiserFS: hda5: Using r5 hash to sort names
VFS: Mounted root (reiserfs filesystem) readonly.
Freeing unused kernel memory: 136k freed
Adding 497972k swap on /dev/hda6.  Priority:-1 extents:1
reiserfs: enabling write barrier flush mode
reiserfs: disabling flush barriers on hda5
Real Time Clock Driver v1.12
WDT driver for the Winbond(TM) W83627HF Super I/O chip initialising.
w83627hf WDT: initialized. timeout=60 sec (nowayout=0)
CSLIP: code copyright 1989 Regents of the University of California
PPP generic driver version 2.4.2
ACPI: No ACPI bus support for 0-0290
ReiserFS: hda7: found reiserfs format "3.6" with standard journal
ReiserFS: hda7: using ordered data mode
reiserfs: using flush barriers
ReiserFS: hda7: journal params: device hda7, size 8192, journal first 
block 18, max trans len 1024,
max batch 900, max commit age 30, max trans age 30
ReiserFS: hda7: checking transaction log (hda7)
reiserfs: disabling flush barriers on hda7
ReiserFS: hda7: Using r5 hash to sort names
ReiserFS: hda8: found reiserfs format "3.6" with standard journal
ReiserFS: hda8: using ordered data mode
reiserfs: using flush barriers
ReiserFS: hda8: journal params: device hda8, size 8192, journal first 
block 18, max trans len 1024,
max batch 900, max commit age 30, max trans age 30
ReiserFS: hda8: checking transaction log (hda8)
reiserfs: disabling flush barriers on hda8
ReiserFS: hda8: Using r5 hash to sort names
ReiserFS: hdg1: found reiserfs format "3.6" with standard journal
ReiserFS: hdg1: using ordered data mode
reiserfs: using flush barriers
ReiserFS: hdg1: journal params: device hdg1, size 8192, journal first 
block 18, max trans len 1024,
max batch 900, max commit age 30, max trans age 30
ReiserFS: hdg1: checking transaction log (hdg1)
Unable to handle kernel paging request at virtual address 0a373138
 printing eip:
df6d1211
*pde = 
Oops: 0002 [#1]
PREEMPT
Modules linked in: ext2 mbcache w83627hf i2c_sensor i2c_isa ppp_generic 
slhc w83627hf_wdt msr cpuid
rtc
CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00010202   (2.6.12-rc1-mm4)
EIP is at 0xdf6d1211
eax: c9393266   ebx: df6d1c84   ecx: d84eab1e   edx: c155ccf8
esi: c039242c   edi: c039239c   ebp: 700d580a   esp: df6d1c80
ds: 007b   es: 007b   ss: 0068
Process mount (pid: 1132, threadinfo=df6d1000 task=df711a50)
Stack: c039242c c0229945 c039239c df6d1000 df6d1000 c039242c c155ccf8 
c0223051
   0088 1388 c159ae28 df6d1000 c039242c c155ccf8 c039239c 
c022333e
   df6d1d1c  c153d6e0 c155bd78  df6d1d1c c14007f0 
c0212260
Call Trace:
 [] flagged_taskfile+0x125/0x380
 [] start_request+0x1f1/0x2a0
 [] ide_do_request+0x20e/0x3c0
 [] __generic_unplug_device+0x20/0x30
 [] generic_unplug_device+0x11/0x30
 [] blk_backing_dev_unplug+0xc/0x10
 [] sync_buffer+0x26/0x40
 [] __wait_on_bit+0x42/0x70
 [] sync_buffer+0x0/0x40
 [] sync_buffer+0x0/0x40
 [] out_of_line_wait_on_bit+0x7d/0x90
 [] wake_bit_function+0x0/0x60
 [] __wait_on_buffer+0x29/0x30
 [] _update_journal_header_block+0xf7/0x140
 [] journal_read+0x31d/0x470
 [] journal_init+0x4e1/0x650
 [] printk+0x1b/0x20
 [] reiserfs_fill_super+0x34d/0x770
 [] snprintf+0x20/0x30
 [] disk_name+0x96/0xf0
 [] get_sb_bdev+0xe5/0x130
 [] link_path_walk+0x65/0x140
 [] get_super_block+0x18/0x20
 [] reiserfs_fill_super+0x0/0x770
 [] do_kern_mount+0x44/0xf020 30 20 30 20 30 20 30 20 30 20 
30 20 30 20 30 20 <1>general p
rotection fault:  [#2]
PREEMPT
Modules linked in: ext2 mbcache w83627hf i2c_sensor i2c_isa ppp_generic 
slhc w83627hf_wdt msr cpuid
rtc
CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00010012   (2.6.12-rc1-mm4)
EIP is at 0xdf6d1dae
eax: df6d1d54   ebx: df6d1d1c   ecx:    edx: 0003
esi: df62190c   edi:    ebp: df7a3e80   esp: df7a3e60
ds: 007b   es: 007b   ss: 0068
Process fsck.reiserfs (pid: 1126, threadinfo=df7a3000 task=df711030)
Stack: c0113849 df7a3ea8 0001 0003 c14007f0 df7a3000 df7a3ea8 
0286
   df7a3e9c c011389f  df7a3ea8 c14007f0 c138be00 dfd36330 
da0db42c
   c012c00a df7a3ea8 c138be00  0001 c0145b47 0054 
000f
Call Trace:
 [] __wake_up_common+0x39/0x60
 [] __wake_up+0x2f/0x60
 [] __wake_up_bit+0x2a/0x30
 [] do_wp_page+0x1e7/0x340
 [] handle_mm_fault+0x184/0x1c0
 [] do_page_fault+0x268/0x657
 [] free_pgtables+0x82/0xb0
 [] unmap_region+0xb2/0x120
 [] unmap_vma_list+0xe/0x20
 [] 

Re: sched /HT processor

2005-04-03 Thread Steven Rostedt
On Sun, 2005-04-03 at 13:17 +0200, Jesper Juhl wrote:
> 
> A reschedule can happen once every ms, but also upon returning to 
> userspace and when returning from an interrupt handler, and also when 
> something in the kernel explicitly calls schedule() or sleeps (which in 
> turn results in a call to schedule()). And each CPU runs schedule() 
> independently.
> At least that's my understanding of it - if I'm wrong I hope someone on 
> the list will correct me.

You're correct, but I'll add some more details here.  The actual
schedule happens when needed.  A schedule may not take place at every
ms, if the task running is not done with its time slice and no events
happened where another task should preempt it. If an RT task is running
in a FIFO policy, then it will continue to run until it calls schedule
itself or another process of higher priority preempts it. 

Now if you don't have PREEMPT turned on, than the schedule won't take
place at all while a task is in the kernel, unless the task explicitly
calls schedule.

What happens on a timer interrupt where a task is done with its time
slice or another event where a schedule should take place, is just the
need_resched flag is set for the task.  On return from the interrupt the
flag is checked, and if set a schedule is called. 

This is still a pretty basic description of what really happens, and if
you want to learn more, just start searching the kernel code for
schedule and need_resched. Don't forget to look in the asm code (ie
entry.S, and dependent on your arch other *.S files).

-- Steve


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   >