Re: [PATCH -v6 2/2] Updating ctime and mtime for memory-mapped files

2008-01-18 Thread Ingo Oeser
Hi Linus,

On Friday 18 January 2008, Linus Torvalds wrote:
> On Fri, 18 Jan 2008, Miklos Szeredi wrote:
> > 
> > What I'm saying is that the times could be left un-updated for a long
> > time if program doesn't do munmap() or msync(MS_SYNC) for a long time.
> 
> Sure.
> 
> But in those circumstances, the programmer cannot depend on the mtime 
> *anyway* (because there is no synchronization), so what's the downside?

Can we get "if the write to the page hits the disk, the mtime has hit the disk
already no less than SOME_GRANULARITY before"? 

That is very important for computer forensics. Esp. in saving your ass!

Ok, now back again to making that fast :-)


Best Regards

Ingo Oeser
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/5] x86: Add config variables for SMP_MAX

2008-01-18 Thread Ingo Oeser
Hi Mike,

On Friday 18 January 2008, [EMAIL PROTECTED] wrote:
> +config THREAD_ORDER
> + int "Kernel stack size (in page order)"
> + range 1 3
> + depends on X86_64_SMP
> + default "3" if X86_SMP_MAX
> + default "1"
> + help
> +   Increases kernel stack size.
> +

Could you please elaborate, why this is needed and put more info about
this requirement into this patch description?

People worked hard to push data allocation from stack to heap to make 
THREAD_ORDER of 0 and 1 possible. So why increase it again and why does this
help scalability?

Many thanks and Best Regards

Ingo Oeser, puzzled a bit :-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


kinit (was: sleep before boot panic)

2008-01-10 Thread Ingo Oeser
On Wednesday 09 January 2008, H. Peter Anvin wrote:
> Pavel Machek wrote:
> > 
> >> Of course, if we'd been using kinit, "soft panic" would 
> >> have been done exclusively in userspace...
> > 
> > What's the status of kinit, btw?
> > Pavel
> 
> It's bitrotted a bit since it was first rejected.  It wouldn't take too 
> much work to bring it back up to speed, however.  klibc, and some of the 
> kinit components, are used for the initramfs in Debian.

Yes, and I like most of it. The only thing really missing for me 
is LVM support. Debian (?) did a evil hack to make it work.
Maybe one day this itches soo much, I'll even scratch it :-)

Then I'll be able to test kernels on a standard LVM installation again.

So please keep up the good work!


Best Regards

Ingo Oeser
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sleep before boot panic

2008-01-07 Thread Ingo Oeser
On Monday 07 January 2008, Andi Kleen wrote:
> Bernd Schubert <[EMAIL PROTECTED]> writes:
> 
> > Hi,
> >
> > I just switched to libata (pata) on my laptop and the immediate panic made 
> > it 
> > impossible to figure out why my boot partition wasn't available.
> > After applying this little patch I could check boot printk output and then 
> > saw 
> > everything was properly recognized and only scsi-disk support was missing.
> 
> The correct fix would be to make scroll back (and sysrq) still work
> after panic.  It's a little more complicated, but possible (essentially
> it needs a polled keyboard handler)

Customer: "This system could not find the root fs."
Support: "Oh, yeah, just connect a (USB-) keyboard and scroll back."

Hmm, device detection works after panic?

I really like the "soft" panic better, where you still can operate the 
kernel debugging features, but just have no user space supporting it.

One better hopes, that keyboards never need external firmware to be loaded
at this stage :-)

Best Regards

Ingo Oeser, who just hit the same problem yesterday...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sleep before boot panic

2008-01-06 Thread Ingo Oeser
Hi Bernd,

CC'ed hpa, since I'm sure he can give useful advise on that :-)

On Sunday 06 January 2008, Bernd Schubert wrote:
> On Sunday 06 January 2008, Ingo Oeser wrote:
> > Hi Bernd,
> >
> > On Sunday 06 January 2008, you wrote:
> > > Index: zd1211rw.git.beno/init/do_mounts.c
> > > ===
> > > --- zd1211rw.git.beno.orig/init/do_mounts.c   2008-01-06 
> > > 18:44:23.0
> > > +0100
> > > +++ zd1211rw.git.beno/init/do_mounts.c2008-01-06 18:45:44.0
> > > +0100 @@ -330,6 +330,7 @@
> > >   printk("Please append a correct \"root=\" boot option; here are 
> > > the
> > > available partitions:\n");
> > >
> > >   printk_all_partitions();
> > > + msleep(60 * 1000);
> >
> > ssleep(60);
> 
> feel free to replace it replace it :)

Not that urgent, but if you resubmit please do it :-)

> There is no dump_stack() here, but disc detection is relatively early in boot 
> process and on all these information are already scrolled off screen when the 
> panic is done. For this and any other panic it would be optimal if scrolling 
> still would work, but scrolling also requires kernel code, so I see there's a 
> reason not to this for all panics. However, for this boot problem I tend to 
> say there's no need to panic at all...

But the kernel cannot continue from that position. You would need a "soft" 
panic,
which allows behavior of panic=X, but let the kernel continue.

Even better is to continue with the init in the builtin ramfs. That should 
always
be available and can implement any behavior desired (like droping into a dash).

> Btw, not all stack straces are useless, *most* of them are actually very 
> useful.

I didn't say that. Just if you cannot continue due to admin error, 
but the kernel is in a perfect valid state otherwise,
dumping stack is next to useless.


Best Regards

Ingo Oeser
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sleep before boot panic

2008-01-06 Thread Ingo Oeser
Hi Bernd,

On Sunday 06 January 2008, you wrote:
> Index: zd1211rw.git.beno/init/do_mounts.c
> ===
> --- zd1211rw.git.beno.orig/init/do_mounts.c   2008-01-06 18:44:23.0 
> +0100
> +++ zd1211rw.git.beno/init/do_mounts.c2008-01-06 18:45:44.0 
> +0100
> @@ -330,6 +330,7 @@
>   printk("Please append a correct \"root=\" boot option; here are 
> the 
> available partitions:\n");
>  
>   printk_all_partitions();
> + msleep(60 * 1000);

ssleep(60);

>   panic("VFS: Unable to mount root fs on %s", b);
>   }

Better would be for this and similiar panic()s
(fatal user/admin errors on boot) to NOT print a stack trace+registers,
since it is useless and actually hides useful information.


Best Regards

Ingo Oeser
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] add task handling notifier

2007-12-20 Thread Ingo Oeser
Hi Jan,

I like and support your idea!

On Thursday 20 December 2007, Jan Beulich wrote:
> With more and more sub-systems/sub-components leaving their footprint
> in task handling functions, it seems reasonable to add notifiers that
> these components can use instead of having them all patch themselves
> directly into core files.

Yes, but why export variables? Wouldn't it be better to export 
an API? 

That simplifies the callers (they all pass "current" as task 
and "task_notifier_list" as arguments).

It also prevents exposing internal variables (notifier lists 
ARE internal variables) to modules.

What do you think?


Best Regards

Ingo Oeser
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RFC] dynamic pipe resizing

2007-12-15 Thread Ingo Oeser
On Saturday 15 December 2007, Jan Engelhardt wrote:
> On Aug 24 2007 10:52, Jens Axboe wrote:
> >Subject: [PATCH][RFC] dynamic pipe resizing
> >Like with my original splice patches from 2005, I used fcntl()
> >F_GETPIPE_SZ and F_SETPIPE_SZ to change the size of the pipe. I'm not
> >particularly fond of that interface, so suggestions on how to improve it
> >would be appreciated. Even if fcntl() should be the preferred approach,
> >I think it would be better to pass in a byte based value instead of a
> >number of pages.
> >
> Could this patch still make it in?
> Yes, I think its set() and get() parts should use bytes and convert 
> to/from pages.
> 
> Perhaps just round up, and mention the rounding in the manpage, so that 
> noone gets a shock when the pipe is not exactly as small as requested.

Yes, but document only that it is rounding and make the unit arbitrary.
That reduces the ABI requirements.


Best Regards

Ingo Oeser
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [feature] automatically detect hung TASK_UNINTERRUPTIBLE tasks

2007-12-01 Thread Ingo Oeser
On Saturday 01 December 2007, Ingo Molnar wrote:
> maybe, but we'd have to see how often this gets triggered. An OOM is 
> something that could happen in any overloaded system - while a hung task 
> is likely due to a kernel bug.

What about a client using hard mounted NFS shares here? That shouldn't be
killed by the OOM killer in that situation, should it?

Am I missing sth.?


Best Regards

Ingo Oeser
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ata: ahci: Enable enclosure management via LED (resend)

2007-10-25 Thread Ingo Oeser
On Thursday 25 October 2007, Kristen Carlson Accardi wrote:
> I did look into using the LED class for this, but it didn't appropriate
> as I wanted the leds to be associated with a particular disk, and not
> with the platform as a whole.  It seemed to me that the led_class was 
> a bit of overkill for what we needed to do here, since we just need 
> on/off and nothing else.

Maybe. But didn't you want mdadm to control it? Then it would make sense.

But you have a point in the LED API missing the ability to associate 
a LED to a specific device (e.g. where it is installed :-).

So I'm fine either way, since I see you point.


Best Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ata: ahci: Enable enclosure management via LED (resend)

2007-10-25 Thread Ingo Oeser
Hi Kristen,

On Thursday 25 October 2007, Kristen Carlson Accardi wrote:
> Enable enclosure management via LED
> 
> As described in the AHCI spec, some AHCI controllers may support 
> Enclosure management via a variety of protocols.  This patch
> adds support for the LED message type that is specified in 
> AHCI 1.1 and higher.

Linux has a LED subsystem for that. May I suggest, that you just register
these leds and let userspace handle them via that via the LED API?

The LED userspace API is described in Documentation/leds-class.txt
and the headers for registering LEDs is linux/leds.h under include/

Since you explicitly WANT user space to control these, that should
be the right API. 

Richard, what do YOU think?


Best Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 13/13] RT: Cache cpus_allowed weight for optimizing migration

2007-10-23 Thread Ingo Oeser
Hi Gregory,

On Tuesday 23 October 2007, Gregory Haskins wrote:
> Calculating the weight is probably relatively expensive, so it is only
> done when the cpus_allowed mask is updated (which should be relatively
> infrequent, especially compared to scheduling frequency) and cached in
> the task_struct.

Why not make it a task flag, since according to your code, you are only 
interested whether this is <= 1 or > 1. Since !(x <= 1) <=> (x > 1)
for any given unsigned integer x, the required data structure is
a "boolean" or a flag.


Best Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Extending kbuild syntax

2007-09-30 Thread Ingo Oeser
Hi Sam,

On Saturday 29 September 2007, Sam Ravnborg wrote:
> Introducing the following new variable could make this a oneliner:
> ccflags-y
> 
> ccflags-$(DEBUG) := -DDEBUG
> 
> grep -r -C 1 -B 1 EXTRA_CFLAGS shows that the above is a 
> very common pattern especially in drivers/

ACK. Also ACK for asflags, if done the same way :-)

> The second is the more controversial suggestion.

Yes, but please bear in mind, what the developers are trying
to express in these cases.

> In several Makefile we have simple if expression of the variants:
> if ($(CONFIG_FOO),y)
>   obj-$(CONFIG_BAR) += fubar.o
> endif

This is "feature FOO of module BAR" where the feature itself
cannot be a module. The composition scheme described in 
section 3.3 is at least equally useful. And that is used today.
Maybe the documentation of that scheme is not prominent enough :-)

> obj-y-ifn-

This is the only one needed, because it is cumbersome to express
negative rules in kbuild to include stubs (e.g. nommu stuff). 
But again this can be done with composition rules right now, but
is order dependent. If we could get rid of this requirement,
I would be happy already. So kbuild is just lacking an "else" 
clause here.


Best Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Combine path_put and path_put_conditional

2007-09-29 Thread Ingo Oeser
Hi Andreas,

On Friday 28 September 2007, Andreas Gruenbacher wrote:
> The name path_put_conditional (formerly, dput_path) is a little unclear.
> Replace (path_put_conditional + path_put) with path_walk_put_both,
> "put a pair of paths after a path_walk" (see the kerneldoc).
 ^

So why not name it path_walk_put_pair() then?

Rationale: "_both" is just counting, "_pair" 
means they are related somehow.


Best regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[OT] kbuild syntax extension for ccflags and asflags (was: [PATCH 1/3] CodingStyle updates)

2007-09-29 Thread Ingo Oeser
Hi Sam,

On Saturday 29 September 2007, Sam Ravnborg wrote:
> Lately I have considered extending the kbuild syntax a bit.
> 
> Introducing
> ccflags-y
> asflags-y
> 
> [with same functionality as the EXTRA_CFLAGS, EXTRA_AFLAGS]
> would allow us to do:
> 
> ccflags-$(CONFIG_WHATEVER_DEBUG) := -DDEBUG

Please do!

That is very useful for testing and developing new modules.
I learnt a lot from you in this regard and used that kind
of syntax to the extreme in some other non-kernel project
of mine. There it included also ccflags, asflags and so on.

I further split that into -debug-y and -optimize-y flags,
but that was just for my own convenience.


Best Regards

Ingo Oeser

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] Fix coding style

2007-09-25 Thread Ingo Oeser
On Tuesday 25 September 2007, Srivatsa Vaddagiri wrote:
> Index: current/kernel/sched_debug.c
> ===
> --- current.orig/kernel/sched_debug.c
> +++ current/kernel/sched_debug.c
> @@ -239,11 +239,7 @@ static int
>  root_user_share_read_proc(char *page, char **start, off_t off, int count,
>int *eof, void *data)
>  {
> - int len;
> -
> - len = sprintf(page, "%d\n", init_task_grp_load);
> -
> - return len;
> + return sprintf(page, "%d\n", init_task_grp_load);
>  }
>  
>  static int
> @@ -297,7 +293,7 @@ static int __init init_sched_debug_procf
>   pe->proc_fops = &sched_debug_fops;
>  
>  #ifdef CONFIG_FAIR_USER_SCHED
> - pe = create_proc_entry("root_user_share", 0644, NULL);
> + pe = create_proc_entry("root_user_cpu_share", 0644, NULL);
>   if (!pe)
>   return -ENOMEM;

What about moving this debug stuff under debugfs?
Please consider using the functions in  .
They compile into nothing, if DEBUGFS is not compiled in
and have already useful functions for reading/writing integers
and booleans.

Best Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Why do so many machines need "noapic"?

2007-09-15 Thread Ingo Oeser
On Saturday 15 September 2007, Andrew Morton wrote:
> There are 48 bugs in bugzilla which mention "noapic"
> 
> http://bugzilla.kernel.org/buglist.cgi?query_format=advanced&short_desc_type=allwordssubstr&short_desc=&long_desc_type=substring&long_desc=noapic&kernel_version_type=allwordssubstr&kernel_version=&bug_status=NEW&bug_status=REOPENED&bug_status=ASSIGNED&emailassigned_to1=1&emailtype1=substring&email1=&emailassigned_to2=1&emailreporter2=1&emailcc2=1&emailtype2=substring&email2=&bugidtype=include&bug_id=&chfieldfrom=&chfieldto=Now&chfieldvalue=®ression=both&cmdtype=doit&order=Reuse+same+sort+as+last+time&field0-0-0=noop&type0-0-0=noop&value0-0-0=
> 
> And there are 173,000 on the internet ;)
> http://www.google.com/search?hl=en&q=linux+noapic&btnG=Google+Search
> 
> We screwed this pooch a long time ago - years.  Perhaps if some of the many
> noapic users could run a bisection search to work out when it broke we
> could start fixing things.  But they all have a workaround so there's no
> motivation.

I have 2 SMP-Boards and both need noapic. One is from 2001 (AUSUS CUR-DLS),
one is from June 2006 (Gigabyte M57SLI-S4).

There are many reasons:

1. Bugs which have such a simple workaround don't get much attention.

2. Usually SMP boards are used for machines, which just HAVE to work,
   since they have been expensive. These are not consumer boards.

3. I usually had only USB problems (no IRQ), if ommiting noapic.
   USB technology is a cosumer grade technology and enterprise
   grade developers don't have much interest in it (until now?).

4. IRQ routing setup is often a BIOS issue. You might be able
   to fix that by upgrading your BIOS. That often needs a Windows
   tool. Linux people not always (want to) have access to Windows :-)

I reported the all the problems (starting 2001), no developer 
seemed interested.

I can report them against the latest RC6 kernel tomorrow and put them
into bugzilla, if we now REALLY care.


Best Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] crypto: cleanup: Use max() in blkcipher_get_spot() to state the intention.

2007-09-11 Thread Ingo Oeser
[PATCH] crypto: cleanup: Use max() in blkcipher_get_spot() to state the 
intention.

Signed-off-by: Ingo Oeser <[EMAIL PROTECTED]>
---
Hi Herbert,

here is the requested patch against Linus' latest tree. 
It at least compiles.

Best Regards

Ingo Oeser

 crypto/blkcipher.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/crypto/blkcipher.c b/crypto/blkcipher.c
index d8f8ec3..1c99d92 100644
--- a/crypto/blkcipher.c
+++ b/crypto/blkcipher.c
@@ -65,7 +65,7 @@ static inline void blkcipher_unmap_dst(struct blkcipher_walk 
*walk)
 static inline u8 *blkcipher_get_spot(u8 *start, unsigned int len)
 {
u8 *end_page = (u8 *)(((unsigned long)(start + len - 1)) & PAGE_MASK);
-   return start > end_page ? start : end_page;
+   return max(start, end_page);
 }
 
 static inline unsigned int blkcipher_done_slow(struct crypto_blkcipher *tfm,
-- 
1.5.2.5



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] crypto: blkcipher_get_spot() handling of buffer at end of page

2007-09-10 Thread Ingo Oeser
Hi Herbert,

On Monday 10 September 2007, Herbert Xu wrote:
> On Sat, Sep 08, 2007 at 12:14:23PM +0800, Herbert Xu wrote:
> > 
> > [CRYPTO] blkcipher: Fix handling of kmalloc page straddling
> 
> As Bob correctly noted, I had the boolean test inverted.
> Here is the correction:
> 
> [CRYPTO] blkcipher: Fix inverted test in blkcipher_get_spot
> 
> The previous patch had the conditional inverted.  This patch fixes it
> so that we return the original position if it does not straddle a page.

What about using max() for this to make your intention obvious?

static inline u8 *blkcipher_get_spot(u8 *start, unsigned int len)
{
u8 *end_page = (u8 *)(((unsigned long)(start + len - 1)) & PAGE_MASK); 
return max(start, end_page);
}


Best Regards

Ingo Oeser 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFD] new syscalls: suspend_other/resume_other?

2007-09-02 Thread Ingo Oeser
Hi there,

at the moment implementing a mark and sweep garbage collection 
subsystem is quite a hack, because you always have to use up
some signals for suspend/resume all threads to implement this.

For runtime environments (like D system libraries or JVMs) this is 
a hack, since you take away flexibility from the application.

A possible solution would be a syscall or a PTRACE extension
to realize the suspend/resume. I best describe the possible
syscall manpages here, so you get an idea.

NAME 
suspend_other - suspends execution of all but the calling thread

SYNOPSIS
long suspend_other(void);

RETURN VALUE
Positive count suspended threads on success. 
If 0, then suspend_other was a no-op and there is nothing to resume, 
but the
call should still considered successful.
If the number is -1, the errno has to be checked for possible error 
values.

ERRORS
EDEADLK We run already a suspend_other() and the calling thread 
has just been resumed.

EPERM The calling thread is not allowed to do this. 
(optional case due to security)

DESCRIPTION
After sucessful return of this call, the affected process 
is single threaded and only the calling thread runs 
in this process (==MM struct).

The thread, which calls this is responsible for resuming 
all the suspended threads. One can iterate through 
"/proc/self/task/", to verify for sure that one knows all threads,
if the returned count doesn't match the expected value.

Any per thread queued signals are deferred until resume_other()
or process destruction.

NOTES
This call might be restricted to the main thread.



NAME 
resume_other - resume execution of foreign thread in this process

SYNOPSIS
long resume_other(pid_t tid)

RETURN VALUE
Returns 0, if successful.
Otherwise -1 is returned, the errno has to be checked for 
possible error values and the call has no effect at all.

Any non-blocked signals of that thread which happend during 
suspend/resume are deliverd now.

ERRORS
EINVAL The thread is running already. 
(this is a severe caller BUG).
ESRCH The thread with tid does not exist.
(or doesn't belong to this process).
EPERM The calling thread is not allowed to do this. 
(optional case due to security)

DESCRIPTION
After sucessful return of this call, the affected thread 
is the the state it was before it was suspended 
by calling suspend_other().

NOTES
The value -1 for tid is reserved for future extension 
(e.g. meaning ALL other threads).

This call might be restricted to the main thread.

-

Any opinions?


Best Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] memchr (trivial) optimization

2007-08-22 Thread Ingo Oeser
On Wednesday 22 August 2007, lode leroy wrote:
> While profiling something completely unrelated, I noticed
> that on the workloads I used memchr for, I saw a 30%-40% improvement
> in performance, with the following trivial changes...
> (basically, it saves 3 operations for each call)

Yes, but then you could be a bit more explicit to the compiler
on what you are doing here:
 
void *memchr(const void *s, int c, size_t n)
{
const unsigned char *p = s;

for (; n != 0; n--, p++) {
   if ((unsigned char)c == *p) {
   return (void *)p;
}
return NULL;
}

Now the compiler should see the loop more clearly.


Best Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]: proc: export a processes resource limits via proc/

2007-08-13 Thread Ingo Oeser
Hi Neil,

> +static struct limit_names lnames[RLIM_NLIMITS] = {
static const ...

may be better here.

Best Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Driver-level memory management

2007-08-12 Thread Ingo Oeser
Hi Michael,

On Sunday 12 August 2007, Michael Bourgeous wrote:
> I'm working on a driver for older HDTV cards based on the TL880 chip.
> These cards typically have 16MB of their own memory, which is
> available to me over the PCI bus.  Various functions of the card
> require me to manage this memory, allocating and freeing chunks of it
> as necessary.  I can easily include my own allocation and management
> code, 

Ok.

> but I'm sure this is a problem that has been solved before. 

Yes!

in your Kconfig

select GENERIC_ALLOCATOR

in your driver.c

#include 

Code is in lib/genalloc.c, if you like to take a look.

Memory for MANAGING free/allocated space is NOT taken 
from your on-card memory! That allocator is explicitly developed
for such use cases.

Happy hacking!


Best regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 03/11] fuse: add reference counting to fuse_file

2007-08-04 Thread Ingo Oeser
Hi Miklos,

On Friday 03 August 2007, Miklos Szeredi wrote:
> From: Miklos Szeredi <[EMAIL PROTECTED]>
> 
> Make lifetime of 'struct fuse_file' independent from 'struct file' by
> adding a reference counter and destructor.

What about using krefs to implement that?

see include/linux/kref.h
and lib/kref.c

Just embed that "struct kref" inside your struct fuse_file

struct fuse_file {
...
struct kref ref;
...
}

init in struct fuse_file *fuse_file_alloc(void) where you added the counter.
...
kref_init(&ff->ref);
...

and implement the release function like:

static void fuse_file_release(struct kref *ff_ref)
{
struct fuse_file *ff = container_of(ff_ref, struct fuse_file, ref);
struct fuse_req *req = ff->reserved_req;
struct fuse_conn *fc = get_fuse_conn(req->dentry->d_inode);
request_send_background(fc, req);
kfree(ff);
}

This will also fix the missing smp_barriers, is very simple, saves code,
makes your life easier and is a well known known kernel infrastructure :-)

BTW: FUSE rocks! :-)

You can add my "Signed-off-by: Ingo Oeser <[EMAIL PROTECTED]>",
if you want to use that suggestion.

Best Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [EXT4 set 7][PATCH 1/1]Remove 32000 subdirs limit.

2007-07-22 Thread Ingo Oeser
On Tuesday 17 July 2007, Kalpak Shah wrote:
> Index: linux-2.6.22/include/linux/ext4_fs.h
> ===
> --- linux-2.6.22.orig/include/linux/ext4_fs.h
> +++ linux-2.6.22/include/linux/ext4_fs.h
> @@ -797,12 +797,18 @@ struct ext4_dir_entry_2 {
>#define is_dx(dir) (EXT4_HAS_COMPAT_FEATURE(dir->i_sb, \
>   EXT4_FEATURE_COMPAT_DIR_INDEX) 
> && \
>   (EXT4_I(dir)->i_flags & EXT4_INDEX_FL))
> -#define EXT4_DIR_LINK_MAX(dir) (!is_dx(dir) && (dir)->i_nlink >= 
> EXT4_LINK_MAX)
> -#define EXT4_DIR_LINK_EMPTY(dir) ((dir)->i_nlink == 2 || (dir)->i_nlink == 1)
> +static inline int ext4_dir_link_max(struct inode *dir)
> +{
> +   return (!is_dx(dir) && (dir)->i_nlink >= EXT4_LINK_MAX);
> +}
> +static inline int ext4_dir_link_empty(struct inode *dir)
> +{
> +   return ((dir)->i_nlink == 2 || (dir)->i_nlink == 1);
> +}

even better:

static inline bool ext4_is_dx(const struct inode *dir)
{
#ifdef FOOBAR
return EXT4_HAS_COMPAT_FEATURE(dir->i_sb,   
EXT4_FEATURE_COMPAT_DIR_INDEX) 
   && (EXT4_I(dir)->i_flags & EXT4_INDEX_FL));
#else
return false;
#endif
}

static inline bool ext4_dir_link_max(const struct inode *dir)
{
return !ext4_is_dx(dir) && (dir->i_nlink >= EXT4_LINK_MAX);
}

static inline bool ext4_dir_link_empty(const struct inode *dir)
{
#ifdef FOOBAR
return dir->i_nlink == 2 || dir->i_nlink == 1;
#else
return dir->i_nlink == 2;
#endif
}


FOOBAR is the define, which enables ext4_is_dx(). 
That is not in the patch, so left as an exercise to the reader :-)

Best Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] intel-rng: Undo mess made by an 80 column extremist

2007-06-08 Thread Ingo Oeser
On Friday 08 June 2007, John Stoffel wrote:
> Jeff> On Thu, Jun 07, 2007 at 09:56:06PM -0400, John Stoffel wrote:
> >> Thinking about it more, I wonder if Krysztof is bitching more about
> >> the tab width of 8 characters?   I know that it ticks me off,
> 
> Jeff> Even if he is, _that_ is definitely not getting changed.
> 
> Oh sure... I know that part is written in stone.  

Yes, and as a person doing Linux code review for 12 years now,
I'm really thankful for it. 8 char tab, 80 column rule and 25-50 lines
of code per function actually enable effective review of code snippets.
Because you can see more code flow per patch.

And enables high code reuse. If you can get within 1-5min, 
what a functions does and match it with your actually 
written down last 20 code lines, you just reuse it more often.

If you have more to choose from, you reuse naturally. 
Personally I find best candidates by code position in tree and 
function signature.

> Jeff> If code starts creeping way right due to indentation levels,
> Jeff> create a new function.
> 
> Sure... compilers are good, us humans haven't gotten much better, make
> it easier on us and harder on the computer.   

Yes, let compile remove all the abstraction overhead. 
GCC does a pretty good job there, I think.

I recently analyzed some code and it took much, much longer (factor 2-3), 
because of laxer coding rules similiar to the ones you suggest.

I even asked the developers, who wrote that code and to ones who work 
daily with that code base and they had the same problems. They all couldn't 
explain the "Why?" only the "How?". Not to mention, that this was a core 
component.

After refactoring some big messy parts into smaller functions,
identical, missing, unhandled cases became visible, inappropriate usages
were identified and even some loops could be removed.

Now try to find such problems within Linux. They should be a small percentage 
and not within core components.

So a big THANKS to all the code cops here: You actually make the 
damn fast change rate of Linux possible by keeping the base clean 
and neat.

Best Regards

Ingo oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 1/6] Storing ipcs into radix trees

2007-06-08 Thread Ingo Oeser
Hi,

On Friday 08 June 2007, Nadia Derbey wrote:
> Ingo Oeser wrote:
> > ... together with this means 4*256 -> 1k of precious stack space used.
> > Please consider either lowering IPCS_MAX_SCAN_ENTRIES or kmalloc() that.
> You're completely right, but trying to lower the extraction size, I'm 
> afraid this will have an impact on performances.
> 
> Here are the results of a small test I did: I have run ctxbench on both 
> the 256 and and 16 entries versions
> 
> 1) 256 entries:
> 42523679 itterations in 300.005423 seconds = 141743/sec
> 2) 16 entries:
> 41774255 itterations in 300.005334 seconds = 139245/sec

So that is around 1.8% in a benchmark. 

Not bad, if one considers, that this is an expensive syncronisation primitive 
anyway (and thus shouldn't dominate any real workload). At least _much_
better than possible stack underflow :-)

BTW: You forgot to include measurements with the unmodified code 
as it is in Linus' tree now. They woule be a nice data point here.

> Will try with a dynamic allocation.

But than you have an additional error path 
or have to sleep until memory becomes available.

Maybe try doubling IPCS_MAX_SCAN_ENTRIES - until the performance impact
is in the noise - is simpler. Up to 64 seems acceptable.

Best Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 1/6] Storing ipcs into radix trees

2007-06-07 Thread Ingo Oeser
Hi Nadia,

good to see someone is pounding this old beast again :-)

On Thursday 07 June 2007, [EMAIL PROTECTED] wrote:
> Index: linux-2.6.21/ipc/util.h
> ===
> --- linux-2.6.21.orig/ipc/util.h  2007-06-07 11:00:30.0 +0200
> +++ linux-2.6.21/ipc/util.h   2007-06-07 11:07:22.0 +0200
> @@ -13,6 +13,8 @@
>  #define USHRT_MAX 0x
>  #define SEQ_MULTIPLIER   (IPCMNI)
>  
> +#define IPCS_MAX_SCAN_ENTRIES 256

That ...

> Index: linux-2.6.21/ipc/util.c
> ===
> --- linux-2.6.21.orig/ipc/util.c  2007-06-07 11:00:30.0 +0200
> +++ linux-2.6.21/ipc/util.c   2007-06-07 11:29:43.0 +0200
> @@ -252,72 +241,94 @@ void __init ipc_init_proc_interface(cons
>   *   @key: The key to find
>   *   
>   *   Requires ipc_ids.mutex locked.
> - *   Returns the identifier if found or -1 if not.
> + *   Returns the LOCKED pointer to the ipc structure if found or NULL
> + *   if not.
> + *  If key is found ipc contains its ipc structure
>   */
>   
> -int ipc_findkey(struct ipc_ids* ids, key_t key)
> +struct kern_ipc_perm *ipc_findkey(struct ipc_ids *ids, key_t key)
>  {
> - int id;
> - struct kern_ipc_perm* p;
> - int max_id = ids->max_id;
> + struct kern_ipc_perm *ipc;
> + struct kern_ipc_perm *ipcs[IPCS_MAX_SCAN_ENTRIES];

... together with this means 4*256 -> 1k of precious stack space used.
Please consider either lowering IPCS_MAX_SCAN_ENTRIES or kmalloc() that.

Same problem with your third patch called 
"Changing the loops on a single ipcid into radix_tree_gang_lookup() calls"

If you cannot sleep, try to lower that constant (e.g. 16-32). 
The current users use much smaller numbers.

If you can sleep and performance goes down after lowering that constant,
try to kmalloc these arrays (since kmalloc() of that small amount 
should succeed easily).



Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Bridge] [BUG] Dropping fragmented IP packets within VLAN frames on bridge

2007-05-26 Thread Ingo Oeser
On Saturday 26 May 2007, Patrick McHardy wrote:
> Adam Osuchowski wrote:
> > if (((skb->protocol == htons(ETH_P_IP) && skb->len > skb->dev->mtu) ||
> > (IS_VLAN_IP(skb) && skb->len > skb->dev->mtu - VLAN_HLEN)) &&
> > !skb_is_gso(skb))
> > return ip_fragment ...
> 
> 
> net/8021q ignores the VLAN header overhead, so we should probably do the
> same here for consistency. Using IS_VLAN_IP (and IS_PPPOE_IP for current
> -rc) looks fine, additionally we should probably also check for
> skb->nfct != NULL to make sure that at least without connection tracking
> the bridge doesn't perform fragmentation.

And could we separe the conditions for that into a static helper function
explaining each of these conditions? e.g. sth. like that:

static bool br_nf_need_fragment(struct sk_buff *skb)
{
/* Plain IP packet does not fit in MTU */
if (!(skb->protocol == htons(ETH_P_IP) && skb->len > skb->dev->mtu))
return true;

/* VLAN encapsulated IP packet does not fit in MTU */
if (IS_VLAN_IP(skb) && skb->len > skb->dev->mtu - VLAN_HLEN)
return true;

/* PPPoE encapsulated IP packet does not fit in MTU */
if (IS_PPPOE_IP(skb) && skb->len > skb->dev->mtu - PPPOE_SES_HLEN)
return true;

return false;
}

and then br_nf_dev_queue_xmit() becomes:

static int br_nf_dev_queue_xmit(struct sk_buff *skb)
{
if (br_nf_need_fragment(skb) &&  !skb_is_gso(skb))
return ip_fragment(skb, br_dev_queue_push_xmit);
else
return br_dev_queue_push_xmit(skb);
}

which is much more readable, more documented and doesn't contain a condition 
monster :-)

@Patrick: Could you check, wether the PPPoE case is correct?

What do you think? Should I submit a patch for that?


Best Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: epoll,threading

2007-05-26 Thread Ingo Oeser
Hi Arunachalam,

On Saturday 26 May 2007, Arunachalam wrote:
> I want to know in detail about , what the events (epoll or /dev/poll or
> select ) achieve in contrast to  thread per client.
> 
> i can have a thread per client and use send and recv system call directly
> right? Why do i go for these event mechanisms?

Try 30.000 clients or more on a x86 32bit box. 
That will show you the difference quite nicely :-)

More seriously: Thread per client scales only to a certain amount of clients
per RAM. If you like to scale beyond that to like to minimize your 
state 
per client. If you have a thread then you have a task structure as 
unswappable memory in kernel, a per-thread stack, which is reducing 
your virtual memory per process (you have only around 3GB of virtual 
memory per process in Linux x86 32bit).

So one uses a process or thread pool to scale beyond that. 
Pool size is typically related to the amount of CPU cores in the system.

Regards

Ingo Oeser

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Introduce boot based time

2007-05-10 Thread Ingo Oeser
Hi John and Tomas, 

On Thursday 10 May 2007, john stultz wrote:
> I'm not sure I follow this. 
> 
> total_sleep_time stores seconds. So on 32bit systems that's 130some
> years, so it shouldn't be an issue.
> 
> Is the reason you want it to be a ktime is because you want a way to
> keep sub-second sleep granularity?

No, I'm just overworked and getting sloppy :-/ 

Sorry for the noise...

Best regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Introduce boot based time

2007-05-10 Thread Ingo Oeser
Hi Tomas,

On Thursday 10 May 2007, Tomas Janousek wrote:
> diff --git a/include/linux/time.h b/include/linux/time.h
> index 8997b61..06f3eaf 100644
> --- a/include/linux/time.h
> +++ b/include/linux/time.h
> @@ -116,6 +116,8 @@ extern int do_setitimer(int which, struct itimerval 
> *value,
>  extern unsigned int alarm_setitimer(unsigned int seconds);
>  extern int do_getitimer(int which, struct itimerval *value);
>  extern void getnstimeofday(struct timespec *tv);
> +extern void getboottime(struct timespec *ts);
> +extern void monotonic_to_bootbased(struct timespec *ts);
>  
>  extern struct timespec timespec_trunc(struct timespec t, unsigned gran);
>  extern int timekeeping_is_continuous(void);
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index f9217bf..dd9647a 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -36,9 +36,17 @@ EXPORT_SYMBOL(xtime_lock);
>   * at zero at system boot time, so wall_to_monotonic will be negative,
>   * however, we will ALWAYS keep the tv_nsec part positive so we can use
>   * the usual normalization.
> + *
> + * wall_to_monotonic is moved after resume from suspend for the monotonic
> + * time not to jump. We need to add total_sleep_time to wall_to_monotonic
> + * to get the real boot based time offset.
> + *
> + * - wall_to_monotonic is no longer the boot time, getboottime must be
> + * used instead.
>   */
>  struct timespec xtime __attribute__ ((aligned (16)));
>  struct timespec wall_to_monotonic __attribute__ ((aligned (16)));
> +static unsigned long total_sleep_time;

Could you make that a ktime_t (or struct ktime)?
There are machines, which sleep more than they are awake. 
Just imagine a surveillance camera triggered by door entrance.

Yes, these things might run Linux (e.g. on "cris" architecture).

Or your VCR.
Yes, these devices might sleep more than they are awake, 
if you are not a TV junkie :-)


Best regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] LogFS proper

2007-05-08 Thread Ingo Oeser
On Tuesday 08 May 2007, Thomas Gleixner wrote:
> On Tue, 2007-05-08 at 00:00 +0200, Jörn Engel wrote:
> > +#define packed __attribute__((__packed__))
> 
> Please use the __attribute__((__packed__)) on your structs instead of
> creating some extra "needs lookup" magic.

Don't worry, we have __packed predefined for this. 
Just look in include/linux/compiler-gcc.h

I love it, because I always forget at least one brace or undescore level :-)




Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] DRM TTM Memory Manager patch

2007-05-01 Thread Ingo Oeser
On Tuesday 01 May 2007, Dave Airlie wrote:
> >   - what's with the /proc interface?  Don't add new proc code for
> > non-process related things.  This should all go into sysfs
> > somewhere.  And yes, I know /proc/dri/ is there today, but don't add
> > new stuff please.
> 
> Well we should move all that stuff to sysfs, but we have all the
> infrastructure for publishing this stuff under /proc/dri and adding
> new files doesn't take a major amount, as much as I appreciate sysfs,
> it isn't suitable for this sort of information dump, the whole one
> value per file is quite useless to provide this sort of information
> which is uni-directional for users to send to us for debugging without
> have to install some special tool to join all the values into one
> place.. and I don't think drmfs is the answer either... or maybe it
> is

Ok, what about debugfs then? If it is just for debugging blobs -> debugfs,
if it is crucial for operation -> sysfs and representation of one value 
per file.


Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] crypto: Use padlock.ko only as a module

2007-04-29 Thread Ingo Oeser
Hi Scott,

On Sunday 29 April 2007, Simon Arlott wrote:
> Ideally I'd just remove that module completely, all it does is 
> trigger the loading of the other two modules when modules are 
> used - so I'll submit a patch for that instead.

That's much better! 

When you force a feature to be a module on a kernel without 
module support, it will effectivly be disabled.

And if it is so simple to do the same in userspace like you suggest,
than that's much better.

Best Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: init's children list is long and slows reaping children.

2007-04-10 Thread Ingo Oeser
On Tuesday 10 April 2007, Jeff Garzik wrote:
> Thus, rather than forcing authors to make their code more complex, we 
> should find another solution.

What about sth. like the "pre-forking" concept? So just have a thread creator 
thread,
which checks the amount of unused threads and keeps them within certain limits.

So that anything which needs a thread now simply queues up the work and
specifies, that it wants a new thread, if possible.

One problem seems to be, that a thread is nothing else but a statement
on what other tasks I can wait before doing my current one (e.g. I don't want 
to 
mlseep() twice on the same reset timeout). 
But we usually use locking to order that.

Do I miss anything fundamental here?

Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: init's children list is long and slows reaping children.

2007-04-10 Thread Ingo Oeser
On Tuesday 10 April 2007, Jeff Garzik wrote:
> That's why I feel thread creation -- cheap under Linux -- is quite 
> appropriate for many of these situations.

Maybe that (thread creation) can be done at open(), socket-creation, 
service request, syscall or whatever event triggers a driver/subsystem 
to actually queue work into a thread.

And when there is a close(), socket-destruction, service completion
or whatever these threads can be marked for destruction and destroyed
by a timer or even immediately.

Regards

Ingo Oeser

-- 
If something is getting cheap, it is getting wasted just because it is cheap.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: getting processor numbers

2007-04-03 Thread Ingo Oeser
Hi Ulrich,

On Tuesday 03 April 2007, Ulrich Drepper wrote:
> So, anybody else has a proposal?  This is a pressing issue and cannot
> wait until someday in the distant future NUMA topology information is
> easily and speedily accessible.

Since for now you just need a fast and dirty hack, which will be replaced 
with better interfaces, I suggest creating a directory with some files in it.
These should just contain, what you need to handle your most pressing cases.

I propose /sys/devices/system/topology_counters/ for that.
These can contain "online_cpu", "proped_cpu", "max_cpu"
and maybe the same for nodes. All that as a simple file with an integer
value.

Since sysfs-attribute files are pollable (if the owners notifies sysfs 
on changes), you also have the notification system you need 
(select, poll, epoll etc.).

If you promise to just keep the slow code around, than one day when the shiny 
NUMA topology stuff is ready, this directory can be completely removed and
glibc (plus all their users) keeps working. It will then even work better with 
a 
new glibc version, which supports the shiny new NUMA topology stuff.

The kernel can create these counters quiete easy, since most of them are 
the hamming weight (or population count) of some bitmaps.

Does this sound like a proper hacky solution? :-)

Regards

Ingo Oeser


pgpeUyaLE4v0G.pgp
Description: PGP signature


Re: [PATCH] sysctl: vfs_cache_divisor

2007-03-20 Thread Ingo Oeser
Hi Randy,

On Monday 19 March 2007, Randy Dunlap wrote:
> Were there any patches written after this?  If so, I missed them.
> If not, does this patch help any?

How is division by zero avoided? Maybe one can avoid setting it to zero.

Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 0/2] semaphores: add down_interruptible_timeout() and asm-generic/semaphore.h

2007-02-27 Thread Ingo Oeser
Hi Inaky,

On Tuesday, 27. February 2007, Inaky Perez-Gonzalez wrote:
> On Monday 26 February 2007 18:18, Alan wrote:
> > > Yeah, I need semaphore. This is a hw register that says when the hw
> > > is ready to accept a new command. Code that wants to send commands has
> > > to down the semaphore and then send it. When hw is ready to get a new
> > > command, it sends and IRQ and the IRQ up()s the semaphore.
> >
> > So you need a mutex not a semaphore
> 
> Theoretically I could use a mutex. Practically it would trigger ugly
> complications. Only the owner can unlock a mutex (for example), so
> I could not unlock from an IRQ handler -- not to mention that the
> semantic rules outlined in Documentation/mutex-design.txt explicitly
> forbid IRQ usage. 
> 
> And then, this is what semaphores where designed for, as gates :)
> for once that I get to use a semaphore properly...

But they are not required for that :-)

I would suggest to use an irq-safe spinlock for the hardware access
and a status indicator (ready for command), if this is really just a command 
register. 
If the status indicator is updated (in IRQ) and read under spinlock, that is 
safe.

If that command sending is speed critical, please try a FIFO and batch that 
stuff.

Timeout based locking mechanisms are flawed, because they introduce the hard to 
find
timing sensitive bugs.

Please try sth. different (e.g. like suggested above).

Semaphores aren't good "busy/ready flags", as you might have already noticed.

Many Thanks and Best Regards

Ingo Oeser, the down{_interruptible,}_timeout() implementation of Linux :-)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/1] MM: detach_vmas_to_be_unmapped fix

2007-02-21 Thread Ingo Oeser
Hi,

On Wednesday, 21. February 2007, [EMAIL PROTECTED] wrote:
> 
> ---
> 
>  mm/mmap.c |4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff -puN mm/mmap.c~Avoiding-mmap-fragmentation_fixup mm/mmap.c
> --- linux-2.6_clean/mm/mmap.c~Avoiding-mmap-fragmentation_fixup   
> 2007-02-21 09:49:32.0 -0800
> +++ linux-2.6_clean-akuster/mm/mmap.c 2007-02-21 09:51:26.0 -0800
> @@ -1720,9 +1720,9 @@ detach_vmas_to_be_unmapped(struct mm_str
>   *insertion_point = vma;
>   tail_vma->vm_next = NULL;
>   if (mm->unmap_area == arch_unmap_area)
> - addr = prev ? prev->vm_end : mm->mmap_base;
> + addr = prev ? prev->vm_start : mm->mmap_base;
>   else
> - addr = vma ?  vma->vm_start : mm->mmap_base;
> + addr = vma ?  vma->vm_end : mm->mmap_base;
>   mm->unmap_area(mm, addr);
>   mm->mmap_cache = NULL;  /* Kill the cache. */
>  }

Please comment, why you think this is necessary.


Thanks & Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] MTD: fix DOC2000/2001/2001PLUS build error

2007-02-05 Thread Ingo Oeser
On Monday, 5. February 2007, Linus Torvalds wrote:
> So thank God for the few selects we have, and we should add a whole lot 
> more!

But "select" is not fine grained enough.

I would like to have "require", "recommend", "suggest" for feature A.

require X
does not work without X, but X is way down the tree 
e.g. ext3 and block device or how select currently is intended

recommend X
it is usable but uncomfortable without X, enabled per default
e.g. firewalling recommends connection tracking support 
or NAT recommends all NAT helpers

suggest X
many people use A together with X, 
so you might be interested in enabling it, but I disabled it
per default unless you said "featuritis mode" before.
e.g. highmem and SMP or a network driver and NAPI.

That is what the Debian/Ubuntu package management does and maybe other too.
And this also gives us new keywords to replace select with, 
so migration is doable :-)

This would also make "EMBEDDED" superflous, because it would just mean 
"disable anything not required". 

And this would enable an individual tree for the users current configuration 
problem instead of a global one.

Regards

Ingo "and tomorrow we change the world" Oeser :-)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm 3/10][RFC] aio: use iov_length instead of ki_left

2007-01-16 Thread Ingo Oeser
On Tuesday, 16. January 2007 06:37, Nate Diller wrote:
> On 1/15/07, Christoph Hellwig <[EMAIL PROTECTED]> wrote:
> > On Mon, Jan 15, 2007 at 05:54:50PM -0800, Nate Diller wrote:
> > > Convert code using iocb->ki_left to use the more generic iov_length() 
> > > call.
> >
> > No way.  We need to reduce the numer of iovec traversals, not adding
> > more of them.
> 
> ok, I can work on a version of this that uses struct iodesc.  Maybe
> something like this?
> 
> struct iodesc {
> struct iovec *iov;
> unsigned long nr_segs;
> size_t nbytes;
> };
> 
> I suppose it's worth doing the iodesc thing along with this patchset
> anyway, since it'll avoid an extra round of interface churn.

What about this instead

struct iodesc {
struct iovec *iov;
unsigned long nr_segs;
unsigned long seg_limit;
size_t nr_bytes;
};

That will enable resizeable iodescs with partial completion state and
will enable successive filling of an iodesc with iovs.

This will be needed anyway. I built an complete short userspace 
module for that already. I can post and GPLv2 it somewhere, if people
are interested.

Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 31/59] sysctl: C99 convert the ctl_tables in arch/mips/au1000/common/power.c

2007-01-16 Thread Ingo Oeser
Hi Eric,

On Tuesday, 16. January 2007 17:39, Eric W. Biederman wrote:
> diff --git a/arch/mips/au1000/common/power.c b/arch/mips/au1000/common/power.c
> index b531ab7..31256b8 100644
> --- a/arch/mips/au1000/common/power.c
> +++ b/arch/mips/au1000/common/power.c
> @@ -419,15 +419,41 @@ static int pm_do_freq(ctl_table * ctl, int write, 
> struct file *file,

> + {
> + .ctl_name   = CTL_UNNUMBERED,
> + .procname   = "suspend",
> + .data   = NULL,
> + .maxlen = 0,
> + .mode   = 0600,
> + .proc_handler   = &pm_do_suspend
> + },

No need for zero initialization for maxlen.

> + {
> + .ctl_name   = CTL_UNNUMBERED,
> + .procname   = "sleep",
> + .data   = NULL,
> + .maxlen = 0,
> + .mode   = 0600,
> + .proc_handler   = &pm_do_sleep
> + },

dito

> + {
> + .ctl_name   = CTL_UNNUMBERED,
> + .procname   = "freq",
> + .data   = NULL,
> + .maxlen = 0,
> +     .mode   = 0600,
> + .proc_handler   = &pm_do_freq
> + },
> + {}
>  };

dito

Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 45/59] sysctl: C99 convert ctl_tables in drivers/parport/procfs.c

2007-01-16 Thread Ingo Oeser
Hi Eric,

On Tuesday, 16. January 2007 17:39, Eric W. Biederman wrote:
> diff --git a/drivers/parport/procfs.c b/drivers/parport/procfs.c
> index 2e744a2..5337789 100644
> --- a/drivers/parport/procfs.c
> +++ b/drivers/parport/procfs.c
> @@ -263,50 +263,118 @@ struct parport_sysctl_table {
> + {
> + .ctl_name   = DEV_PARPORT_BASE_ADDR,
> + .procname   = "base-addr",
> + .data   = NULL,
> + .maxlen = 0,
> + .mode   = 0444,
> + .proc_handler   = &do_hardware_base_addr
> + },

No need to initialize to zero or NULL. Just list any variable, which is NOT 
zero or NULL.

> + {
> + .ctl_name   = DEV_PARPORT_AUTOPROBE + 1,
> + .procname   = "autoprobe0",
> + .data   = NULL,
> + .maxlen = 0,
> + .maxlen = 0444,
> + .proc_handler   =  &do_autoprobe
> + },

Typo here? .mode = 0444 makes mor sense.

Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [DISCUSS] Make the variable NULL after freeing it.

2007-01-01 Thread Ingo Oeser
On Monday, 1. January 2007 17:25, Andreas Schwab wrote:
> Ingo Oeser <[EMAIL PROTECTED]> writes:
> > Then this works, because the side effect (+20) is evaluated only once. 
> 
> It's not a side effect, it's a non-lvalue, and you can't take the address
> of a non-lvalue.

Just verified this. So If we cannot make it work in all cases, it will
cause more problems then it will solve.

So we are left with a function, which will 
a) only be used by janitors to provide "kfree(x); x = NULL;" 
with an macro KFREE(x) in all the simple cases.

b) be used by developers, who are aware of the fact that reusable
pointer values should set to NULL after kfree().

Doing a) and b) is "running into open doors", so doesn't prevent any
error, obfuscates code more and works only sometimes.

I give up here and would vote for dropping that idea then.


Regards

Ingo Oeser


pgpcCSfafJsC7.pgp
Description: PGP signature


Re: [PATCH] [DISCUSS] Make the variable NULL after freeing it.

2007-01-01 Thread Ingo Oeser
Hi,

On Monday, 1. January 2007 07:37, Amit Choudhary wrote:
> --- Ingo Oeser <[EMAIL PROTECTED]> wrote:
> > #define kfree_nullify(x) do { \
> > if (__builtin_constant_p(x)) { \
> > kfree(x); \
> > } else { \
> > typeof(x) *__addr_x = &x; \

Ok, I should change that line to 
typeof(x) *__addr_x = &(x); \

> > kfree(*__addr_x); \
> > *__addr_x = NULL; \
> > } \
> > } while (0)
> > 
> > Regards
> > 
> > Ingo Oeser
> > 
> 
> This is a nice approach but what if someone does kfree_nullify(x+20).

Then this works, because the side effect (+20) is evaluated only once. 
AFAIK __builtin_constant_p() and typeof() are both free of side effects.

 
> I decided to keep it simple. If someone is calling kfree_nullify() with 
> anything other than a
> simple variable, then they should call kfree().

kfree_nullify() has to replace kfree() to be of any use one day. So this is not 
an option.

Anybody thinking of "Hey, this must be NULL afterwards!", will set it to NULL 
himself.
Anybody else doesn't know or care about it, which is the case we like to catch.

> But definitely an approach that takes care of all 
> situations is the best but I cannot think of a macro that can handle all 
> situations. The simple
> macro that I sent earlier will catch all the other usage at compile time. 

The problems I see are:
1. parameter to kfree is a value not a pointer 
-> solved by using a macro instead of function, 
 but generate new (the other) problems
-> take the address of the value there.
2. possible side effects of macro parameter usage 
   -> solved by assigning once only and using typeof
3. Constants don't have an address 
   -> need to check for constant

So apart from missing braces before taking the address, I don't see
any problem with my solution :-)

Should I send a patch?

> Please let me know if I have missed something.

I reviewed it and you missed side effects (kfree(x); x = NULL).

Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [DISCUSS] Make the variable NULL after freeing it.

2006-12-31 Thread Ingo Oeser
On Sunday, 31. December 2006 14:38, Bernd Petrovitsch wrote:
> That depends on the decision/definition if (so called) "double free" is
> an error or not (and "free(NULL)" must work in POSIX-compliant
> environments).

A double free of non-NULL is certainly an error.
So the idea of setting it to NULL is ok, since then you can
kfree the variable over and over again without any harm.

It is just complicated to do this side effect free.

Maybe one should check for builtin-constant and take the address,
if this is not an builtin-constant.

sth, like this

#define kfree_nullify(x) do { \
if (__builtin_constant_p(x)) { \
kfree(x); \
} else { \
typeof(x) *__addr_x = &x; \
kfree(*__addr_x); \
*__addr_x = NULL; \
    } \
} while (0)

Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/8] KVM: Implement a few system configuration msrs

2006-12-31 Thread Ingo Oeser
Hi,

On Thursday, 28. December 2006 11:11, Avi Kivity wrote:
> Index: linux-2.6/drivers/kvm/svm.c
> ===
> --- linux-2.6.orig/drivers/kvm/svm.c
> +++ linux-2.6/drivers/kvm/svm.c
> @@ -1068,6 +1068,9 @@ static int emulate_on_interception(struc
>  static int svm_get_msr(struct kvm_vcpu *vcpu, unsigned ecx, u64 *data)
>  {
>   switch (ecx) {
> + case 0xc0010010: /* SYSCFG */
> + case 0xc0010015: /* HWCR */
> + case MSR_IA32_PLATFORM_ID:
>   case MSR_IA32_P5_MC_ADDR:
>   case MSR_IA32_P5_MC_TYPE:
>   case MSR_IA32_MC0_CTL:

What about just defining constants for these?
Then you can rip out these comments.

Same for linux-2.6/drivers/kvm/vmx.c


Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 009 of 14] knfsd: SUNRPC: teach svc_sendto() to deal with IPv6 addresses

2006-12-17 Thread Ingo Oeser
On Wednesday, 13. December 2006 00:59, NeilBrown wrote:
> diff .prev/net/sunrpc/svcsock.c ./net/sunrpc/svcsock.c
> --- .prev/net/sunrpc/svcsock.c2006-12-13 10:31:39.0 +1100
> +++ ./net/sunrpc/svcsock.c2006-12-13 10:32:15.0 +1100
> @@ -438,6 +439,47 @@ svc_wake_up(struct svc_serv *serv)
>   }
>  }
>  
> +union svc_pktinfo_u {
> + struct in_pktinfo pkti;
> +#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
> + struct in6_pktinfo pkti6;
> +#endif
> +};
> +
> +static void svc_set_cmsg_data(struct svc_rqst *rqstp, struct cmsghdr *cmh)
> +{
> + switch (rqstp->rq_sock->sk_sk->sk_family) {
> + case AF_INET:
> + do {
> + struct in_pktinfo *pki =
> + (struct in_pktinfo *) CMSG_DATA(cmh);

struct in_pktinfo *pki = CMSG_DATA(cmh);

Ugly casting not needed here, since CMSG_DATA should return "void *",
which can be casted to any pointer.

> +
> + cmh->cmsg_level = SOL_IP;
> + cmh->cmsg_type = IP_PKTINFO;
> + pki->ipi_ifindex = 0;
> + pki->ipi_spec_dst.s_addr = rqstp->rq_daddr.addr.s_addr;
> + cmh->cmsg_len = CMSG_LEN(sizeof(*pki));
> + } while (0);
> + break;
> +#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
> + case AF_INET6:
> + do {
> + struct in6_pktinfo *pki =
> + (struct in6_pktinfo *) CMSG_DATA(cmh);
> +

No casting needed, so:

struct in6_pktinfo *pki = CMSG_DATA(cmh);


Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2.6.19] e1000: replace kmalloc with kzalloc

2006-12-17 Thread Ingo Oeser
On Tuesday, 12. December 2006 18:34, Pekka Enberg wrote:
> On 12/12/06, Yan Burman <[EMAIL PROTECTED]> wrote:
> > size = txdr->count * sizeof(struct e1000_buffer);
> > -   if (!(txdr->buffer_info = kmalloc(size, GFP_KERNEL))) {
> > +   if (!(txdr->buffer_info = kzalloc(size, GFP_KERNEL))) {
> > ret_val = 1;
> > goto err_nomem;
> > }
> > -   memset(txdr->buffer_info, 0, size);
> 
> No one seems to be using size elsewhere so why not convert to
> kcalloc() and get rid of it? (Seems to apply to other places as well.)

Because if done properly that often exceeds the 80 column limit.
The intermediate variable should be optimized away from the compiler.

But kcalloc() is better for another reason: Overflow checking.

Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


EEPROM infrastructure (was: [PATCH] eeprom_93cx6: Add write support)

2006-12-15 Thread Ingo Oeser
Lennart Sorensen schrieb:
> On Wed, Dec 13, 2006 at 07:56:50PM +0100, Ivo van Doorn wrote:
> > This patch addes support for writing to the eeprom,
> > this also moves some duplicate code into seperate functions.
> > 
> > Signed-off-by Ivo van Doorn <[EMAIL PROTECTED]>
> 
> Thank you.  I will have a try with that to see if I can get that to work
> with the jsm driver.  Too bad the serial drivers don't have any
> geteeprom/seteeprom standard ioctl's the way ethtool does for network
> devices.

It might be even better to have eeprom writing infrastructure.

Many device types come with eeproms today and they implement
it per driver or subsystem. On embedded platforms these EEPROMs
might even be shared among different devices.

So it might be time to generalize this like we did with LEDs.

Any comments?

Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] Introduce mutex_lock_timeout

2006-11-26 Thread Ingo Oeser
Hi Matthew,

On Saturday, 25. November 2006 17:32, Matthew Wilcox wrote:
> In the qla case, the mutex can be acquired by a thread which then waits
> for the hardware to do something.  If the hardware locks up, it is
> preferable that the system not hang.

Ok, I looked at it (drivers/scsi/qla2xxx/qla_mbx.c) 
and the solution seems simple:
- Introduce an busy flag, check that BEFORE this mutex_lock()
  and don't protect it by that mutex.
- return -EBUSY to the  upper layers, if mailbox still busy
- upper layers can either queue the command or use a retry mechanism

There are many examples for this in the kernel. NICs have the same problems
(transmitter busy or stuck) and have no problem handling that gracefully
since ages.

> I assumed that he'd spent enough time thinking about it that fixing it
> really wasn't feasible.

That doesn't depend on time, just whether you get the right idea or not.

Anyway I CCed the current maintainers.

So my point still stands: Timeout based locking is evil and hides bugs.

In this case the bugs are: 
1. That mutex protects a code path (mailbox command submission 
and retrieve) instead of data.
2. "Mailbox is free" is an event, so you should use wait_event_timout() 
for that


Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/1] ipw2100: remove by-hand function entry/exit debugging

2005-09-07 Thread Ingo Oeser
Hi Jeff,

Jeff Garzik wrote:
> David S. Miller wrote:
> > From: Jeff Garzik <[EMAIL PROTECTED]>
> > Date: Tue, 06 Sep 2005 21:51:21 -0400
> >
> >>NAK.  Rationale: maintainer's choice.  Pavel doesn't get to choose
> >>the debugger of choice for the driver maintainer.
> >
> > If it makes the driver unreadable and thus harder to maintain,
> > I think such changes should seriously be considered.
> >
> > Most of the DEBUG_INFO macro usage is fine, but those "enter"
> > and "exit" ones are just pure noise and should be removed.
>
> I find them useful in my own drivers; they are definitely not pure noise.

gcc -finstrument-functions

can do that completely without adding noise to the sources.

been there, done that. With a gcc-patch you don't even need to 
resolve symbols.


Regards

Ingo Oeser

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][CFLART] ipmi procfs bogosity

2005-09-05 Thread Ingo Oeser
Hi,

On Thursday 01 September 2005 22:30, Alexey Dobriyan wrote:
> On Thu, Sep 01, 2005 at 03:00:44PM -0500, Corey Minyard wrote:
> > Plus the scanning function I wrote handles arbitrary leading and 
> > trailing space, etc.  Not a big deal, but a little nicer.
> 
> You can say from the beggining that
> 
>   echo -n "2   " >/proc/FUBAR
>   
> is illegal and don't add bloat to kernel.

No, user interfaces should be robust.
Just remember the mantra "Be liberal in what you accept and
conservative in what you send."

I would suggest adding sth. like Coreys user_strtoul() to lib/string.c
which would reduce bloat and security threats for the kernel.


Regards

Ingo Oeser



pgpZueJv3kxez.pgp
Description: PGP signature


Re: [PATCH] New: Omnikey CardMan 4040 PCMCIA Driver

2005-09-04 Thread Ingo Oeser
On Sunday 04 September 2005 12:12, Harald Welte wrote:
> cmx_llseek

just use

return nonseekable_open(inode, filp);

as your last statement in cmx_open() instead of

return 0;

to really disable any file pointer positioning (e.g. pwrite/pread too).

Addtionally cmx_llseek() is implement already as "no_llseek()"
by the VFS, so you delete it from the driver an use no_llseek() from
the VFS instead.


Regards

Ingo Oeser



pgpfDmvKLYTKl.pgp
Description: PGP signature


Re: [2.6 patch] lib/sort.c: small cleanups

2005-09-04 Thread Ingo Oeser
On Saturday 03 September 2005 15:25, Adrian Bunk wrote:
> This patch contains the following small cleanups:
> - make two needlessly global functions static
> - every file should #include the header files containing the prototypes 
>   of it's global functions

While this is a nice cleanup, does anybody remember,
why the inner loops are duplicated in the source?

If there are no arguments for it, I would like to consolidate them
to a function or a define, if they share to much state.

Or is the duplicate just considered cleaner?


Regards

Ingo Oeser

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] acpi: Handle cpu_index greater than 256 properly in processor_core.c

2005-08-27 Thread Ingo Oeser
Hi Venkatesh,

On Saturday 27 August 2005 02:07, Venkatesh Pallipadi wrote:
> Fix convert_acpiid_to_cpu function to handle cpu_index greater than 256. This 
> patch also prevents a warning in IA64 cross-compile of this file 
> (drivers/acpi/processor_core.c:517: warning: comparison is always false due 
> to limited range of data type).

Why don't you just change the datatype to "unsigned int" and 
the return failure value to NR_CPUS?

That reduces the code changes and leaves the code quite clear.
It should also reduce compiled code size by some bytes, but I'm not
sure about that one.


Regards

Ingo Oeser



pgpjbXsBpz0gN.pgp
Description: PGP signature


Re: [PATCH] Use sg_init_one where appropriate

2005-08-27 Thread Ingo Oeser
Hi David,

I appreciate your work on unifying common code, but have some comments.

On Saturday 27 August 2005 02:33, David Härdeman wrote:
> The same code as in sg_init_one can be found in a number of places, this 
> patch changes them to call the function instead.
> Index: linux-sginitone/include/linux/scatterlist.h
> ===
> --- linux-sginitone.orig/include/linux/scatterlist.h  2005-03-02 
> 08:38:32.0 +0100
> +++ linux-sginitone/include/linux/scatterlist.h   2005-08-27 
> 00:20:53.0 +0200
> @@ -1,8 +1,9 @@
>  #ifndef _LINUX_SCATTERLIST_H
>  #define _LINUX_SCATTERLIST_H
>  
> -static inline void sg_init_one(struct scatterlist *sg,
> -u8 *buf, unsigned int buflen)
> +static inline void sg_init_one(const struct scatterlist *sg,
> +const u8 *buf,
> +const unsigned int buflen)
>  {
>   memset(sg, 0, sizeof(*sg));
>  

In short: please remove all "const" markers from the function, 
try to uninline it somewhere and resend.

Explanation:

If this compiles without any warning, then your compiler is clearly broken.

You promise to not modify the memory pointed to by "sg" and set it
to zero then?

You also assign buflen to a variable, which voids the "const" attribute
anyway. For "buf" this is also wrong. The memory pointed to it
will be assigned to a variable whose modification you cannot control.

And while you are at it, please check, wether this can be uninlined,
since it does a lot of things and is called from quite some sites
then.


Regards

Ingo Oeser




pgpmUfxLaepzk.pgp
Description: PGP signature


Re: Multiple virtual address mapping for the same code on IA-64 linux kernel.

2005-08-19 Thread Ingo Oeser
Hi,

On Friday 19 August 2005 00:18, George Anzinger wrote:
> Not to say that is wrong but just to make it clear that saying the 
> itanium speed is  is like saying that a cummings diesel is fast with 
> out saying what sort of car/truck it is mounted in.

Yes, esp. since we all known that the fastest diesel is actually
Vin Diesel :-)


Have Fun!

Ingo Oeser



pgpCokNmuRJzI.pgp
Description: PGP signature


Re: Environment variables inside the kernel?

2005-08-18 Thread Ingo Oeser
Hi Guillermo,

On Thursday 18 August 2005 17:44, Guillermo López Alejos wrote:
> I have a piece of code which uses environment variables. I have been
> told that it is not going to work in kernel space because the concept
> of environment is not applicable inside the kernel.
> 
> I belive that, but I need to demonstrate it. I do not know how to
> proof this, perhaps referring to a solid reference about Linux design
> that points to the idea that it has no sense to use environment
> variables in kernel space.

The Linux kernel is technically one big process with lots of threads.

An environment variable is per process and is usally to be threated
read only within it.

Also the Linux kernel is the first "process" ever. 
Who should set up it's environment variables?

That's why there are none.

These arguments are no real proof in a mathematical sense,
but should help you argumenting.


Regards

Ingo Oeser



pgpQQpmbZAcFx.pgp
Description: PGP signature


Re: [PATCH] Fix mmap_kmem (was: [question] What's the difference between /dev/kmem and /dev/mem)

2005-08-13 Thread Ingo Oeser
Hi Andi,

On Friday 12 August 2005 18:54, Andi Kleen wrote:
> Acessing vmalloc in /dev/mem would be pretty awkward. Yes it doesn't
> also work in mmap of /dev/kmem, but at least in read/write.
> There are quite a lot of scripts that use it for kernel debugging
> like dumping variables. And for that you really want to access modules
> and vmalloc. And it's much easier to parse than /proc/kcore

Perfect! So it should be under CONFIG_DEBUG_KERNEL and default to off.

So you can still debug and we raise the bar higher for rootkits, 
if they are the only other user.

Too simple?


Regards

Ingo Oeser



pgpciqwUeESwg.pgp
Description: PGP signature


Re: [PATCH] ARCH_HAS_IRQ_PER_CPU avoids dead code in __do_IRQ()

2005-08-07 Thread Ingo Oeser
Hi Karsten,

On Sunday 07 August 2005 12:25, Karsten Wiese wrote:
> With my proposal the
>   #if defined(ARCH_HAS_IRQ_PER_CPU)
>   
>   #endif
> lets readers of __do_IRQ() immediately grasp:
>  "this block might not be compiled / depends an ARCH"
> And you'll get compile error's using IRQ_PER_CPU on ie i386,
> letting you immediately know,
> that you've got to change something to be able to use IRQ_PER_CPU.
> 
> That are advantages I think.

That's a valid argument. But an if is an if for the reader.
It is a conditional he has to be aware of and it usally has
no idention, if it is just inside "#if" instead of "if ()".

I have seen people seen missing "#if 0" [1] around code while 
reading it. Missing an normal if () is harder with proper idention.

A normal conditional has also the advantage, that the compiler
checks the code for syntactic and some semantic errors within it.

In an "#if 0" you can basically write any plain text[2] and any error
will go undetected, until it becomes an "#if 1".

Since your define is true for most compilations out there,
this argument is not very strong.

Last argument: Many kernel developers -- including Linus -- 
don't like "#if" in C files and prefer them in headers. 
Their reasons might be similiar to my own.


Regards

Ingo Oeser

[1] Let's just consider the values of the pre-processor symbols here, ok?
[2] Pavel Machek used this already to combine Makefile and C file :-)



pgpS2eoyKJboQ.pgp
Description: PGP signature


Re: [PATCH] ARCH_HAS_IRQ_PER_CPU avoids dead code in __do_IRQ()

2005-08-06 Thread Ingo Oeser
Hi Karsten,

On Saturday 06 August 2005 18:14, Karsten Wiese wrote:
> From: Karsten Wiese <[EMAIL PROTECTED]>
> 
> IRQ_PER_CPU is not used by all architectures.
> To avoid dead code generation in __do_IRQ()
> this patch introduces the macro ARCH_HAS_IRQ_PER_CPU.
> 
> ARCH_HAS_IRQ_PER_CPU is defined by architectures using
> IRQ_PER_CPU in their
>   include/asm_ARCH/irq.h
> file.

Why not the other way around?

Just define IRQ_PER_CPU to 0 on architectures not needing it and
add a FAT comment there, that this disables it. Or make it a config option.

Then just leave the code as is and let GCC optimize the dead code
away without any changes in the C file. It works, I just checked it ;-)


Regards

Ingo Oeser



pgpjkl6LFAYYy.pgp
Description: PGP signature


Re: [patch 3/5] Driver core: Documentation: use snprintf and strnlen

2005-07-31 Thread Ingo Oeser
Hi Domen,

On Sunday 31 July 2005 13:12, [EMAIL PROTECTED] wrote:
> From: Jan Veldeman <[EMAIL PROTECTED]>
> Documentation should give the good example of using snprintf and
> strnlen in stead of sprintf and strlen.
> 
> PAGE_SIZE is used as the maximal length to reflect the behaviour of
> show/store.

The whole part of the Documentation is obsoleted by the fact,
that struct device has no structure member called "name".

People hacking sysfs should also try to hack the docu to match or
at least remove the obsolete parts of it.

So you can drop this patch altogether, I think.


Regards

Ingo Oeser




pgpyZkeZGy25T.pgp
Description: PGP signature


Re: Average instruction length in x86-built kernel?

2005-07-30 Thread Ingo Oeser
Hi Karim,

On Friday 29 July 2005 23:32, Karim Yaghmour wrote:
> Googling around, I can find references claiming that the average
> instruction length on x86 is anywhere from 2.7 to 3.5 bytes, but I
> can't find anything studying Linux specifically.

This is not that hard to find out yourself:

Just study the output od objdump -d and average the differences
of the first hex number in a line printed, which are followed by a ":"

e.g.


scripts/kconfig/mconf.o: file format elf32-i386

Disassembly of section .text:

 :
   0:   83 ec 1csub$0x1c,%esp
   3:   8d 44 24 10 lea0x10(%esp),%eax
   7:   89 44 24 08 mov%eax,0x8(%esp)

so avg(3-7, 3-0) = 2.5

and so on...


Happy analyzing!


Regards

Ingo Oeser



pgpQz9Sa4VgH3.pgp
Description: PGP signature


Re: [patch 1/1] Audit return code of create_proc_*

2005-07-16 Thread Ingo Oeser
Hi Domen,

On Friday 15 July 2005 00:19, you wrote:
> Audit return of create_proc_* functions.

This (and related changes) spam the log, if
kernel is compiled without /proc-support.

Kernels without /proc-support are quite common in the embedded world.

Just provide a function in a suitable header 
(include/linux/proc_fs.h looks promising)
file, which contains the following:

#ifdef CONFIG_PROC_FS
#define procfs_failure(msg) do { printk(msg); } while(0)
#else
#define procfs_failure(msg) do {} while(0)
#endif

and use it instead of the direct printk call.

That way you get both: Your GCC or checking tool warning is silenced
and the log is not spammed for the embedded people.

For code, which is broken without procfs, the code
should be fixed or it should select PROC_FS in its Kconfig file.


Regards

Ingo Oeser




pgp4s9qFpX6R1.pgp
Description: PGP signature


Re: kernel guide to space

2005-07-11 Thread Ingo Oeser
On Monday 11 July 2005 17:44, Dmitry Torokhov wrote:
> >Descendant must be indented at least to the level of the innermost
> >compound expression in the parent. All descendants at the same level
> >are indented the same.
> >if (foobar(.) + barbar * foobar(bar +
> >foo *
> >
> > oof)) {
> >}
> 
> Ugh, that's as ugly as it can get... Something like below is much
> easier to read...
> 
> if (foobar(.) +
> barbar * foobar(bar + foo * oof)) {
> }
 
Even easier is
 if (foobar(.)
 + barbar * foobar(bar + foo * oof)) {
 }

since a statement cannot start with binary operators
and as such we are SURE that there must have been something before.
And this matches with old shop owner calculations like:

   1
+  2
+  3
----
   6

which we all know since early math classes.


Regards

Ingo Oeser




pgpC5TxreXsJl.pgp
Description: PGP signature


Re: [PATCH] add securityfs for all LSMs to use

2005-07-06 Thread Ingo Oeser
Hi Greg,

On Wednesday 06 July 2005 10:17, Greg KH wrote:
> + * TODO:
> + *   I think I can get rid of these default_file_ops, but not quite sure...
> + */
> +static ssize_t default_read_file(struct file *file, char __user *buf,
> +  size_t count, loff_t *ppos)
> +{
> + return 0;
> +}
> +
> +static ssize_t default_write_file(struct file *file, const char __user *buf,
> +size_t count, loff_t *ppos)
> +{
> + return count;
> +}

Yes, you can get rid of both, if you move read_null and write_null from 
drivers/char/mem.c to fs/libfs.c and export them.

But for what do you need a successful dummy read/write?


Regards

Ingo Oeser



pgpsV5J7kgPff.pgp
Description: PGP signature


Re: [ARM] Group device drivers together under their own menu

2005-03-29 Thread Ingo Oeser
Hi,

Randy.Dunlap wrote:
> The real problem AFAICT is that Networking options
> includes some protocols and then Network Device Support
> includes some other protocols.  Maybe if there was a Network Protocol
> section things could be clearer.  ??

I would really welcome that change. 

Regards

Ingo Oeser

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Releasing resources with children

2005-03-14 Thread Ingo Oeser
Russell King wrote:
> The only thing I'd question is whether we really need to BUG_ON() here.
> ISTR Linus' policy for BUG()/BUG_ON() was only if the condition lead
> directly to a filesystem-corrupting bug.

I consider it quite effective to flag interface violations.
Programming by contract anyone? ;-)

Regards

Ingo Oeser

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2.6] fix mprotect() with len=(size_t)(-1) to return -ENOMEM

2005-03-14 Thread Ingo Oeser
Hi Arjan,

You wrote:
> shouldn't we just fix the alignment code instead that the overflow case
> doesn't align to 0???
> that sounds really odd.

How? You have to align and you are out of bits for representing the
next number. What is the next number you can round to? "null" right!

Just remember that integer math with limited bits is always ring math ;-)

I love to abuse this for buffers and save an if.

Regards

Ingo Oeser



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PPC64] Allow emulation of mfpvr on ppc64 kernel

2005-03-10 Thread Ingo Oeser
David Gibson wrote:
> Andrew, please apply.
>
> Allow userspace programs on ppc64 to use the (privileged) mfpvr
> instruction to determine the processor type.  At the moment it
> emulates the instruction to provide the real PVR value, though it
> could be made to lie in future if for some reason we wish to restrict
> what CPU features userspace uses.

Why not putting the required information into the AUX table
when executing your ELF programs? I loved this feature in the
ix86 arch.


Regards

Ingo Oeser

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Xterm Hangs - Possible scheduler defect?

2005-02-25 Thread Ingo Oeser
Chris Friesen wrote:
> Ingo Oeser wrote:
> > Stupid applications can starve other applications for a while, but not
> > forever, because the kernel is still running and deciding.
>
> Not so.
>
>
>
> task 1: sched_rr, priority 1, takes mutex
> task 2: sched_rr, priority 2, cpu hog, infinite loop
> task 3: sched_rr, priority 99, tries to get mutex
>
> And now tasks 1 and 3 are starved forever.  Arguably bad application
> design, but it demonstrates a case where applications can starve other
> applications.

You are right.
 
In "If a SCHED_RR process has been running for a time period  equal  to  or 
longer  than  the  time quantum,  it  will  be  put at the end of the list for
its priority" I missed the "for its priority" part.

You would need to change the priority of task 1 until it releases the
mutex. Ideally the owner gets the maximum priority of
his and all the waiters on it, until it releases his mutex, where he regains
its old priority after release of mutex. But this priority elevation happens
only, if he is runnable. If not, he gets his old priority back, until he is 
runnable.

But then again you just need to grab a mutex shared with a high priority
task and consume CPU.

Since this behavior is not defined in POSIX AFAIK, you just have
to write your applications properly or use SCHED_OTHER for CPU hogging.


Regards

Ingo Oeser

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Xterm Hangs - Possible scheduler defect?

2005-02-24 Thread Ingo Oeser
Chad N. Tindel wrote:
> I think what we have are the need for two levels of applications:
>
> 1.  That which wishes to be the highest priority userspace application, and
> wishes to preempt all other userspace applications.  Such an application is
> OK being preempted by the kernel when the kernel needs to do work.  IMHO,
> this should be the default behavior for any SCHED_FIFO application.  If one
> of these has a bug and goes CPU-bound, the worst it can do is prevent other
> apps from ever using the CPU it is on.

That is basically, what you do with SCHED_RR.
(Be preempted after maximum quantum, even if having work to do)

> 2.  Applications which actually want to be the highest priority thing on
> the system, including being higher than the kernel.  These applications are
> OK with the fact that they may cause system hangs and deadlocks, and are
> careful not to shoot themselves in the foot.

This is SCHED_FIFO.
(Strict priority scheduling, allowed to starve anything below)

So just try to use the right scheduler for your application right now, ok?

If your system is busy with top priority task, why should the kernel disturb 
it?

Things will stop anyway, if your high priority task is needing a resource,
which is blocked. Than it becomes unrunnable and other tasks have
chances to continue. Kernel threads are likely to execute then, because they
are likely runnable then. Your task could even migrate, if a lot of kernel 
tasks 
are waiting in one CPU and your task is NOT bound to a specific CPU.

So the system is not brought down, but just busy in a infortunate way.
Stupid applications can starve other applications for a while, but not
forever, because the kernel is still running and deciding.


Regards

Ingo Oeser


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch 4/6] Bind Mount Extensions 0.06

2005-02-22 Thread Ingo Oeser
Hi,

Herbert Poetzl wrote:
> +static inline int mnt_may_unlink(struct vfsmount *mnt, struct inode *dir,
> struct dentry *child) { +   if (!child->d_inode)
> +   return -ENOENT;
> +   if (MNT_IS_RDONLY(mnt))
> +   return -EROFS;
> +   return 0;
> +}

The argument "dir" is not used. Please remove it and fix the callers.


Regards

Ingo Oeser

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] hotplug-ng 001 release

2005-02-11 Thread Ingo Oeser
Hi,

Greg KH write:
> Very nice stuff.  Ok, that's a good reason not to get rid of these
> files, although they can be generated on the fly from the modules
> themselves (like depmod does it.)

Time to resurrect modinfo? ;-)
Didn't we plan to get rid of that, too?

If we like to use information from modules, there should be a scriptable 
tool to extract this kind of information, otherwise it will be a bitch to 
maintain those tools.


Regards

Ingo Oeser

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.11-rc3: Kylix application no longer works?

2005-02-11 Thread Ingo Oeser
Hi,

Rik van Riel wrote:
> On Wed, 9 Feb 2005, Daniel Jacobowitz wrote:
> > On Tue, Feb 08, 2005 at 06:10:18PM -0800, Andrew Morton wrote:
> > It's asking for a lot of unwritable zeroed space.  See this:
> >>   LOAD   0x00 0x08048000 0x08048000 0xb7354 0x1b7354 R E
> >> 0x1000 LOAD   0x0b7354 0x08200354 0x08200354 0x1e3e4 0x1f648 RW 
> >> 0x1000
> >
> > clear_user's probably not the right way to provide the extra zeroing.
>
> Indeed, clear_user() refuses to zero data when it's not writable
> to the user process ...

So if the application wants an read only range of zeroed pages, why
not just map the ZERO_PAGE() multiple times there?

I can imagine _valid_ uses for that (templates for zero intitialized data),
although there are _better_ ways to do that.


Regards

Ingo Oeser

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] device-mapper: multipath hardware handler

2005-02-11 Thread Ingo Oeser
Hi Alasdair,

Alasdair G Kergon wrote:
> +/*
> + * Constructs a hardware handler object, takes custom arguments
> + */
> +typedef int (*hwh_ctr_fn) (struct hw_handler *hwh, unsigned arc, char
> **argv); +typedef void (*hwh_dtr_fn) (struct hw_handler *hwh);
> +
> +typedef void (*hwh_pg_init_fn) (struct hw_handler *hwh, unsigned bypassed,
> +struct path *path);
> +typedef unsigned (*hwh_err_fn) (struct hw_handler *hwh, struct bio *bio);
> +typedef int (*hwh_status_fn) (struct hw_handler *hwh,
> + status_type_t type,
> + char *result, unsigned int maxlen);
> +
> +/* Information about a hardware handler type */
> +struct hw_handler_type {
> + char *name;
> + struct module *module;
> +
> + hwh_ctr_fn ctr;
> + hwh_dtr_fn dtr;
> +
> + hwh_pg_init_fn pg_init;
> + hwh_err_fn err;
> + hwh_status_fn status;
> +};

Please loose the prototypes, don't use prefixes/suffixes and use more 
descriptive names. Reasons are in Documentation/CodingStyle, Chapter 4.

So I suggest declaring it like this:

struct hardware_handler_operations {
 char *name;
 struct module *module;

 int (*create) (struct hw_handler *handler, unsigned int argc, char **argv);
 void (*destroy) (struct hw_handler *handler);

 void (*pg_init) (struct hw_handler *handler, unsigned int bypassed, 
   struct path *path);

 unsigned (*error) (struct hw_handler *hwh, struct bio *bio);
 int (*status) (struct hw_handler *hwh, status_type_t type,  char *result, 
  unsigned int maxlen);
};

But you might want to loose status_type_t, too. Also hw_foo is a bit generic, 
isn't it? We are all dealing with "hardware" in any driver (which is 
basically another word for "hardware handler"). So please be a bit more 
creative on WHAT you drive.


Regards

Ingo Oeser

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Re: msdos/vfat defaults are annoying

2005-02-07 Thread Ingo Oeser
Michelle Konzack schrieb:
> Am 2005-02-07 09:47:09, schrieb Pozsár Balázs:
> > See? I _have_ that patch applied, that's why it tried vfat and not msdos
> > first.
>
> With this, you will nerver mount a Filesystem "msdos".
>
> Because "vfat" IS "msdos" + "lfn".
>
> You can attach to ALL "msdos" media "lfn" and you will have "vfat".

So msdos is vfat WITHOUT lfn, which is a a restriction like noatime
or mounting ext3 as ext2.

That's why the default should be vfat indeed and the restriction should be
"nolfn", which will not allow lfns to be created and is what you actually 
intend, right?

But this will break API today, so it should be added to list of
features that will change.

Regards

Ingo Oeser

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Make pipe data structure be a circular list of pages, rather

2005-01-17 Thread Ingo Oeser
Hi Linus,

Linus Torvalds wrote:
> +static long do_splice_from(struct inode *pipe, struct file *out, size_t len, 
> unsigned long flags) 
> +static long do_splice_to(struct file *in, struct inode *pipe, size_t len, 
> unsigned long flags) 
> +static long do_splice(struct file *in, struct file *out, size_t len, 
> unsigned long flags) 
> +asmlinkage long sys_splice(int fdin, int fdout, size_t len, unsigned long 
> flags) 

That part looks quite perfect. As long as they stay like this, I'm totally 
happy. I have even no problem about limiting to a length, since I can use that
to measure progress (e.g. a simple progress bar). 

This way I also keep the process as an "actor" like "[EMAIL PROTECTED]" pointed 
out.

It has unnecessary scheduling overhead, but the ability to stop/resume
the transfer by killing the process doing it is worth it, I agree.

So I would put a structure in the inode identifying the special device
and check, whether the "in" and "out" parameters are from devices suitable
for a direct on wire transfer. If they are, I just set up some registers
and wait for the transfer to happen. 

Then I get an interrupt/wakeup, if the requested amount is streamed, increment 
some 
user space pointers, switch to user space, user space tells me abort or stream
more and I follow. 
 
Continue until abort by user or streaming problems happen.

Just to give you an idea: I debugged such a machine and I had a hard hanging 
kernel with interrupts disabled. It still got data from a tuner, through an
MPEG decoder, an MPEG demultiplexer and played it to the audio card.
Not just a buffer like ALSA/OSS, but as long as I would like and it's end to end
without any CPU intervention.

That behavior would be perfect, but I could also live with a "pushing process".


Regards

Ingo Oeser

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: How does ramfs actually fills the page cache with data?

2001-06-23 Thread Ingo Oeser

On Fri, Jun 22, 2001 at 05:45:27PM -0400, Ho Chak Hung wrote:
> In fs/ramfs/inode.c, how does ramfs actually fills the page
> cache with data? In the readpage operation, it only zero-fill
> the page if it didn't already exist in the page cache. However,
> how do I actually fill the page with data?

The page cache does it itself. 

"readpage" is to move pages from the backing store into the page
cache. 

"writepage" and friends is for updating the backing store with
the contents of the page cache.

There is no real backing store of ramfs, since ramfs data lives
completly in page cache. 

But we cannot give the user random memory contents, so we zero it
out on readpage and prepare_write.

The data is copied with copy_{from,to}_user in the generic file
operations (look how ramfs_file_operations is defined and look at
the functions referenced), which read/write through page cache.

Regards

Ingo Oeser
-- 
Use ReiserFS to get a faster fsck and Ext2 to fsck slowly and gently.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: SMP spin-locks

2001-06-15 Thread Ingo Oeser

On Thu, Jun 14, 2001 at 05:05:07PM -0400, Richard B. Johnson wrote:
> The problem is that a data acquisition board across the PCI bus
> gives a data transfer rate of 10 to 11 megabytes per second
> with a UP kernel, and the transfer drops to 5-6 megabytes per
> second with a SMP kernel. The ISR is really simple and copies
> data, that's all.
> 
> The 'read()' routine uses a spinlock when it modifies pointers.
> 
> I started to look into where all the CPU clocks were going. The
> SMP spinlock code is where it's going. There is often contention
> for the lock because interrupts normally occur at 50 to 60 kHz.

Then you need another (better?) queueing mechanism.

Use multiple queues and a _overflowable_ sequence number as
global variable between the queues. 

N Queues (N := no. of CPUs + 1), which have a spin_lock for each
queue.

optionally: One reader packet reassembly priority queue (APQ) ordered by
   sequence number (implicitly or explicitly), if this shouldn't
   be done in user space.

In the writer ISR: 

   Foreach Queue in RR order (start with remebered one):
   - Try to lock it with spin_trylock (totally inline!)
 + Failed
* if we failed to find a free queue for x "rounds", disable
  device (we have no reader) and notify user space somehow
   * increment "rounds" 
   * next queue
 + Succeed
   * Increment sequence number
   * Put data record into queue
  (* remember this queue as last queue used)
  (* mark queue "not empty")
   * do other IRQ work...

In the reader routine:
   Foreach Queue in RR order (start with remebered one):
   - No data counter above threshold -> EAGAIN [1] 
   - Try to lock it with spin_trylock (totally inline!)
 + Failed -> next queue
 + Succeed
   * if queue empty, unlock and try next one
  (* remember this queue as last queue used)
   * Get one data record from queue (in queue order!)
   * Move data record into APQ
   * Unlock queue
   * Deliver as much data from the APQ, as the user wants and
 is available
- if all queues empty or locked -> increment "no data round"
  counter
  

Notes:
   The "last queue used" variable is static, but local to routine.
   It is there to decrease the number of iterations and distribute
   the data to all queues as more equally.


   Statistics about lock contention per queue, per round and per
   try would be nice here to estimate the number of queues
   needed.

   The APQ can be quite large, if the sequences are bad
   distributed and some queues tend to be always locked, if the
   reader wants to read from this queue.

   The above can be solved by 2^N "One entry queues" (aka slots)
   and sequence numbers mapping to this slots. If you need many
   slots (more then 256, I would say) then this is again 
   inaccaptable, because of the iteration cost in the ISR.
   
What do you think? After some polishing this should decrease lock
contention noticibly.


Regards

Ingo Oeser

[1] Blocking will be harder to implement here, since we need to
   notify the reader routine, that we have data available, which
   involves some latency you cannot afford. Maybe this could be
   done via schedule_task(), if needed.
-- 
Use ReiserFS to get a faster fsck and Ext2 to fsck slowly and gently.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [isocompr PATCH]: announcing stable port to kernel 2.2.18

2001-06-12 Thread Ingo Oeser

On Mon, Jun 11, 2001 at 10:59:44PM +0200, Pavel Machek wrote:
> > The current version of the patch for 2.2.18 is very stable
> > (we use it for DemoLinux [see www.demolinux.org] heavily),
> > and I wonder if it could not be a good idea to see if this
> > code can be folded into the official releases sometime in the
> > future (I have been looking at 2.4.x code, but the new page
> > cache means some changes might be needed: I will try to post
> > a first version for 2.4.x soon).
> 
> I think that 2.5.0 should be your target... It is definitely new
> feature, and both 2.4.X and 2.2.X are in feature freeze.

Right. And besides: HPA coded a similar patch for 2.4.x, while he
fixed some issues.

So you might try his work or even come to an agreement on the
format.

Regards

Ingo Oeser
-- 
Use ReiserFS to get a faster fsck and Ext2 to fsck slowly and gently.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: APIC problem or 3com 3c590 driver problem in smp kernel 2.4.x

2001-06-01 Thread Ingo Oeser

On Thu, May 31, 2001 at 12:27:07PM -0400, Feng Xian wrote:
> The driver for my pci device, I have the SA_SHIRQ set.

What kind of PCI device do you have? I had this problem once with
an PCI-Matchmaker[1] based board (for which we still have the wrong
PCI-ID btw, but my patch was rejected twice...).

> Actually what I am thinking it may be APIC support problem. I rebuild my
> kernel to use single cpu without APIC support, my device and 3c905 both
> work fine. they don't work for SMP kernel (APIC is by default enabled)
> Then I configured my uni-processor kernel to enable the APIC support, my
> device won't work with the 3c905, just exactly same as it behaves in the
> SMP kernel.

With 2.2 I also had this without APIC. 

I have been flooded with interrupts which have been intended for
the Cyclone card (3c905B 100BaseTX), and exited the ISR quickly
after querying the interrupt register of my Matchmaker board
without any ACKing, but the Cyclone never got these interrupts
anymore.

But is doesn't seem to be a 3c905 based problem, as I have 

 11:   95772726  XT-PIC  es1371, eth0, eth1
 
in /proc/interrupts where eth0 and eth1 are both Cyclones.
Even the vga card has IRQ 11 assigned.

So this is not really unknown ;-)

Regards

Ingo Oeser

[1] class 0b40, vendor id: 10e8, device id: 807d
-- 
To the systems programmer,
users and applications serve only to provide a test load.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Please help me fill in the blanks.

2001-05-27 Thread Ingo Oeser

On Sat, May 26, 2001 at 10:27:09PM -0400, Jeff Garzik wrote:
> > * Service Location Protocol (SLP)

www.openslp.org

Regards

Ingo Oeser
-- 
To the systems programmer,
users and applications serve only to provide a test load.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Dedicated Interrupt handling on SMP

2001-05-25 Thread Ingo Oeser

On Fri, May 25, 2001 at 12:43:11PM -0400, Randy wrote:
> I'm trying to find the easiest way to to deidcate one CPU to responding
> to a specific Interrupt request.
> That CPU should only listen for that request while all other CPU should
> ignore the interrupt.

cat /proc/irq/*/smp_affinity

There you can select on which if the 32 CPUs Linux should handle
this IRQ.

Read Documentation/IRQ-affinity.txt for more.

Regards

Ingo Oeser
-- 
To the systems programmer,
users and applications serve only to provide a test load.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [RFD w/info-PATCH] device arguments from lookup, partion code

2001-05-21 Thread Ingo Oeser

On Sun, May 20, 2001 at 12:02:35PM -0700, Linus Torvalds wrote:
> The problem with ioctl's is, let me repeat, not technology.
> 
> It's people.
> 
> ioctl's are a way to do ugly things. That's what they have ALWAYS been.
> And because of that, people don't care about following the rules - if
> ioctl's followed the rules, they wouldn't _be_ ioctls in the first place,
> but instead have a good interface (say, read()/write()).
> 
> Basically, ioctl's will _never_ be done right, because of the way people
> think about them. They are a back door. They are by design typeless and
> without rules. They are, in fact, the Microsoft of UNIX.
 
Yes, they are. Why? Because we cannot fit all behavior of a
devices _cleanly_ into read/write/mmap/lseek.

If we do, we would need different device views (which implies
aliasing of devices, which HPA does not like) and it would
still be not that clean, because reading from readonly gives a
stream and writing gives a stream too, not particular order
required until now.

[good points]

> Would fs/ioctl.c be an ugly mess of some special cases? Yes. But would
> that make the ugliness explicit and possibly easier to try to manage and
> fix? Very probably. And it would mean that driver writers could not just
> say "fuck design, I'm going to do this my own really ugly way". 

Ok, then I give you an real world example where I idly fight with
design since nearly 2 years.

A free programmable DSP (or set of DSPs) with several kinds of
memory and additional optional devices (like DAC/ADC, ISDN frames
and sth. like that) on it. This DSP is attached via some glue
logic on Parallel port, PCI, ISA or (soon to come) USB.

This thingie can (once programmed) act as a data sink, data
source or data processing pipe.

OTOH it should be randomly accessable via debuggers and program
loaders. It is also resettable/rebootable, has discontinous
memory of certain kinds (possibly harvard architecture) and many
more funny stuff. And it needs to upload software.

I try to unify all these stuff into a "Generic Processing Device
Layer" for Linux.

Now I like to be shown how I should fit this into clean design
that:

   - uses NO ioctls (Linus)
   - has only one device per DSP (H.P.A)
   - Does not emulate ioctls via read/write transactions (which I
 consider bogus)

Theory is nice, but until someone can show me a clean design for
this (admittedly heavy ;-)) example, I just don't buy your
arguments. 

A *better* ioctl would be nice, but we still need an "catch all
exceptional accesses" interface, IMNSHO.


Regards

Ingo Oeser
-- 
To the systems programmer,
users and applications serve only to provide a test load.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: const __init

2001-05-20 Thread Ingo Oeser

On Sun, May 20, 2001 at 05:34:48PM -0400, Jeff Garzik wrote:
> This might be a very valid point...
> 
> (let me know if the following test is flawed)
 
It is imho.

> > [jgarzik@rum tmp]$ cat > sectest.c
> > #include 
> > #include 
> > static const char version[] __initdata = "foo";
static char version2[] __initdata = "bar";
> > [jgarzik@rum tmp]$ gcc -D__KERNEL__ -I/spare/cvs/linux_2_4/include -Wall 
>-Wstrict-prototypes -O2 -fomit-frame-pointer -fno-strict-aliasing -pipe 
>-mpreferred-stack-boundary=2 -march=i686-c -o sectest.o sectest.c
> > [jgarzik@rum tmp]$ 
> 
> No section type conflict appears.

Now it SHOULD conflict on these binutils, but doesn't on mine (2.9.5) ;-)

It is decided to put it into .data.init as expected.

AFAIK "const" is only a promise to the compiler, that we write
this data ONCE and read only after this initial write. So the
decision on the section is implementation defined.

What I don't understand is, why GCC overrides our explicit
override (done by setting the "section attribute" explicitly).

I would consider this a BUG in GCC. I don't understand, why we
support this BUG...

Maybe some GCC people can enlighten me, why GCC ignores such
overrides, that are for the cases where we DO KNOW BETTER than
GCC, what section is correct.


Regards

Ingo Oeser
-- 
To the systems programmer,
users and applications serve only to provide a test load.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [RFC][PATCH] Re: Linux 2.4.4-ac10

2001-05-20 Thread Ingo Oeser

On Sun, May 20, 2001 at 05:29:49AM +0200, Mike Galbraith wrote:
> I'm not sure why that helps.  I didn't put it in as a trick or
> anything though.  I put it in because it didn't seem like a
> good idea to ever have more cleaned pages than free pages at a
> time when we're yammering for help.. so I did that and it helped.

The rationale for this is easy: free pages is wasted memory,
clean pages is hot, clean cache. The best state a cache can be in.

Regards

Ingo Oeser
-- 
To the systems programmer,
users and applications serve only to provide a test load.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: VIA's Southbridge bug: Latest (pseudo-)patch

2001-05-19 Thread Ingo Oeser

On Sat, May 19, 2001 at 05:11:30PM +0100, Alan Cox wrote:
> If it had been a manufacturer in most respectable areas of business they'd be
> recalling and reissuing components, and paying for the end resllers to notify
> each customer 

This is consumer hardware. Consumer products are optimized for a
good buzzword count per $ ratio. Everything else is secondary.

Producing cheap stuff has its price. And being so smart an buing
cheapest available has the same price. QA and recalling are
expensive as hell. That's why cheap products usally have this
quality tradeoff.

Most consumers don't like to pay for quality. 

Germany has learned this lesson and thus "Made in Germany"
doesn't mean anything for certain products anymore :-(

Regards

Ingo Oeser
-- 
To the systems programmer,
users and applications serve only to provide a test load.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [RFD w/info-PATCH] device arguments from lookup, partion code

2001-05-19 Thread Ingo Oeser

On Sat, May 19, 2001 at 11:34:48AM -0700, Linus Torvalds wrote:
[Reasons]
> So the "English is bad" argument is a complete non-argument.

Jepp, I have to agree. 

English is used more or less as an communication protocol in
computer science and for operating computers.

Once you know how to operate an computer in English, you can
operate nearly every computer in the world, because they have
English as default locale.

Let's not repeat Babel please :-(

PS: English is neither mine, nor Linus native language. Why do
   the English natives complain instead of us? ;-)


   And be glad that's not German, that has this role. English
   sentences are WAY easier to parse by computers, because it
   doesn't use much suffixes and prefixes on words and has very
   few exceptions. Also these exceptions are eleminated from
   command languages WITHOUT influencing readability and
   comprehensability.



Regards

Ingo Oeser
-- 
To the systems programmer,
users and applications serve only to provide a test load.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4.4-ac10

2001-05-18 Thread Ingo Oeser

On Fri, May 18, 2001 at 03:23:03PM -0300, Rik van Riel wrote:
> On Fri, 18 May 2001, Ingo Oeser wrote:
> 
> > Rik: Would you take patches for such a tradeoff sysctl?
> 
> "such a tradeoff" ?
> 
> While this sounds reasonable, I have to point out that
> up to now nobody has described exactly WHAT tradeoff
> they'd like to make tunable and why...

Amount of pages reclaimed from swapout_mm() versus amount of
pages reclaimed from caches.

A value that says: "use XX% of my main memory for RSS of
processes, even if I run heavy disk loadf now" would be nice.

For general purpose machines, where I run several services but
also play games, this would allow both to survive.

The external services would go slower. Who cares, if some CVS
updates or NFS services go slower, if I can play my favorite game
at full speed? ;-)

> I'm not against making things tunable, but I would like
> to at least see the proponents of tunable things explain
> WHAT they want tunable and exactly WHY.

Ideally: Every value that the kernel decides by heuristics,
   because heuristics can fail to get even close to an optimal
   result. 

But this is too much. Some tunables from refill_inactive would be
nice. Also the patch for honouring the soft rss limit is good (is
it in?).

Regards

Ingo Oeser
-- 
To the systems programmer,
users and applications serve only to provide a test load.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4.4-ac10

2001-05-18 Thread Ingo Oeser

On Fri, May 18, 2001 at 07:45:15PM +0200, Mike Galbraith wrote:
> Yes, ~exactly!  I chose 30 tasks because they almost do (tool/userland
> dependant.. must recalibrate often) fit.  The bitch is to get the vm
> to automagically detect the rss/cache munch tradeoff point without all
> the manual help.

What about a sysctl for that? Choose decent steps and let 0
(which is an insane value) mean "let's kernel decide" and make
this default.

In the past we could do this by adjusting some watermarks in
/proc/sys/vm but now, we can't do anything but trust the genius
kernel developers.

I doubt that we can test all kinds of workload and even imagine
what pervert stuff some people do with their machines.

Tuning _is_ manual work. Always has been and always will be.

This countinously "I know it better then you" is what I hated
about Windows and now this comes more and more into Linux :-(

Rik: Would you take patches for such a tradeoff sysctl?

Regards

Ingo Oeser
-- 
To the systems programmer,
users and applications serve only to provide a test load.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4.4-ac10

2001-05-17 Thread Ingo Oeser

On Thu, May 17, 2001 at 05:45:38PM +0100, Alan Cox wrote:
> 2.4.4-ac10

I think someone forgot this little return. It removes the
following warning:

serial.c:4208: warning: control reaches end of non-void function


--- linux-2.4.4-ac10/drivers/char/serial.c  Thu May 17 20:41:05 2001
+++ linux-2.4.4-ac10-ioe/drivers/char/serial.c  Thu May 17 20:35:53 2001
@@ -4205,6 +4205,7 @@
 {
__set_current_state(TASK_UNINTERRUPTIBLE);
schedule_timeout(HZ/10);
+   return 0;
 }
 
 /*

Regards

Ingo Oeser
-- 
To the systems programmer,
users and applications serve only to provide a test load.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: cmpci sound chip lockup

2001-05-17 Thread Ingo Oeser

On Wed, May 16, 2001 at 08:02:06PM -0300, Rik van Riel wrote:
> I'm seeing a similar thing on 2.4.4-pre[23], but in a far less
> serious way. Using xmms the music stops after anything between
> a few seconds and a minute, I suspect a race condition somewhere.
> 
> Using mpg123 everything works fine...

Your xmms uses esd[1]?

Friends of mine report problems with esd and 2.4.x. Tested on
SB-Live! and es1371.

Regards

Ingo Oeser

[1] E Sound Deamon - A sound mixing framework
-- 
To the systems programmer,
users and applications serve only to provide a test load.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: CMPXCHG

2001-05-17 Thread Ingo Oeser

On Wed, May 16, 2001 at 03:37:00PM -0700, Scott Huang wrote:
> Four adapters need to produce data to a 
> descriptor queue which is consumed by a
> user process. A lock mechanism was implemented
> to sync the adapters. However, this causes 
> a performance hit.  Is it possible to use
> CMPXCHG on Intel's i-386 to avoid the locking?

What about using atomic operations for that? This is more general
and works on ALL architectures. CMPXCHG is just and special
atomic operation on ia32.

> Where can I find some doc and some sample code?

   Documentation/DocBook/kernel-hacking.tmpl

But better do

   make htmldocs 

in the kernel top level directory and read

   Documentation/DocBook/kernel-hacking/lk-hacking-guide.html

instead.

Sample code is scattered all around in the kernel.


Regards

Ingo Oeser
-- 
To the systems programmer,
users and applications serve only to provide a test load.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-16 Thread Ingo Oeser

On Wed, May 16, 2001 at 02:36:44PM -0700, H. Peter Anvin wrote:
> > But all devices which export a CD-ROM interface will do so. So the
> > device node that is associated with the CD-ROM driver will export
> > CD-ROM semantics, and the trailing name will be "/cd".
> > 
> > Other interfaces a device exports, such as a CD-RW, appear as a
> > different device node ("generic" for SCSI, because we have no CD-RW
> > classification at this point).
> > 
> > My scheme works already, and works reliably. Nothing had to be done to
> > support the CD-ROM interface to CD-RW and DVD devices.
> > 
> 
> It's still completely braindamaged: (a) these interfaces aren't
> disjoint.  They refer to the same device, and will interfere with each
> other; (b) it is highly undesirable to tie the naming to the interfaces
> in this way.  It further restricts the namespaces you can export, for one
> thing.

We do this already with ide-scsi. A device is visible as /dev/hda
and /dev/sda at the same time. Or think IDE-CDRW: /dev/hda,
/dev/sr0 and /dev/sg0.

All at the same time.

It is perfectly normal to export different interfaces for the
same device. This is basically, what subfunctions on PCI do: Same
device with different interfaces. 

Just that we do it through a driver with ide and through the
hardware with a multi function PCI card.

Applications don't care about devices. They care about entities
that have capabilities and programming interfaces. What they
_really_ are and if this is only emulated is not important.

Sorry, I don't see your point here :-(


Regards

Ingo Oeser
-- 
10.+11.03.2001 - 3. Chemnitzer LinuxTag <http://www.tu-chemnitz.de/linux/tag>
 <<<<<<<<<<<< been there and had much fun   >>>>>>>>>>>>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-16 Thread Ingo Oeser

On Mon, May 14, 2001 at 09:33:35PM -0300, Rik van Riel wrote:
> Agreed. However, if this thing means I cannot use the -linus
> tree without devfs, then it will also mean my VM stuff only
> gets tested on -ac kernels...

No Problem. I test most of your VM stuff anyway and I use devfs
on that machine ;-)

PS: It's not that hard to build a machine, which can support
both. E-Mail me, if you would like to know _how_ to do that.

Regards

Ingo Oeser
-- 
10.+11.03.2001 - 3. Chemnitzer LinuxTag <http://www.tu-chemnitz.de/linux/tag>
 <<<<<<<<<<<< been there and had much fun   >>>>>>>>>>>>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-15 Thread Ingo Oeser

On Tue, May 15, 2001 at 10:44:23AM -0700, James Simmons wrote:
> different. I do plan on some day merging drm and fbdev into one interface. So
> I plan to change this behavior. I like to see this interface ioctl-less
> (is their such a word ???). You mmap to alter buffers. Mmap is much more
> flexiable than write for graphics buffers anyways. You use write to pass
> "data" to the driver.

The only problem with mmap(): You cannot know, if the page
changed under you a**.

What would first mmap()ed page of the screen look like, if some
accelerator wrote a line there? Invalidating all mmap()ed pages
for each and every accelerator command would be evil. Forbidding
reads of that page is evil, too.

I have the same problem with DSPs, which like to mmap() some of
their memory into the application, but can alter this memory
every instruction the execute.

mmap() has it's beauties, but ...

Regards

Ingo Oeser
-- 
10.+11.03.2001 - 3. Chemnitzer LinuxTag <http://www.tu-chemnitz.de/linux/tag>
 <<<<<<<<<<<< been there and had much fun   >>>>>>>>>>>>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



  1   2   3   >