RE: Cosmetic JFFS patch.

2001-06-28 Thread Bjorn Wesen

On Thu, 28 Jun 2001, Laramie Leavitt wrote:
> > dmesg buffer space is rather limited and IMHO there isn't space to
> > waste on credit-giving in boot logs.
> 
> Here here.  You don't see annoying log-eating copyright messages
> printed out in the Windows boot. Just imagine:

There's a difference; someone paid for that Windows code and you paid to
get windows and don't care about who did what. But when someone puts down
a lot of work to contributes something for free which others find useful
and actually use, don't you think it might be prudent to let them at least
write who contributed it, if a line is going to be printed anyway to say
device that or that has been registred ?

I know it sounds a bit like an "advertisment space" but it's always been
so; people have been releasing code for free since noone knows how long
and often one major factor has been that their peers will go "wow did
you do that". Otherwise why would anyone ever write their name in an About
box when they release a freeware program. And dmesg is the Linux kernels
About box (someone might argue that the code is the about box but
unfortunately most people dont read the headers in every .c file they
use).

See the old BSD license - distribution-wise it's more free than the GPL
but you still had to give credit where credit is due when getting a free
lunch from someone elses work (I think this requirement was dropped in the
current BSD license)

The risk is that some people might take it quite personally to get their
names removed and might not be as interested to see their code in the
kernel in the future. Of course as long as it's GPL nothing would stop it
anyway, but I still think it's a good idea to give credit for others hard
work.

/Bjorn

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: Cosmetic JFFS patch.

2001-06-28 Thread Bjorn Wesen

On Thu, 28 Jun 2001, Laramie Leavitt wrote:
  dmesg buffer space is rather limited and IMHO there isn't space to
  waste on credit-giving in boot logs.
 
 Here here.  You don't see annoying log-eating copyright messages
 printed out in the Windows boot. Just imagine:

There's a difference; someone paid for that Windows code and you paid to
get windows and don't care about who did what. But when someone puts down
a lot of work to contributes something for free which others find useful
and actually use, don't you think it might be prudent to let them at least
write who contributed it, if a line is going to be printed anyway to say
device that or that has been registred ?

I know it sounds a bit like an advertisment space but it's always been
so; people have been releasing code for free since noone knows how long
and often one major factor has been that their peers will go wow did
you do that. Otherwise why would anyone ever write their name in an About
box when they release a freeware program. And dmesg is the Linux kernels
About box (someone might argue that the code is the about box but
unfortunately most people dont read the headers in every .c file they
use).

See the old BSD license - distribution-wise it's more free than the GPL
but you still had to give credit where credit is due when getting a free
lunch from someone elses work (I think this requirement was dropped in the
current BSD license)

The risk is that some people might take it quite personally to get their
names removed and might not be as interested to see their code in the
kernel in the future. Of course as long as it's GPL nothing would stop it
anyway, but I still think it's a good idea to give credit for others hard
work.

/Bjorn

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Via-rhine in 2.4.5 still requires cold-boot

2001-06-12 Thread Bjorn Wesen

Just for the record, the via-rhine.c in 2.4.5 still does not work if you
soft-boot the computer (at least one a machine here), MAC address shows up
as 00:00:00:00:00:00 and it fails - but a cold boot (power cable off, no
standby power) makes it work.

I read something that we'd need to reload the EEPROM on the boards or
something if a cold-boot solves a problem. Well it does. :)

/BW

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



meaning of vmalloc shortcut comment in fault.c

2001-06-05 Thread Bjorn Wesen

Can someone elaborate on why it's bad to refer to tsk directly below (this
is a 2.4.5 change in x86) and why it's needed on x86 and not other archs..

What should I do for an arch that does not have a "cr3" machine register
to check with ?

/BW

vmalloc_fault:
{
/*
 * Synchronize this task's top level page-table
 * with the 'reference' page table.
 *
 * Do _not_ use "tsk" here. We might be inside
 * an interrupt in the middle of a task switch..
 */
int offset = __pgd_offset(address);
pgd_t *pgd, *pgd_k;
pmd_t *pmd, *pmd_k;
pte_t *pte_k;

asm("movl %%cr3,%0":"=r" (pgd));
pgd = offset + (pgd_t *)__va(pgd);


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: USB requiring PCI

2001-06-05 Thread Bjorn Wesen

On Mon, 4 Jun 2001 [EMAIL PROTECTED] wrote:
> I don't know the details of the implementation, but the CRIS port
> (ETRAX 100LX) has support for USB but no PCI.

A builtin non-PCI USB-host controller, that is. And the driver is in the
kernel so we do support it as well :) 

/BW

> > > AC> o   Make USB require PCI(me)
> > > Huh?!
> > > How about people from StrongArm sa11x0 port, who have USB host
> controller (in
> > > sa companion chip) but do not have PCI?
> >
> > The strongarm doesnt have a USB master but a slave.
> >
> > > Probably there are more such embedded architectures with USB
> controllers,
> > > but not PCI bus.
> >
> > Currently we don't support any of them.
> >
> > > How about ISA USB host controllers?
> >
> > They do not exist.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Missing cache flush.

2001-06-05 Thread Bjorn Wesen

On Tue, 5 Jun 2001, David Woodhouse wrote:
> The flash mapping driver arch/cris/drivers/axisflashmap.c uses a cached
> mapping of the flash chips for bulk reads, but obviously an uncached mapping
> for sending commands and reading status when we're actually writing to or
> erasing parts of the chip.
> 
> However, it fails to flush the dcache for the range used when the flash is 
> accessed through the uncached mapping. So after an erase or write, we may 
> read old data from the cache for the changed area.

I'll start by saying that axisflashmap.c was not meant to be used by any
other archs, that's why it's in arch/cris. But if anyone find it useful,
that's great. Just be aware that it's not _designed_ for general use and
something like this might be just what that might mean.

CRIS is cache coherent just like the x86 cache and does not need any
explicit cache flushes for the write case. Even when doing cache bypass
writing, if a cacheline already exist with the referenced memory, the
cacheline is updated.

In the erase case though, yes there should be a flush. However during the
1-2 seconds it takes to erase a sector, you can with very high certainity
guarantee that the direct-mapped unified 8 kB cache on the CRIS is
flushed from any flash references at all.. I mean, it's one-way
associative, during 1-2 seconds it executes potentially 200 million
instructions. So we haven't really bothered to think about the problem..

For other CPU's it might be more dangerous, although I don't hold my
breath.. 1-2 seconds is a long time when talking about L1 caches.

> However, I can't see a cache operation which performs this function.
> flush_dcache_page() is defined as a NOP on CRIS as, it seems, it is on most
> architectures. On other architectures, there's dma_cache_wback_inv(), but
> that also seems to be a NOP on i386, to pick a random example.

I'd agree that to be really certain, a "flush_dcache()" function
should be implemented and used when an erase finishes. Like David Miller
wrote somewhere in the thread, one way is to use your knowledge of the
arch's cache and do suitable dummy accesses to flush it, if there is no
explicit command to do it. But that's just up to the arch coders..

-bw

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Missing cache flush.

2001-06-05 Thread Bjorn Wesen

On Tue, 5 Jun 2001, David Woodhouse wrote:
 The flash mapping driver arch/cris/drivers/axisflashmap.c uses a cached
 mapping of the flash chips for bulk reads, but obviously an uncached mapping
 for sending commands and reading status when we're actually writing to or
 erasing parts of the chip.
 
 However, it fails to flush the dcache for the range used when the flash is 
 accessed through the uncached mapping. So after an erase or write, we may 
 read old data from the cache for the changed area.

I'll start by saying that axisflashmap.c was not meant to be used by any
other archs, that's why it's in arch/cris. But if anyone find it useful,
that's great. Just be aware that it's not _designed_ for general use and
something like this might be just what that might mean.

CRIS is cache coherent just like the x86 cache and does not need any
explicit cache flushes for the write case. Even when doing cache bypass
writing, if a cacheline already exist with the referenced memory, the
cacheline is updated.

In the erase case though, yes there should be a flush. However during the
1-2 seconds it takes to erase a sector, you can with very high certainity
guarantee that the direct-mapped unified 8 kB cache on the CRIS is
flushed from any flash references at all.. I mean, it's one-way
associative, during 1-2 seconds it executes potentially 200 million
instructions. So we haven't really bothered to think about the problem..

For other CPU's it might be more dangerous, although I don't hold my
breath.. 1-2 seconds is a long time when talking about L1 caches.

 However, I can't see a cache operation which performs this function.
 flush_dcache_page() is defined as a NOP on CRIS as, it seems, it is on most
 architectures. On other architectures, there's dma_cache_wback_inv(), but
 that also seems to be a NOP on i386, to pick a random example.

I'd agree that to be really certain, a flush_dcache() function
should be implemented and used when an erase finishes. Like David Miller
wrote somewhere in the thread, one way is to use your knowledge of the
arch's cache and do suitable dummy accesses to flush it, if there is no
explicit command to do it. But that's just up to the arch coders..

-bw

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: USB requiring PCI

2001-06-05 Thread Bjorn Wesen

On Mon, 4 Jun 2001 [EMAIL PROTECTED] wrote:
 I don't know the details of the implementation, but the CRIS port
 (ETRAX 100LX) has support for USB but no PCI.

A builtin non-PCI USB-host controller, that is. And the driver is in the
kernel so we do support it as well :) 

/BW

   AC o   Make USB require PCI(me)
   Huh?!
   How about people from StrongArm sa11x0 port, who have USB host
 controller (in
   sa companion chip) but do not have PCI?
 
  The strongarm doesnt have a USB master but a slave.
 
   Probably there are more such embedded architectures with USB
 controllers,
   but not PCI bus.
 
  Currently we don't support any of them.
 
   How about ISA USB host controllers?
 
  They do not exist.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



meaning of vmalloc shortcut comment in fault.c

2001-06-05 Thread Bjorn Wesen

Can someone elaborate on why it's bad to refer to tsk directly below (this
is a 2.4.5 change in x86) and why it's needed on x86 and not other archs..

What should I do for an arch that does not have a cr3 machine register
to check with ?

/BW

vmalloc_fault:
{
/*
 * Synchronize this task's top level page-table
 * with the 'reference' page table.
 *
 * Do _not_ use tsk here. We might be inside
 * an interrupt in the middle of a task switch..
 */
int offset = __pgd_offset(address);
pgd_t *pgd, *pgd_k;
pmd_t *pmd, *pmd_k;
pte_t *pte_k;

asm(movl %%cr3,%0:=r (pgd));
pgd = offset + (pgd_t *)__va(pgd);


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: make menuconfig - cosmetic question

2001-05-17 Thread Bjorn Wesen

While we're on cosmetics... how about imprisonment for the person who
chose yellow on light grey for the first letters in each option... 

/Bjorn

On Thu, 17 May 2001, Martin.Knoblauch wrote:
>  this is most likely just a small issue. If I knew where to look, I
> would try to fix it and submit a patch :-)
> 
>  When I diff config files pocessed by "make [old]config" and "make
> menueconfig", it seems that menuconfig is not writing out some of the
> "comments" that the other versions do write. This is of course nothing
> serious, but it ticks me off. Any idea where to look for this glitch?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: make menuconfig - cosmetic question

2001-05-17 Thread Bjorn Wesen

While we're on cosmetics... how about imprisonment for the person who
chose yellow on light grey for the first letters in each option... 

/Bjorn

On Thu, 17 May 2001, Martin.Knoblauch wrote:
  this is most likely just a small issue. If I knew where to look, I
 would try to fix it and submit a patch :-)
 
  When I diff config files pocessed by make [old]config and make
 menueconfig, it seems that menuconfig is not writing out some of the
 comments that the other versions do write. This is of course nothing
 serious, but it ticks me off. Any idea where to look for this glitch?

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Question] Explanation of zero-copy networking

2001-05-08 Thread Bjorn Wesen

On Mon, 7 May 2001, Richard B. Johnson wrote:
> Basically, "no copy" is an academic exercise. It makes the first
> packet get sent more quickly, after which everything slows to
> the natural bandwidth of the system.
> 
> If you used a server for multicast-only.  In other words,  you
> just spewed out unidirectional data, you still slow to the rate
> at which the media can take the data.  And CPUs can obtain or
> generate these data a lot faster than 100-base can sink them.

This is an awfully PC-centric way of putting things. You assume that the
only ones who use Linux are those with a 1 ghz CPU and those 66 mhz PCI
boards and whatever. You simply cannot make that assumption anymore; the
diversity of Linux HW these days is so broad that the sweet spot between
CPU cycles, memory bandwidth etc which controls the code optimization
fluctuates wildly.

A simple kernel profile of one of our embedded Linux systems for example
show csum_partial_copy limiting the performance. Now for us zero-copy
cannot be implemented anyway because we don't have a checksumming ethernet
controller but if we had, we could enhance performance by 50% by skipping
the copy perhaps. And there definitely are no 1 GHZ embedded CPU's in the
same price range to choose instead, or Rambus memories etc.. raw power
simply is not an option sometimes.

It's still true of course that it's not obvious that the cycles spent on
copying can be used for anything better in all cases.

However, the beauty of open-source is that there is no need to debate over
whether something should be done or not. If someone feels the need, it
will be coded and if it's good people will use it. In this case, if anyone
gets a 200% boost in performance, they probably won't listen to the
argument that "it's academic" afterwards :) And some others might go
twiddle their hardware and skip the zero-copy mechanism altogether.

-BW

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Question] Explanation of zero-copy networking

2001-05-08 Thread Bjorn Wesen

On Mon, 7 May 2001, Richard B. Johnson wrote:
 Basically, no copy is an academic exercise. It makes the first
 packet get sent more quickly, after which everything slows to
 the natural bandwidth of the system.
 
 If you used a server for multicast-only.  In other words,  you
 just spewed out unidirectional data, you still slow to the rate
 at which the media can take the data.  And CPUs can obtain or
 generate these data a lot faster than 100-base can sink them.

This is an awfully PC-centric way of putting things. You assume that the
only ones who use Linux are those with a 1 ghz CPU and those 66 mhz PCI
boards and whatever. You simply cannot make that assumption anymore; the
diversity of Linux HW these days is so broad that the sweet spot between
CPU cycles, memory bandwidth etc which controls the code optimization
fluctuates wildly.

A simple kernel profile of one of our embedded Linux systems for example
show csum_partial_copy limiting the performance. Now for us zero-copy
cannot be implemented anyway because we don't have a checksumming ethernet
controller but if we had, we could enhance performance by 50% by skipping
the copy perhaps. And there definitely are no 1 GHZ embedded CPU's in the
same price range to choose instead, or Rambus memories etc.. raw power
simply is not an option sometimes.

It's still true of course that it's not obvious that the cycles spent on
copying can be used for anything better in all cases.

However, the beauty of open-source is that there is no need to debate over
whether something should be done or not. If someone feels the need, it
will be coded and if it's good people will use it. In this case, if anyone
gets a 200% boost in performance, they probably won't listen to the
argument that it's academic afterwards :) And some others might go
twiddle their hardware and skip the zero-copy mechanism altogether.

-BW

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Problem with map_user_kiobuf() not mapping to physical memory

2001-05-02 Thread Bjorn Wesen

On Wed, 2 May 2001, Terry Barnaby wrote:
> However, I note that if the user just mallocs memory and does not access
> it
> (No physical memory pages created) and then passes this virtual address
> space
> to the driver which performs a map_user_kiobuf() on it, the resulting
> kiobuf
> structure has all of the pagelist[] physical address entries set to the
> same value
> and the maplist[] entries set to 0. The devices access to this memory
> now
> causes system problems.
> Is map_user_kiobuf() working correctly ?
> Should I call some function to map the virtual address space into
> physical memory
> or at least pages before I call map_user_kiobuf() ?

No.. but you might just have done something wrong.

See the example in arch/cris/drivers/examples/kiobuftest.c

(that example does not deallocate the vectors properly IIRC, but the
actual kiobuf mapping sequence should work)

/BW

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Problem with map_user_kiobuf() not mapping to physical memory

2001-05-02 Thread Bjorn Wesen

On Wed, 2 May 2001, Terry Barnaby wrote:
 However, I note that if the user just mallocs memory and does not access
 it
 (No physical memory pages created) and then passes this virtual address
 space
 to the driver which performs a map_user_kiobuf() on it, the resulting
 kiobuf
 structure has all of the pagelist[] physical address entries set to the
 same value
 and the maplist[] entries set to 0. The devices access to this memory
 now
 causes system problems.
 Is map_user_kiobuf() working correctly ?
 Should I call some function to map the virtual address space into
 physical memory
 or at least pages before I call map_user_kiobuf() ?

No.. but you might just have done something wrong.

See the example in arch/cris/drivers/examples/kiobuftest.c

(that example does not deallocate the vectors properly IIRC, but the
actual kiobuf mapping sequence should work)

/BW

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ramdisk/tmpfs/ramfs/memfs ?

2001-04-27 Thread Bjorn Wesen

On Fri, 27 Apr 2001, Padraig Brady wrote:
> for a partition. If I understand correctly ramfs just points
> to the file data which are pages in the cache marked not to be

It does not even do that - as of 2.4, the VFS in the kernel also knows how
to cache a filestructure itself. It's in the dentry-cache. So ramfs just
provides the thin mapping between VFS operations and the VFS caches
(dentries, inodes, pages) like any other 2.4 filesystem - with the
difference that ramfs does not need to know anything about actually
transferring the cache entries to a backing store (a physical filesystem).

Take a look at fs/ramfs/inode.c, it's just some hundred odd lines of
code and worth reading to find out more about how 2.4's VFS works.

> uncached. Doh! is ramfs supported in 2.2?

Don't think so, for the above reason.

-BW

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ramdisk/tmpfs/ramfs/memfs ?

2001-04-27 Thread Bjorn Wesen

On Fri, 27 Apr 2001, Padraig Brady wrote:
 for a partition. If I understand correctly ramfs just points
 to the file data which are pages in the cache marked not to be

It does not even do that - as of 2.4, the VFS in the kernel also knows how
to cache a filestructure itself. It's in the dentry-cache. So ramfs just
provides the thin mapping between VFS operations and the VFS caches
(dentries, inodes, pages) like any other 2.4 filesystem - with the
difference that ramfs does not need to know anything about actually
transferring the cache entries to a backing store (a physical filesystem).

Take a look at fs/ramfs/inode.c, it's just some hundred odd lines of
code and worth reading to find out more about how 2.4's VFS works.

 uncached. Doh! is ramfs supported in 2.2?

Don't think so, for the above reason.

-BW

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ramdisk/tmpfs/ramfs/memfs ?

2001-04-26 Thread Bjorn Wesen

On Thu, 26 Apr 2001, Padraig Brady wrote:
> I'm working on an embedded system here which has no harddisk.
> So, I can't swap to disk and need to have /var & /tmp in RAM.
> I'm confused between the various options for in RAM file-
> systems. At the moment I've created a ramdisk and made an 
> ext2 partition in it (which is compressed as I applied the 
> e2compr patch), which is working fine. Anyway questions:

Ouch.. yes you had to do stuff like that in the old days but it's very 
cumbersome and inefficient compared to ramfs for what you're trying to do.

> 1. I presume the kernel is clever enough to not cache any
>files from these filesystems? Would it ever need to?

You always need to "cache" pages read. Because a page is the smallest
possible granularity for the MMU, and a block-based filesystem does not
need to be page-aligned, so it's impossible to do it otherwise in a
general way.

> 3. If I've no backing store (harddisk?) is there any advantage 
>of using tmpfs instead of ramfs? Also does tmpfs need a 
>backing store?

I don't know what tmpfs does actually, but if it is like you suggest (a
ramfs that can be swapped out ?) then you don't need it obviously (since
you don't have any swap).

ramfs simply inserts any files written into the kernels cache and tells it
not to forget it. it can't get much more simple than that.

> 5. Can you set size limits on ramfs/tmpfs/memfs?

i don't think you can set a limit in the current ramfs implementation but
it would not be particularly difficult to make it work I think

> 6. Is a ramdisk resizable like the others. If so, do you have
>to delete/recreate or umount/resize a fs (e.g. ext2) every
>time it's resized? Do ramfs/tmpfs/memfs do this transparently?
>Are ramdisks resizable in kernel 2.2?

ramfs does not need any "resizing" because there is no filesystem behind
it. there is only the actual file data and metadata in the cache itself.
if you delete a file, it disapperas, if you create a new one new pages are
brought in.

> 7. What's memfs?
> 8. Is there a way I can get transparent compression like I now
>have using a ramdisk+ext2+e2compr with ramfs et al?

you could try using jffs2 on a RAM-simulated MTD partition. i think that
would work but i have not tried it..

> 9. Apart from this transparent compression, is there any other
>functionality ext2 would have over ramfs for e.g, for /tmp
>& /var? Also would ramfs have less/more speed over ext2?

ramfs has all the bells and whistles you need except size limiting. and
obviously its faster than simulating a harddisk in ram and using ext2 on
it.. 

-bw


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ramdisk/tmpfs/ramfs/memfs ?

2001-04-26 Thread Bjorn Wesen

On Thu, 26 Apr 2001, Padraig Brady wrote:
 I'm working on an embedded system here which has no harddisk.
 So, I can't swap to disk and need to have /var  /tmp in RAM.
 I'm confused between the various options for in RAM file-
 systems. At the moment I've created a ramdisk and made an 
 ext2 partition in it (which is compressed as I applied the 
 e2compr patch), which is working fine. Anyway questions:

Ouch.. yes you had to do stuff like that in the old days but it's very 
cumbersome and inefficient compared to ramfs for what you're trying to do.

 1. I presume the kernel is clever enough to not cache any
files from these filesystems? Would it ever need to?

You always need to cache pages read. Because a page is the smallest
possible granularity for the MMU, and a block-based filesystem does not
need to be page-aligned, so it's impossible to do it otherwise in a
general way.

 3. If I've no backing store (harddisk?) is there any advantage 
of using tmpfs instead of ramfs? Also does tmpfs need a 
backing store?

I don't know what tmpfs does actually, but if it is like you suggest (a
ramfs that can be swapped out ?) then you don't need it obviously (since
you don't have any swap).

ramfs simply inserts any files written into the kernels cache and tells it
not to forget it. it can't get much more simple than that.

 5. Can you set size limits on ramfs/tmpfs/memfs?

i don't think you can set a limit in the current ramfs implementation but
it would not be particularly difficult to make it work I think

 6. Is a ramdisk resizable like the others. If so, do you have
to delete/recreate or umount/resize a fs (e.g. ext2) every
time it's resized? Do ramfs/tmpfs/memfs do this transparently?
Are ramdisks resizable in kernel 2.2?

ramfs does not need any resizing because there is no filesystem behind
it. there is only the actual file data and metadata in the cache itself.
if you delete a file, it disapperas, if you create a new one new pages are
brought in.

 7. What's memfs?
 8. Is there a way I can get transparent compression like I now
have using a ramdisk+ext2+e2compr with ramfs et al?

you could try using jffs2 on a RAM-simulated MTD partition. i think that
would work but i have not tried it..

 9. Apart from this transparent compression, is there any other
functionality ext2 would have over ramfs for e.g, for /tmp
 /var? Also would ramfs have less/more speed over ext2?

ramfs has all the bells and whistles you need except size limiting. and
obviously its faster than simulating a harddisk in ram and using ext2 on
it.. 

-bw


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[PATCH] drivers/ide/ide.c to work with more IDE controllers

2001-04-24 Thread Bjorn Wesen

Hi!

Problem description:

  * drivers/ide/ide.c assumes the IDE controller is mapped in such a way
that it can access it by "hardcoded" I/O commands (IN_BYTE/OUT_BYTE)

  * drivers/ide/ide.c assumes that polled ide/atapi transfers should be
done the way a PC would

  * drivers/ide/Makefile assumes that all IDE DMA controllers are PCI

This makes it impossible to use for example the IDE-driver for the Etrax
controller (arch/cris) which is not memory-mapped and is not PCI-based.

The following trivial patches (against 2.4.4-pre6 but are probably
appliable to any 2.4.3-ac as well) fix the problem:

  * In include/linux/ide.h, do #ifdef HAVE_ARCH_IN_BYTE etc. around the
definitions of IN_BYTE and OUT_BYTE (allowing include/asm/ide.h to
bypass the standard definition - see asm-cris/ide.h for an example)

  * Add the "ideproc" entry in the HW driver structure, and let 
ide_input_bytes and friends in ide.c test that first. If it exists,
it uses it, otherwise just do the normal PC transfer

  * In the Makefile, let ide-dma.c (which is really PCI DMA only) be
included by CONFIG_BLK_DEV_IDEDMA_PCI instead of just
CONFIG_BLK_DEV_IDEDMA

  * (Un)related addition: add ide_etrax100 as a chipset enum and an init
call to the etrax IDE driver under #ifdef CONFIG_ETRAX_IDE

Please comment. It should all be trivial but there is one thing I'm unsure
about and that is if it's guaranteed that the HWIF's structures are nulled
upon creation (or maybe, if the primordial HWIF is nulled when copies are
made). Obviously the above patch depends on any HWIF to have NULL as
'ideproc' if it does not need any alternative function there.

Regards,
Bjorn



--- /home/bjornw/tmp/linux/drivers/ide/ide.cTue Apr 24 13:30:46 2001
+++ linux/drivers/ide/ide.c Wed Apr  4 13:20:53 2001
@@ -374,7 +374,19 @@
  */
 void ide_input_data (ide_drive_t *drive, void *buffer, unsigned int wcount)
 {
-   byte io_32bit = drive->io_32bit;
+   byte io_32bit;
+
+   /* first check if this controller has defined a special function
+* for handling polled ide transfers
+*/
+
+   if(HWIF(drive)->ideproc) {
+   HWIF(drive)->ideproc(ideproc_ide_input_data,
+drive, buffer, wcount);
+   return;
+   }
+
+   io_32bit = drive->io_32bit;
 
if (io_32bit) {
 #if SUPPORT_VLB_SYNC
@@ -407,7 +419,15 @@
  */
 void ide_output_data (ide_drive_t *drive, void *buffer, unsigned int wcount)
 {
-   byte io_32bit = drive->io_32bit;
+   byte io_32bit;
+
+   if(HWIF(drive)->ideproc) {
+   HWIF(drive)->ideproc(ideproc_ide_output_data,
+drive, buffer, wcount);
+   return;
+   }
+
+   io_32bit = drive->io_32bit;
 
if (io_32bit) {
 #if SUPPORT_VLB_SYNC
@@ -444,6 +464,12 @@
  */
 void atapi_input_bytes (ide_drive_t *drive, void *buffer, unsigned int bytecount)
 {
+   if(HWIF(drive)->ideproc) {
+   HWIF(drive)->ideproc(ideproc_atapi_input_bytes,
+drive, buffer, bytecount);
+   return;
+   }
+
++bytecount;
 #if defined(CONFIG_ATARI) || defined(CONFIG_Q40)
if (MACH_IS_ATARI || MACH_IS_Q40) {
@@ -459,6 +485,12 @@
 
 void atapi_output_bytes (ide_drive_t *drive, void *buffer, unsigned int bytecount)
 {
+   if(HWIF(drive)->ideproc) {
+   HWIF(drive)->ideproc(ideproc_atapi_output_bytes,
+drive, buffer, bytecount);
+   return;
+   }
+
++bytecount;
 #if defined(CONFIG_ATARI) || defined(CONFIG_Q40)
if (MACH_IS_ATARI || MACH_IS_Q40) {
@@ -2092,6 +2123,7 @@
hwif->maskproc  = old_hwif.maskproc;
hwif->quirkproc = old_hwif.quirkproc;
hwif->rwproc= old_hwif.rwproc;
+   hwif->ideproc   = old_hwif.ideproc;
hwif->dmaproc   = old_hwif.dmaproc;
hwif->dma_base  = old_hwif.dma_base;
hwif->dma_extra = old_hwif.dma_extra;
@@ -3193,6 +3225,12 @@
}
 #endif /* CONFIG_PCI */
 
+#ifdef CONFIG_ETRAX_IDE
+   {
+   extern void init_e100_ide(void);
+   init_e100_ide();
+   }
+#endif /* CONFIG_ETRAX_IDE */
 #ifdef CONFIG_BLK_DEV_CMD640
{
extern void ide_probe_for_cmd640x(void);


--- /home/bjornw/tmp/linux/include/linux/ide.h  Thu Jan  4 23:51:21 2001
+++ linux/include/linux/ide.h   Wed Apr 18 13:49:54 2001
@@ -133,14 +133,6 @@
 #define IDE_BCOUNTL_REGIDE_LCYL_REG
 #define IDE_BCOUNTH_REGIDE_HCYL_REG
 
-#ifdef REALLY_FAST_IO
-#define OUT_BYTE(b,p)  outb((b),(p))
-#define IN_BYTE(p) (byte)inb(p)
-#else
-#define OUT_BYTE(b,p)  outb_p((b),(p))
-#define IN_BYTE(p) (byte)inb_p(p)
-#endif /* 

Re: Is there a way to turn file caching off ?

2001-04-18 Thread Bjorn Wesen

A similar phenomenon happens when you simply copy a file - file A is read
into the cache and file B is written to the cache, until the memory runs
out. Then both start to flush at the same time, creating a horrible
performance hit (especially if A and B are on the same disk :) 

I don't know a way to fix this except having the kernel correctly identify
the access pattern and optimize for it (i.e. if it recognizes that cache
pages are flushed in order to make room for more pages from the same
inode, then it's probably a suboptimal caching pattern and instead it
should probably increase the readahead and flush bigger chunks of pages at
the same time). I don't think anything can be done to the writing queue
(except maybe make the kernel understand that seek-time is more expensive
than transfer-time, so it does not schedule the read/writeing each odd
page..)

I'm still using 2.4.0 though so maybe this behaviour has been fixed to the
better in later kernels.. 

As a sidenote, try the same thing on an WinNT box and watch it die :) Like
unpacking a 1 GB file on a machine with 128 MB ram.. after it has unpacked
the first 100 MB's or so, performance drops to 1% or something..

-BW

On Tue, 17 Apr 2001, Laurent Chavet wrote:
> First cache grows to the size of RAM (2GB) with transfer rate
> slowing down as the cache grows.
> Then the transfer rates drops a lot (2 to 3 time slower than the
> drive capacity) and there is a very high CPU usage of system time (more
> than a CPU) used by bdflush and kswapd (and some others like kupdated).

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Is there a way to turn file caching off ?

2001-04-18 Thread Bjorn Wesen

A similar phenomenon happens when you simply copy a file - file A is read
into the cache and file B is written to the cache, until the memory runs
out. Then both start to flush at the same time, creating a horrible
performance hit (especially if A and B are on the same disk :) 

I don't know a way to fix this except having the kernel correctly identify
the access pattern and optimize for it (i.e. if it recognizes that cache
pages are flushed in order to make room for more pages from the same
inode, then it's probably a suboptimal caching pattern and instead it
should probably increase the readahead and flush bigger chunks of pages at
the same time). I don't think anything can be done to the writing queue
(except maybe make the kernel understand that seek-time is more expensive
than transfer-time, so it does not schedule the read/writeing each odd
page..)

I'm still using 2.4.0 though so maybe this behaviour has been fixed to the
better in later kernels.. 

As a sidenote, try the same thing on an WinNT box and watch it die :) Like
unpacking a 1 GB file on a machine with 128 MB ram.. after it has unpacked
the first 100 MB's or so, performance drops to 1% or something..

-BW

On Tue, 17 Apr 2001, Laurent Chavet wrote:
 First cache grows to the size of RAM (2GB) with transfer rate
 slowing down as the cache grows.
 Then the transfer rates drops a lot (2 to 3 time slower than the
 drive capacity) and there is a very high CPU usage of system time (more
 than a CPU) used by bdflush and kswapd (and some others like kupdated).

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



parport initialisation

2001-04-09 Thread Bjorn Wesen

Hi,

regarding drivers/parport/*

is there any particular reason as to why the different parport drivers
aren't initialized using module_init() ? Like weird init order
dependencies and stuff.

Looking at parport_init itself (which has hardcoded init calls to the
different drivers right now) it does not look like it does anything
particularly special except some proc filesystem registering.

Is it just because nobody has gotten around to "fixing" it or is there a
deeper reason ?

Regards
Bjorn

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



parport initialisation

2001-04-09 Thread Bjorn Wesen

Hi,

regarding drivers/parport/*

is there any particular reason as to why the different parport drivers
aren't initialized using module_init() ? Like weird init order
dependencies and stuff.

Looking at parport_init itself (which has hardcoded init calls to the
different drivers right now) it does not look like it does anything
particularly special except some proc filesystem registering.

Is it just because nobody has gotten around to "fixing" it or is there a
deeper reason ?

Regards
Bjorn

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ERESTARTSYS question.

2001-04-05 Thread Bjorn Wesen

On Thu, 5 Apr 2001, Jani Monoses wrote:
> although the comments in errno.h say that ERESTARTSYS should not be seen
> by userland,many drivers seam to return it from their
> file_operations.Should glibc convert this errno so that the user program
> sees something meaningful?Because it does not.Is EINTR not a better errno 
> to return from the drivers?

ERESTARTSYS is a part of the api between the driver and the
signal-handling code in the kernel. It does not reach user-space (provided
of course that it's used appropriately in the drivers :) 

When a driver needs to wait, and get awoken by a signal (as opposed to
what it's really waiting for) the driver should in most cases abort the
system call so the signal handler can be run (like, you push ctrl-c while
running somethinig that's stuck in a wait for an interrupt). The kernel
uses the ERESTARTSYS as a "magic" value saying it's ok to restart the
system call automagically after the signal handling is done. The actual
return-code is switched to EINTR if the system call could not be
restarted.

-Bjorn

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ERESTARTSYS question.

2001-04-05 Thread Bjorn Wesen

On Thu, 5 Apr 2001, Jani Monoses wrote:
 although the comments in errno.h say that ERESTARTSYS should not be seen
 by userland,many drivers seam to return it from their
 file_operations.Should glibc convert this errno so that the user program
 sees something meaningful?Because it does not.Is EINTR not a better errno 
 to return from the drivers?

ERESTARTSYS is a part of the api between the driver and the
signal-handling code in the kernel. It does not reach user-space (provided
of course that it's used appropriately in the drivers :) 

When a driver needs to wait, and get awoken by a signal (as opposed to
what it's really waiting for) the driver should in most cases abort the
system call so the signal handler can be run (like, you push ctrl-c while
running somethinig that's stuck in a wait for an interrupt). The kernel
uses the ERESTARTSYS as a "magic" value saying it's ok to restart the
system call automagically after the signal handling is done. The actual
return-code is switched to EINTR if the system call could not be
restarted.

-Bjorn

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: kernel/sched.c questions

2001-04-04 Thread Bjorn Wesen

On 4 Apr 2001, Andi Kleen wrote:
> > >>  Hello, I would like to know why you put this two functions:
> > >>  void scheduling_functions_start_here(void) { }
> > >>  ...
> > >>  void scheduling_functions_end_here(void) { }

> This is needed for a very bad hack to get the EIP information in ps -lax:
> most programs would be shown as hanging in schedule(), which would not be 
> very useful to show the user. To avoid this sched.c is always compiled with 
> frame pointers and if the EIP is inside these two functions the proc code 
> goes back one level in the stack frame.

That sure is a very bad hack :) (For the original poster: search for
get_wchan in the various ports)

There is no comment anywhere near it that says what it is MEANT to do. You
can guess from the code and the usage that it has to do with stack-frames
and special-casing the scheduler functions..  Thanks for the 
clarification.. now I can go and fix it in arch/cris :) (I had never seen
the WCHAN field in ps before actually)

Just as a reference (everyone should get their daily dose of headache)
here is the i386 version:

unsigned long get_wchan(struct task_struct *p)
{
unsigned long ebp, esp, eip;
unsigned long stack_page;
int count = 0;
if (!p || p == current || p->state == TASK_RUNNING)
return 0;
stack_page = (unsigned long)p;
esp = p->thread.esp;
if (!stack_page || esp < stack_page || esp > 8188+stack_page)
return 0;
/* include/asm-i386/system.h:switch_to() pushes ebp last. */
ebp = *(unsigned long *) esp;
do {
if (ebp < stack_page || ebp > 8184+stack_page)
return 0;
eip = *(unsigned long *) (ebp+4);
if (eip < first_sched || eip >= last_sched)
return eip;
ebp = *(unsigned long *) ebp;
} while (count++ < 16);
return 0;
}

-BW


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: kernel/sched.c questions

2001-04-04 Thread Bjorn Wesen

On 4 Apr 2001, Andi Kleen wrote:
Hello, I would like to know why you put this two functions:
void scheduling_functions_start_here(void) { }
...
void scheduling_functions_end_here(void) { }

 This is needed for a very bad hack to get the EIP information in ps -lax:
 most programs would be shown as hanging in schedule(), which would not be 
 very useful to show the user. To avoid this sched.c is always compiled with 
 frame pointers and if the EIP is inside these two functions the proc code 
 goes back one level in the stack frame.

That sure is a very bad hack :) (For the original poster: search for
get_wchan in the various ports)

There is no comment anywhere near it that says what it is MEANT to do. You
can guess from the code and the usage that it has to do with stack-frames
and special-casing the scheduler functions..  Thanks for the 
clarification.. now I can go and fix it in arch/cris :) (I had never seen
the WCHAN field in ps before actually)

Just as a reference (everyone should get their daily dose of headache)
here is the i386 version:

unsigned long get_wchan(struct task_struct *p)
{
unsigned long ebp, esp, eip;
unsigned long stack_page;
int count = 0;
if (!p || p == current || p-state == TASK_RUNNING)
return 0;
stack_page = (unsigned long)p;
esp = p-thread.esp;
if (!stack_page || esp  stack_page || esp  8188+stack_page)
return 0;
/* include/asm-i386/system.h:switch_to() pushes ebp last. */
ebp = *(unsigned long *) esp;
do {
if (ebp  stack_page || ebp  8184+stack_page)
return 0;
eip = *(unsigned long *) (ebp+4);
if (eip  first_sched || eip = last_sched)
return eip;
ebp = *(unsigned long *) ebp;
} while (count++  16);
return 0;
}

-BW


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: CML1 cleanup patch, take 2

2001-03-26 Thread Bjorn Wesen

On Mon, 26 Mar 2001, Eric S. Raymond wrote:
> (2) Fix up 20 cris-architecture configuration symbols lacking a CONFIG_
> prefix, so they obey CML1/CML2 conventions and can be detected by
> `make dep', also static-analysis tools and consistency checkers.
> This is a BUG FIX in CML1.

No need for you to fret on this; it's partly fixed in the version in
Alan's tree and the rest will be cleaned up in our next update.

-Bjorn

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: CML1 cleanup patch, take 2

2001-03-26 Thread Bjorn Wesen

On Mon, 26 Mar 2001, Eric S. Raymond wrote:
 (2) Fix up 20 cris-architecture configuration symbols lacking a CONFIG_
 prefix, so they obey CML1/CML2 conventions and can be detected by
 `make dep', also static-analysis tools and consistency checkers.
 This is a BUG FIX in CML1.

No need for you to fret on this; it's partly fixed in the version in
Alan's tree and the rest will be cleaned up in our next update.

-Bjorn

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: CRAMFS

2001-03-23 Thread Bjorn Wesen

On Fri, 23 Mar 2001, David Woodhouse wrote:
> > 1. RAMFS is just more stable in terms of less complexity, less bugs reported 
> > over the time, etc.
> > 2. RAMFS is a fairly robust filesystem and all features required as far as I can 
> > tell.

Ok, ramfs is really simple, but heck, cramfs is not much more complex :)
It's as simple a flash-filesystem as you can get.

I don't know why the comparision is made though, they are used for two
completely different things... ramfs is for temporary file storage, cramfs
is for immutable files stored on flash. Each by itself is quite optimal
for what it's designed for, isn't it ?

> I'm not aware of any bugs being found in cramfs recently - unless you 
> wanted to use it on Alpha (or anything else where PAGE_SIZE != the 
> hard-coded 4096 in mkcramfs.c).

I committed a patch that disappeared that added the choice of page size
(trivial yes :), we have PAGE_SIZE == 8192 on our systems. Works fine.

> I wouldn't avoid it for those reasons - although if you're _really_ short 
> of flash space, the same argument applies as for JFFS2 - a single 
> compression stream (tar.gz) will be smaller than compressing individual 
> pages like JFFS2 and cramfs do.

Here are some results from a quite mixed filesystem:

[bjornw@godzilla linux]$ ls -l cram*
-rw-r--r--   1 bjornw   users 1179648 Mar 23 22:38 cram32768
-rw-r--r--   1 bjornw   users 1282048 Mar 23 22:38 cram4096
-rw-r--r--   1 bjornw   users 1220608 Mar 23 22:38 cram8192

(the numbers correspond to blocksize)

There's not any big difference here. 

With bigger files though, the difference get larger. YMMV.

Notice that you can change cramfs so it uses a blocksize that is bigger
than PAGE_SIZE, of course, if it really is necessary. You'll get worse
performance at run-time though since you need to cache the page and hope
for read-ahead or similar (you can stuff the pages in the page-cache even
if they are not requested for example).

-BW


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: CRAMFS

2001-03-23 Thread Bjorn Wesen

On Fri, 23 Mar 2001, David Woodhouse wrote:
  1. RAMFS is just more stable in terms of less complexity, less bugs reported 
  over the time, etc.
  2. RAMFS is a fairly robust filesystem and all features required as far as I can 
  tell.

Ok, ramfs is really simple, but heck, cramfs is not much more complex :)
It's as simple a flash-filesystem as you can get.

I don't know why the comparision is made though, they are used for two
completely different things... ramfs is for temporary file storage, cramfs
is for immutable files stored on flash. Each by itself is quite optimal
for what it's designed for, isn't it ?

 I'm not aware of any bugs being found in cramfs recently - unless you 
 wanted to use it on Alpha (or anything else where PAGE_SIZE != the 
 hard-coded 4096 in mkcramfs.c).

I committed a patch that disappeared that added the choice of page size
(trivial yes :), we have PAGE_SIZE == 8192 on our systems. Works fine.

 I wouldn't avoid it for those reasons - although if you're _really_ short 
 of flash space, the same argument applies as for JFFS2 - a single 
 compression stream (tar.gz) will be smaller than compressing individual 
 pages like JFFS2 and cramfs do.

Here are some results from a quite mixed filesystem:

[bjornw@godzilla linux]$ ls -l cram*
-rw-r--r--   1 bjornw   users 1179648 Mar 23 22:38 cram32768
-rw-r--r--   1 bjornw   users 1282048 Mar 23 22:38 cram4096
-rw-r--r--   1 bjornw   users 1220608 Mar 23 22:38 cram8192

(the numbers correspond to blocksize)

There's not any big difference here. 

With bigger files though, the difference get larger. YMMV.

Notice that you can change cramfs so it uses a blocksize that is bigger
than PAGE_SIZE, of course, if it really is necessary. You'll get worse
performance at run-time though since you need to cache the page and hope
for read-ahead or similar (you can stuff the pages in the page-cache even
if they are not requested for example).

-BW


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available

2001-01-20 Thread Bjorn Wesen

On Sat, 20 Jan 2001, Martin MaD Douda wrote:
> On Fri, 19 Jan 2001, Michael Lindner wrote:
> > data is generated as a result of data received via a select(),
> > the next delivery occurs a clock tick later, with the machine
> > mostly idle.
> 
> The machine is in fact not idle - there is a task running - idle task.
> Could the problem be that scheduler does not preempt this task to run
> something more useful?

Normally, the "idle task" (task[0]) does this pseudo-code:

   while(1) { 
  if(need_resched)
 schedule();
   }

to minimize latency out of idle so if that actually is running it should
not be a problem (unless need_resched is not set by the wakeup calls)

Perhaps the kapm-idled kernel thread is killing your latency, you could
try disabling APM and APM-making-idle-calls especially. Also check ps aux
and see if anything else is taking your idle CPU %.

-BW

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available

2001-01-20 Thread Bjorn Wesen

On Sat, 20 Jan 2001, Martin MaD Douda wrote:
 On Fri, 19 Jan 2001, Michael Lindner wrote:
  data is generated as a result of data received via a select(),
  the next delivery occurs a clock tick later, with the machine
  mostly idle.
 
 The machine is in fact not idle - there is a task running - idle task.
 Could the problem be that scheduler does not preempt this task to run
 something more useful?

Normally, the "idle task" (task[0]) does this pseudo-code:

   while(1) { 
  if(need_resched)
 schedule();
   }

to minimize latency out of idle so if that actually is running it should
not be a problem (unless need_resched is not set by the wakeup calls)

Perhaps the kapm-idled kernel thread is killing your latency, you could
try disabling APM and APM-making-idle-calls especially. Also check ps aux
and see if anything else is taking your idle CPU %.

-BW

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: setfsuid on ext2 weirdness (2.4)

2001-01-08 Thread Bjorn Wesen

On Mon, 8 Jan 2001, Linus Torvalds wrote:
> Please show them, anyway. What does "ls -ld / /etc /etc/passwd" say?

Heh... /etc and /etc/passwd were allright... but / was fscked (or not,
maybe :)

drwx- 500 0   both locked from other users and 500 as owner..

> 99% says that one of the three will be wrong (probably "/", because you
> probably checked the others already and overlooked root), and you'll
> feel really silly. 

Dunno how that ever happened (unpacking a bad tar-ball maybe) but it's
fixed now and Linux 2.4.0 is completely without blame! :) I'm stupendously
silly but that's just normal, also, it's another warm unix experience to
cherish..

Thanks for the hint!

> And hey, if you think the above is confusing, try making your /dev/null
> a regular (writable) file by mistake.  Now THAT will be confusing as

Been there got the t-shirt :)

/BW

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: setfsuid on ext2 weirdness (2.4)

2001-01-08 Thread Bjorn Wesen

On Mon, 8 Jan 2001, Linus Torvalds wrote:
 Please show them, anyway. What does "ls -ld / /etc /etc/passwd" say?

Heh... /etc and /etc/passwd were allright... but / was fscked (or not,
maybe :)

drwx- 500 0   both locked from other users and 500 as owner..

 99% says that one of the three will be wrong (probably "/", because you
 probably checked the others already and overlooked root), and you'll
 feel really silly. 

Dunno how that ever happened (unpacking a bad tar-ball maybe) but it's
fixed now and Linux 2.4.0 is completely without blame! :) I'm stupendously
silly but that's just normal, also, it's another warm unix experience to
cherish..

Thanks for the hint!

 And hey, if you think the above is confusing, try making your /dev/null
 a regular (writable) file by mistake.  Now THAT will be confusing as

Been there got the t-shirt :)

/BW

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



setfsuid on ext2 weirdness (2.4)

2001-01-07 Thread Bjorn Wesen

Ok.. I'm going bananas. It could be a 4am braindeath or a rh7.0 bungholio
but this is annoying:

main(int argc, char **argv)
{
int fd;
setfsuid(atoi(argv[1]));
fd = open("/etc/passwd", O_RDONLY);
printf("got fd %d\n", fd);
}

[root@wizball /root]# ./setfstest 0 
got fd 3
[root@wizball /root]# ./setfstest 500
got fd 3
[root@wizball /root]# ./setfstest 501
got fd -1

0 is obviously my root user and 500 is my standard user i log-in with. 501
exists (not that that has anything to do with this)

in fact, 0 and 500 are the ONLY ones who let a filesystem op through after
the setfsuid call. all other cause an EACCESS error on the open (or any
other fs op). and yes, the actual filepermissions on /etc and /etc/passwd
are correct.

consequence is that i can't login as any other user (or ftp, or anything
that needs to change the uid's) :(

so... the quick question is... is there anything in EXT2 or VFS that can
cause a quite normal ext2 filesystem on a 2.4.0 kernel to behave remotely
like this ?

strace shows the setfsuid call succeeds and nothing funny happens.

[root@wizball /root]# strace ./setfstest 501
execve("./setfstest", ["./setfstest", "501"], [/* 38 vars */]) = 0
uname({sys="Linux", node="wizball.xxx.yyy.zzz", ...}) = 0
brk(0)  = 0x80496c8
open("/etc/ld.so.preload", O_RDONLY)= -1 ENOENT (No such file or
directory)
open("/etc/ld.so.cache", O_RDONLY)  = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=32172, ...}) = 0
old_mmap(NULL, 32172, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40018000
close(3)= 0
open("/lib/libc.so.6", O_RDONLY)= 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\0\301\1"...,
1024) = 1024
fstat64(3, {st_mode=S_IFREG|0755, st_size=4851725, ...}) = 0
old_mmap(NULL, 1217864, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) =
0x4002
mprotect(0x4014, 38216, PROT_NONE)  = 0
old_mmap(0x4014, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED,
3, 0x11f000) = 0x4014
old_mmap(0x40146000, 13640, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x40146000
close(3)= 0
munmap(0x40018000, 32172)   = 0
getpid()= 1739
setfsuid32(0x1f5)   = 0
open("/etc/passwd", O_RDONLY)   = -1 EACCES (Permission denied)

 





-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



setfsuid on ext2 weirdness (2.4)

2001-01-07 Thread Bjorn Wesen

Ok.. I'm going bananas. It could be a 4am braindeath or a rh7.0 bungholio
but this is annoying:

main(int argc, char **argv)
{
int fd;
setfsuid(atoi(argv[1]));
fd = open("/etc/passwd", O_RDONLY);
printf("got fd %d\n", fd);
}

[root@wizball /root]# ./setfstest 0 
got fd 3
[root@wizball /root]# ./setfstest 500
got fd 3
[root@wizball /root]# ./setfstest 501
got fd -1

0 is obviously my root user and 500 is my standard user i log-in with. 501
exists (not that that has anything to do with this)

in fact, 0 and 500 are the ONLY ones who let a filesystem op through after
the setfsuid call. all other cause an EACCESS error on the open (or any
other fs op). and yes, the actual filepermissions on /etc and /etc/passwd
are correct.

consequence is that i can't login as any other user (or ftp, or anything
that needs to change the uid's) :(

so... the quick question is... is there anything in EXT2 or VFS that can
cause a quite normal ext2 filesystem on a 2.4.0 kernel to behave remotely
like this ?

strace shows the setfsuid call succeeds and nothing funny happens.

[root@wizball /root]# strace ./setfstest 501
execve("./setfstest", ["./setfstest", "501"], [/* 38 vars */]) = 0
uname({sys="Linux", node="wizball.xxx.yyy.zzz", ...}) = 0
brk(0)  = 0x80496c8
open("/etc/ld.so.preload", O_RDONLY)= -1 ENOENT (No such file or
directory)
open("/etc/ld.so.cache", O_RDONLY)  = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=32172, ...}) = 0
old_mmap(NULL, 32172, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40018000
close(3)= 0
open("/lib/libc.so.6", O_RDONLY)= 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\0\301\1"...,
1024) = 1024
fstat64(3, {st_mode=S_IFREG|0755, st_size=4851725, ...}) = 0
old_mmap(NULL, 1217864, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) =
0x4002
mprotect(0x4014, 38216, PROT_NONE)  = 0
old_mmap(0x4014, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED,
3, 0x11f000) = 0x4014
old_mmap(0x40146000, 13640, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x40146000
close(3)= 0
munmap(0x40018000, 32172)   = 0
getpid()= 1739
setfsuid32(0x1f5)   = 0
open("/etc/passwd", O_RDONLY)   = -1 EACCES (Permission denied)

 cut





-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: IDE-driver not generalized enough ?

2000-11-28 Thread Bjorn Wesen

On Mon, 27 Nov 2000, Andre Hedrick wrote:
> Yes, I have been working on that for some time.
> This requires that the macros be exported the arch-xxx/ide.h
> Additionally it takes more work to modify the request_io and release_io,
> but it is all doable.

Right on! Do you think it would be too big a performance hit if OUT_BYTE
actually was an hwif function call instead of a macro ? OUT_BYTE has more
to do with the specific hw interface than the system architecture, really. 

Actually the entire hwif_unregister function should be handled by the hwif
itself I guess (haven't noticed that yet since I never unregister my
drivers :) 

My "hack" right now involves putting "magic" values in the io_ports array
so that OUT_BYTE separate them correctly (my controller has ONE address
where a 32-bit write does the commands, with a bitfield controlling the
IDE bus address instead of splitting into 7 + 1 separate addresses).

BTW can ide_register_hw be called from the automatic "module_init" chains
during bootup, or is that too early or too late ? It would be nice if that
was the case because otherwise we need to add to the long list in
probe_for_hwifs with initialization calls.

-BW

> On Tue, 28 Nov 2000, Bjorn Wesen wrote:
> > Hi! Quick question: is it possible to write an IDE driver for a controller
> > that is not mappable using outp and those memory-mapped thingys ? 
> > 
> > I see all the nice overrideables in struct hwif_s but the main code still
> > uses OUT_BYTE which is hardcoded to an outb_p.. non-overrideable. Same
> > thing with ide_input/output_bytes, they do direct in/out accesses also
> > without consulting any hwif specific routine.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: IDE-driver not generalized enough ?

2000-11-28 Thread Bjorn Wesen

On Mon, 27 Nov 2000, Andre Hedrick wrote:
 Yes, I have been working on that for some time.
 This requires that the macros be exported the arch-xxx/ide.h
 Additionally it takes more work to modify the request_io and release_io,
 but it is all doable.

Right on! Do you think it would be too big a performance hit if OUT_BYTE
actually was an hwif function call instead of a macro ? OUT_BYTE has more
to do with the specific hw interface than the system architecture, really. 

Actually the entire hwif_unregister function should be handled by the hwif
itself I guess (haven't noticed that yet since I never unregister my
drivers :) 

My "hack" right now involves putting "magic" values in the io_ports array
so that OUT_BYTE separate them correctly (my controller has ONE address
where a 32-bit write does the commands, with a bitfield controlling the
IDE bus address instead of splitting into 7 + 1 separate addresses).

BTW can ide_register_hw be called from the automatic "module_init" chains
during bootup, or is that too early or too late ? It would be nice if that
was the case because otherwise we need to add to the long list in
probe_for_hwifs with initialization calls.

-BW

 On Tue, 28 Nov 2000, Bjorn Wesen wrote:
  Hi! Quick question: is it possible to write an IDE driver for a controller
  that is not mappable using outp and those memory-mapped thingys ? 
  
  I see all the nice overrideables in struct hwif_s but the main code still
  uses OUT_BYTE which is hardcoded to an outb_p.. non-overrideable. Same
  thing with ide_input/output_bytes, they do direct in/out accesses also
  without consulting any hwif specific routine.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Address translation

2000-11-23 Thread Bjorn Wesen

On Thu, 23 Nov 2000, Andreas Bombe wrote:
> > I may be wrong on this, but I thought that copy_{to,from}_user are
> > only necessary if the address range you are accessing might cause a
> > fault which Linux cannot handle (ie. one which would cause the
> > application to segfault if it accessed that memory). If it is only a
> 
> It is wrong.  copy_*_user handle the page faults, whether they are good
> faults (swapped out, copy on write) or bad faults (illegal access).
> Without these macros you get the "unable to handle kernel page fault"
> oops message if a fault occurs.

Yes but only if it's a real fault, not if the address range actually is
a valid VMA which needs paging, COW'ing or related OS ops. copy_*_user
does not do the access in any different way than a "manual" access or
memcpy does, it just adds a .fixup section that tells the do_page_fault
handler that it should not segfault the kernel itself if the copy takes a
big fault at any point, instead it should jump to the fixup which makes
the copy routine return an error message.

However, the fixup stuff is not in-line with the copy code so there should
be absolutely no penalty using copy_*_user instead of a memcpy
(provided the copy_*_user is as optimized as the memcpy code), and it's
dangerous to assume anything about pages visible in user-space, they might
be unmapped by another thread while you're doing that memcpy etc.

> >  (1) In a "top half" thread, can I now access this memory without the
> >  access macros (since I know the address range is valid)?
> 
> The address is valid, the pages probably aren't.  In fact, extending the
> address space only creates read-only mappings to the global zeroed page
> if I remember right.

But it does not matter that the pages aren't there physically, any kind of
access (including an access from kernel-mode) will bring about the same
COW/change-on-write mechanism as copy_to_user or a user-mode access would.

The problem is rather that between your do_brk and when you access the
pages, a thread in the process might do an unmap or brk to remove the
mapping, then you crash the kernel.

> >  (2) Can I also access this memory from an interrupt/exception
> >  context, or must I lock it? (ie. can faults be handled from such
> >  a context) 
> 
> You can't even use copy_*_user in this context (since the current user
> space might be any process, even kernel threads that have no user space
> at all).
> 
> For access to user memory from interrupt context at all and to access
> user memory without the uaccess macros, you have to lock them down in
> memory, with map_user_kiobuf().  This is only recommended if you want
> hardware to DMA to/from buffers provided by user space.

Yup, if you are in the wrong context or in an interrupt context you'll die
horribly if you try to access user-pages that aren't there:

if (in_interrupt() || !mm)
goto no_context;

So you need to 1) make sure the pages are in physical memory and 2) make
sure the pages won't get removed from under your feet at any time and 3)
access them using their physical address

> >  (3) Is the above code sensible at all, or barking? It took me a while
> >  to figure that the above would work, and I think/hope it is the
> >  most elegant way to share memory between kernel and a process.
> 
> It will fail quickly, probably taking the kernel down with it.
> 
> The most elegant way to share memory between user and kernel is to
> allocate the memory in the kernel and map it to user space (by
> implementing mmap  on the kernel side for the file used for
> communication).

Agreed, but that does not cut it for some applications. For example, let's
say you want to grab 16 MB of video frames without copying them from that
mmap area to your malloc'ed 16 MB (let's say your CPU takes a pretty big
hit doing that extra memcpy) and you'd like to DMA directly into the
user-pages. 

You can of course make the kernel grab 16 MB worth of pages for you and
then mmap them into the process, but the kernel driver would be pretty
hooked to that demanding user process then.. 

Actually I'm trying to figure out the best way to do a similar thing for
some hardware we have - I have incoming DMA data containing JPG grabs, and
I want to cache images in a user-mode daemon, which will send pictures
from the cache out on TCP. The images might be generated with many
different JPG settings so they need right tags in the cache etc. 

Before when we ran on a chip without MMU this was easy - that user-mode
buffer was a contigous physical area which I could DMA directly in. But
now when we're going to a CPU with MMU, it gets more complicated of course
:) 

I have figured the options are

1) let the kernel driver have a buffer big enough for a single grab, mmap
this into user-space and do the memcpy into the cache (might be fast
enough, but our chip isn't super on memcpy's..)

2) let the kernel driver lock down the user-pages in an access and DMA
directly 

Re: Q: network drivers interface changes

2000-09-27 Thread Bjorn Wesen

On Wed, 27 Sep 2000, Hen, Shmulik wrote:
> Is there a good source of information that describes the changes in network
> driver interface between 2.2.x and 2.4.x kernels ?

Try a diff -u of skeleton.c in the both kernels. If the skeleton driver is
correct that is :) 

It didn't look very complicated from 2.0 -> 2.4 at least so 2.2 -> 2.4
should not be difficult at all.

-Bjorn

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/