Re: max sizes for files and file systems

2001-07-05 Thread Albert D. Cahalan

Derek Vadala writes:

> It's clear that under 2.4, the kernel imposes a limit of 2TB as the
> maximum file size and that some portion of the kernels before 2.4 had a
> limit of 2GB.
>
> However, it's not clear to me when the file size limit was increased, or
> what the maximum file system sizes under 2.0, 2.2 and 2.4 are. I realize
> that both of these values are also contingent on the filesystem used, but
> I'm wondering about what limits the kernel itself imposes. 
> 
> I'm also a bit unclear as to where the 2GB limit in kernels < 2.4 comes
> from. It appears to be a kernel imposed limit, but there also seems to be
> a lot conflicting information out there, blaming the problem on
> EXT2. However, from what I can tell, 2.0.39, 2.2.19 and 2.4.5 all use the
> same version (0.5b-95/08/09) of ext2-- either that or EXT2FS_VERSION and
> EXT2FS_DATE in .../include/linux/ext2_fs.h simply haven't been updated.

No 32-bit Linux system could exceed 1 TB on anything until this week.
This is caused by signed 32-bit math on units of 512 bytes.
Now there are experimental patches for larger devices.

The file access API was limited to signed 32-bit byte values.
Officially, this was fixed for the 2.4 series. Most distributions
shipped 2.2 series kernels with patches to allow large files.

The ext2, FAT, and NFSv2 filesystems all had a 32-bit file
size limit. For ext2 this was lifted just as the 2.2 series
came out, but only Alpha systems could use the large files.
FAT has not been fixed. NFSv2 has been replaced by NFSv3.

EXT2FS_VERSION has not been updated because feature flag bits
are being used instead.

I have a graph of ext2 limits:
http://www.cs.uml.edu/~acahalan/linux/ext2.gif

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] more SAK stuff

2001-07-05 Thread Albert D. Cahalan

Rob Landley writes:

> Off the top of my head, fun things you can't do suid root:
...
> ps  (What the...?  Worked in Red Hat 7, but not in suse 7.1.
> Huh?  "suid-to  apache ps ax" works fine, though...)

The ps command used to require setuid root. People would set the
bit by habit.

> I keep bumping into more of these all the time.  Often it's fun
> little warnings "you shouldn't have the suid bit on this
> executable", which is frustrating 'cause I haven't GOT the suid bit
> on that executable, it inherited it from its parent process, which
> DOES explicitly set the $PATH and blank most of the environment
> variables and other fun stuff...)

Oh, cry me a river. You can set the RUID, EUID, SUID, and FUID
in that same parent process or after you fork().

Since you didn't set all the UID values, I have to wonder what
else you forgot to do. Maybe you shouldn't be messing with
setuid programming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [OT] Re: LILO calling modprobe?

2001-07-05 Thread Albert D. Cahalan

Wakko Warner writes:

> I believe there is.  It wants to find what drive is bios drive 80h.  Really
> annoying since there's no way (correct me if I'm wrong) to read bios from
> linux.  If there is, lilo should do that.  But since it's an old copy, this
> probably was fixed.
>
> I had a machine at work with both ide and scsi.  ide hdd was hdc and ide
> cdrom was hda just to keep lilo from thinking hdc is the first bios drive
> which infact sda was

The easy way to handle this is to md5 checksum the disks at boot.
Read the first and last track of the first and last cylinder of
every BIOS drive. Then match up the disks when partition tables
get scanned.

The hard way involves running the BIOS in virtual-8088 mode to
trap IO accesses, then mapping to drivers by IO region later.

Neither way is 100% reliable, but the current guess is worse.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [OT] Re: LILO calling modprobe?

2001-07-05 Thread Albert D. Cahalan

Wakko Warner writes:

 I believe there is.  It wants to find what drive is bios drive 80h.  Really
 annoying since there's no way (correct me if I'm wrong) to read bios from
 linux.  If there is, lilo should do that.  But since it's an old copy, this
 probably was fixed.

 I had a machine at work with both ide and scsi.  ide hdd was hdc and ide
 cdrom was hda just to keep lilo from thinking hdc is the first bios drive
 which infact sda was

The easy way to handle this is to md5 checksum the disks at boot.
Read the first and last track of the first and last cylinder of
every BIOS drive. Then match up the disks when partition tables
get scanned.

The hard way involves running the BIOS in virtual-8088 mode to
trap IO accesses, then mapping to drivers by IO region later.

Neither way is 100% reliable, but the current guess is worse.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] more SAK stuff

2001-07-05 Thread Albert D. Cahalan

Rob Landley writes:

 Off the top of my head, fun things you can't do suid root:
...
 ps  (What the...?  Worked in Red Hat 7, but not in suse 7.1.
 Huh?  suid-to  apache ps ax works fine, though...)

The ps command used to require setuid root. People would set the
bit by habit.

 I keep bumping into more of these all the time.  Often it's fun
 little warnings you shouldn't have the suid bit on this
 executable, which is frustrating 'cause I haven't GOT the suid bit
 on that executable, it inherited it from its parent process, which
 DOES explicitly set the $PATH and blank most of the environment
 variables and other fun stuff...)

Oh, cry me a river. You can set the RUID, EUID, SUID, and FUID
in that same parent process or after you fork().

Since you didn't set all the UID values, I have to wonder what
else you forgot to do. Maybe you shouldn't be messing with
setuid programming.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: max sizes for files and file systems

2001-07-05 Thread Albert D. Cahalan

Derek Vadala writes:

 It's clear that under 2.4, the kernel imposes a limit of 2TB as the
 maximum file size and that some portion of the kernels before 2.4 had a
 limit of 2GB.

 However, it's not clear to me when the file size limit was increased, or
 what the maximum file system sizes under 2.0, 2.2 and 2.4 are. I realize
 that both of these values are also contingent on the filesystem used, but
 I'm wondering about what limits the kernel itself imposes. 
 
 I'm also a bit unclear as to where the 2GB limit in kernels  2.4 comes
 from. It appears to be a kernel imposed limit, but there also seems to be
 a lot conflicting information out there, blaming the problem on
 EXT2. However, from what I can tell, 2.0.39, 2.2.19 and 2.4.5 all use the
 same version (0.5b-95/08/09) of ext2-- either that or EXT2FS_VERSION and
 EXT2FS_DATE in .../include/linux/ext2_fs.h simply haven't been updated.

No 32-bit Linux system could exceed 1 TB on anything until this week.
This is caused by signed 32-bit math on units of 512 bytes.
Now there are experimental patches for larger devices.

The file access API was limited to signed 32-bit byte values.
Officially, this was fixed for the 2.4 series. Most distributions
shipped 2.2 series kernels with patches to allow large files.

The ext2, FAT, and NFSv2 filesystems all had a 32-bit file
size limit. For ext2 this was lifted just as the 2.2 series
came out, but only Alpha systems could use the large files.
FAT has not been fixed. NFSv2 has been replaced by NFSv3.

EXT2FS_VERSION has not been updated because feature flag bits
are being used instead.

I have a graph of ext2 limits:
http://www.cs.uml.edu/~acahalan/linux/ext2.gif

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Re: gcc: internal compiler error: program cc1 got fatal signal 11]

2001-06-29 Thread Albert D. Cahalan

> Almost always ?
> It seems like gcc is THE ONLY program which gets
> signal 11
> Why the X server doesn't get signal 11 ?
> Why others programs don't get signal 11 ?
...
> Some time ago I installed Linux (Redhat 6.0) on my 
> pc (Cx486 8M RAM) and gcc had a lot of signal 11 (a
> couple every hour) I was upgrading
> the kernel every time there was a new kernel and
> from 2.2.12(or 14) no more signal 11 (very rare)
> Is this still a hardware problem ?

It could be. One possible way:

1. your system is clogged with dust
2. gcc runs the CPU hard, generating lots of heat
3. the heat causes crashes
4. a new Linux version that sets a Cyrix-specific power-saving mode
5. your heat problems go away, and so do the crashes

Another possible way:

1. you have buggy motherboard or disk hardware
2. when you swap, gcc gets corrupted by the hardware
3. you get a new Linux kernel that has a bug work-around
4. your problems go away

Yet another way:

1. your room is hot, your computer is near a huge motor...
2. you upgrade to Linux 2.2.12 and move your computer
3. soon you realize that the crashes are gone
4. you credit the kernel, but location was the problem
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Re: gcc: internal compiler error: program cc1 got fatal signal 11]

2001-06-29 Thread Albert D. Cahalan

 Almost always ?
 It seems like gcc is THE ONLY program which gets
 signal 11
 Why the X server doesn't get signal 11 ?
 Why others programs don't get signal 11 ?
...
 Some time ago I installed Linux (Redhat 6.0) on my 
 pc (Cx486 8M RAM) and gcc had a lot of signal 11 (a
 couple every hour) I was upgrading
 the kernel every time there was a new kernel and
 from 2.2.12(or 14) no more signal 11 (very rare)
 Is this still a hardware problem ?

It could be. One possible way:

1. your system is clogged with dust
2. gcc runs the CPU hard, generating lots of heat
3. the heat causes crashes
4. a new Linux version that sets a Cyrix-specific power-saving mode
5. your heat problems go away, and so do the crashes

Another possible way:

1. you have buggy motherboard or disk hardware
2. when you swap, gcc gets corrupted by the hardware
3. you get a new Linux kernel that has a bug work-around
4. your problems go away

Yet another way:

1. your room is hot, your computer is near a huge motor...
2. you upgrade to Linux 2.2.12 and move your computer
3. soon you realize that the crashes are gone
4. you credit the kernel, but location was the problem
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] User chroot

2001-06-28 Thread Albert D. Cahalan

Sean Hunter writes:
> On Wed, Jun 27, 2001 at 04:55:56PM -0400, Albert D. Cahalan wrote:

>> ln /dev/zero /tmp/zero
>> ln /dev/hda ~/hda
>> ln /dev/mem /var/tmp/README
>
> None of these (of course) work if you use mount options to
> restrict device nodes on those filesystems.

In which case, you can't boot. Think about it.

Never mind the method. One way or another, it is very often
possible for a normal users to set up a chroot environment
with the device files that are needed. Maybe they do something
obscene with the admin. :-) So chroot() is useful for users.

In my case, I _am_ the admin and I just don't want to run
every damn little test program and hack as root.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] User chroot

2001-06-28 Thread Albert D. Cahalan

Sean Hunter writes:
 On Wed, Jun 27, 2001 at 04:55:56PM -0400, Albert D. Cahalan wrote:

 ln /dev/zero /tmp/zero
 ln /dev/hda ~/hda
 ln /dev/mem /var/tmp/README

 None of these (of course) work if you use mount options to
 restrict device nodes on those filesystems.

In which case, you can't boot. Think about it.

Never mind the method. One way or another, it is very often
possible for a normal users to set up a chroot environment
with the device files that are needed. Maybe they do something
obscene with the admin. :-) So chroot() is useful for users.

In my case, I _am_ the admin and I just don't want to run
every damn little test program and hack as root.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] User chroot

2001-06-27 Thread Albert D. Cahalan

H. Peter Anvin writes:
> "Albert D. Cahalan" wrote:

>> BTW, it is way wrong that /dev/zero should be needed at all.
>> Such use is undocumented ("man zero", "man mmap") anyway, and
>> AFAIK one should use mmap() with MAP_ANON instead. Not that
>> the documentation on MAP_ANON is any good either, but at least
>> the mere existence of the flag is mentioned.
>
> RTFM(POSIX).

No manual entry for RTFM in section POSIX

Seriously:

1. both features ought to be documented in the man pages
   (I did submit a man page too, back in 1996)

2. it is slow and nasty to open /dev/zero for getting memory
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] User chroot

2001-06-27 Thread Albert D. Cahalan

H. Peter Anvin writes:
> Albert D. Cahalan wrote:

>> Normal users can use an environment provided for them.
>>
>> While trying to figure out why the "heyu" program would not
>> work on a Red Hat box, I did just this. As root I set up all
>> the device files needed, along Debian libraries and the heyu
>> executable itself. It was annoying that I couldn't try out
>> my chroot environment as a regular user.
>>
>> Creating the device files isn't a big deal. It wouldn't be
>> hard to write a setuid app to make the few needed devices.
>> If we had per-user limits, "mount --bind /dev/zero /foo/zero"
>> could be allowed. One way or another, devices can be provided.
>
> Hell no!  This would give the user a way to subvert root or other
> system-provided things by having device nodes or such appear where
> they aren't expected.  NOT GOOD.

On every normal (default Red Hat or Debian at least) system
this is already trivial:

ln /dev/zero /tmp/zero
ln /dev/hda ~/hda
ln /dev/mem /var/tmp/README

So the user often can provide device nodes. The above is _worse_
than allowing "mount --bind ..." because the admin has to search
the whole filesystem to find such links.

Never mind that though; it doesn't matter how the devices are
created. Social engineering can work. Once the device problem
is taken care of, chroot() becomes useful for normal users.

BTW, it is way wrong that /dev/zero should be needed at all.
Such use is undocumented ("man zero", "man mmap") anyway, and
AFAIK one should use mmap() with MAP_ANON instead. Not that
the documentation on MAP_ANON is any good either, but at least
the mere existence of the flag is mentioned.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] User chroot

2001-06-27 Thread Albert D. Cahalan

H. Peter Anvin writes:
 Albert D. Cahalan wrote:

 Normal users can use an environment provided for them.

 While trying to figure out why the heyu program would not
 work on a Red Hat box, I did just this. As root I set up all
 the device files needed, along Debian libraries and the heyu
 executable itself. It was annoying that I couldn't try out
 my chroot environment as a regular user.

 Creating the device files isn't a big deal. It wouldn't be
 hard to write a setuid app to make the few needed devices.
 If we had per-user limits, mount --bind /dev/zero /foo/zero
 could be allowed. One way or another, devices can be provided.

 Hell no!  This would give the user a way to subvert root or other
 system-provided things by having device nodes or such appear where
 they aren't expected.  NOT GOOD.

On every normal (default Red Hat or Debian at least) system
this is already trivial:

ln /dev/zero /tmp/zero
ln /dev/hda ~/hda
ln /dev/mem /var/tmp/README

So the user often can provide device nodes. The above is _worse_
than allowing mount --bind ... because the admin has to search
the whole filesystem to find such links.

Never mind that though; it doesn't matter how the devices are
created. Social engineering can work. Once the device problem
is taken care of, chroot() becomes useful for normal users.

BTW, it is way wrong that /dev/zero should be needed at all.
Such use is undocumented (man zero, man mmap) anyway, and
AFAIK one should use mmap() with MAP_ANON instead. Not that
the documentation on MAP_ANON is any good either, but at least
the mere existence of the flag is mentioned.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] User chroot

2001-06-27 Thread Albert D. Cahalan

H. Peter Anvin writes:
 Albert D. Cahalan wrote:

 BTW, it is way wrong that /dev/zero should be needed at all.
 Such use is undocumented (man zero, man mmap) anyway, and
 AFAIK one should use mmap() with MAP_ANON instead. Not that
 the documentation on MAP_ANON is any good either, but at least
 the mere existence of the flag is mentioned.

 RTFM(POSIX).

No manual entry for RTFM in section POSIX

Seriously:

1. both features ought to be documented in the man pages
   (I did submit a man page too, back in 1996)

2. it is slow and nasty to open /dev/zero for getting memory
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] User chroot

2001-06-26 Thread Albert D. Cahalan

H. Peter Anvin writes:
> [somebody]

>> Have you ever wondered why normal users are not allowed to chroot?
>>
>> I have. The reasons I can figure out are:
>>
>> * Changing root makes it trivial to trick suid/sgid binaries to do
>>   nasty things.
>>
>> * If root calls chroot and changes uid, he expects that the process
>>   can not escape to the old root by calling chroot again.
>>
>> If we only allow user chroots for processes that have never been
>> chrooted before, and if the suid/sgid bits won't have any effect under
>> the new root, it should be perfectly safe to allow any user to chroot.
>
> Safe, perhaps, but also completely useless: there is no way the user
> can set up a functional environment inside the chroot.  In other
> words, it's all pain, no gain.

Normal users can use an environment provided for them.

While trying to figure out why the "heyu" program would not
work on a Red Hat box, I did just this. As root I set up all
the device files needed, along Debian libraries and the heyu
executable itself. It was annoying that I couldn't try out
my chroot environment as a regular user.

Creating the device files isn't a big deal. It wouldn't be
hard to write a setuid app to make the few needed devices.
If we had per-user limits, "mount --bind /dev/zero /foo/zero"
could be allowed. One way or another, devices can be provided.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: EXT2 Filesystem permissions (bug)?

2001-06-26 Thread Albert D. Cahalan

Kenneth Johansson writes:

> Do linux even support the sticky bit (t) I can't see a reason
> to use it, why would I want the file to be stored in the swap ?? 

It is not currently supported. Swapping out executables would
be very nice when using an NFS or CD-ROM filesystem, because
swap space is much faster.

> Also I think S (setuid but no execute bit) have something to
> do with file locking but I'am not shure exactly how it works. 

Yeah, if you mount with mandatory locking enabled it does stuff.
It's a UNIX feature.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: EXT2 Filesystem permissions (bug)?

2001-06-26 Thread Albert D. Cahalan

Kenneth Johansson writes:

 Do linux even support the sticky bit (t) I can't see a reason
 to use it, why would I want the file to be stored in the swap ?? 

It is not currently supported. Swapping out executables would
be very nice when using an NFS or CD-ROM filesystem, because
swap space is much faster.

 Also I think S (setuid but no execute bit) have something to
 do with file locking but I'am not shure exactly how it works. 

Yeah, if you mount with mandatory locking enabled it does stuff.
It's a UNIX feature.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] User chroot

2001-06-26 Thread Albert D. Cahalan

H. Peter Anvin writes:
 [somebody]

 Have you ever wondered why normal users are not allowed to chroot?

 I have. The reasons I can figure out are:

 * Changing root makes it trivial to trick suid/sgid binaries to do
   nasty things.

 * If root calls chroot and changes uid, he expects that the process
   can not escape to the old root by calling chroot again.

 If we only allow user chroots for processes that have never been
 chrooted before, and if the suid/sgid bits won't have any effect under
 the new root, it should be perfectly safe to allow any user to chroot.

 Safe, perhaps, but also completely useless: there is no way the user
 can set up a functional environment inside the chroot.  In other
 words, it's all pain, no gain.

Normal users can use an environment provided for them.

While trying to figure out why the heyu program would not
work on a Red Hat box, I did just this. As root I set up all
the device files needed, along Debian libraries and the heyu
executable itself. It was annoying that I couldn't try out
my chroot environment as a regular user.

Creating the device files isn't a big deal. It wouldn't be
hard to write a setuid app to make the few needed devices.
If we had per-user limits, mount --bind /dev/zero /foo/zero
could be allowed. One way or another, devices can be provided.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: FAT32 superiority over ext2 :-)

2001-06-24 Thread Albert D. Cahalan

Daniel Phillips writes:
> On Monday 25 June 2001 00:54, Albert D. Cahalan wrote:

>> By dumb luck (?), FAT32 is compatible with the phase-tree algorithm
>> as seen in Tux2. This means it offers full data integrity.
>> Yep, it whips your typical journalling filesystem. Look at what
>> we have in the superblock (boot sector):
>>
>> __u32  fat32_length;  /* sectors/FAT */
>> __u16  flags; /* bit 8: fat mirroring, low 4: active fat */
>> __u8   version[2];/* major, minor filesystem version */
>> __u32  root_cluster;  /* first cluster in root directory */
>> __u16  info_sector;   /* filesystem info sector */
>>
>> All in one atomic write, one can...
>>
>> 1. change the active FAT
>> 2. change the root directory
>> 3. change the free space count
>>
>> That's enough to atomically move from one phase to the next.
>> You create new directories in the free space, and make FAT
>> changes to an inactive FAT copy. Then you write the superblock
>> to atomically transition to the next phase.
>
> Yes, FAT is what inspired me to go develop the algorithm.  However, two
> words: 'lost clusters'.  Now that may just be an implemenation detail ;-)

What lost clusters?

Set bit 8 of "flags" (A_BF_BPBExtFlags to Microsoft) to disable
FAT mirroring. Then the low 4 bits are a 0-based value that
indicates which copy of the FAT should be used.

Assume we have 2 copies of the FAT, as is (was?) common. I'll call
them X and Y. When we mount the filesystem, we disable FAT mirroring
and mark FAT X active.

Now we can make changes to FAT Y without affecting filesystem
integrity. Windows will not use FAT Y. As is usual with the
phase-tree algorithm, we use free space to create a new structure
beside the old one.

Time for a phase change:

We have FAT Y, currently inactive, updated on disk.
FAT X is active; it describes the current on-disk state.
We have a new root directory on disk, sitting in free space.
We have a new filesystem info sector on disk, sitting in free space.

We write one single sector, then:

FAT X becomes inactive, and will not be used by Windows.
FAT Y becomes active; it describes the new on-disk state.
The old root directory is marked free in FAT Y. Good!
The old filesystem info sector is marked free in FAT Y. Good!

Once the superblock goes to disk, FAT X may be written to.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



FAT32 superiority over ext2 :-)

2001-06-24 Thread Albert D. Cahalan


By dumb luck (?), FAT32 is compatible with the phase-tree algorithm
as seen in Tux2. This means it offers full data integrity.
Yep, it whips your typical journalling filesystem. Look at what
we have in the superblock (boot sector):

__u32  fat32_length;  /* sectors/FAT */
__u16  flags; /* bit 8: fat mirroring, low 4: active fat */
__u8   version[2];/* major, minor filesystem version */
__u32  root_cluster;  /* first cluster in root directory */
__u16  info_sector;   /* filesystem info sector */

All in one atomic write, one can...

1. change the active FAT
2. change the root directory
3. change the free space count

That's enough to atomically move from one phase to the next.
You create new directories in the free space, and make FAT
changes to an inactive FAT copy. Then you write the superblock
to atomically transition to the next phase.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



FAT32 superiority over ext2 :-)

2001-06-24 Thread Albert D. Cahalan


By dumb luck (?), FAT32 is compatible with the phase-tree algorithm
as seen in Tux2. This means it offers full data integrity.
Yep, it whips your typical journalling filesystem. Look at what
we have in the superblock (boot sector):

__u32  fat32_length;  /* sectors/FAT */
__u16  flags; /* bit 8: fat mirroring, low 4: active fat */
__u8   version[2];/* major, minor filesystem version */
__u32  root_cluster;  /* first cluster in root directory */
__u16  info_sector;   /* filesystem info sector */

All in one atomic write, one can...

1. change the active FAT
2. change the root directory
3. change the free space count

That's enough to atomically move from one phase to the next.
You create new directories in the free space, and make FAT
changes to an inactive FAT copy. Then you write the superblock
to atomically transition to the next phase.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: FAT32 superiority over ext2 :-)

2001-06-24 Thread Albert D. Cahalan

Daniel Phillips writes:
 On Monday 25 June 2001 00:54, Albert D. Cahalan wrote:

 By dumb luck (?), FAT32 is compatible with the phase-tree algorithm
 as seen in Tux2. This means it offers full data integrity.
 Yep, it whips your typical journalling filesystem. Look at what
 we have in the superblock (boot sector):

 __u32  fat32_length;  /* sectors/FAT */
 __u16  flags; /* bit 8: fat mirroring, low 4: active fat */
 __u8   version[2];/* major, minor filesystem version */
 __u32  root_cluster;  /* first cluster in root directory */
 __u16  info_sector;   /* filesystem info sector */

 All in one atomic write, one can...

 1. change the active FAT
 2. change the root directory
 3. change the free space count

 That's enough to atomically move from one phase to the next.
 You create new directories in the free space, and make FAT
 changes to an inactive FAT copy. Then you write the superblock
 to atomically transition to the next phase.

 Yes, FAT is what inspired me to go develop the algorithm.  However, two
 words: 'lost clusters'.  Now that may just be an implemenation detail ;-)

What lost clusters?

Set bit 8 of flags (A_BF_BPBExtFlags to Microsoft) to disable
FAT mirroring. Then the low 4 bits are a 0-based value that
indicates which copy of the FAT should be used.

Assume we have 2 copies of the FAT, as is (was?) common. I'll call
them X and Y. When we mount the filesystem, we disable FAT mirroring
and mark FAT X active.

Now we can make changes to FAT Y without affecting filesystem
integrity. Windows will not use FAT Y. As is usual with the
phase-tree algorithm, we use free space to create a new structure
beside the old one.

Time for a phase change:

We have FAT Y, currently inactive, updated on disk.
FAT X is active; it describes the current on-disk state.
We have a new root directory on disk, sitting in free space.
We have a new filesystem info sector on disk, sitting in free space.

We write one single sector, then:

FAT X becomes inactive, and will not be used by Windows.
FAT Y becomes active; it describes the new on-disk state.
The old root directory is marked free in FAT Y. Good!
The old filesystem info sector is marked free in FAT Y. Good!

Once the superblock goes to disk, FAT X may be written to.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Shared memory quantity not being reflected by /proc/meminfo

2001-06-23 Thread Albert D. Cahalan

Allan Duncan writes:

> Since the 2.4.x advent of shm as tmpfs or thereabouts,
> /proc/meminfo shows shared memory as 0.  It is in
> reality not zero, and is being allocated, and shows
> up in /proc/sysvipc/shm and /proc/sys/kernel/shmall
> etc..
> Neither 2.4.6-pre5 nor 2.4.5-ac17 have the correct
> display.

You misunderstood what 2.2.xx kernels were reporting.
The "shared" memory in /proc/meminfo refers to something
completely unrelated to SysV shared memory. This is no
longer calculated because the computation was too costly.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Shared memory quantity not being reflected by /proc/meminfo

2001-06-23 Thread Albert D. Cahalan

Allan Duncan writes:

 Since the 2.4.x advent of shm as tmpfs or thereabouts,
 /proc/meminfo shows shared memory as 0.  It is in
 reality not zero, and is being allocated, and shows
 up in /proc/sysvipc/shm and /proc/sys/kernel/shmall
 etc..
 Neither 2.4.6-pre5 nor 2.4.5-ac17 have the correct
 display.

You misunderstood what 2.2.xx kernels were reporting.
The shared memory in /proc/meminfo refers to something
completely unrelated to SysV shared memory. This is no
longer calculated because the computation was too costly.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: For comment: draft BIOS use document for the kernel

2001-06-22 Thread Albert D. Cahalan

Alan Cox writes:
> [somebody]

>> I could not find any reference to BIOS int 0x15, function 0x87,
>> block-move, used to copy the kernel to above the 1 megabyte
>> real-mode boundary. I think this is still used.
>
> I dont think the kernel has ever used it. The path has always been to
> enter 32bit mode then relocate/uncompress the kernel, then run it

There are several non-kernel BIOS users:

lilo
grub
syslinux
XFree86 (using virtual-8088 to run a video BIOS for a second card?)
dosemu?
loadlin?
the boot block that reads ext2 (in 1 kB -- damn what a hack)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: For comment: draft BIOS use document for the kernel

2001-06-22 Thread Albert D. Cahalan

Alan Cox writes:
 [somebody]

 I could not find any reference to BIOS int 0x15, function 0x87,
 block-move, used to copy the kernel to above the 1 megabyte
 real-mode boundary. I think this is still used.

 I dont think the kernel has ever used it. The path has always been to
 enter 32bit mode then relocate/uncompress the kernel, then run it

There are several non-kernel BIOS users:

lilo
grub
syslinux
XFree86 (using virtual-8088 to run a video BIOS for a second card?)
dosemu?
loadlin?
the boot block that reads ext2 (in 1 kB -- damn what a hack)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [RFC][PATCH] cutting up struct kernel_stat into cpu_stat

2001-06-21 Thread Albert D. Cahalan

Zach Brown writes:

> The attached patch-in-progress removes the per-cpu statistics from
> struct kernel_stat and puts them in a cpu_stat structure, one per cpu,
> cacheline padded.  The data is still coolated and presented through
> /proc/stat, but another file /proc/cpustat is also added.  The locking
> is as nonexistant as it was with kernel_stat, but who cares, they're
> just fuzzy stats to be eyeballed by system tuners :).

Hey! The lack of atomicity causes "top" to do one of 3 things
for the idle time report, depending on the version:

1. negative numbers
2. wrap-around (4200.00% idle)
3. truncate to zero (the numbers don't add up)

This is because top sees the idle time run backwards for a moment.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [OT] Threads, inelegance, and Java

2001-06-21 Thread Albert D. Cahalan

Rob Landley writes:
> On Wednesday 20 June 2001 15:53, Martin Dalecki wrote:
>> Mike Harrold wrote:

>> super computing, hmm what about some PowerPC CPU variant - they very
>> compettetiv in terms of cost and FPU performance! Transmeta isn't the
>> adequate choice here.
>
> You honestly think you can fit 142 PowerPC processors in a single 1U,
> air cooled?

That 142 would be what, a SHARC DSP system? It sure doesn't look
like Transmeta's Crueso. The best I found was 6 and 8 per 1U:

"RLX has managed to tuck 24 servers into a 3U enclosure" --> 8/U
"WebBunker units can hold 12 processors [in 2U]" --> 6/U

For PowerPC I found 32/U to 40/U, in increments of 9U.
See www.mc.com for an example. The processor gets you 4 (four!)
floating-point fused multiply-add operations per cycle, typically
at 400 MHz. Being optimistic, that's a teraflop in 9U.

> Liquid air cooled, maybe...

Nope, plain old air or conduction.

If you're going to rant about off-topic junk, at least try to
throw in a few useful references so people can check facts and
maybe take advantage of whatever it is you're ranting about.
(yeah, yeah, sorry about the VGA console thing)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [OT] Threads, inelegance, and Java

2001-06-21 Thread Albert D. Cahalan

Rob Landley writes:
 On Wednesday 20 June 2001 15:53, Martin Dalecki wrote:
 Mike Harrold wrote:

 super computing, hmm what about some PowerPC CPU variant - they very
 compettetiv in terms of cost and FPU performance! Transmeta isn't the
 adequate choice here.

 You honestly think you can fit 142 PowerPC processors in a single 1U,
 air cooled?

That 142 would be what, a SHARC DSP system? It sure doesn't look
like Transmeta's Crueso. The best I found was 6 and 8 per 1U:

RLX has managed to tuck 24 servers into a 3U enclosure -- 8/U
WebBunker units can hold 12 processors [in 2U] -- 6/U

For PowerPC I found 32/U to 40/U, in increments of 9U.
See www.mc.com for an example. The processor gets you 4 (four!)
floating-point fused multiply-add operations per cycle, typically
at 400 MHz. Being optimistic, that's a teraflop in 9U.

 Liquid air cooled, maybe...

Nope, plain old air or conduction.

If you're going to rant about off-topic junk, at least try to
throw in a few useful references so people can check facts and
maybe take advantage of whatever it is you're ranting about.
(yeah, yeah, sorry about the VGA console thing)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [RFC][PATCH] cutting up struct kernel_stat into cpu_stat

2001-06-21 Thread Albert D. Cahalan

Zach Brown writes:

 The attached patch-in-progress removes the per-cpu statistics from
 struct kernel_stat and puts them in a cpu_stat structure, one per cpu,
 cacheline padded.  The data is still coolated and presented through
 /proc/stat, but another file /proc/cpustat is also added.  The locking
 is as nonexistant as it was with kernel_stat, but who cares, they're
 just fuzzy stats to be eyeballed by system tuners :).

Hey! The lack of atomicity causes top to do one of 3 things
for the idle time report, depending on the version:

1. negative numbers
2. wrap-around (4200.00% idle)
3. truncate to zero (the numbers don't add up)

This is because top sees the idle time run backwards for a moment.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Alan Cox quote? (was: Re: accounting for threads)

2001-06-20 Thread Albert D. Cahalan

Rob Landley writes:

> My only real gripe with Linux's threads right now [...] is
> that ps and top and such aren't thread aware and don't group them
> right.
>
> I'm told they added some kind of "threadgroup" field to processes
> that allows top and ps and such to get the display right.  I haven't
> noticed any upgrades, and haven't had time to go hunting myself.

There was a "threadgroup" added just before the 2.4 release.
Linus said he'd remove it if he didn't get comments on how
useful it was, examples of usage, etc. So I figured I'd look at
the code that weekend, but the patch was removed before then!

There is nothing that ps and top can do about this problem.
I've certainly looked into the matter; much of the code is mine.
BTW, the version in debian-unstable is the most stable. :-)

These options might help a little bit: --forest -H f

> (Ever tried to sumit a patch to the FSF?  They want you to sign
> legal documents.  That's annoying.  I usually just send the bug
> reports to red hat and let THEM deal with it...)

Submit patches to me, under the LGPL please. The FSF isn't likely
to care. What, did you think this was the GNU system or something?

> Linus's job is to keep code OUT of the kernel.  He has veto power,
> nothing else.  I suspect he's pre-emptively vetoing some stuff to
> keep the flood down to a level he can deal with.  Maybe someday
> we'll convince him to use some variant of source control (not
> necessarily CVS, how about just a seperate mailing list of the
> individual patches as he applies them?  One linus can post to and
> that is read-only to everybody else?  HE always wants patches
> seperated down nicely into individual messages with explanations,
> but WE have to get pre2-pre3 as one big patch lump.  With a
> patches-from-linus mailing list that he forwarded posts to, we'd
> know exactly when a patch went in and who it was from without
> bothering Linus. :)

How about a filesystem filter to spit out patches, or a filesystem
interface to version control?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Alan Cox quote? (was: Re: accounting for threads)

2001-06-20 Thread Albert D. Cahalan

Rob Landley writes:

 My only real gripe with Linux's threads right now [...] is
 that ps and top and such aren't thread aware and don't group them
 right.

 I'm told they added some kind of threadgroup field to processes
 that allows top and ps and such to get the display right.  I haven't
 noticed any upgrades, and haven't had time to go hunting myself.

There was a threadgroup added just before the 2.4 release.
Linus said he'd remove it if he didn't get comments on how
useful it was, examples of usage, etc. So I figured I'd look at
the code that weekend, but the patch was removed before then!

There is nothing that ps and top can do about this problem.
I've certainly looked into the matter; much of the code is mine.
BTW, the version in debian-unstable is the most stable. :-)

These options might help a little bit: --forest -H f

 (Ever tried to sumit a patch to the FSF?  They want you to sign
 legal documents.  That's annoying.  I usually just send the bug
 reports to red hat and let THEM deal with it...)

Submit patches to me, under the LGPL please. The FSF isn't likely
to care. What, did you think this was the GNU system or something?

 Linus's job is to keep code OUT of the kernel.  He has veto power,
 nothing else.  I suspect he's pre-emptively vetoing some stuff to
 keep the flood down to a level he can deal with.  Maybe someday
 we'll convince him to use some variant of source control (not
 necessarily CVS, how about just a seperate mailing list of the
 individual patches as he applies them?  One linus can post to and
 that is read-only to everybody else?  HE always wants patches
 seperated down nicely into individual messages with explanations,
 but WE have to get pre2-pre3 as one big patch lump.  With a
 patches-from-linus mailing list that he forwarded posts to, we'd
 know exactly when a patch went in and who it was from without
 bothering Linus. :)

How about a filesystem filter to spit out patches, or a filesystem
interface to version control?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: very strange (semi-)lockups in 2.4.5

2001-06-18 Thread Albert D. Cahalan

Pozsar Balazs writes:

> I'm having ~2 lockups a day. The following happens:
>  If I was under X, i only can use the magic-key, but no other keyboard (eg
> numlock) or mouse response, the screen freezes, processes stop.
>  If i was using textmode:
>   numlock still works
>   cursor blinks
>   processess stop (eg, gpm doesn't work, outputs freeze)
>   i can still switch vt's.
>   BUT, i can only type into a few vt's, last time into 3,5,6,7,8, but not
> into 1,2 or 4!
> 
> I cannot give you any traces, as i dont have any.
> 
> Also note that magic-key works, and it says that it umounts filesystems if
> i press magic-u, but next time at mount i see that reiserfs is replaying
> transactions.
> 
> 
> Any ideas?
> 
> The machine is a P3-750, 512M ram, abit vp6 mb. No overclocking, and it
> passes memtest86.

I think I'm getting the same thing, but I don't have the magic-key
compiled in. I'm going to hook up a VT510 to the serial port, in case
this is just XFree86 crashing. For anyone collecting statistics:

kernels 2.4.4-pre6 (?) and now 2.4.6-pre3
plain Pentium MMX @ 200 MHz
Intel motherboard -- see below
stable since 1996, on a UPS, dust-free, and the fan works
one lockup per day with desktop usage

In case the serial console doesn't work, could someone post plans
for a safe NMI board? (both ISA and PCI) The best I found:
http://www.sandelman.ottawa.on.ca/linux-ipsec/html/2000/02/msg00425.html
http://www.sandelman.ottawa.on.ca/linux-ipsec/html/2000/02/msg00391.html
(for PCI you're supposed to assert SERR# on the clock -- how?)

00:00.0 Host bridge: Intel Corporation 430TX - 82439TX MTXC (rev 01)
00:07.0 ISA bridge: Intel Corporation 82371AB PIIX4 ISA (rev 01)
00:07.1 IDE interface: Intel Corporation 82371AB PIIX4 IDE (rev 01)
00:07.2 USB Controller: Intel Corporation 82371AB PIIX4 USB (rev 01)
00:07.3 Bridge: Intel Corporation 82371AB PIIX4 ACPI (rev 01)
00:11.0 Ethernet controller: Digital Equipment Corporation DECchip 21040 [Tulip] (rev 
23)
00:13.0 Ethernet controller: Lite-On Communications Inc LNE100TX Fast Ethernet Adapter 
(rev 25)
00:14.0 VGA compatible controller: ATI Technologies Inc 3D Rage Pro 215GP (rev 5c)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: very strange (semi-)lockups in 2.4.5

2001-06-18 Thread Albert D. Cahalan

Pozsar Balazs writes:

 I'm having ~2 lockups a day. The following happens:
  If I was under X, i only can use the magic-key, but no other keyboard (eg
 numlock) or mouse response, the screen freezes, processes stop.
  If i was using textmode:
   numlock still works
   cursor blinks
   processess stop (eg, gpm doesn't work, outputs freeze)
   i can still switch vt's.
   BUT, i can only type into a few vt's, last time into 3,5,6,7,8, but not
 into 1,2 or 4!
 
 I cannot give you any traces, as i dont have any.
 
 Also note that magic-key works, and it says that it umounts filesystems if
 i press magic-u, but next time at mount i see that reiserfs is replaying
 transactions.
 
 
 Any ideas?
 
 The machine is a P3-750, 512M ram, abit vp6 mb. No overclocking, and it
 passes memtest86.

I think I'm getting the same thing, but I don't have the magic-key
compiled in. I'm going to hook up a VT510 to the serial port, in case
this is just XFree86 crashing. For anyone collecting statistics:

kernels 2.4.4-pre6 (?) and now 2.4.6-pre3
plain Pentium MMX @ 200 MHz
Intel motherboard -- see below
stable since 1996, on a UPS, dust-free, and the fan works
one lockup per day with desktop usage

In case the serial console doesn't work, could someone post plans
for a safe NMI board? (both ISA and PCI) The best I found:
http://www.sandelman.ottawa.on.ca/linux-ipsec/html/2000/02/msg00425.html
http://www.sandelman.ottawa.on.ca/linux-ipsec/html/2000/02/msg00391.html
(for PCI you're supposed to assert SERR# on the clock -- how?)

00:00.0 Host bridge: Intel Corporation 430TX - 82439TX MTXC (rev 01)
00:07.0 ISA bridge: Intel Corporation 82371AB PIIX4 ISA (rev 01)
00:07.1 IDE interface: Intel Corporation 82371AB PIIX4 IDE (rev 01)
00:07.2 USB Controller: Intel Corporation 82371AB PIIX4 USB (rev 01)
00:07.3 Bridge: Intel Corporation 82371AB PIIX4 ACPI (rev 01)
00:11.0 Ethernet controller: Digital Equipment Corporation DECchip 21040 [Tulip] (rev 
23)
00:13.0 Ethernet controller: Lite-On Communications Inc LNE100TX Fast Ethernet Adapter 
(rev 25)
00:14.0 VGA compatible controller: ATI Technologies Inc 3D Rage Pro 215GP (rev 5c)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [patch] nonblinking VGA block cursor

2001-06-15 Thread Albert D. Cahalan

Daniel Phillips writes:
> On Friday 15 June 2001 21:21, Albert D. Cahalan wrote:

>> Non-blinking cursors are just wrong. You need to patch your brain.
>> You really fucked up, because now apps can't restore your cursor
>> to proper behavior as defined by IBM.
>
> Just one question Albert: why doesn't my mouse cursor blink? ;-)

1. confusion with the text cursor, which should blink
2. need for continuous pixel-to-pixel accuracy with the mouse
3. you can wiggle your mouse as needed to find the mouse cursor

Apps do funny things when you try to wiggle the text cursor
with the arrow keys, and movement tends to be harshly constrained.
So the blinking is important.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [patch] nonblinking VGA block cursor

2001-06-15 Thread Albert D. Cahalan

Leon Breedt writes:

> Attached is a patch to enforce a non-blinking, FreeBSD-syscons like
> block cursor in console mode.
> 
> This is useful for laptop types, or people like me who really really
> detest a blinking cursor.
> 
> NOTE: It disables the softcursor escape codes 
>   (/usr/src/linux/Documentation/VGA-softcursor.txt), since I don't 
>   ever want anything to change my cursor shape/style :)

I've seen this 666 times too often.

Non-blinking cursors are just wrong. You need to patch your brain.
You really fucked up, because now apps can't restore your cursor
to proper behavior as defined by IBM.

The blinking cursor is implemented in your video hardware.
IBM knew what was right for you. Millions of people know that
the blinking cursor is good. It is so right that a proper GUI
will implement the blinking cursor even without hardware support.

Of course FreeBSD has a block cursor. It was easy to program,
and it seems nice to the pot-smoking hippies out in Berkeley.
FreeBSD doesn't define standards. FreeBSD breaks standards.
(zombie creation, "ps -ef", partition tables, pty allocation...)
Gee, kind of like Microsoft, except Microsoft got the cursor right!

Ever wonder why IBM supports Linux instead of FreeBSD? Hmmm?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Client receives TCP packets but does not ACK

2001-06-15 Thread Albert D. Cahalan

Mike Black writes:

> I'm concerned that you're probably just overruning your IP stack:
...
> TCP is NOT a guaranteed protocol -- you can't just blast data from one port
> to another and expect it to work.

Yes you can. This is why we have TCP in fact.

> a tcp-write is NOT guaranteed -- and as you've seen -- a recv() isn't either
> (that's why you need timeouts).
> You're probably overrunning the tcp buffer on your "print" statement and
> truncating a block.
> I don't see where you're checking forEAGAIN or EWOULDBLOCK (see man
> send).

You do have to check for partial writes due to the UNIX API.
Then check for EAGAIN and EINTR at least.

> You need a layer-7 protocol that will guarantee your transactions -- once
> you're client acks/naks your server I'll bet everything works hunky-dory.
> If you're not familiar with the OSI model
> http://www.csihq.com/~mike/students/networking/iso/isomodel.html

You don't need that crap. TCP/IP doesn't even fit the OSI model,
and we're missing much of the OSI stack AFAIK. (Do we have that
thing with 10-byte addresses? I think not.)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Client receives TCP packets but does not ACK

2001-06-15 Thread Albert D. Cahalan

Mike Black writes:

 I'm concerned that you're probably just overruning your IP stack:
...
 TCP is NOT a guaranteed protocol -- you can't just blast data from one port
 to another and expect it to work.

Yes you can. This is why we have TCP in fact.

 a tcp-write is NOT guaranteed -- and as you've seen -- a recv() isn't either
 (that's why you need timeouts).
 You're probably overrunning the tcp buffer on your print statement and
 truncating a block.
 I don't see where you're checking forEAGAIN or EWOULDBLOCK (see man
 send).

You do have to check for partial writes due to the UNIX API.
Then check for EAGAIN and EINTR at least.

 You need a layer-7 protocol that will guarantee your transactions -- once
 you're client acks/naks your server I'll bet everything works hunky-dory.
 If you're not familiar with the OSI model
 http://www.csihq.com/~mike/students/networking/iso/isomodel.html

You don't need that crap. TCP/IP doesn't even fit the OSI model,
and we're missing much of the OSI stack AFAIK. (Do we have that
thing with 10-byte addresses? I think not.)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [patch] nonblinking VGA block cursor

2001-06-15 Thread Albert D. Cahalan

Leon Breedt writes:

 Attached is a patch to enforce a non-blinking, FreeBSD-syscons like
 block cursor in console mode.
 
 This is useful for laptop types, or people like me who really really
 detest a blinking cursor.
 
 NOTE: It disables the softcursor escape codes 
   (/usr/src/linux/Documentation/VGA-softcursor.txt), since I don't 
   ever want anything to change my cursor shape/style :)

I've seen this 666 times too often.

Non-blinking cursors are just wrong. You need to patch your brain.
You really fucked up, because now apps can't restore your cursor
to proper behavior as defined by IBM.

The blinking cursor is implemented in your video hardware.
IBM knew what was right for you. Millions of people know that
the blinking cursor is good. It is so right that a proper GUI
will implement the blinking cursor even without hardware support.

Of course FreeBSD has a block cursor. It was easy to program,
and it seems nice to the pot-smoking hippies out in Berkeley.
FreeBSD doesn't define standards. FreeBSD breaks standards.
(zombie creation, ps -ef, partition tables, pty allocation...)
Gee, kind of like Microsoft, except Microsoft got the cursor right!

Ever wonder why IBM supports Linux instead of FreeBSD? Hmmm?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [patch] nonblinking VGA block cursor

2001-06-15 Thread Albert D. Cahalan

Daniel Phillips writes:
 On Friday 15 June 2001 21:21, Albert D. Cahalan wrote:

 Non-blinking cursors are just wrong. You need to patch your brain.
 You really fucked up, because now apps can't restore your cursor
 to proper behavior as defined by IBM.

 Just one question Albert: why doesn't my mouse cursor blink? ;-)

1. confusion with the text cursor, which should blink
2. need for continuous pixel-to-pixel accuracy with the mouse
3. you can wiggle your mouse as needed to find the mouse cursor

Apps do funny things when you try to wiggle the text cursor
with the arrow keys, and movement tends to be harshly constrained.
So the blinking is important.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Going beyond 256 PCI buses

2001-06-14 Thread Albert D. Cahalan

David S. Miller writes:
> Jeff Garzik writes:

>> According to the PCI spec it is -impossible- to have more than 256
>> buses on a single "hose", so you simply have to implement multiple
>> hoses, just like Alpha (and Sparc64?) already do.  That's how the
>> hardware is forced to implement it...
>
> Right, what userspace had to become aware of are "PCI domains" which
> is just another fancy term for a "hose" or "controller".
>
> All you have to do is (right now, the kernel supports this fully)
> open up a /proc/bus/pci/${BUS}/${DEVICE} node and then go:
> 
>   domain = ioctl(fd, PCIIOC_CONTROLLER, 0);
>
> Viola.
>
> There are only two real issues:

No, three.

0) The API needs to be taken out and shot.

   You've added an ioctl. This isn't just any ioctl. It's a
   wicked nasty ioctl. It's an OH MY GOD YOU CAN'T BE SERIOUS
   ioctl by any standard.

   Consider the logical tree:
   hose -> bus -> slot -> function -> bar

   Well, the hose and bar are missing. You specify the middle
   three parts in the filename (with slot and function merged),
   then use an ioctl to specify the hose and bar.

   Doing the whole thing by filename would be better. Else
   why not just say "screw it", open /proc/pci, and do the
   whole thing by ioctl? Using ioctl for both the most and
   least significant parts of the path while using a path
   for the middle part is Wrong, Bad, Evil, and Broken.

   Fix:

   /proc/bus/PCI/0/0/3/0/config   config space
   /proc/bus/PCI/0/0/3/0/0the first bar
   /proc/bus/PCI/0/0/3/0/1the second bar
   /proc/bus/PCI/0/0/3/0/driver   info about the driver, if any
   /proc/bus/PCI/0/0/3/0/eventhot-plug, messages from driver...

   Then we have arch-specific MMU cruft. For example the PowerPC
   defines bits that affect caching, ordering, and merging policy.
   The chips from IBM also define an endianness bit. I don't think
   this ought to be an ioctl either. Maybe mmap() flags would be
   reasonable. This isn't just for PCI; one might do an anon mmap
   with pages locked and cache-incoherent for better performance.

> 1) Extending the type bus numbers use inside the kernel.
...
> 2) Figure out what to do wrt. sys_pciconfig_{read,write}()
...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Going beyond 256 PCI buses

2001-06-14 Thread Albert D. Cahalan

David S. Miller writes:
 Jeff Garzik writes:

 According to the PCI spec it is -impossible- to have more than 256
 buses on a single hose, so you simply have to implement multiple
 hoses, just like Alpha (and Sparc64?) already do.  That's how the
 hardware is forced to implement it...

 Right, what userspace had to become aware of are PCI domains which
 is just another fancy term for a hose or controller.

 All you have to do is (right now, the kernel supports this fully)
 open up a /proc/bus/pci/${BUS}/${DEVICE} node and then go:
 
   domain = ioctl(fd, PCIIOC_CONTROLLER, 0);

 Viola.

 There are only two real issues:

No, three.

0) The API needs to be taken out and shot.

   You've added an ioctl. This isn't just any ioctl. It's a
   wicked nasty ioctl. It's an OH MY GOD YOU CAN'T BE SERIOUS
   ioctl by any standard.

   Consider the logical tree:
   hose - bus - slot - function - bar

   Well, the hose and bar are missing. You specify the middle
   three parts in the filename (with slot and function merged),
   then use an ioctl to specify the hose and bar.

   Doing the whole thing by filename would be better. Else
   why not just say screw it, open /proc/pci, and do the
   whole thing by ioctl? Using ioctl for both the most and
   least significant parts of the path while using a path
   for the middle part is Wrong, Bad, Evil, and Broken.

   Fix:

   /proc/bus/PCI/0/0/3/0/config   config space
   /proc/bus/PCI/0/0/3/0/0the first bar
   /proc/bus/PCI/0/0/3/0/1the second bar
   /proc/bus/PCI/0/0/3/0/driver   info about the driver, if any
   /proc/bus/PCI/0/0/3/0/eventhot-plug, messages from driver...

   Then we have arch-specific MMU cruft. For example the PowerPC
   defines bits that affect caching, ordering, and merging policy.
   The chips from IBM also define an endianness bit. I don't think
   this ought to be an ioctl either. Maybe mmap() flags would be
   reasonable. This isn't just for PCI; one might do an anon mmap
   with pages locked and cache-incoherent for better performance.

 1) Extending the type bus numbers use inside the kernel.
...
 2) Figure out what to do wrt. sys_pciconfig_{read,write}()
...
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Going beyond 256 PCI buses

2001-06-13 Thread Albert D. Cahalan

Tom Gall writes:

>   I was wondering if there are any other folks out there like me who
> have the 256 PCI bus limit looking at them straight in the face?

I might. The need to reserve bus numbers for hot-plug looks like
a quick way to waste all 256 bus numbers.

> each PHB has an
> additional id, then each PHB can have up to 256 buses.

Try not to think of him as a PHB with an extra id. Lots of people
have weird collections. If your boss wants to collect buses, well,
that's his business. Mine likes boats. It's not a big deal, really.

(Did you not mean your pointy-haired boss has mental problems?)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Going beyond 256 PCI buses

2001-06-13 Thread Albert D. Cahalan

Tom Gall writes:

   I was wondering if there are any other folks out there like me who
 have the 256 PCI bus limit looking at them straight in the face?

I might. The need to reserve bus numbers for hot-plug looks like
a quick way to waste all 256 bus numbers.

 each PHB has an
 additional id, then each PHB can have up to 256 buses.

Try not to think of him as a PHB with an extra id. Lots of people
have weird collections. If your boss wants to collect buses, well,
that's his business. Mine likes boats. It's not a big deal, really.

(Did you not mean your pointy-haired boss has mental problems?)


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: IBM PPC 405 series little endian?

2001-06-11 Thread Albert D. Cahalan

Zehetbauer Thomas writes:

> Has someone experimented with running linux in little-endian mode on IBM
> PowerPC 405 (Walnut) yet?

I doubt it. You are at least the 3rd person to want little-endian.
Somebody at Matrox posted a patch for little-endian on the 74xx.
You need a bit more than that though; you need to change the way
page table bits get set and modify head_4xx.S IIRC.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: IBM PPC 405 series little endian?

2001-06-11 Thread Albert D. Cahalan

Zehetbauer Thomas writes:

 Has someone experimented with running linux in little-endian mode on IBM
 PowerPC 405 (Walnut) yet?

I doubt it. You are at least the 3rd person to want little-endian.
Somebody at Matrox posted a patch for little-endian on the 74xx.
You need a bit more than that though; you need to change the way
page table bits get set and modify head_4xx.S IIRC.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: temperature standard - global config option?

2001-06-09 Thread Albert D. Cahalan

Michael H. Warfiel writes:
> On Fri, Jun 08, 2001 at 05:16:39PM -0400, Albert D. Cahalan wrote:

>> The bits are free; the API is hard to change.
>> Sensors might get better, at least on high-end systems.
>> Rounding gives a constant 0.15 degree error.
>> Only the truly stupid would assume accuracy from decimal places.
>> Again, the bits are free; the API is hard to change.
...
>   No...  The average person, NO, the vast majority of people,
> DO assume accuracy from decimal places and honestly do not know the
> difference between precision and accuracy.  I've had comments on this
> thread in private E-Mail the reinforce this impression.

I hope you don't think people would assume that a "float" always
has useful data in all 23 fraction bits. It is a similar case.

So here you go, a kernel-safe conversion from C to K. It works
from 0 to 238 degrees C. Print as hex, so user code can toss it
into a union or maybe abuse scanf. Adjust as needed for F to K
or for hardware with greater resolution.

/* unsigned int degrees C --> float degrees K */
unsigned ic_to_fk(unsigned c){
  unsigned exponent;
  unsigned tmp;

  tmp = (c<<23) + 0x8893; /* Kelvin shifted 23 left */
  exponent = 127; /* IEEE floating-point bias */
  while(tmp&0xff00){
tmp >>= 1;
exponent++;
  }
  tmp &= 0x007f; /* keep only the fraction */
  tmp |= exponent<<23;
  return tmp;
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



checker suggestion

2001-06-09 Thread Albert D. Cahalan

Struct padding is a problem. Really, there shouldn't be any
implicit padding. This causes:

1. security leaks when such structs are copied to userspace
   (the implicit padding is uninitialized, and so may contain
   a chunk of somebody's private key or password)

2. bloat, when struct members could be reordered to eliminate
   the need for padding
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



checker suggestion

2001-06-09 Thread Albert D. Cahalan

Struct padding is a problem. Really, there shouldn't be any
implicit padding. This causes:

1. security leaks when such structs are copied to userspace
   (the implicit padding is uninitialized, and so may contain
   a chunk of somebody's private key or password)

2. bloat, when struct members could be reordered to eliminate
   the need for padding
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: temperature standard - global config option?

2001-06-09 Thread Albert D. Cahalan

Michael H. Warfiel writes:
 On Fri, Jun 08, 2001 at 05:16:39PM -0400, Albert D. Cahalan wrote:

 The bits are free; the API is hard to change.
 Sensors might get better, at least on high-end systems.
 Rounding gives a constant 0.15 degree error.
 Only the truly stupid would assume accuracy from decimal places.
 Again, the bits are free; the API is hard to change.
...
   No...  The average person, NO, the vast majority of people,
 DO assume accuracy from decimal places and honestly do not know the
 difference between precision and accuracy.  I've had comments on this
 thread in private E-Mail the reinforce this impression.

I hope you don't think people would assume that a float always
has useful data in all 23 fraction bits. It is a similar case.

So here you go, a kernel-safe conversion from C to K. It works
from 0 to 238 degrees C. Print as hex, so user code can toss it
into a union or maybe abuse scanf. Adjust as needed for F to K
or for hardware with greater resolution.

/* unsigned int degrees C -- float degrees K */
unsigned ic_to_fk(unsigned c){
  unsigned exponent;
  unsigned tmp;

  tmp = (c23) + 0x8893; /* Kelvin shifted 23 left */
  exponent = 127; /* IEEE floating-point bias */
  while(tmp0xff00){
tmp = 1;
exponent++;
  }
  tmp = 0x007f; /* keep only the fraction */
  tmp |= exponent23;
  return tmp;
}
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: temperature standard - global config option?

2001-06-08 Thread Albert D. Cahalan

Michael H. Warfiel writes:
> On Fri, Jun 08, 2001 at 05:16:39PM -0400, Albert D. Cahalan wrote:

>> The bits are free; the API is hard to change.
>> Sensors might get better, at least on high-end systems.
>> Rounding gives a constant 0.15 degree error.
>> Only the truly stupid would assume accuracy from decimal places.
>> Again, the bits are free; the API is hard to change.
...
>   No...  The average person, NO, the vast majority of people,
> DO assume accuracy from decimal places and honestly do not know the
> difference between precision and accuracy.  I've had comments on this
> thread in private E-Mail the reinforce this impression.

Fine. Most user apps can round to the nearest degree, or even
display the values "cool", "warm", "hot", and "BURNING!".
The kernel API should not be so limiting.

>   Even the rounding error vis-a-vis the .15 is silly and irrelevant!
> If the sensor is +- 1 degree, you can't even measure the rounding error,
> even if you HAVE two decimal places.  With that degree of accuracy, you
> are no better off than 273 with no decimal places.  Worrying about rounding
> error on .15 when the accuracy is in the units is exactly the kind of
> misinformed false precision that I worry about.  You actually though that
> the .15 was significant enough to worry about round error when, in fact,
> it will be impossible to measure with the equipment available in the
> environment of discourse.

The 0.15 may mean the difference between:

a.  less than 0.005 chance of exceeding 370 degrees
b.  less than 0.01 chance of exceeding 370 degrees

for a measurement that might be 365 degrees.

>> One might provide other numbers to specify accuracy and precision.
>
>   Now...  That I can agree with and it would make absolute sense.
> Especially if we were discussing lab grade or scientific grade measure
> equipment and measurements.  In fact, that would be a requirement for
> any validity to be attached to measurements of that level of precision.

No, at any level of precision. I'd sure want to know if the device
is specified as "resolution 8 degrees, standard deviation 23".

This information is fairly important. The user is responsible for
defining acceptable risk, and the app should be able to provide a
warning or shutdown based on this.

For typical PC hardware, one might assume that the device is a
cheap piece of junk 2 mm below the CPU. (with quite a bit of lag!)
The lag ought to be specified too of course.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: temperature standard - global config option?

2001-06-08 Thread Albert D. Cahalan

John Chris Wren writes:

> coupling to the CPU that is about as bad as it can get.  You've got an epoxy
> housing of an inconsistent shape in contact with ceramic.  The actual
> contact point is miniscule.  There's no thermal paste, and often, I've seen
> the sensors not quite raised high enough to contact the chip (you should be
> able to rack a business card across the empty socket and feel a slight
> "bump" as you touch the sensor.  If not, you need to bend it up slightly, to
> give better physical contact to the CPU).
> 
> But in spite of all this, you're not really measure the critical
> temperature, which is junction tempature.  Yes, case tempature has *some*

There are processors with temperature measurement built right
into the silicon.

> For the record, in the course of a normal day, I see my temperatures
> fluctuate from 48C with the house A/C set to 73, to 56C when I open the
> doors, and let it get up to 76 in the house.  That's 8C (14.4F) over a 3F
> change in ambient.

This makes sense. Heat increases resistance, which generates heat.
At some point, a tiny increase will cause thermal run-away.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: temperature standard - global config option?

2001-06-08 Thread Albert D. Cahalan

L. K. writes:
> On Fri, 8 Jun 2001, Albert D. Cahalan wrote:

>> The bits are free; the API is hard to change.
>> Sensors might get better, at least on high-end systems.
>> Rounding gives a constant 0.15 degree error.
>> Only the truly stupid would assume accuracy from decimal places.
>> Again, the bits are free; the API is hard to change.
>>
>> One might provide other numbers to specify accuracy and precision.
>
> I really do not belive that for a CPU or a motherboard +- 1 degree would
> make any difference.
>
> If a CPU runs fine at, say, 37 degrees C, I do not belive it will have any
> problems running at 38 or 36 degrees. I support the ideea of having very
> good sensors for temperature monitoring, but CPU and motherboard
> temperature do not depend on the rise of the temperature of 1 degree, but
> when the temperature rises 10 or more degrees. I hope you understand what
> I want to say.

Of course I understand. Motorola offers 4-degree resolution,
with a random offset of up to 12 degrees. (calibration is possible)
You seem to need another reminder that THE BITS ARE FREE.

Why would you even consider trying to squeeze out a few bits?
You can't be absolutely sure that they will never be useful.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: temperature standard - global config option?

2001-06-08 Thread Albert D. Cahalan

Michael H. Warfiel writes:

> We don't have sensors that are accurate to 1/10 of a K and certainly not
> to 1/100 of a K.  Knowing the CPU temperature "precise" to .01 K when
> the accuracy of the best sensor we are likely to see is no better than
> +- 1 K is just about as relevant as negative absolute temperatures.
...
>   Even if we had or could, anticiplate, sensors with a +- .01 K,
> the relevance of knowing the CPU temperature to that precision is
> lost on me.  I see no sense in stuffing a field with meaningless
> bits just because the field will hold them.  In fact, this "false precision"
> quickly leads to the false impression of accuracy.  Based on several
> messages I have seen on this thread and in private E-Mail, there are a
> number of people who don't seem to grasp the fundamental difference
> between precision and accuracy and truely don't understand that adding
> meaningless precision like this adds nothing to the accuracy.
>
>   I can see maybe making it precise to .1 K.  But stuffing the bits
> in there to be precise to .01 K just because we have the bits and not
> because we have any realistic information to fill the bits in with, is
> just silly to me.  Just as silly as allowing for negative numbers in an
> absolute temperature field.  We have the bits to support it, but why?

The bits are free; the API is hard to change.
Sensors might get better, at least on high-end systems.
Rounding gives a constant 0.15 degree error.
Only the truly stupid would assume accuracy from decimal places.
Again, the bits are free; the API is hard to change.

One might provide other numbers to specify accuracy and precision.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: temperature standard - global config option?

2001-06-08 Thread Albert D. Cahalan

Michael H. Warfiel writes:

 We don't have sensors that are accurate to 1/10 of a K and certainly not
 to 1/100 of a K.  Knowing the CPU temperature precise to .01 K when
 the accuracy of the best sensor we are likely to see is no better than
 +- 1 K is just about as relevant as negative absolute temperatures.
...
   Even if we had or could, anticiplate, sensors with a +- .01 K,
 the relevance of knowing the CPU temperature to that precision is
 lost on me.  I see no sense in stuffing a field with meaningless
 bits just because the field will hold them.  In fact, this false precision
 quickly leads to the false impression of accuracy.  Based on several
 messages I have seen on this thread and in private E-Mail, there are a
 number of people who don't seem to grasp the fundamental difference
 between precision and accuracy and truely don't understand that adding
 meaningless precision like this adds nothing to the accuracy.

   I can see maybe making it precise to .1 K.  But stuffing the bits
 in there to be precise to .01 K just because we have the bits and not
 because we have any realistic information to fill the bits in with, is
 just silly to me.  Just as silly as allowing for negative numbers in an
 absolute temperature field.  We have the bits to support it, but why?

The bits are free; the API is hard to change.
Sensors might get better, at least on high-end systems.
Rounding gives a constant 0.15 degree error.
Only the truly stupid would assume accuracy from decimal places.
Again, the bits are free; the API is hard to change.

One might provide other numbers to specify accuracy and precision.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: temperature standard - global config option?

2001-06-08 Thread Albert D. Cahalan

L. K. writes:
 On Fri, 8 Jun 2001, Albert D. Cahalan wrote:

 The bits are free; the API is hard to change.
 Sensors might get better, at least on high-end systems.
 Rounding gives a constant 0.15 degree error.
 Only the truly stupid would assume accuracy from decimal places.
 Again, the bits are free; the API is hard to change.

 One might provide other numbers to specify accuracy and precision.

 I really do not belive that for a CPU or a motherboard +- 1 degree would
 make any difference.

 If a CPU runs fine at, say, 37 degrees C, I do not belive it will have any
 problems running at 38 or 36 degrees. I support the ideea of having very
 good sensors for temperature monitoring, but CPU and motherboard
 temperature do not depend on the rise of the temperature of 1 degree, but
 when the temperature rises 10 or more degrees. I hope you understand what
 I want to say.

Of course I understand. Motorola offers 4-degree resolution,
with a random offset of up to 12 degrees. (calibration is possible)
You seem to need another reminder that THE BITS ARE FREE.

Why would you even consider trying to squeeze out a few bits?
You can't be absolutely sure that they will never be useful.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: temperature standard - global config option?

2001-06-08 Thread Albert D. Cahalan

John Chris Wren writes:

 coupling to the CPU that is about as bad as it can get.  You've got an epoxy
 housing of an inconsistent shape in contact with ceramic.  The actual
 contact point is miniscule.  There's no thermal paste, and often, I've seen
 the sensors not quite raised high enough to contact the chip (you should be
 able to rack a business card across the empty socket and feel a slight
 bump as you touch the sensor.  If not, you need to bend it up slightly, to
 give better physical contact to the CPU).
 
 But in spite of all this, you're not really measure the critical
 temperature, which is junction tempature.  Yes, case tempature has *some*

There are processors with temperature measurement built right
into the silicon.

 For the record, in the course of a normal day, I see my temperatures
 fluctuate from 48C with the house A/C set to 73, to 56C when I open the
 doors, and let it get up to 76 in the house.  That's 8C (14.4F) over a 3F
 change in ambient.

This makes sense. Heat increases resistance, which generates heat.
At some point, a tiny increase will cause thermal run-away.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: temperature standard - global config option?

2001-06-08 Thread Albert D. Cahalan

Michael H. Warfiel writes:
 On Fri, Jun 08, 2001 at 05:16:39PM -0400, Albert D. Cahalan wrote:

 The bits are free; the API is hard to change.
 Sensors might get better, at least on high-end systems.
 Rounding gives a constant 0.15 degree error.
 Only the truly stupid would assume accuracy from decimal places.
 Again, the bits are free; the API is hard to change.
...
   No...  The average person, NO, the vast majority of people,
 DO assume accuracy from decimal places and honestly do not know the
 difference between precision and accuracy.  I've had comments on this
 thread in private E-Mail the reinforce this impression.

Fine. Most user apps can round to the nearest degree, or even
display the values cool, warm, hot, and BURNING!.
The kernel API should not be so limiting.

   Even the rounding error vis-a-vis the .15 is silly and irrelevant!
 If the sensor is +- 1 degree, you can't even measure the rounding error,
 even if you HAVE two decimal places.  With that degree of accuracy, you
 are no better off than 273 with no decimal places.  Worrying about rounding
 error on .15 when the accuracy is in the units is exactly the kind of
 misinformed false precision that I worry about.  You actually though that
 the .15 was significant enough to worry about round error when, in fact,
 it will be impossible to measure with the equipment available in the
 environment of discourse.

The 0.15 may mean the difference between:

a.  less than 0.005 chance of exceeding 370 degrees
b.  less than 0.01 chance of exceeding 370 degrees

for a measurement that might be 365 degrees.

 One might provide other numbers to specify accuracy and precision.

   Now...  That I can agree with and it would make absolute sense.
 Especially if we were discussing lab grade or scientific grade measure
 equipment and measurements.  In fact, that would be a requirement for
 any validity to be attached to measurements of that level of precision.

No, at any level of precision. I'd sure want to know if the device
is specified as resolution 8 degrees, standard deviation 23.

This information is fairly important. The user is responsible for
defining acceptable risk, and the app should be able to provide a
warning or shutdown based on this.

For typical PC hardware, one might assume that the device is a
cheap piece of junk 2 mm below the CPU. (with quite a bit of lag!)
The lag ought to be specified too of course.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: temperature standard - global config option?

2001-06-07 Thread Albert D. Cahalan

Chris Boot writes:

 Kelvins good idea in general - it is always positive ;-)

 0.01*K fits in 16 bits and gives reasonable range.
...
> OK, I think by now we've all agreed the following:
>  - The issue is NOT displaying temperatures to the user, but a userspace
>program reading them from the kernel.  The userspace program itself can
>do temperature conversions for the user if he/she wants.
>  - The most preferable units would be decikelvins, as the value can give a
>relatively precise as well as wide range of numbers ranging from absolute
>zero to about 6340 degrees Celsius ((65535 / 10) - 273) which is well
>within anything that a computer can operate.  It also gives us a good
>base for all sorts of other temperature sensing devices.
>
> Do we all agree on those now?

I nearly do.

There isn't any need to cram the data into 16 bits.
The offset to Celsius is 273.15 degrees.
So hundredths of a degree, in Kelvin, is a better choice.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: CacheFS

2001-06-07 Thread Albert D. Cahalan

Jan Kasprzak writes:

> Another goal is to use the Linux filesystem
> as a backing store (as opposed to the block device or single large file
> used by CODA).
...
> - kernel module, implementing the filesystem of the type "cachefs"
>   and a character device /dev/cachefs
> - user-space daemon, which would communicate with the kernel
>   over /dev/cachefs and which would manage the backing store
>   in a given directory.
>
>   Every file on the front filesystem (NFS or so) volume will be cached
> in two local files by cachefsd: The first one would contain the (parts of)
...
> * Should the cachefsd be in user space (as it is in the prototype
> implementation) or should it be moved to the kernel space? The
> former allows probably better configuration (maybe a deeper
> directory structure in the backing store), but the later is
> faster as it avoids copying data between the user and kernel spaces.

I think that, if speed is your goal, you should have the kernel
code use swap space for the cache. Look at what tmpfs does, but
running over top of tmpfs leaves you with the overhead of running
two filesystems and a daemon. It is better to be direct.

Maybe this shouldn't even be a filesystem. You could have a general
way to flag a filesystem as being significantly slower than swap.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: temperature standard - global config option?

2001-06-07 Thread Albert D. Cahalan

L. K. writes:

> Why not make it in Celsius ? Is more easy to read it this way.

No, because then the software must handle negative numbers for
cooled computers. CentiKelvin is fine. Do C=cK/100-273.15 if you
really must... but you still have a number that is useless to
a human. Humans need a seconds-to-destruction value or an alarm.

Negative temperatures do not really exist.





-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: temperature standard - global config option?

2001-06-07 Thread Albert D. Cahalan

L. K. writes:

 Why not make it in Celsius ? Is more easy to read it this way.

No, because then the software must handle negative numbers for
cooled computers. CentiKelvin is fine. Do C=cK/100-273.15 if you
really must... but you still have a number that is useless to
a human. Humans need a seconds-to-destruction value or an alarm.

Negative temperatures do not really exist.





-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: CacheFS

2001-06-07 Thread Albert D. Cahalan

Jan Kasprzak writes:

 Another goal is to use the Linux filesystem
 as a backing store (as opposed to the block device or single large file
 used by CODA).
...
 - kernel module, implementing the filesystem of the type cachefs
   and a character device /dev/cachefs
 - user-space daemon, which would communicate with the kernel
   over /dev/cachefs and which would manage the backing store
   in a given directory.

   Every file on the front filesystem (NFS or so) volume will be cached
 in two local files by cachefsd: The first one would contain the (parts of)
...
 * Should the cachefsd be in user space (as it is in the prototype
 implementation) or should it be moved to the kernel space? The
 former allows probably better configuration (maybe a deeper
 directory structure in the backing store), but the later is
 faster as it avoids copying data between the user and kernel spaces.

I think that, if speed is your goal, you should have the kernel
code use swap space for the cache. Look at what tmpfs does, but
running over top of tmpfs leaves you with the overhead of running
two filesystems and a daemon. It is better to be direct.

Maybe this shouldn't even be a filesystem. You could have a general
way to flag a filesystem as being significantly slower than swap.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: temperature standard - global config option?

2001-06-07 Thread Albert D. Cahalan

Chris Boot writes:

 Kelvins good idea in general - it is always positive ;-)

 0.01*K fits in 16 bits and gives reasonable range.
...
 OK, I think by now we've all agreed the following:
  - The issue is NOT displaying temperatures to the user, but a userspace
program reading them from the kernel.  The userspace program itself can
do temperature conversions for the user if he/she wants.
  - The most preferable units would be decikelvins, as the value can give a
relatively precise as well as wide range of numbers ranging from absolute
zero to about 6340 degrees Celsius ((65535 / 10) - 273) which is well
within anything that a computer can operate.  It also gives us a good
base for all sorts of other temperature sensing devices.

 Do we all agree on those now?

I nearly do.

There isn't any need to cram the data into 16 bits.
The offset to Celsius is 273.15 degrees.
So hundredths of a degree, in Kelvin, is a better choice.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Missing cache flush.

2001-06-06 Thread Albert D. Cahalan

David S. Miller writes:
> David Woodhouse writes:

>>> Call it flush_ecache_full() or something.
>>
>> Strange name. Why? How about __flush_cache_range()?
>
> How about flush_cache_range_force() instead?
>
> I want something in the name that tells the reader "this flushes
> the caches, even though under every other ordinary circumstance
> you would not need to".

"flush" means what to you?

write-back
write-back-and-invalidate
discard-and-invalidate

All 3 behaviors are useful to me, and a few more. I've been
using chunks of PowerPC assembly. Using PowerPC mnemonics...

dcba -- allocate a cache block with undefined content
dcbf -- write to RAM, then invalidate ("data cache block flush")
dcbi -- invalidate, discarding any data
dcbst -- initiate write if dirty
dcbt -- prefetch, hinting about future load instructions
dcbtst -- prefetch, hinting about future store instructions
dcbz -- allocate and zero a cache block (cacheable mem only!)

So dcbf_range() and dcbi_range() sound good to me. :-)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Inconsistent "#ifdef __KERNEL__" on different architectures

2001-06-06 Thread Albert D. Cahalan

Paul Mackerras writes:

> The only valid reason for userspace programs to be including kernel
> headers is to get definitions that are part of the kernel API.  (And
> in fact others here will go further and assert that there are *no*
> valid reasons for userspace programs to include kernel headers.)
>
> If you want some atomic functions or whatever for your userspace
> program and the ones in the kernel look like they would be useful,
> then take a copy of the relevant kernel code if you like, but don't
> include the kernel headers directly.

Sure. That copy belongs in /usr/include/asm for all programs
to use, and it should match the libc that will be linked against.
(note: "copy", not a symlink)

Red Hat 7 gets this right:

$ ls -ldog /usr/include/asm /usr/include/linux
drwxr-xr-x2 root 2048 Sep 28  2000 /usr/include/asm
drwxr-xr-x   10 root10240 Sep 28  2000 /usr/include/linux

Debian's "unstable" is correct too:

$ ls -ldog /usr/include/asm /usr/include/linux
drwxr-xr-x2 root 6144 Mar 12 15:57 /usr/include/asm
drwxr-xr-x   10 root23552 Mar 12 15:57 /usr/include/linux

> This is why I added #ifdef __KERNEL__ around most of the contents
> of include/asm-ppc/*.h.  It was done deliberately to flush out those
> programs which are depending on kernel headers when they shouldn't.

What, is  being used? I doubt it.

If /usr/include/asm is a link into /usr/src/linux, then you
have a problem with your Linux distribution. Don't blame the
apps for this problem.

Adding "#ifdef __KERNEL__" causes extra busywork for someone
trying to adapt kernel headers for userspace use. At least do
something easy to rip out. Three lines, all together at the top:

#ifndef __KERNEL__
#error Raw kernel headers may not be compatible with user code.
#endif
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Inconsistent #ifdef __KERNEL__ on different architectures

2001-06-06 Thread Albert D. Cahalan

Paul Mackerras writes:

 The only valid reason for userspace programs to be including kernel
 headers is to get definitions that are part of the kernel API.  (And
 in fact others here will go further and assert that there are *no*
 valid reasons for userspace programs to include kernel headers.)

 If you want some atomic functions or whatever for your userspace
 program and the ones in the kernel look like they would be useful,
 then take a copy of the relevant kernel code if you like, but don't
 include the kernel headers directly.

Sure. That copy belongs in /usr/include/asm for all programs
to use, and it should match the libc that will be linked against.
(note: copy, not a symlink)

Red Hat 7 gets this right:

$ ls -ldog /usr/include/asm /usr/include/linux
drwxr-xr-x2 root 2048 Sep 28  2000 /usr/include/asm
drwxr-xr-x   10 root10240 Sep 28  2000 /usr/include/linux

Debian's unstable is correct too:

$ ls -ldog /usr/include/asm /usr/include/linux
drwxr-xr-x2 root 6144 Mar 12 15:57 /usr/include/asm
drwxr-xr-x   10 root23552 Mar 12 15:57 /usr/include/linux

 This is why I added #ifdef __KERNEL__ around most of the contents
 of include/asm-ppc/*.h.  It was done deliberately to flush out those
 programs which are depending on kernel headers when they shouldn't.

What, is /usr/src/linux/asm/foo.h being used? I doubt it.

If /usr/include/asm is a link into /usr/src/linux, then you
have a problem with your Linux distribution. Don't blame the
apps for this problem.

Adding #ifdef __KERNEL__ causes extra busywork for someone
trying to adapt kernel headers for userspace use. At least do
something easy to rip out. Three lines, all together at the top:

#ifndef __KERNEL__
#error Raw kernel headers may not be compatible with user code.
#endif
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Missing cache flush.

2001-06-06 Thread Albert D. Cahalan

David S. Miller writes:
 David Woodhouse writes:

 Call it flush_ecache_full() or something.

 Strange name. Why? How about __flush_cache_range()?

 How about flush_cache_range_force() instead?

 I want something in the name that tells the reader this flushes
 the caches, even though under every other ordinary circumstance
 you would not need to.

flush means what to you?

write-back
write-back-and-invalidate
discard-and-invalidate

All 3 behaviors are useful to me, and a few more. I've been
using chunks of PowerPC assembly. Using PowerPC mnemonics...

dcba -- allocate a cache block with undefined content
dcbf -- write to RAM, then invalidate (data cache block flush)
dcbi -- invalidate, discarding any data
dcbst -- initiate write if dirty
dcbt -- prefetch, hinting about future load instructions
dcbtst -- prefetch, hinting about future store instructions
dcbz -- allocate and zero a cache block (cacheable mem only!)

So dcbf_range() and dcbi_range() sound good to me. :-)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: symlink_prefix

2001-06-04 Thread Albert D. Cahalan

Alexander Viro writes:

> leaves ncp with its ioctls ugliness.

Authentication will be ugly. Joe mounts a filesystem, and does
not bother to authenticate. He gets world-accessible files.
Then Kevin authenticates as himself, and later as db_adm too.
Along comes Sue, who can authenticate the whole box as trusted.

The /fs/ext2 stuff is one of the nastiest hacks I've seen in
a long time, and it doesn't solve the authentication problem.

GUI users might like to see a dialog box pop up whenever they
hit restricted filesystem space. (example: an authentication tool
blocked on /dev/auth-notify or getting signals with info)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: symlink_prefix

2001-06-04 Thread Albert D. Cahalan

Alexander Viro writes:

 leaves ncp with its ioctls ugliness.

Authentication will be ugly. Joe mounts a filesystem, and does
not bother to authenticate. He gets world-accessible files.
Then Kevin authenticates as himself, and later as db_adm too.
Along comes Sue, who can authenticate the whole box as trusted.

The /fs/ext2 stuff is one of the nastiest hacks I've seen in
a long time, and it doesn't solve the authentication problem.

GUI users might like to see a dialog box pop up whenever they
hit restricted filesystem space. (example: an authentication tool
blocked on /dev/auth-notify or getting signals with info)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Highmem Bigmem question

2001-06-01 Thread Albert D. Cahalan

[EMAIL PROTECTED] writes:

> This is probably an FAQ, but I read the FAQ and its not in there.

Odd.

> I have a machine with 2G of memory.  I compiled the kernel with the
> 4G memory option.  How much address space should each process be
> able to address?

3 GB for user stuff, or 3.5 GB with a patch

> Does this change if I use the 64G option?

No. Don't do that.

> I'm after 2.4 information.  Right now I am running on a 2.2 kernel
> and it looks like the user processes are limited to ~1G.

This is not a kernel problem. Try a libc upgrade, or use some
other way to allocate memory. At least sbrk() and mmap() can
be used.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Highmem Bigmem question

2001-06-01 Thread Albert D. Cahalan

[EMAIL PROTECTED] writes:

 This is probably an FAQ, but I read the FAQ and its not in there.

Odd.

 I have a machine with 2G of memory.  I compiled the kernel with the
 4G memory option.  How much address space should each process be
 able to address?

3 GB for user stuff, or 3.5 GB with a patch

 Does this change if I use the 64G option?

No. Don't do that.

 I'm after 2.4 information.  Right now I am running on a 2.2 kernel
 and it looks like the user processes are limited to ~1G.

This is not a kernel problem. Try a libc upgrade, or use some
other way to allocate memory. At least sbrk() and mmap() can
be used.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: How to know HZ from userspace?

2001-05-30 Thread Albert D. Cahalan

Harald Welte writes:

> Is there any way to read out the compile-time HZ value of the kernel?
> 
> I had a brief look at /proc/* and didn't find anything.

Look again, this time with a sick mind. Got your barf bag?
Kubys made me do it.

//
/***\
*   Copyright (C) 1992-1998 by Michael K. Johnson, [EMAIL PROTECTED] *
*  *
*  This file is placed under the conditions of the GNU Library *
*  General Public License, version 2, or any later version.*
*  See file COPYING for information on distribution conditions.*
\***/

/* ...but Albert Cahalan wrote the really evil parts.
MKJ is only guilty for the macro */

/* Sets Hertz equal to the kernel's HZ, as seen in /proc. */

#include 
#include 
#include 
#include 

#include 
#include 

#ifndef HZ
#include   /* htons */
#endif

long smp_num_cpus; /* number of CPUs */

#define BAD_OPEN_MESSAGE\
"Error: /proc must be mounted\n"\
"  To mount /proc at boot you need an /etc/fstab line like:\n"  \
"  /proc   /proc   procdefaults\n"  \
"  In the meantime, mount /proc /proc -t proc\n"

#define STAT_FILE"/proc/stat"
static int stat_fd = -1;
#define UPTIME_FILE  "/proc/uptime"
static int uptime_fd = -1;
#define LOADAVG_FILE "/proc/loadavg"
static int loadavg_fd = -1;
#define MEMINFO_FILE "/proc/meminfo"
static int meminfo_fd = -1;

static char buf[1024];

/* This macro opens filename only if necessary and seeks to 0 so
 * that successive calls to the functions are more efficient.
 * It also reads the current contents of the file into the global buf.
 */
#define FILE_TO_BUF(filename, fd) do{   \
static int local_n; \
if (fd == -1 && (fd = open(filename, O_RDONLY)) == -1) {\
fprintf(stderr, BAD_OPEN_MESSAGE);  \
fflush(NULL);   \
_exit(102); \
}   \
lseek(fd, 0L, SEEK_SET);\
if ((local_n = read(fd, buf, sizeof buf - 1)) < 0) {\
perror(filename);   \
fflush(NULL);   \
_exit(103); \
}   \
buf[local_n] = '\0';\
}while(0)

unsigned long Hertz;
static void init_Hertz_value(void) __attribute__((constructor));
static void init_Hertz_value(void){
  unsigned long user_j, nice_j, sys_j, other_j;  /* jiffies (clock ticks) */
  double up_1, up_2, seconds;
  unsigned long jiffies, h;
  smp_num_cpus = sysconf(_SC_NPROCESSORS_CONF);
  if(smp_num_cpus==-1) smp_num_cpus=1;
  do{
FILE_TO_BUF(UPTIME_FILE,uptime_fd);  sscanf(buf, "%lf", _1);
/* uptime(_1, NULL); */
FILE_TO_BUF(STAT_FILE,stat_fd);
sscanf(buf, "cpu %lu %lu %lu %lu", _j, _j, _j, _j);
FILE_TO_BUF(UPTIME_FILE,uptime_fd);  sscanf(buf, "%lf", _2);
/* uptime(_2, NULL); */
  } while((long)( (up_2-up_1)*1000.0/up_1 )); /* want under 0.1% error */
  jiffies = user_j + nice_j + sys_j + other_j;
  seconds = (up_1 + up_2) / 2;
  h = (unsigned long)( (double)jiffies/seconds/smp_num_cpus );
  /* actual values used by 2.4 kernels: 32 64 100 128 1000 1024 1200 */
  switch(h){
  case   30 ...   34 :  Hertz =   32; break; /* ia64 emulator */
  case   48 ...   52 :  Hertz =   50; break;
  case   58 ...   62 :  Hertz =   60; break;
  case   63 ...   65 :  Hertz =   64; break; /* StrongARM /Shark */
  case   95 ...  105 :  Hertz =  100; break; /* normal Linux */
  case  124 ...  132 :  Hertz =  128; break; /* MIPS, ARM */
  case  195 ...  204 :  Hertz =  200; break; /* normal << 1 */
  case  253 ...  260 :  Hertz =  256; break;
  case  393 ...  408 :  Hertz =  400; break; /* normal << 2 */
  case  790 ...  808 :  Hertz =  800; break; /* normal << 3 */
  case  990 ... 1010 :  Hertz = 1000; break; /* ARM */
  case 1015 ... 1035 :  Hertz = 1024; break; /* Alpha, ia64 */
  case 1180 ... 1220 :  Hertz = 1200; break; /* Alpha */
  default:
#ifdef HZ
Hertz = (unsigned long)HZ;/*  */
#else
/* If 32-bit or big-endian (not Alpha or ia64), assume HZ is 100. */
Hertz = (sizeof(long)==sizeof(int) || htons(999)==999) ? 100UL : 1024UL;
#endif
fprintf(stderr, "Unknown HZ value! (%ld) Assume %ld.\n", h, Hertz);
  }
}
//





-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a 

Re: How to know HZ from userspace?

2001-05-30 Thread Albert D. Cahalan

Jonathan Lundell writes:
> At 5:07 PM -0700 2001-05-30, H. Peter Anvin wrote:

>>> If you now want to set those values from a userspace program / script in
>>>  a portable manner, you need to be able to find out of HZ of the currently
>>>  running kernel.
>>
>> Yes, but that's because the interfaces are broken.  The decision has
>> been that these values should be exported using the default HZ for the
>> architecture, and that it is the kernel's responsibility to scale them
>> when HZ != USER_HZ.  I don't know if any work has been done in this
>> area.

Nope.

HZ-derived values are not scaled in the /proc code.
The real value is not available to apps. (Linus said so)
People often change the HZ value.

Thus we have problems.

Maybe I'll post my disgusting hack. You _can_ get HZ out
of /proc if you know where to look. >:-)

> FWIW (perhaps not much in this context), the POSIX way is
> sysconf(_SC_CLK_TCK) POSIX sysconf is pretty useful for this
> kind of thing (not just HZ, either).

That does not report the real value. It reports the default.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: How to know HZ from userspace?

2001-05-30 Thread Albert D. Cahalan

Jonathan Lundell writes:
 At 5:07 PM -0700 2001-05-30, H. Peter Anvin wrote:

 If you now want to set those values from a userspace program / script in
  a portable manner, you need to be able to find out of HZ of the currently
  running kernel.

 Yes, but that's because the interfaces are broken.  The decision has
 been that these values should be exported using the default HZ for the
 architecture, and that it is the kernel's responsibility to scale them
 when HZ != USER_HZ.  I don't know if any work has been done in this
 area.

Nope.

HZ-derived values are not scaled in the /proc code.
The real value is not available to apps. (Linus said so)
People often change the HZ value.

Thus we have problems.

Maybe I'll post my disgusting hack. You _can_ get HZ out
of /proc if you know where to look. :-)

 FWIW (perhaps not much in this context), the POSIX way is
 sysconf(_SC_CLK_TCK) POSIX sysconf is pretty useful for this
 kind of thing (not just HZ, either).

That does not report the real value. It reports the default.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: How to know HZ from userspace?

2001-05-30 Thread Albert D. Cahalan

Harald Welte writes:

 Is there any way to read out the compile-time HZ value of the kernel?
 
 I had a brief look at /proc/* and didn't find anything.

Look again, this time with a sick mind. Got your barf bag?
Kubys made me do it.

//
/***\
*   Copyright (C) 1992-1998 by Michael K. Johnson, [EMAIL PROTECTED] *
*  *
*  This file is placed under the conditions of the GNU Library *
*  General Public License, version 2, or any later version.*
*  See file COPYING for information on distribution conditions.*
\***/

/* ...but Albert Cahalan wrote the really evil parts.
MKJ is only guilty for the macro */

/* Sets Hertz equal to the kernel's HZ, as seen in /proc. */

#include stdio.h
#include stdlib.h
#include string.h
#include ctype.h

#include unistd.h
#include fcntl.h

#ifndef HZ
#include netinet/in.h  /* htons */
#endif

long smp_num_cpus; /* number of CPUs */

#define BAD_OPEN_MESSAGE\
Error: /proc must be mounted\n\
  To mount /proc at boot you need an /etc/fstab line like:\n  \
  /proc   /proc   procdefaults\n  \
  In the meantime, mount /proc /proc -t proc\n

#define STAT_FILE/proc/stat
static int stat_fd = -1;
#define UPTIME_FILE  /proc/uptime
static int uptime_fd = -1;
#define LOADAVG_FILE /proc/loadavg
static int loadavg_fd = -1;
#define MEMINFO_FILE /proc/meminfo
static int meminfo_fd = -1;

static char buf[1024];

/* This macro opens filename only if necessary and seeks to 0 so
 * that successive calls to the functions are more efficient.
 * It also reads the current contents of the file into the global buf.
 */
#define FILE_TO_BUF(filename, fd) do{   \
static int local_n; \
if (fd == -1  (fd = open(filename, O_RDONLY)) == -1) {\
fprintf(stderr, BAD_OPEN_MESSAGE);  \
fflush(NULL);   \
_exit(102); \
}   \
lseek(fd, 0L, SEEK_SET);\
if ((local_n = read(fd, buf, sizeof buf - 1))  0) {\
perror(filename);   \
fflush(NULL);   \
_exit(103); \
}   \
buf[local_n] = '\0';\
}while(0)

unsigned long Hertz;
static void init_Hertz_value(void) __attribute__((constructor));
static void init_Hertz_value(void){
  unsigned long user_j, nice_j, sys_j, other_j;  /* jiffies (clock ticks) */
  double up_1, up_2, seconds;
  unsigned long jiffies, h;
  smp_num_cpus = sysconf(_SC_NPROCESSORS_CONF);
  if(smp_num_cpus==-1) smp_num_cpus=1;
  do{
FILE_TO_BUF(UPTIME_FILE,uptime_fd);  sscanf(buf, %lf, up_1);
/* uptime(up_1, NULL); */
FILE_TO_BUF(STAT_FILE,stat_fd);
sscanf(buf, cpu %lu %lu %lu %lu, user_j, nice_j, sys_j, other_j);
FILE_TO_BUF(UPTIME_FILE,uptime_fd);  sscanf(buf, %lf, up_2);
/* uptime(up_2, NULL); */
  } while((long)( (up_2-up_1)*1000.0/up_1 )); /* want under 0.1% error */
  jiffies = user_j + nice_j + sys_j + other_j;
  seconds = (up_1 + up_2) / 2;
  h = (unsigned long)( (double)jiffies/seconds/smp_num_cpus );
  /* actual values used by 2.4 kernels: 32 64 100 128 1000 1024 1200 */
  switch(h){
  case   30 ...   34 :  Hertz =   32; break; /* ia64 emulator */
  case   48 ...   52 :  Hertz =   50; break;
  case   58 ...   62 :  Hertz =   60; break;
  case   63 ...   65 :  Hertz =   64; break; /* StrongARM /Shark */
  case   95 ...  105 :  Hertz =  100; break; /* normal Linux */
  case  124 ...  132 :  Hertz =  128; break; /* MIPS, ARM */
  case  195 ...  204 :  Hertz =  200; break; /* normal  1 */
  case  253 ...  260 :  Hertz =  256; break;
  case  393 ...  408 :  Hertz =  400; break; /* normal  2 */
  case  790 ...  808 :  Hertz =  800; break; /* normal  3 */
  case  990 ... 1010 :  Hertz = 1000; break; /* ARM */
  case 1015 ... 1035 :  Hertz = 1024; break; /* Alpha, ia64 */
  case 1180 ... 1220 :  Hertz = 1200; break; /* Alpha */
  default:
#ifdef HZ
Hertz = (unsigned long)HZ;/* asm/param.h */
#else
/* If 32-bit or big-endian (not Alpha or ia64), assume HZ is 100. */
Hertz = (sizeof(long)==sizeof(int) || htons(999)==999) ? 100UL : 1024UL;
#endif
fprintf(stderr, Unknown HZ value! (%ld) Assume %ld.\n, h, Hertz);
  }
}
//





-
To unsubscribe from this list: 

Re: [patch] severe softirq handling performance bug, fix, 2.4.5

2001-05-26 Thread Albert D. Cahalan

David S. Miller
> Ingo Molnar writes:

>> (unlike bottom halves, soft-IRQs do not preempt kernel code.)
> ...
>
> Since when do we have this rule? :-)
...
> You should check Softirqs on return from every single IRQ.
> In do_softirq() it will make sure that we won't run softirqs
> while already doing so or being already nested in a hard-IRQ.
> 
> Every port works this way, I don't know where you got this "soft-IRQs
> cannot run when returning to kernel code" rule, it simply doesn't
> exist.

After you two argue this out, please toss a note in Documentation.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [patch] severe softirq handling performance bug, fix, 2.4.5

2001-05-26 Thread Albert D. Cahalan

David S. Miller
 Ingo Molnar writes:

 (unlike bottom halves, soft-IRQs do not preempt kernel code.)
 ...

 Since when do we have this rule? :-)
...
 You should check Softirqs on return from every single IRQ.
 In do_softirq() it will make sure that we won't run softirqs
 while already doing so or being already nested in a hard-IRQ.
 
 Every port works this way, I don't know where you got this soft-IRQs
 cannot run when returning to kernel code rule, it simply doesn't
 exist.

After you two argue this out, please toss a note in Documentation.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 freezes on VIA KT133

2001-05-24 Thread Albert D. Cahalan

Mark Hahn writes:

> contrary to the implication here, I don't believe there is any *general*
> problem with Linux/VIA/AMD stability.  there are well-known issues
> with specific items (VIA 686b, for instance), but VIA/AMD hardware
> is quite suitable for servers.

VIA hardware is not suitable for anything until we _know_ the
truth about what is wrong. VIA is hiding something big.

Simple fix:

0. get lawyer
1. start class-action lawsuit
2. do discovery
3. unseal court records
4. done -- you may drop the case if not settled already

Well, something like that... not a lawyer, etc.
If you have the time and money, go for it. Have fun.

Creative Labs ought to toast VIA over blaming the sound card. :-)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Why side-effects on open(2) are evil. (was Re: [RFD w/info-PATCH]device

2001-05-24 Thread Albert D. Cahalan

Oliver Xymoron writes:

> The /dev dir should not be special. At least not to the kernel. I have
> device files in places other than /dev, and you probably do too (hint:
> anonymous FTP).

This is a horribly broken FTP server.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 freezes on VIA KT133

2001-05-24 Thread Albert D. Cahalan

Mark Hahn writes:

 contrary to the implication here, I don't believe there is any *general*
 problem with Linux/VIA/AMD stability.  there are well-known issues
 with specific items (VIA 686b, for instance), but VIA/AMD hardware
 is quite suitable for servers.

VIA hardware is not suitable for anything until we _know_ the
truth about what is wrong. VIA is hiding something big.

Simple fix:

0. get lawyer
1. start class-action lawsuit
2. do discovery
3. unseal court records
4. done -- you may drop the case if not settled already

Well, something like that... not a lawyer, etc.
If you have the time and money, go for it. Have fun.

Creative Labs ought to toast VIA over blaming the sound card. :-)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Why side-effects on open(2) are evil. (was Re: [RFD w/info-PATCH]device

2001-05-24 Thread Albert D. Cahalan

Oliver Xymoron writes:

 The /dev dir should not be special. At least not to the kernel. I have
 device files in places other than /dev, and you probably do too (hint:
 anonymous FTP).

This is a horribly broken FTP server.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: alpha iommu fixes

2001-05-22 Thread Albert D. Cahalan

David S. Miller writes:

> What are these "devices", and what drivers "just program the cards to
> start the dma on those hundred mbyte of ram"?

Hmmm, I have a few cards that are used that way. They are used
for communication between nodes of a cluster.

One might put 16 cards in a system. The cards are quite happy to
do a 2 GB DMA transfer. Scatter-gather is possible, but it cuts
performance. Typically the driver would provide a huge chunk
of memory for an app to use, mapped using large pages on x86 or
using BAT registers on ppc. (reserved during boot of course)
The app would crunch numbers using the CPU (with AltiVec, VIS,
3dnow, etc.) and instruct the device to transfer data to/from
the memory region.

Remote nodes initiate DMA too, even supplying the PCI bus address
on both sides of the interconnect. :-) No IOMMU problems with
that one, eh? The other node may transfer data at will.






-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: alpha iommu fixes

2001-05-22 Thread Albert D. Cahalan

David S. Miller writes:

 What are these devices, and what drivers just program the cards to
 start the dma on those hundred mbyte of ram?

Hmmm, I have a few cards that are used that way. They are used
for communication between nodes of a cluster.

One might put 16 cards in a system. The cards are quite happy to
do a 2 GB DMA transfer. Scatter-gather is possible, but it cuts
performance. Typically the driver would provide a huge chunk
of memory for an app to use, mapped using large pages on x86 or
using BAT registers on ppc. (reserved during boot of course)
The app would crunch numbers using the CPU (with AltiVec, VIS,
3dnow, etc.) and instruct the device to transfer data to/from
the memory region.

Remote nodes initiate DMA too, even supplying the PCI bus address
on both sides of the interconnect. :-) No IOMMU problems with
that one, eh? The other node may transfer data at will.






-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-21 Thread Albert D. Cahalan

Guest section DW writes:
> On Thu, May 17, 2001 at 02:35:55AM -0400, Albert D. Cahalan wrote:

>> The PC partition table has such an ID. The LILO change log
>> mentions it. I think it's 6 random bytes, with some restriction
>> about being non-zero.
>
> You are confused. The partition table contains IDs, but these are
> the numbers like 83 for a Linux partition. No disk-identifying numbers.

Care to explain "duplicate MBR signature handling" in the GPT FAQ?
While describing the new-style partitions, Microsoft mentions that
Windows 2000 has a way to mark old-style ("MBR") partitions:

: 58. What happens if a duplicate Disk or Partition GUID is detected? 
: Windows Whistler will generate new GUIDs for any duplicate Disk GUID,
: MSR Partition GUID, or MSR basic data GUID upon detection. This is
: similar to the duplicate MBR signature handling in Windows 2000.
: Duplicate GUIDs on a dynamic container or database partition
: cause unpredictable results.

Well, the way to test this would be with Windows 2000, two disks,
and a Linux rescue disk that has "dd" on it. See what gets changed
when the "duplicate MBR signature handling in Windows 2000" runs.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-21 Thread Albert D. Cahalan

Guest section DW writes:
 On Thu, May 17, 2001 at 02:35:55AM -0400, Albert D. Cahalan wrote:

 The PC partition table has such an ID. The LILO change log
 mentions it. I think it's 6 random bytes, with some restriction
 about being non-zero.

 You are confused. The partition table contains IDs, but these are
 the numbers like 83 for a Linux partition. No disk-identifying numbers.

Care to explain duplicate MBR signature handling in the GPT FAQ?
While describing the new-style partitions, Microsoft mentions that
Windows 2000 has a way to mark old-style (MBR) partitions:

: 58. What happens if a duplicate Disk or Partition GUID is detected? 
: Windows Whistler will generate new GUIDs for any duplicate Disk GUID,
: MSR Partition GUID, or MSR basic data GUID upon detection. This is
: similar to the duplicate MBR signature handling in Windows 2000.
: Duplicate GUIDs on a dynamic container or database partition
: cause unpredictable results.

Well, the way to test this would be with Windows 2000, two disks,
and a Linux rescue disk that has dd on it. See what gets changed
when the duplicate MBR signature handling in Windows 2000 runs.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] 2.4.5pre3 warning fixes

2001-05-17 Thread Albert D. Cahalan

Bingner Sam J. Con writes:

> Looks to me like it's adding { and } on each side of the
> "c->devices->prev=d;" statement... so changing from:
> 
> if (c->devices != NULL)
>   c->devices->prev=d;
> 
> to 
> 
> if (c->devices != NULL){
>   c->devices->prev=d;
> }
> 
> I assume the new compiler likes the if to have explicit
> brackets instead of using the next statement...

Maybe one of these will make it happy:

(void)(c->devices && (c->devices->prev=d));

!c->devices ?: (c->devices->prev=d);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-17 Thread Albert D. Cahalan

Heinz J. Mauelshag writes:

> LVM does a similar thing storing UUIDs in its private metadata
> area on every device used by it.
>
> Problem is: neither MD nor LVM define a standard in Linux
> which *needs* to be used on every device!
>
> It is just up to the user to configure devices with them or not.
>
> BTW: in case we had a Linux standard it wouldn't solve the
> "different OS" situation mentioned in this thread either.
>
>
> Generally speaking:
> 
> It is not the problem to reserve some space to store a uuid or
> something at such and such location on a device.
>
> The problem is the lack of a standard which eventually
> could be implemented in all OSes at some point in time.

The PC partition table has such an ID. The LILO change log
mentions it. I think it's 6 random bytes, with some restriction
about being non-zero.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-17 Thread Albert D. Cahalan

Heinz J. Mauelshag writes:

 LVM does a similar thing storing UUIDs in its private metadata
 area on every device used by it.

 Problem is: neither MD nor LVM define a standard in Linux
 which *needs* to be used on every device!

 It is just up to the user to configure devices with them or not.

 BTW: in case we had a Linux standard it wouldn't solve the
 different OS situation mentioned in this thread either.


 Generally speaking:
 
 It is not the problem to reserve some space to store a uuid or
 something at such and such location on a device.

 The problem is the lack of a standard which eventually
 could be implemented in all OSes at some point in time.

The PC partition table has such an ID. The LILO change log
mentions it. I think it's 6 random bytes, with some restriction
about being non-zero.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] 2.4.5pre3 warning fixes

2001-05-17 Thread Albert D. Cahalan

Bingner Sam J. Con writes:

 Looks to me like it's adding { and } on each side of the
 c-devices-prev=d; statement... so changing from:
 
 if (c-devices != NULL)
   c-devices-prev=d;
 
 to 
 
 if (c-devices != NULL){
   c-devices-prev=d;
 }
 
 I assume the new compiler likes the if to have explicit
 brackets instead of using the next statement...

Maybe one of these will make it happy:

(void)(c-devices  (c-devices-prev=d));

!c-devices ?: (c-devices-prev=d);
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ((struct pci_dev*)dev)->resource[...].start

2001-05-16 Thread Albert D. Cahalan

Jeff Garzik writes:
> "Khachaturov, Vassilii" wrote:

>> Can someone please confirm if my assumptions below are correct:
>> 1) Unless someone specifically tampered with my driver's device
>> since the OS bootup, the mapping of the PCI base address registers
>> to virtual memory will remain the same (just as seen in /proc/pci,
>> and as reflected in )? If not, is there a way to freeze it
>> for the time I want to access it?
>
> This is not a safe assumption, because the OS may reprogram the
> PCI BARs at certain times.  The rule is:  ALWAYS read from
> dev->resource[] unless you are a bus driver (PCI bridges, for
> example, need to assign resources).

Well, I have a bus driver. Just how do I get a bus number?
My hardware comes up as a regular device, then mutates into
a bridge when I flip a bit in a config register. The header
even changes from type 1 to type 2. The class code is always
the same, a bridge device, but not PCI-to-PCI. It's kind of
like hot-plug PCI over a network, with all sorts of extra
alignment restrictions on address space allocation.

So maybe this card is on bus 42. I need a secondary bus number,
plus a few more in case there are more bridges downstream.
I can't just grab 42..44 because they might be used elsewhere,
and I can't just grab 253..255 either because that upsets the
whole system of bus number assignment being done by carving up
the space granted to upstream bridges.

BTW, is there any reason why the primary bus register of a
bridge would have to be set correctly? I have to set mine equal
to the secondary bus register to keep the hardware happy.

> Further, access to PCI BARs -and- dev->resource[] in a driver is
> wrong until you have called pci_enable_device.  Resource and IRQ
> assignment potentially occurs at pci_enable_device time, so BAR
> is [potentially] undefined before then.

Hmmm. I can use device-specific config space registers to change
the size of a BAR. (or limit & base, whatever) Say I want to have
512 MB, but the bridge upstream only has 128 MB allotted to it.
How do I fix this?


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ((struct pci_dev*)dev)-resource[...].start

2001-05-16 Thread Albert D. Cahalan

Jeff Garzik writes:
 Khachaturov, Vassilii wrote:

 Can someone please confirm if my assumptions below are correct:
 1) Unless someone specifically tampered with my driver's device
 since the OS bootup, the mapping of the PCI base address registers
 to virtual memory will remain the same (just as seen in /proc/pci,
 and as reflected in subj)? If not, is there a way to freeze it
 for the time I want to access it?

 This is not a safe assumption, because the OS may reprogram the
 PCI BARs at certain times.  The rule is:  ALWAYS read from
 dev-resource[] unless you are a bus driver (PCI bridges, for
 example, need to assign resources).

Well, I have a bus driver. Just how do I get a bus number?
My hardware comes up as a regular device, then mutates into
a bridge when I flip a bit in a config register. The header
even changes from type 1 to type 2. The class code is always
the same, a bridge device, but not PCI-to-PCI. It's kind of
like hot-plug PCI over a network, with all sorts of extra
alignment restrictions on address space allocation.

So maybe this card is on bus 42. I need a secondary bus number,
plus a few more in case there are more bridges downstream.
I can't just grab 42..44 because they might be used elsewhere,
and I can't just grab 253..255 either because that upsets the
whole system of bus number assignment being done by carving up
the space granted to upstream bridges.

BTW, is there any reason why the primary bus register of a
bridge would have to be set correctly? I have to set mine equal
to the secondary bus register to keep the hardware happy.

 Further, access to PCI BARs -and- dev-resource[] in a driver is
 wrong until you have called pci_enable_device.  Resource and IRQ
 assignment potentially occurs at pci_enable_device time, so BAR
 is [potentially] undefined before then.

Hmmm. I can use device-specific config space registers to change
the size of a BAR. (or limit  base, whatever) Say I want to have
512 MB, but the bridge upstream only has 128 MB allotted to it.
How do I fix this?


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Getting FS access events

2001-05-15 Thread Albert D. Cahalan

H. Peter Anvin writes:

> This would leave no way (without introducing new interfaces) to write,
> for example, the boot block on an ext2 filesystem.  Note that the
> bootblock (defined as the first 1024 bytes) is not actually used by
> the filesystem, although depending on the block size it may share a
> block with the superblock (if blocksize > 1024).

The lack of coherency would screw this up anyway, doesn't it?
You have a block device, soon to be in the page cache, and
a superblock, also soon to be in the page cache. LILO writes to
the block device, while the ext2 driver updates the superblock.
Whatever gets written out last wins, and the other is lost.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Getting FS access events

2001-05-15 Thread Albert D. Cahalan

H. Peter Anvin writes:

 This would leave no way (without introducing new interfaces) to write,
 for example, the boot block on an ext2 filesystem.  Note that the
 bootblock (defined as the first 1024 bytes) is not actually used by
 the filesystem, although depending on the block size it may share a
 block with the superblock (if blocksize  1024).

The lack of coherency would screw this up anyway, doesn't it?
You have a block device, soon to be in the page cache, and
a superblock, also soon to be in the page cache. LILO writes to
the block device, while the ext2 driver updates the superblock.
Whatever gets written out last wins, and the other is lost.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: How VFS interacts with a file driver

2001-05-14 Thread Albert D. Cahalan

Daniel Phillips writes:
> On Monday 14 May 2001 07:29, Blesson Paul wrote:

>>I am trying to implement a distributed file system.

Me too!  :-)

>> For that I write a file driver. I want to know the following things
>>
>> 1 . If I am writing a new file system, is it necessary to modify the
>> existing structs including inode struct.

Nope. There is a generic pointer you can use. You just need to
figure out when to free it, assuming you don't want to leak
lots of memory. Student projects can leak -- lucky you!

>> 2 . If it is not needed, will a simple registration of the file
>> system is needed to mount the file system
>> More over I am new to this area. I am doing as my
>> graduate project. I need someones help to crack the working of VFS
>> Thanks in advance
>
> 1. In .config, change CONFIG_EXT2_FS to 'm'
> 2. change "ext2" to "newfs" at DECLARE_FSTYPE_DEV in super.c
> 3. make modules SUBDIRS=fs/ext2
> 4. insmod fs/ext2/ext2.o
> 
> Poof!  New filesystem.  (cat /proc/filesystems) Don't forget to change 
> ext2 in .config back to "y" before you build your next kernel.  You'll 
> need to study the kernel *hard* before you can expect to have half a 
> chance of having your filesystem work properly.

Gotta love d_delete the function and d_delete the function pointer
in a struct. Discovery was cause for inventing new curses.

Along the way I stumble accross a "retval" that is only set once
and a "if(de &&" of "if(!de ||" (I forget) that is redundant.
Maybe in the proc or tmpfs code, just in case someone cares enough.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Inodes

2001-05-14 Thread Albert D. Cahalan

Blesson Paul writes:

> This is an another doubt related to VFS. I want to know
> wheather all files are assigned their inode number at the
> mounting time itself or inodes are assigned to files upon
> accessing only

That would depend on what type of filesystem you use.
For ext2, inode numbers are assigned at file creation.
For vfat, inode numbers are assigned as needed, and
forgotten when not needed.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: How VFS interacts with a file driver

2001-05-14 Thread Albert D. Cahalan

Daniel Phillips writes:
 On Monday 14 May 2001 07:29, Blesson Paul wrote:

I am trying to implement a distributed file system.

Me too!  :-)

 For that I write a file driver. I want to know the following things

 1 . If I am writing a new file system, is it necessary to modify the
 existing structs including inode struct.

Nope. There is a generic pointer you can use. You just need to
figure out when to free it, assuming you don't want to leak
lots of memory. Student projects can leak -- lucky you!

 2 . If it is not needed, will a simple registration of the file
 system is needed to mount the file system
 More over I am new to this area. I am doing as my
 graduate project. I need someones help to crack the working of VFS
 Thanks in advance

 1. In .config, change CONFIG_EXT2_FS to 'm'
 2. change ext2 to newfs at DECLARE_FSTYPE_DEV in super.c
 3. make modules SUBDIRS=fs/ext2
 4. insmod fs/ext2/ext2.o
 
 Poof!  New filesystem.  (cat /proc/filesystems) Don't forget to change 
 ext2 in .config back to y before you build your next kernel.  You'll 
 need to study the kernel *hard* before you can expect to have half a 
 chance of having your filesystem work properly.

Gotta love d_delete the function and d_delete the function pointer
in a struct. Discovery was cause for inventing new curses.

Along the way I stumble accross a retval that is only set once
and a if(de  of if(!de || (I forget) that is redundant.
Maybe in the proc or tmpfs code, just in case someone cares enough.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [reiserfs-dev] Re: reiserfs, xfs, ext2, ext3

2001-05-11 Thread Albert D. Cahalan

Hans Reiser writes:

> Tell us what to code for, and so long as it doesn't involve looking
> up files by their 32 bit inode numbers we'll probably be happy to
> code to it.  The Neil Brown stuff is already coded for though.

Next time around, when you update the on-disk format, how about
allowing for such a thing?

You could have a tree that maps from inode number to whatever
you need to find a file. This shouldn't affect much more than
file creation and deletion. Maybe it will allow for a more
robust fsck as well, helping to justify the cost.

It would be really nice to be able to find all filenames that
refer to a given inode number.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [reiserfs-dev] Re: reiserfs, xfs, ext2, ext3

2001-05-11 Thread Albert D. Cahalan

Hans Reiser writes:

 Tell us what to code for, and so long as it doesn't involve looking
 up files by their 32 bit inode numbers we'll probably be happy to
 code to it.  The Neil Brown stuff is already coded for though.

Next time around, when you update the on-disk format, how about
allowing for such a thing?

You could have a tree that maps from inode number to whatever
you need to find a file. This shouldn't affect much more than
file creation and deletion. Maybe it will allow for a more
robust fsck as well, helping to justify the cost.

It would be really nice to be able to find all filenames that
refer to a given inode number.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Patch to make ymfpci legacy address 16 bits

2001-05-09 Thread Albert D. Cahalan

Jeff Garzik writes:
> Pavel Roskin wrote:

>> You may need to save some data in memory when the system goes
>> to suspend and restore them afterwards. I believe that the PCI
>> config space should be saved by BIOS. Everything else is the
>> responsibility of the driver.
>
> In ACPI land the kernel should save and restore the PCI device
> config space and the PCI bus config space.  It is probably that
> similar is necessary under APM.

When you write "the kernel", do you mean the driver or generic
code? I hope you mean the driver, because I have this:

1. the device looks normal at power on
2. the driver pokes a device-specific config register
3. the config space header changes from type 0 to type 1

(The class code does NOT indicate PCI-to-PCI bridge.
You could say this is like CardBus but much weirder)

If the kernel saves type 1 header data, cuts power using
motherboard features, restores power, and then tries to
restore type 1 header data into a type 0 header... the
system will be well and truly screwed IMHO.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



  1   2   3   4   5   >