Re: [PATCH] PPC64: large INITRD causes kernel not to boot [UPDATE]

2005-09-06 Thread Mark Bellon

Paul Mackerras wrote:


Mark Bellon writes:

 

Simply put the existing code has a fixed reservation (claim) address and 
once the kernel plus initrd image are large enough to pass this address 
all sorts of bad things occur. The fix is the dynamically establish the 
first claim address above the loaded kernel plus initrd (plus some 
"padding" and rounding). If PROG_START is defined this will be used as 
the minimum safe address - currently known to be 0x0140 for the 
firmwares tested so far.
   



The idea is fine, but I have some questions about the actual patch:

 


-void *claim(unsigned int, unsigned int, unsigned int);
+void *claim(unsigned long, unsigned long, unsigned long);
   



What was the motivation for this change?  Since the zImage wrapper is
a 32-bit executable, int and long are both 32 bits.  I would prefer to
leave the parameters as unsigned int to force people to realize that
the parameters are 32 bits (even if said people have been working on
64-bit programs recently).

 

The function, claim, is found in prom.c uses longs. The long is the 
usual idiom for hiding a pointer, not an int, so I fixed accordingly. 
I'm open to further discussion of course.


On a 64 bit machine long and int are different sizes. This would make 
things "proper" if things changed in the future.



+   claim_base = _ALIGN_UP((unsigned long)_end, ONE_MB);
+
+#if defined(PROG_START)
+   /*
+* Maintain a "magic" minimum address. This keeps some older
+* firmware platforms running.
+*/
+
+   if (claim_base < PROG_START)
+   claim_base = PROG_START;
+#endif
   



This appears to be the meat of the patch, the rest is "cleanup", right?
 



Correct. The preceding comment explains what is going on. Removing the 
magic numbers seemed like a good idea.


mark


Paul.
 



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] PPC64: large INITRD causes kernel not to boot [UPDATE]

2005-09-06 Thread Mark Bellon
In PPC64 there are number of problems in arch/ppc64/boot/main.c that 
prevent a kernel from making use of a large (greater than ~16MB) INITRD. 
This is 64 bit architecture and really large INITRD images should be 
possible.


Simply put the existing code has a fixed reservation (claim) address and 
once the kernel plus initrd image are large enough to pass this address 
all sorts of bad things occur. The fix is the dynamically establish the 
first claim address above the loaded kernel plus initrd (plus some 
"padding" and rounding). If PROG_START is defined this will be used as 
the minimum safe address - currently known to be 0x0140 for the 
firmwares tested so far.


We've talked about this in [EMAIL PROTECTED] and this is what 
seems to have settled out.


mark

Signed-off-by: Mark Bellon <[EMAIL PROTECTED]>

diff -Naur linux-2.6.13-orig/arch/ppc64/boot/main.c 
linux-2.6.13/arch/ppc64/boot/main.c
--- linux-2.6.13-orig/arch/ppc64/boot/main.c2005-08-28 16:41:01.0 
-0700
+++ linux-2.6.13/arch/ppc64/boot/main.c 2005-09-06 15:42:22.0 -0700
@@ -20,7 +20,7 @@
 extern void printf(const char *fmt, ...);
 extern int sprintf(char *buf, const char *fmt, ...);
 void gunzip(void *, int, unsigned char *, int *);
-void *claim(unsigned int, unsigned int, unsigned int);
+void *claim(unsigned long, unsigned long, unsigned long);
 void flush_cache(void *, unsigned long);
 void pause(void);
 extern void exit(void);
@@ -31,7 +31,8 @@
 
 /* Value picked to match that used by yaboot */
 #define PROG_START 0x0140
-#define RAM_END(256<<20) // Fixme: use OF */
+#define RAM_END(512<<20) // Fixme: use OF */
+#defineONE_MB  0x10
 
 char *avail_ram;
 char *begin_avail, *end_avail;
@@ -40,6 +41,7 @@
 unsigned int heap_max;
 
 extern char _start[];
+extern char _end[];
 extern char _vmlinux_start[];
 extern char _vmlinux_end[];
 extern char _initrd_start[];
@@ -73,13 +75,13 @@
 
 #undef DEBUG
 
-static unsigned long claim_base = PROG_START;
+static unsigned long claim_base;
 
 static unsigned long try_claim(unsigned long size)
 {
unsigned long addr = 0;
 
-   for(; claim_base < RAM_END; claim_base += 0x10) {
+   for(; claim_base < RAM_END; claim_base += ONE_MB) {
 #ifdef DEBUG
printf("trying: 0x%08lx\n\r", claim_base);
 #endif
@@ -110,7 +112,26 @@
if (getprop(chosen_handle, "stdin", , sizeof(stdin)) != 4)
exit();
 
-   printf("\n\rzImage starting: loaded at 0x%x\n\r", (unsigned)_start);
+   printf("\n\rzImage starting: loaded at 0x%lx\n\r", (unsigned long) 
_start);
+
+   /*
+* The first available claim_base must be above the end of the
+* the loaded kernel wrapper file (_start to _end includes the
+* initrd image if it is present) and rounded up to a nice
+* 1 MB boundary for good measure.
+*/
+
+   claim_base = _ALIGN_UP((unsigned long)_end, ONE_MB);
+
+#if defined(PROG_START)
+   /*
+* Maintain a "magic" minimum address. This keeps some older
+* firmware platforms running.
+*/
+
+   if (claim_base < PROG_START)
+   claim_base = PROG_START;
+#endif
 
/*
 * Now we try to claim some memory for the kernel itself
@@ -120,7 +141,7 @@
 * size... In practice we add 1Mb, that is enough, but we should really
 * consider fixing the Makefile to put a _raw_ kernel in there !
 */
-   vmlinux_memsize += 0x10;
+   vmlinux_memsize += ONE_MB;
printf("Allocating 0x%lx bytes for kernel ...\n\r", vmlinux_memsize);
vmlinux.addr = try_claim(vmlinux_memsize);
if (vmlinux.addr == 0) {


[PATCH] PPC64: large INITRD causes kernel not to boot [UPDATE]

2005-09-06 Thread Mark Bellon
In PPC64 there are number of problems in arch/ppc64/boot/main.c that 
prevent a kernel from making use of a large (greater than ~16MB) INITRD. 
This is 64 bit architecture and really large INITRD images should be 
possible.


Simply put the existing code has a fixed reservation (claim) address and 
once the kernel plus initrd image are large enough to pass this address 
all sorts of bad things occur. The fix is the dynamically establish the 
first claim address above the loaded kernel plus initrd (plus some 
padding and rounding). If PROG_START is defined this will be used as 
the minimum safe address - currently known to be 0x0140 for the 
firmwares tested so far.


We've talked about this in [EMAIL PROTECTED] and this is what 
seems to have settled out.


mark

Signed-off-by: Mark Bellon [EMAIL PROTECTED]

diff -Naur linux-2.6.13-orig/arch/ppc64/boot/main.c 
linux-2.6.13/arch/ppc64/boot/main.c
--- linux-2.6.13-orig/arch/ppc64/boot/main.c2005-08-28 16:41:01.0 
-0700
+++ linux-2.6.13/arch/ppc64/boot/main.c 2005-09-06 15:42:22.0 -0700
@@ -20,7 +20,7 @@
 extern void printf(const char *fmt, ...);
 extern int sprintf(char *buf, const char *fmt, ...);
 void gunzip(void *, int, unsigned char *, int *);
-void *claim(unsigned int, unsigned int, unsigned int);
+void *claim(unsigned long, unsigned long, unsigned long);
 void flush_cache(void *, unsigned long);
 void pause(void);
 extern void exit(void);
@@ -31,7 +31,8 @@
 
 /* Value picked to match that used by yaboot */
 #define PROG_START 0x0140
-#define RAM_END(25620) // Fixme: use OF */
+#define RAM_END(51220) // Fixme: use OF */
+#defineONE_MB  0x10
 
 char *avail_ram;
 char *begin_avail, *end_avail;
@@ -40,6 +41,7 @@
 unsigned int heap_max;
 
 extern char _start[];
+extern char _end[];
 extern char _vmlinux_start[];
 extern char _vmlinux_end[];
 extern char _initrd_start[];
@@ -73,13 +75,13 @@
 
 #undef DEBUG
 
-static unsigned long claim_base = PROG_START;
+static unsigned long claim_base;
 
 static unsigned long try_claim(unsigned long size)
 {
unsigned long addr = 0;
 
-   for(; claim_base  RAM_END; claim_base += 0x10) {
+   for(; claim_base  RAM_END; claim_base += ONE_MB) {
 #ifdef DEBUG
printf(trying: 0x%08lx\n\r, claim_base);
 #endif
@@ -110,7 +112,26 @@
if (getprop(chosen_handle, stdin, stdin, sizeof(stdin)) != 4)
exit();
 
-   printf(\n\rzImage starting: loaded at 0x%x\n\r, (unsigned)_start);
+   printf(\n\rzImage starting: loaded at 0x%lx\n\r, (unsigned long) 
_start);
+
+   /*
+* The first available claim_base must be above the end of the
+* the loaded kernel wrapper file (_start to _end includes the
+* initrd image if it is present) and rounded up to a nice
+* 1 MB boundary for good measure.
+*/
+
+   claim_base = _ALIGN_UP((unsigned long)_end, ONE_MB);
+
+#if defined(PROG_START)
+   /*
+* Maintain a magic minimum address. This keeps some older
+* firmware platforms running.
+*/
+
+   if (claim_base  PROG_START)
+   claim_base = PROG_START;
+#endif
 
/*
 * Now we try to claim some memory for the kernel itself
@@ -120,7 +141,7 @@
 * size... In practice we add 1Mb, that is enough, but we should really
 * consider fixing the Makefile to put a _raw_ kernel in there !
 */
-   vmlinux_memsize += 0x10;
+   vmlinux_memsize += ONE_MB;
printf(Allocating 0x%lx bytes for kernel ...\n\r, vmlinux_memsize);
vmlinux.addr = try_claim(vmlinux_memsize);
if (vmlinux.addr == 0) {


Re: [PATCH] PPC64: large INITRD causes kernel not to boot [UPDATE]

2005-09-06 Thread Mark Bellon

Paul Mackerras wrote:


Mark Bellon writes:

 

Simply put the existing code has a fixed reservation (claim) address and 
once the kernel plus initrd image are large enough to pass this address 
all sorts of bad things occur. The fix is the dynamically establish the 
first claim address above the loaded kernel plus initrd (plus some 
padding and rounding). If PROG_START is defined this will be used as 
the minimum safe address - currently known to be 0x0140 for the 
firmwares tested so far.
   



The idea is fine, but I have some questions about the actual patch:

 


-void *claim(unsigned int, unsigned int, unsigned int);
+void *claim(unsigned long, unsigned long, unsigned long);
   



What was the motivation for this change?  Since the zImage wrapper is
a 32-bit executable, int and long are both 32 bits.  I would prefer to
leave the parameters as unsigned int to force people to realize that
the parameters are 32 bits (even if said people have been working on
64-bit programs recently).

 

The function, claim, is found in prom.c uses longs. The long is the 
usual idiom for hiding a pointer, not an int, so I fixed accordingly. 
I'm open to further discussion of course.


On a 64 bit machine long and int are different sizes. This would make 
things proper if things changed in the future.



+   claim_base = _ALIGN_UP((unsigned long)_end, ONE_MB);
+
+#if defined(PROG_START)
+   /*
+* Maintain a magic minimum address. This keeps some older
+* firmware platforms running.
+*/
+
+   if (claim_base  PROG_START)
+   claim_base = PROG_START;
+#endif
   



This appears to be the meat of the patch, the rest is cleanup, right?
 



Correct. The preceding comment explains what is going on. Removing the 
magic numbers seemed like a good idea.


mark


Paul.
 



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] PPC64: large INITRD causes kernel not to boot

2005-08-08 Thread Mark Bellon
In PPC64 there are number of problems in arch/ppc64/boot/main.c that 
prevent a kernel from making use of a large (greater than ~16MB) INITRD. 
This is 64 bit architecture and really large INITRD images should be 
possible.


Simply put the existing code has a fixed reservation (claim) address and 
once the kernel plus initrd image are large enough to pass this address 
all sorts of bad things occur. The fix is the dynamically establish the 
first claim address above the loaded kernel plus initrd (plus some 
"padding" and rounding)


mark

Signed-off-by: Mark Bellon <[EMAIL PROTECTED]>

Index: linux-2.6.12.3/arch/ppc64/boot/main.c
===
--- linux-2.6.12.3.orig/arch/ppc64/boot/main.c
+++ linux-2.6.12.3/arch/ppc64/boot/main.c
@@ -22,7 +22,7 @@
 extern void printf(const char *fmt, ...);
 extern int sprintf(char *buf, const char *fmt, ...);
 void gunzip(void *, int, unsigned char *, int *);
-void *claim(unsigned int, unsigned int, unsigned int);
+void *claim(unsigned long, unsigned long, unsigned long);
 void flush_cache(void *, unsigned long);
 void pause(void);
 extern void exit(void);
@@ -31,9 +31,8 @@
 void *memmove(void *dest, const void *src, unsigned long n);
 void *memcpy(void *dest, const void *src, unsigned long n);
 
-/* Value picked to match that used by yaboot */
-#define PROG_START	0x0140
-#define RAM_END		(256<<20) // Fixme: use OF */
+#define	ONE_MB		0x10
+#define RAM_END		(512<<20) // Fixme: use OF */
 
 char *avail_ram;
 char *begin_avail, *end_avail;
@@ -75,13 +74,13 @@
 
 #define DEBUG
 
-static unsigned long claim_base = PROG_START;
+static unsigned long claim_base;
 
 static unsigned long try_claim(unsigned long size)
 {
 	unsigned long addr = 0;
 
-	for(; claim_base < RAM_END; claim_base += 0x10) {
+	for(; claim_base < RAM_END; claim_base += ONE_MB) {
 #ifdef DEBUG
 		printf("trying: 0x%08lx\n\r", claim_base);
 #endif
@@ -112,7 +111,23 @@
 	if (getprop(chosen_handle, "stdin", , sizeof(stdin)) != 4)
 		exit();
 
-	printf("zImage starting: loaded at 0x%x\n\r", (unsigned)_start);
+	printf("zImage starting: loaded at 0x%lx\n\r", (unsigned long)_start);
+
+	/*
+	 * The first available claim_base must be "out of the way" -
+	 * well above _start + kernel_size + initrd + overhead.
+	 */
+
+	claim_base = ((unsigned long) _start) +
+((unsigned long) vmlinux_filesize) +
+(unsigned long)(_initrd_end - _initrd_start) +
+ONE_MB;
+
+	/*
+	 * Now round up the claim_base to a nice 1 MB boundary.
+	 */
+
+	claim_base = ((claim_base + ONE_MB - 1) / ONE_MB) * ONE_MB;
 
 	/*
 	 * Now we try to claim some memory for the kernel itself
@@ -122,7 +137,7 @@
 	 * size... In practice we add 1Mb, that is enough, but we should really
 	 * consider fixing the Makefile to put a _raw_ kernel in there !
 	 */
-	vmlinux_memsize += 0x10;
+	vmlinux_memsize += ONE_MB;
 	printf("Allocating 0x%lx bytes for kernel ...\n\r", vmlinux_memsize);
 	vmlinux.addr = try_claim(vmlinux_memsize);
 	if (vmlinux.addr == 0) {


[PATCH] PPC64: large INITRD causes kernel not to boot

2005-08-08 Thread Mark Bellon
In PPC64 there are number of problems in arch/ppc64/boot/main.c that 
prevent a kernel from making use of a large (greater than ~16MB) INITRD. 
This is 64 bit architecture and really large INITRD images should be 
possible.


Simply put the existing code has a fixed reservation (claim) address and 
once the kernel plus initrd image are large enough to pass this address 
all sorts of bad things occur. The fix is the dynamically establish the 
first claim address above the loaded kernel plus initrd (plus some 
padding and rounding)


mark

Signed-off-by: Mark Bellon [EMAIL PROTECTED]

Index: linux-2.6.12.3/arch/ppc64/boot/main.c
===
--- linux-2.6.12.3.orig/arch/ppc64/boot/main.c
+++ linux-2.6.12.3/arch/ppc64/boot/main.c
@@ -22,7 +22,7 @@
 extern void printf(const char *fmt, ...);
 extern int sprintf(char *buf, const char *fmt, ...);
 void gunzip(void *, int, unsigned char *, int *);
-void *claim(unsigned int, unsigned int, unsigned int);
+void *claim(unsigned long, unsigned long, unsigned long);
 void flush_cache(void *, unsigned long);
 void pause(void);
 extern void exit(void);
@@ -31,9 +31,8 @@
 void *memmove(void *dest, const void *src, unsigned long n);
 void *memcpy(void *dest, const void *src, unsigned long n);
 
-/* Value picked to match that used by yaboot */
-#define PROG_START	0x0140
-#define RAM_END		(25620) // Fixme: use OF */
+#define	ONE_MB		0x10
+#define RAM_END		(51220) // Fixme: use OF */
 
 char *avail_ram;
 char *begin_avail, *end_avail;
@@ -75,13 +74,13 @@
 
 #define DEBUG
 
-static unsigned long claim_base = PROG_START;
+static unsigned long claim_base;
 
 static unsigned long try_claim(unsigned long size)
 {
 	unsigned long addr = 0;
 
-	for(; claim_base  RAM_END; claim_base += 0x10) {
+	for(; claim_base  RAM_END; claim_base += ONE_MB) {
 #ifdef DEBUG
 		printf(trying: 0x%08lx\n\r, claim_base);
 #endif
@@ -112,7 +111,23 @@
 	if (getprop(chosen_handle, stdin, stdin, sizeof(stdin)) != 4)
 		exit();
 
-	printf(zImage starting: loaded at 0x%x\n\r, (unsigned)_start);
+	printf(zImage starting: loaded at 0x%lx\n\r, (unsigned long)_start);
+
+	/*
+	 * The first available claim_base must be out of the way -
+	 * well above _start + kernel_size + initrd + overhead.
+	 */
+
+	claim_base = ((unsigned long) _start) +
+((unsigned long) vmlinux_filesize) +
+(unsigned long)(_initrd_end - _initrd_start) +
+ONE_MB;
+
+	/*
+	 * Now round up the claim_base to a nice 1 MB boundary.
+	 */
+
+	claim_base = ((claim_base + ONE_MB - 1) / ONE_MB) * ONE_MB;
 
 	/*
 	 * Now we try to claim some memory for the kernel itself
@@ -122,7 +137,7 @@
 	 * size... In practice we add 1Mb, that is enough, but we should really
 	 * consider fixing the Makefile to put a _raw_ kernel in there !
 	 */
-	vmlinux_memsize += 0x10;
+	vmlinux_memsize += ONE_MB;
 	printf(Allocating 0x%lx bytes for kernel ...\n\r, vmlinux_memsize);
 	vmlinux.addr = try_claim(vmlinux_memsize);
 	if (vmlinux.addr == 0) {


Re: [PATCH] IDE disks show invalid geometries in /proc/ide/hd*/geometry

2005-08-04 Thread Mark Bellon

Jan Engelhardt wrote:


Simple: do not use BIOS values.
[ Yes, there should be some warning from kernel. ]
   



On that matter, I get a warning from LILO wrt cyls and stuff:

07:47 spectre:~ # cat /proc/ide/hda/geometry 
physical 16383/16/63

logical  65535/16/63
07:58 spectre:~ # lilo
Warning: Kernel & BIOS return differing head/sector geometries for device 0x80
   Kernel: 65535 cylinders, 16 heads, 63 sectors
 BIOS: 1023 cylinders, 255 heads, 63 sectors
Added linux *
07:59 spectre:~ # fdisk -l

Disk /dev/hda: 40.9 GB, 40982151168 bytes
255 heads, 63 sectors/track, 4982 cylinders


All of these numbers are virtual, since CHS is not really used anymore, as 
we know. But, which of these fake CHS values (16383/16/63 | 65535/16/63 | 
1023/255/63) is the right one? 255/63/4982 is another matter, since it 
[almost] matches the actual size of the disk while the other three are just 
"for the bios".
 

This is exactly the case that my patch was attempting to fix (and 
apparently didn't get quite right).


Certain drive returns cause strange numbers to slip through often when 
LBA 28 is involved.


mark

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] IDE disks show invalid geometries in /proc/ide/hd*/geometry

2005-08-04 Thread Mark Bellon

Jan Engelhardt wrote:


Simple: do not use BIOS values.
[ Yes, there should be some warning from kernel. ]
   



On that matter, I get a warning from LILO wrt cyls and stuff:

07:47 spectre:~ # cat /proc/ide/hda/geometry 
physical 16383/16/63

logical  65535/16/63
07:58 spectre:~ # lilo
Warning: Kernel  BIOS return differing head/sector geometries for device 0x80
   Kernel: 65535 cylinders, 16 heads, 63 sectors
 BIOS: 1023 cylinders, 255 heads, 63 sectors
Added linux *
07:59 spectre:~ # fdisk -l

Disk /dev/hda: 40.9 GB, 40982151168 bytes
255 heads, 63 sectors/track, 4982 cylinders


All of these numbers are virtual, since CHS is not really used anymore, as 
we know. But, which of these fake CHS values (16383/16/63 | 65535/16/63 | 
1023/255/63) is the right one? 255/63/4982 is another matter, since it 
[almost] matches the actual size of the disk while the other three are just 
for the bios.
 

This is exactly the case that my patch was attempting to fix (and 
apparently didn't get quite right).


Certain drive returns cause strange numbers to slip through often when 
LBA 28 is involved.


mark

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] IDE disks show invalid geometries in /proc/ide/hd*/geometry

2005-08-03 Thread Mark Bellon

Bartlomiej Zolnierkiewicz wrote:


On 8/3/05, Mark Bellon <[EMAIL PROTECTED]> wrote:
 


Bartlomiej Zolnierkiewicz wrote:

   


Hi,

The topic was discussed to death on linux-kernel.

Mark, you need to fix your applications and stop using /proc/ide/hd*/geometry
or/and HDIO_GET_GEO ioctl (which BTW your patch also affects).


 


Fixing the applications I can understand but the patch still seems
necessary, to me, so the
HDIO_GET_GEO returns "rational" values.

I tested  HDIO_GET_GEO and it returns the same broken values as go into
/proc (no surprises) without my patch.
   



Please explain.
 

Let me explain more below. Thanks for the response as now I can 
understand better the issue (which was what I was fishing for).


 


If a drive is in LBA mode (28 or 48 bit) the existing code doesn't
always "fix up" the geometry properly for some value returns. It only
tries with 48 bit mode and it fails there for some values.  My patch
forces a complete geometry and appears (to me) to preserve the side
efefcts of the existing code.

Am I missing something?
   



diff -Naur linux-2.6.13-rc3-git9-orig/drivers/ide/ide-disk.c
linux-2.6.13-rc3-git9/drivers/ide/ide-disk.c
--- linux-2.6.13-rc3-git9-orig/drivers/ide/ide-disk.c   2005-08-01
13:48:21.0 -0700
+++ linux-2.6.13-rc3-git9/drivers/ide/ide-disk.c2005-08-02
12:14:43.0 -0700
@@ -884,10 +884,17 @@
ide_add_setting(drive,  "max_failures",   SETTING_RW,  
   -1, -1, TYPE_INT,   0,  
65535,  1,  1,  >max_failures,NULL);
}

+static uint32_t do_div64_32 (__u64 n, uint32_t d)
+{
+   do_div(n, d);
+
+   return n;
+}
+
static void idedisk_setup (ide_drive_t *drive)
{
struct hd_driveid *id = drive->id;
-   unsigned long long capacity;
+   __u64 capacity;
int barrier;

idedisk_add_settings(drive);
@@ -949,27 +956,32 @@
 */
capacity = idedisk_capacity (drive);
if (!drive->forced_geom) {
+   uint32_t cylsz, cyl;

-   if (idedisk_supports_lba48(drive->id)) {
-   /* compatibility */
-   drive->bios_sect = 63;
-   drive->bios_head = 255;
+   /*
+* In LBA mode the geometry isn't used to talk to the drive
+* so always create a "rational" geometry from the capacity.
+*/
+   if (drive->select.b.lba) {

why are you forcing drive->bios_sect and drive->bios_head values for LBA28 
drives?
 

Any LBA 28 drive can return fixed geometry (regardless of size) just as 
well as an LBA 48 drive can. I've got drives all over the place that do 
just that. In any event there are machines which do not have have an 
BIOS, like a PPC target, and the BIOS values can be "fixed up".


The ioctl returns the BIOS values and these are totally wrong on some 
targets. Comments?



+   drive->bios_sect = drive->sect = 63;
+   drive->bios_head = drive->head = 255;

changing drive->sect and drive->head is just wrong,
these are values reported by the device - do not touch them
 

If drive->cyl, drive-head amd drive->sect are to be left alone in all 
cases I have no problem.



+
+   cylsz = drive->bios_sect * drive->bios_head;
+   cyl = do_div64_32(capacity, cylsz);

division by zero now is possible (yes, 0/0/0 is possible)
 


Then the original code is prefered.


+
+   /* "fake out" works up to ~500 GB */
+   cyl = (cyl > 65535) ? 65535 : cyl;
+   drive->bios_cyl = drive->cyl = cyl;

drive->cyl shouldn't be changed
 


Same comment as above. Only change BIOS values.


}

if (drive->bios_sect && drive->bios_head) {
-   unsigned int cap0 = capacity; /* truncate to 32 bits */
-   unsigned int cylsz, cyl;
+   cylsz = drive->bios_sect * drive->bios_head;
+   cyl = do_div64_32(capacity, cylsz);

-   if (cap0 != capacity)
-   drive->bios_cyl = 65535;

cap0 != capacity => set the max possible drive->bios_cyl,
no need for adjusting it

-   else {

cap0 == capacity => no need for do_div(), only lower 32-bits are used,
adjust drive->bios_cyl

-   cylsz = drive->bios_sect * drive->bios_head;
-   cyl = cap0 / cylsz;
-   if (cyl > 65535)
-   cyl = 65535;
-   if (cyl > drive-&g

Re: [PATCH] IDE disks show invalid geometries in /proc/ide/hd*/geometry

2005-08-03 Thread Mark Bellon

Bartlomiej Zolnierkiewicz wrote:


Hi,

The topic was discussed to death on linux-kernel.

Mark, you need to fix your applications and stop using /proc/ide/hd*/geometry
or/and HDIO_GET_GEO ioctl (which BTW your patch also affects).
 

Fixing the applications I can understand but the patch still seems 
necessary, to me, so the

HDIO_GET_GEO returns "rational" values.

I tested  HDIO_GET_GEO and it returns the same broken values as go into 
/proc (no surprises) without my patch.


If a drive is in LBA mode (28 or 48 bit) the existing code doesn't 
always "fix up" the geometry properly for some value returns. It only 
tries with 48 bit mode and it fails there for some values.  My patch 
forces a complete geometry and appears (to me) to preserve the side 
efefcts of the existing code.


Am I missing something?

mark


Bartlomiej

On 8/3/05, Andre Hedrick <[EMAIL PROTECTED]> wrote:
 


Did you read ATA-1 through ATA-7 to understand all the variations?

On Tue, 2 Aug 2005, Mark Bellon wrote:

   


The ATA specification tells large disk drives to return C/H/S data of
16383/16/63 regardless of their actual size (other variations on this
return include 15 heads and/or 4092 cylinders). Unfortunately these CHS
data confuse the existing IDE code and cause it to report invalid
geometries in /proc when the disk runs in LBA mode.

The invalid geometries can cause failures in the partitioning tools;
partitioning may be impossible or illogical size limitations occur. This
also leads to various forms of human confusion.

I attach a patch that fixes this problem while strongly attempting to
not break any existing side effects and await any comments.

mark

Signed-off-by: Mark Bellon <[EMAIL PROTECTED]>


 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

   



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] IDE disks show invalid geometries in /proc/ide/hd*/geometry

2005-08-03 Thread Mark Bellon

Andre Hedrick wrote:


Did you read ATA-1 through ATA-7 to understand all the variations?
 

Regardless of all of the geometry returns by the drives and their ATA 
compliance, the existing code will fail for some drives and return 
values. For instance, the existing code attempts to "fix up" LBA 48 
fails to handle LBA 28. In both cases the "fix up" code appears errant - 
it doesn't create a complete, valid geometry.


My patch attempts to preserve the flow and side effects of the existing 
code while handling all of the boundary cases. Given the way the 
original code appears to read one should be able to "fix up" things 
without regard for the ATA compliance of a drive.


It might help to read the code before and after my patch is applied. The 
explaination and patch alone don't make it easy to see what I think is a 
simple fix.


mark


On Tue, 2 Aug 2005, Mark Bellon wrote:

 

The ATA specification tells large disk drives to return C/H/S data of 
16383/16/63 regardless of their actual size (other variations on this 
return include 15 heads and/or 4092 cylinders). Unfortunately these CHS 
data confuse the existing IDE code and cause it to report invalid 
geometries in /proc when the disk runs in LBA mode.


The invalid geometries can cause failures in the partitioning tools; 
partitioning may be impossible or illogical size limitations occur. This 
also leads to various forms of human confusion.


I attach a patch that fixes this problem while strongly attempting to 
not break any existing side effects and await any comments.


mark

Signed-off-by: Mark Bellon <[EMAIL PROTECTED]>


   



 



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] IDE disks show invalid geometries in /proc/ide/hd*/geometry

2005-08-03 Thread Mark Bellon

Andre Hedrick wrote:


Did you read ATA-1 through ATA-7 to understand all the variations?
 

Regardless of all of the geometry returns by the drives and their ATA 
compliance, the existing code will fail for some drives and return 
values. For instance, the existing code attempts to fix up LBA 48 
fails to handle LBA 28. In both cases the fix up code appears errant - 
it doesn't create a complete, valid geometry.


My patch attempts to preserve the flow and side effects of the existing 
code while handling all of the boundary cases. Given the way the 
original code appears to read one should be able to fix up things 
without regard for the ATA compliance of a drive.


It might help to read the code before and after my patch is applied. The 
explaination and patch alone don't make it easy to see what I think is a 
simple fix.


mark


On Tue, 2 Aug 2005, Mark Bellon wrote:

 

The ATA specification tells large disk drives to return C/H/S data of 
16383/16/63 regardless of their actual size (other variations on this 
return include 15 heads and/or 4092 cylinders). Unfortunately these CHS 
data confuse the existing IDE code and cause it to report invalid 
geometries in /proc when the disk runs in LBA mode.


The invalid geometries can cause failures in the partitioning tools; 
partitioning may be impossible or illogical size limitations occur. This 
also leads to various forms of human confusion.


I attach a patch that fixes this problem while strongly attempting to 
not break any existing side effects and await any comments.


mark

Signed-off-by: Mark Bellon [EMAIL PROTECTED]


   



 



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] IDE disks show invalid geometries in /proc/ide/hd*/geometry

2005-08-03 Thread Mark Bellon

Bartlomiej Zolnierkiewicz wrote:


Hi,

The topic was discussed to death on linux-kernel.

Mark, you need to fix your applications and stop using /proc/ide/hd*/geometry
or/and HDIO_GET_GEO ioctl (which BTW your patch also affects).
 

Fixing the applications I can understand but the patch still seems 
necessary, to me, so the

HDIO_GET_GEO returns rational values.

I tested  HDIO_GET_GEO and it returns the same broken values as go into 
/proc (no surprises) without my patch.


If a drive is in LBA mode (28 or 48 bit) the existing code doesn't 
always fix up the geometry properly for some value returns. It only 
tries with 48 bit mode and it fails there for some values.  My patch 
forces a complete geometry and appears (to me) to preserve the side 
efefcts of the existing code.


Am I missing something?

mark


Bartlomiej

On 8/3/05, Andre Hedrick [EMAIL PROTECTED] wrote:
 


Did you read ATA-1 through ATA-7 to understand all the variations?

On Tue, 2 Aug 2005, Mark Bellon wrote:

   


The ATA specification tells large disk drives to return C/H/S data of
16383/16/63 regardless of their actual size (other variations on this
return include 15 heads and/or 4092 cylinders). Unfortunately these CHS
data confuse the existing IDE code and cause it to report invalid
geometries in /proc when the disk runs in LBA mode.

The invalid geometries can cause failures in the partitioning tools;
partitioning may be impossible or illogical size limitations occur. This
also leads to various forms of human confusion.

I attach a patch that fixes this problem while strongly attempting to
not break any existing side effects and await any comments.

mark

Signed-off-by: Mark Bellon [EMAIL PROTECTED]


 


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

   



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] IDE disks show invalid geometries in /proc/ide/hd*/geometry

2005-08-03 Thread Mark Bellon

Bartlomiej Zolnierkiewicz wrote:


On 8/3/05, Mark Bellon [EMAIL PROTECTED] wrote:
 


Bartlomiej Zolnierkiewicz wrote:

   


Hi,

The topic was discussed to death on linux-kernel.

Mark, you need to fix your applications and stop using /proc/ide/hd*/geometry
or/and HDIO_GET_GEO ioctl (which BTW your patch also affects).


 


Fixing the applications I can understand but the patch still seems
necessary, to me, so the
HDIO_GET_GEO returns rational values.

I tested  HDIO_GET_GEO and it returns the same broken values as go into
/proc (no surprises) without my patch.
   



Please explain.
 

Let me explain more below. Thanks for the response as now I can 
understand better the issue (which was what I was fishing for).


 


If a drive is in LBA mode (28 or 48 bit) the existing code doesn't
always fix up the geometry properly for some value returns. It only
tries with 48 bit mode and it fails there for some values.  My patch
forces a complete geometry and appears (to me) to preserve the side
efefcts of the existing code.

Am I missing something?
   



diff -Naur linux-2.6.13-rc3-git9-orig/drivers/ide/ide-disk.c
linux-2.6.13-rc3-git9/drivers/ide/ide-disk.c
--- linux-2.6.13-rc3-git9-orig/drivers/ide/ide-disk.c   2005-08-01
13:48:21.0 -0700
+++ linux-2.6.13-rc3-git9/drivers/ide/ide-disk.c2005-08-02
12:14:43.0 -0700
@@ -884,10 +884,17 @@
ide_add_setting(drive,  max_failures,   SETTING_RW,  
   -1, -1, TYPE_INT,   0,  
65535,  1,  1,  drive-max_failures,NULL);
}

+static uint32_t do_div64_32 (__u64 n, uint32_t d)
+{
+   do_div(n, d);
+
+   return n;
+}
+
static void idedisk_setup (ide_drive_t *drive)
{
struct hd_driveid *id = drive-id;
-   unsigned long long capacity;
+   __u64 capacity;
int barrier;

idedisk_add_settings(drive);
@@ -949,27 +956,32 @@
 */
capacity = idedisk_capacity (drive);
if (!drive-forced_geom) {
+   uint32_t cylsz, cyl;

-   if (idedisk_supports_lba48(drive-id)) {
-   /* compatibility */
-   drive-bios_sect = 63;
-   drive-bios_head = 255;
+   /*
+* In LBA mode the geometry isn't used to talk to the drive
+* so always create a rational geometry from the capacity.
+*/
+   if (drive-select.b.lba) {

why are you forcing drive-bios_sect and drive-bios_head values for LBA28 
drives?
 

Any LBA 28 drive can return fixed geometry (regardless of size) just as 
well as an LBA 48 drive can. I've got drives all over the place that do 
just that. In any event there are machines which do not have have an 
BIOS, like a PPC target, and the BIOS values can be fixed up.


The ioctl returns the BIOS values and these are totally wrong on some 
targets. Comments?



+   drive-bios_sect = drive-sect = 63;
+   drive-bios_head = drive-head = 255;

changing drive-sect and drive-head is just wrong,
these are values reported by the device - do not touch them
 

If drive-cyl, drive-head amd drive-sect are to be left alone in all 
cases I have no problem.



+
+   cylsz = drive-bios_sect * drive-bios_head;
+   cyl = do_div64_32(capacity, cylsz);

division by zero now is possible (yes, 0/0/0 is possible)
 


Then the original code is prefered.


+
+   /* fake out works up to ~500 GB */
+   cyl = (cyl  65535) ? 65535 : cyl;
+   drive-bios_cyl = drive-cyl = cyl;

drive-cyl shouldn't be changed
 


Same comment as above. Only change BIOS values.


}

if (drive-bios_sect  drive-bios_head) {
-   unsigned int cap0 = capacity; /* truncate to 32 bits */
-   unsigned int cylsz, cyl;
+   cylsz = drive-bios_sect * drive-bios_head;
+   cyl = do_div64_32(capacity, cylsz);

-   if (cap0 != capacity)
-   drive-bios_cyl = 65535;

cap0 != capacity = set the max possible drive-bios_cyl,
no need for adjusting it

-   else {

cap0 == capacity = no need for do_div(), only lower 32-bits are used,
adjust drive-bios_cyl

-   cylsz = drive-bios_sect * drive-bios_head;
-   cyl = cap0 / cylsz;
-   if (cyl  65535)
-   cyl = 65535;
-   if (cyl  drive-bios_cyl)
-   drive-bios_cyl = cyl;
-   }
+   if (cyl  65535)
+   cyl = 65535;
+   if (cyl  drive-bios_cyl

[PATCH] IDE disks show invalid geometries in /proc/ide/hd*/geometry

2005-08-02 Thread Mark Bellon
The ATA specification tells large disk drives to return C/H/S data of 
16383/16/63 regardless of their actual size (other variations on this 
return include 15 heads and/or 4092 cylinders). Unfortunately these CHS 
data confuse the existing IDE code and cause it to report invalid 
geometries in /proc when the disk runs in LBA mode.


The invalid geometries can cause failures in the partitioning tools; 
partitioning may be impossible or illogical size limitations occur. This 
also leads to various forms of human confusion.


I attach a patch that fixes this problem while strongly attempting to 
not break any existing side effects and await any comments.


mark

Signed-off-by: Mark Bellon <[EMAIL PROTECTED]>

diff -Naur linux-2.6.13-rc3-git9-orig/drivers/ide/ide-disk.c 
linux-2.6.13-rc3-git9/drivers/ide/ide-disk.c
--- linux-2.6.13-rc3-git9-orig/drivers/ide/ide-disk.c   2005-08-01 
13:48:21.0 -0700
+++ linux-2.6.13-rc3-git9/drivers/ide/ide-disk.c2005-08-02 
12:14:43.0 -0700
@@ -884,10 +884,17 @@
ide_add_setting(drive,  "max_failures", SETTING_RW, 
-1, -1, 
TYPE_INT,   0,  65535,  1,  1,  
>max_failures,   NULL);
 }
 
+static uint32_t do_div64_32 (__u64 n, uint32_t d)
+{
+   do_div(n, d);
+
+   return n;
+}
+
 static void idedisk_setup (ide_drive_t *drive)
 {
struct hd_driveid *id = drive->id;
-   unsigned long long capacity;
+   __u64 capacity;
int barrier;
 
idedisk_add_settings(drive);
@@ -949,27 +956,32 @@
 */
capacity = idedisk_capacity (drive);
if (!drive->forced_geom) {
+   uint32_t cylsz, cyl;
 
-   if (idedisk_supports_lba48(drive->id)) {
-   /* compatibility */
-   drive->bios_sect = 63;
-   drive->bios_head = 255;
+   /*
+* In LBA mode the geometry isn't used to talk to the drive
+* so always create a "rational" geometry from the capacity.
+*/
+   if (drive->select.b.lba) {
+   drive->bios_sect = drive->sect = 63;
+   drive->bios_head = drive->head = 255;
+
+   cylsz = drive->bios_sect * drive->bios_head;
+   cyl = do_div64_32(capacity, cylsz);
+
+   /* "fake out" works up to ~500 GB */
+   cyl = (cyl > 65535) ? 65535 : cyl;
+   drive->bios_cyl = drive->cyl = cyl;
}
 
if (drive->bios_sect && drive->bios_head) {
-   unsigned int cap0 = capacity; /* truncate to 32 bits */
-   unsigned int cylsz, cyl;
+   cylsz = drive->bios_sect * drive->bios_head;
+   cyl = do_div64_32(capacity, cylsz);
 
-   if (cap0 != capacity)
-   drive->bios_cyl = 65535;
-   else {
-   cylsz = drive->bios_sect * drive->bios_head;
-   cyl = cap0 / cylsz;
-   if (cyl > 65535)
-   cyl = 65535;
-   if (cyl > drive->bios_cyl)
-   drive->bios_cyl = cyl;
-   }
+   if (cyl > 65535)
+   cyl = 65535;
+   if (cyl > drive->bios_cyl)
+   drive->bios_cyl = cyl;
}
}
printk(KERN_INFO "%s: %llu sectors (%llu MB)",


[PATCH] IDE disks show invalid geometries in /proc/ide/hd*/geometry

2005-08-02 Thread Mark Bellon
The ATA specification tells large disk drives to return C/H/S data of 
16383/16/63 regardless of their actual size (other variations on this 
return include 15 heads and/or 4092 cylinders). Unfortunately these CHS 
data confuse the existing IDE code and cause it to report invalid 
geometries in /proc when the disk runs in LBA mode.


The invalid geometries can cause failures in the partitioning tools; 
partitioning may be impossible or illogical size limitations occur. This 
also leads to various forms of human confusion.


I attach a patch that fixes this problem while strongly attempting to 
not break any existing side effects and await any comments.


mark

Signed-off-by: Mark Bellon [EMAIL PROTECTED]

diff -Naur linux-2.6.13-rc3-git9-orig/drivers/ide/ide-disk.c 
linux-2.6.13-rc3-git9/drivers/ide/ide-disk.c
--- linux-2.6.13-rc3-git9-orig/drivers/ide/ide-disk.c   2005-08-01 
13:48:21.0 -0700
+++ linux-2.6.13-rc3-git9/drivers/ide/ide-disk.c2005-08-02 
12:14:43.0 -0700
@@ -884,10 +884,17 @@
ide_add_setting(drive,  max_failures, SETTING_RW, 
-1, -1, 
TYPE_INT,   0,  65535,  1,  1,  
drive-max_failures,   NULL);
 }
 
+static uint32_t do_div64_32 (__u64 n, uint32_t d)
+{
+   do_div(n, d);
+
+   return n;
+}
+
 static void idedisk_setup (ide_drive_t *drive)
 {
struct hd_driveid *id = drive-id;
-   unsigned long long capacity;
+   __u64 capacity;
int barrier;
 
idedisk_add_settings(drive);
@@ -949,27 +956,32 @@
 */
capacity = idedisk_capacity (drive);
if (!drive-forced_geom) {
+   uint32_t cylsz, cyl;
 
-   if (idedisk_supports_lba48(drive-id)) {
-   /* compatibility */
-   drive-bios_sect = 63;
-   drive-bios_head = 255;
+   /*
+* In LBA mode the geometry isn't used to talk to the drive
+* so always create a rational geometry from the capacity.
+*/
+   if (drive-select.b.lba) {
+   drive-bios_sect = drive-sect = 63;
+   drive-bios_head = drive-head = 255;
+
+   cylsz = drive-bios_sect * drive-bios_head;
+   cyl = do_div64_32(capacity, cylsz);
+
+   /* fake out works up to ~500 GB */
+   cyl = (cyl  65535) ? 65535 : cyl;
+   drive-bios_cyl = drive-cyl = cyl;
}
 
if (drive-bios_sect  drive-bios_head) {
-   unsigned int cap0 = capacity; /* truncate to 32 bits */
-   unsigned int cylsz, cyl;
+   cylsz = drive-bios_sect * drive-bios_head;
+   cyl = do_div64_32(capacity, cylsz);
 
-   if (cap0 != capacity)
-   drive-bios_cyl = 65535;
-   else {
-   cylsz = drive-bios_sect * drive-bios_head;
-   cyl = cap0 / cylsz;
-   if (cyl  65535)
-   cyl = 65535;
-   if (cyl  drive-bios_cyl)
-   drive-bios_cyl = cyl;
-   }
+   if (cyl  65535)
+   cyl = 65535;
+   if (cyl  drive-bios_cyl)
+   drive-bios_cyl = cyl;
}
}
printk(KERN_INFO %s: %llu sectors (%llu MB),


Re: [PATCH] (TAKE 3) disk quotas fail when /etc/mtab is symlinked to /proc/mounts

2005-07-29 Thread Mark Bellon
If /etc/mtab is a regular file all of the mount options (of a file 
system) are written to /etc/mtab by the mount command. The quota tools 
look there for the quota strings for their operation. If, however, 
/etc/mtab is a symlink to /proc/mounts (a "good thing" in some 
environments)  the tools don't write anything - they assume the kernel 
will take care of things.


While the quota options are sent down to the kernel via the mount system 
call and the file system codes handle them properly unfortunately there 
is no code to echo the quota strings into /proc/mounts and the quota 
tools fail in the symlink case.


The attached patchs modify the EXT[2|3] and JFS codes to add the 
necessary hooks. The show_options function of each file system in these 
patches currently deal with only those things that seemed related to 
quotas; especially in the EXT3 case more can be done (later?).


Jan Kara also noted the difficulty in moving these changes above the FS 
codes responding similarly to myself to Andrew's comment about possible 
VFS migration. Issue summary:

  - FS codes have to process the entire string of options anyway.
  - Only FS codes that use quotas must have a show_options function 
(for quotas to

work properly) however quotas are only used in a small number of FS.
  - Since most of the quota using FS support other options these FS 
codes should have
the a show_options function to show those options - and the quota 
echoing becomes

virtually negligible.

Based on feedback I have modified my patches from the original:
  JFS a missing patch has been restored to the posting
  EXT[2|3] and JFS always use the show_options function
  - Each FS has at least one FS specific option displayed
  - QUOTA output is under a CONFIG_QUOTA ifdef
  - a follow-on patch will add a multitude of options for each FS
  EXT[2|3] and JFS "quota" is treated as "usrquota"
  EXT3 journalled data check for journalled quota removed
  EXT[2|3] mount when quota specified but not compiled in
 - no changes from my original patch. I tested the patch and the 
codes warn but
 - still mount. With all due respection I believe the comments 
otherwise were a

 - misread of the patch. Please reread/test and comment.
  XFS patch removed - the XFS team already made the necessary changes
  EXT3 mixing old and new quotas are handled differently (not purely 
exclusive)
 - if old and new quotas for the same type are used together the 
old type
   is silently depricated for compatability (e.g. usrquota and 
usrjquota)
 - mixing of old and new quotas is an error (e.g. usrjquota and 
grpquota)


mark

Signed-off-by: Mark Bellon <[EMAIL PROTECTED]>

diff -Naur linux-2.6.13-rc3-git9-orig/fs/ext2/super.c 
linux-2.6.13-rc3-git9/fs/ext2/super.c
--- linux-2.6.13-rc3-git9-orig/fs/ext2/super.c  2005-07-28 15:12:51.0 
-0700
+++ linux-2.6.13-rc3-git9/fs/ext2/super.c   2005-07-29 14:49:45.0 
-0700
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -27,6 +28,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include "ext2.h"
 #include "xattr.h"
@@ -201,6 +204,26 @@
 #endif
 }
 
+static int ext2_show_options(struct seq_file *seq, struct vfsmount *vfs)
+{
+   struct ext2_sb_info *sbi = EXT2_SB(vfs->mnt_sb);
+
+   if (sbi->s_mount_opt & EXT2_MOUNT_GRPID)
+   seq_puts(seq, ",grpid");
+   else
+   seq_puts(seq, ",nogrpid");
+
+#if defined(CONFIG_QUOTA)
+   if (sbi->s_mount_opt & EXT2_MOUNT_USRQUOTA)
+   seq_puts(seq, ",usrquota");
+
+   if (sbi->s_mount_opt & EXT2_MOUNT_GRPQUOTA)
+   seq_puts(seq, ",grpquota");
+#endif
+
+   return 0;
+}
+
 #ifdef CONFIG_QUOTA
 static ssize_t ext2_quota_read(struct super_block *sb, int type, char *data, 
size_t len, loff_t off);
 static ssize_t ext2_quota_write(struct super_block *sb, int type, const char 
*data, size_t len, loff_t off);
@@ -218,6 +241,7 @@
.statfs = ext2_statfs,
.remount_fs = ext2_remount,
.clear_inode= ext2_clear_inode,
+   .show_options   = ext2_show_options,
 #ifdef CONFIG_QUOTA
.quota_read = ext2_quota_read,
.quota_write= ext2_quota_write,
@@ -256,10 +280,11 @@
 
 enum {
Opt_bsd_df, Opt_minix_df, Opt_grpid, Opt_nogrpid,
-   Opt_resgid, Opt_resuid, Opt_sb, Opt_err_cont, Opt_err_panic, Opt_err_ro,
-   Opt_nouid32, Opt_check, Opt_nocheck, Opt_debug, Opt_oldalloc, 
Opt_orlov, Opt_nobh,
-   Opt_user_xattr, Opt_nouser_xattr, Opt_acl, Opt_noacl, Opt_xip,
-   Opt_ignore, Opt_err,
+   Opt_resgid, Opt_resuid, Opt_sb, Opt_err_cont, Opt_err_panic,
+   Opt_err_ro, Opt_nouid32, Opt_check, Opt_nocheck, Opt_debug,
+   Opt_oldalloc, Opt_orlov, Opt_nobh, Opt_user_xattr, Opt_nouser_xattr,
+   Opt

[PATCH] (REPOST/REVISION) disk quotas fail when /etc/mtab is symlinked to /proc/mounts

2005-07-29 Thread Mark Bellon
If /etc/mtab is a regular file all of the mount options (of a file 
system) are written to /etc/mtab by the mount command. The quota tools 
look there for the quota strings for their operation. If, however, 
/etc/mtab is a symlink to /proc/mounts (a "good thing" in some 
environments)  the tools don't write anything - they assume the kernel 
will take care of things.


While the quota options are sent down to the kernel via the mount system 
call and the file system codes handle them properly unfortunately there 
is no code to echo the quota strings into /proc/mounts and the quota 
tools fail in the symlink case.


The attached patchs modify the EXT[2|3] and JFS codes to add the 
necessary hooks. The show_options function of each file system in these 
patches currently deal with only those things that seemed related to 
quotas; especially in the EXT3 case more can be done (later?).


Jan Kara also noted the difficulty in moving these changes above the FS 
codes responding similarly to myself to Andrew's comment about possible 
VFS migration. Issue summary:

   - FS codes have to process the entire string of options anyway.
   - Only FS codes that use quotas must have a show_options function 
(for quotas to work properly) however quotas are only 
used in a small number of FS.
   - Since most of the quota using FS support other options these FS 
codes should have the a show_options function to show those 
options - and the quota echoing becomes virtually

   negligible.

Based on feedback I have modified my patches from the original:
   EXT3 always uses the show_options function even when QUOTA are not 
compiled in

   EXT[2|3] and JFS "quota" is treated as "usrquota"
   EXT3 journalled data check for journalled quota removed
   EXT[2|3] mount when quota specified but not compiled in
  - no changes from my original patch. I tested the patch and the 
codes warn but still
  - mount. With all due respection I believe the comments otherwise 
were a misread

  - of the patch. Please reread/test and comment.
   XFS patch removed - the XFS team already made the necessary changes
   EXT3 mixing old and new quotas are handled differently (not purely 
exclusive)
  - if old and new are present together old is silently depricated 
(compatability)
  - mixing of old and new quotas is an error (e.g. usrjquota and 
grpquota)


mark

Signed-off-by: Mark Bellon <[EMAIL PROTECTED]>
diff -Naur linux-2.6.13-rc3-git9-orig/include/linux/ext3_fs.h 
linux-2.6.13-rc3-git9/include/linux/ext3_fs.h
--- linux-2.6.13-rc3-git9-orig/include/linux/ext3_fs.h  2005-07-28 
15:12:52.0 -0700
+++ linux-2.6.13-rc3-git9/include/linux/ext3_fs.h   2005-07-28 
16:18:24.0 -0700
@@ -373,6 +373,8 @@
 #define EXT3_MOUNT_BARRIER 0x2 /* Use block barriers */
 #define EXT3_MOUNT_NOBH0x4 /* No bufferheads */
 #define EXT3_MOUNT_QUOTA   0x8 /* Some quota option set */
+#define EXT3_MOUNT_USRQUOTA0x10 /* "old" user quota */
+#define EXT3_MOUNT_GRPQUOTA0x20 /* "old" group quota */
 
 /* Compatibility, for having both ext2_fs.h and ext3_fs.h included at once */
 #ifndef _LINUX_EXT2_FS_H
diff -Naur linux-2.6.13-rc3-git9-orig/fs/ext3/super.c 
linux-2.6.13-rc3-git9/fs/ext3/super.c
--- linux-2.6.13-rc3-git9-orig/fs/ext3/super.c  2005-07-28 15:12:51.0 
-0700
+++ linux-2.6.13-rc3-git9/fs/ext3/super.c   2005-07-29 10:00:38.0 
-0700
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include "xattr.h"
 #include "acl.h"
@@ -509,8 +510,41 @@
kfree(rsv);
 }
 
-#ifdef CONFIG_QUOTA
+static int ext3_show_options(struct seq_file *seq, struct vfsmount *vfs)
+{
+   struct ext3_sb_info *sbi = EXT3_SB(vfs->mnt_sb);
+
+   if (sbi->s_mount_opt & EXT3_MOUNT_JOURNAL_DATA)
+   seq_puts(seq, ",data=journal");
+
+   if (sbi->s_mount_opt & EXT3_MOUNT_ORDERED_DATA)
+   seq_puts(seq, ",data=ordered");
+
+   if (sbi->s_mount_opt & EXT3_MOUNT_WRITEBACK_DATA)
+   seq_puts(seq, ",data=writeback");
+
+#if defined(CONFIG_QUOTA)
+   if (sbi->s_jquota_fmt)
+   seq_printf(seq, ",jqfmt=%s",
+   (sbi->s_jquota_fmt == QFMT_VFS_OLD) ? "vfsold": "vfsv0");
+
+   if (sbi->s_qf_names[USRQUOTA])
+   seq_printf(seq, ",usrjquota=%s", sbi->s_qf_names[USRQUOTA]);
+
+   if (sbi->s_qf_names[GRPQUOTA])
+   seq_printf(seq, ",grpjquota=%s", sbi->s_qf_names[GRPQUOTA]);
+
+   if (sbi->s_mount_opt & EXT3_MOUNT_USRQUOTA)
+   seq_puts(seq, ",usrquota");
+
+   if (sbi->s_mount_opt & EXT3_MOUNT_GRPQUOTA)
+   seq_puts(seq, ",grpquota&

Re: [PATCH] disk quotas fail when /etc/mtab is symlinked to /proc/mounts

2005-07-29 Thread Mark Bellon

Jan Kara wrote:


 Hello,

 

If /etc/mtab is a regular file all of the mount options (of a file 
system) are written to /etc/mtab by the mount command. The quota tools 
look there for the quota strings for their operation. If, however, 
/etc/mtab is a symlink to /proc/mounts (a "good thing" in some 
environments)  the tools don't write anything - they assume the kernel 
will take care of things.


While the quota options are sent down to the kernel via the mount system 
call and the file system codes handle them properly unfortunately there 
is no code to echo the quota strings into /proc/mounts and the quota 
tools fail in the symlink case.
   


 Yes, I agree that it's a good think to have quota options in
/proc/mounts. Doing it per filesystem is not nice but I don't know a
nice way of doing it a VFS level either...
 

I'll have a new patch which includes your comments and a few I thought 
up soon. I'll repost the new patch.


mark

 

The attached patchs modify the EXT[2|3] and [X|J]FS codes to add the 
necessary hooks. The show_options function of each file system in these 
patches currently deal with only those things that seemed related to 
quotas; especially in the EXT3 case more can be done (later?).


The EXT3 has added error checking and has two minor changes:
  The "quota" option is considered part of the older style quotas
  Journalled quotas and older style quotas are mutually exclusive.
  - both discussable topics
   


 Ack.

 


mark

Signed-off-by: Mark Bellon <[EMAIL PROTECTED]>

   


 Thanks for the patch - I have some comments below...

 
 


#ifdef CONFIG_QUOTA
+static int ext3_show_options(struct seq_file *seq, struct vfsmount *vfs)
+{
+   struct ext3_sb_info *sbi = EXT3_SB(vfs->mnt_sb);
+
+   if (sbi->s_mount_opt & EXT3_MOUNT_JOURNAL_DATA)
+   seq_puts(seq, ",data=journal");
+
+   if (sbi->s_mount_opt & EXT3_MOUNT_ORDERED_DATA)
+   seq_puts(seq, ",data=ordered");
+
+   if (sbi->s_mount_opt & EXT3_MOUNT_WRITEBACK_DATA)
+   seq_puts(seq, ",data=writeback");
   


 Showing 'data' option only when quota is compile is ugly... Please move
CONFIG_QUOTA inside only around quota specific stuff.

 


+
+   if (sbi->s_jquota_fmt)
+   seq_printf(seq, ",jqfmt=%s",
+   (sbi->s_jquota_fmt == QFMT_VFS_OLD) ? "vfsold": "vfsv0");
+
+   if (sbi->s_qf_names[USRQUOTA])
+   seq_printf(seq, ",usrjquota=%s", sbi->s_qf_names[USRQUOTA]);
+
+   if (sbi->s_qf_names[GRPQUOTA])
+   seq_printf(seq, ",grpjquota=%s", sbi->s_qf_names[GRPQUOTA]);
+
+   if (sbi->s_mount_opt & EXT3_MOUNT_USRQUOTA)
+   seq_puts(seq, ",usrquota");
+
+   if (sbi->s_mount_opt & EXT3_MOUNT_GRPQUOTA)
+   seq_puts(seq, ",grpquota");
+
+   return 0;
+}

#define QTYPE2NAME(t) ((t)==USRQUOTA?"user":"group")
#define QTYPE2MOPT(on, t) ((t)==USRQUOTA?((on)##USRJQUOTA):((on)##GRPJQUOTA))
@@ -572,6 +604,7 @@
#ifdef CONFIG_QUOTA
.quota_read = ext3_quota_read,
.quota_write= ext3_quota_write,
+   .show_options   = ext3_show_options,
   


 Probably set this function everytime...



 


+   case Opt_quota:
+   case Opt_usrquota:
+   case Opt_grpquota:
case Opt_usrjquota:
case Opt_grpjquota:
case Opt_offusrjquota:
@@ -924,7 +973,6 @@
"EXT3-fs: journalled quota options not "
"supported.\n");
break;
-   case Opt_quota:
case Opt_noquota:
break;
   


 I'm not sure with the above change.. Previously if you mounted a
filesystem with 'usrquota' option without a kernel quota support the mount
succeeded. With your patch it will fail. I agree that that makes more
sense but I'm not sure it's correct to change a behaviour so suddently.
Maybe just issue a warning but let the mount succeed.

 


#endif
@@ -962,11 +1010,36 @@
}
}
#ifdef CONFIG_QUOTA
-   if (!sbi->s_jquota_fmt && (sbi->s_qf_names[USRQUOTA] ||
-   sbi->s_qf_names[GRPQUOTA])) {
-   printk(KERN_ERR
+   if (sbi->s_qf_names[USRQUOTA] || sbi->s_qf_names[GRPQUOTA]) {
+		if ((sbi->s_mount_opt & EXT3_MOUNT_USRQUOTA) || 
+		(sbi->s_mount_opt & EXT3_MOUNT_GRPQUOTA)) {

+   printk(KERN_ERR
+   "EXT3-fs: only one type of quotas allowed.\n");
+
+   return 0;
+   }
+
+   if (!sbi->s_jquota_fmt) {
+   printk(KERN_ERR
"EXT3-fs: journalled quota format not specifie

Re: [PATCH] disk quotas fail when /etc/mtab is symlinked to /proc/mounts

2005-07-29 Thread Mark Bellon

Jan Kara wrote:


 Hello,

 

If /etc/mtab is a regular file all of the mount options (of a file 
system) are written to /etc/mtab by the mount command. The quota tools 
look there for the quota strings for their operation. If, however, 
/etc/mtab is a symlink to /proc/mounts (a good thing in some 
environments)  the tools don't write anything - they assume the kernel 
will take care of things.


While the quota options are sent down to the kernel via the mount system 
call and the file system codes handle them properly unfortunately there 
is no code to echo the quota strings into /proc/mounts and the quota 
tools fail in the symlink case.
   


 Yes, I agree that it's a good think to have quota options in
/proc/mounts. Doing it per filesystem is not nice but I don't know a
nice way of doing it a VFS level either...
 

I'll have a new patch which includes your comments and a few I thought 
up soon. I'll repost the new patch.


mark

 

The attached patchs modify the EXT[2|3] and [X|J]FS codes to add the 
necessary hooks. The show_options function of each file system in these 
patches currently deal with only those things that seemed related to 
quotas; especially in the EXT3 case more can be done (later?).


The EXT3 has added error checking and has two minor changes:
  The quota option is considered part of the older style quotas
  Journalled quotas and older style quotas are mutually exclusive.
  - both discussable topics
   


 Ack.

 


mark

Signed-off-by: Mark Bellon [EMAIL PROTECTED]

   


 Thanks for the patch - I have some comments below...

 snip
 


#ifdef CONFIG_QUOTA
+static int ext3_show_options(struct seq_file *seq, struct vfsmount *vfs)
+{
+   struct ext3_sb_info *sbi = EXT3_SB(vfs-mnt_sb);
+
+   if (sbi-s_mount_opt  EXT3_MOUNT_JOURNAL_DATA)
+   seq_puts(seq, ,data=journal);
+
+   if (sbi-s_mount_opt  EXT3_MOUNT_ORDERED_DATA)
+   seq_puts(seq, ,data=ordered);
+
+   if (sbi-s_mount_opt  EXT3_MOUNT_WRITEBACK_DATA)
+   seq_puts(seq, ,data=writeback);
   


 Showing 'data' option only when quota is compile is ugly... Please move
CONFIG_QUOTA inside only around quota specific stuff.

 


+
+   if (sbi-s_jquota_fmt)
+   seq_printf(seq, ,jqfmt=%s,
+   (sbi-s_jquota_fmt == QFMT_VFS_OLD) ? vfsold: vfsv0);
+
+   if (sbi-s_qf_names[USRQUOTA])
+   seq_printf(seq, ,usrjquota=%s, sbi-s_qf_names[USRQUOTA]);
+
+   if (sbi-s_qf_names[GRPQUOTA])
+   seq_printf(seq, ,grpjquota=%s, sbi-s_qf_names[GRPQUOTA]);
+
+   if (sbi-s_mount_opt  EXT3_MOUNT_USRQUOTA)
+   seq_puts(seq, ,usrquota);
+
+   if (sbi-s_mount_opt  EXT3_MOUNT_GRPQUOTA)
+   seq_puts(seq, ,grpquota);
+
+   return 0;
+}

#define QTYPE2NAME(t) ((t)==USRQUOTA?user:group)
#define QTYPE2MOPT(on, t) ((t)==USRQUOTA?((on)##USRJQUOTA):((on)##GRPJQUOTA))
@@ -572,6 +604,7 @@
#ifdef CONFIG_QUOTA
.quota_read = ext3_quota_read,
.quota_write= ext3_quota_write,
+   .show_options   = ext3_show_options,
   


 Probably set this function everytime...

snip

 


+   case Opt_quota:
+   case Opt_usrquota:
+   case Opt_grpquota:
case Opt_usrjquota:
case Opt_grpjquota:
case Opt_offusrjquota:
@@ -924,7 +973,6 @@
EXT3-fs: journalled quota options not 
supported.\n);
break;
-   case Opt_quota:
case Opt_noquota:
break;
   


 I'm not sure with the above change.. Previously if you mounted a
filesystem with 'usrquota' option without a kernel quota support the mount
succeeded. With your patch it will fail. I agree that that makes more
sense but I'm not sure it's correct to change a behaviour so suddently.
Maybe just issue a warning but let the mount succeed.

 


#endif
@@ -962,11 +1010,36 @@
}
}
#ifdef CONFIG_QUOTA
-   if (!sbi-s_jquota_fmt  (sbi-s_qf_names[USRQUOTA] ||
-   sbi-s_qf_names[GRPQUOTA])) {
-   printk(KERN_ERR
+   if (sbi-s_qf_names[USRQUOTA] || sbi-s_qf_names[GRPQUOTA]) {
+		if ((sbi-s_mount_opt  EXT3_MOUNT_USRQUOTA) || 
+		(sbi-s_mount_opt  EXT3_MOUNT_GRPQUOTA)) {

+   printk(KERN_ERR
+   EXT3-fs: only one type of quotas allowed.\n);
+
+   return 0;
+   }
+
+   if (!sbi-s_jquota_fmt) {
+   printk(KERN_ERR
EXT3-fs: journalled quota format not specified.\n);
-   return 0;
+
+   return 0;
+   }
+
+   if ((sbi-s_mount_opt  EXT3_MOUNT_JOURNAL_DATA) == 0) {
   


 This test does not make sense - journaled quota in recent kernels
works with arbitrary journaling data mode.

 


+   printk(KERN_ERR
+   EXT3-fs

[PATCH] (REPOST/REVISION) disk quotas fail when /etc/mtab is symlinked to /proc/mounts

2005-07-29 Thread Mark Bellon
If /etc/mtab is a regular file all of the mount options (of a file 
system) are written to /etc/mtab by the mount command. The quota tools 
look there for the quota strings for their operation. If, however, 
/etc/mtab is a symlink to /proc/mounts (a good thing in some 
environments)  the tools don't write anything - they assume the kernel 
will take care of things.


While the quota options are sent down to the kernel via the mount system 
call and the file system codes handle them properly unfortunately there 
is no code to echo the quota strings into /proc/mounts and the quota 
tools fail in the symlink case.


The attached patchs modify the EXT[2|3] and JFS codes to add the 
necessary hooks. The show_options function of each file system in these 
patches currently deal with only those things that seemed related to 
quotas; especially in the EXT3 case more can be done (later?).


Jan Kara also noted the difficulty in moving these changes above the FS 
codes responding similarly to myself to Andrew's comment about possible 
VFS migration. Issue summary:

   - FS codes have to process the entire string of options anyway.
   - Only FS codes that use quotas must have a show_options function 
(for quotas to work properly) however quotas are only 
used in a small number of FS.
   - Since most of the quota using FS support other options these FS 
codes should have the a show_options function to show those 
options - and the quota echoing becomes virtually

   negligible.

Based on feedback I have modified my patches from the original:
   EXT3 always uses the show_options function even when QUOTA are not 
compiled in

   EXT[2|3] and JFS quota is treated as usrquota
   EXT3 journalled data check for journalled quota removed
   EXT[2|3] mount when quota specified but not compiled in
  - no changes from my original patch. I tested the patch and the 
codes warn but still
  - mount. With all due respection I believe the comments otherwise 
were a misread

  - of the patch. Please reread/test and comment.
   XFS patch removed - the XFS team already made the necessary changes
   EXT3 mixing old and new quotas are handled differently (not purely 
exclusive)
  - if old and new are present together old is silently depricated 
(compatability)
  - mixing of old and new quotas is an error (e.g. usrjquota and 
grpquota)


mark

Signed-off-by: Mark Bellon [EMAIL PROTECTED]
diff -Naur linux-2.6.13-rc3-git9-orig/include/linux/ext3_fs.h 
linux-2.6.13-rc3-git9/include/linux/ext3_fs.h
--- linux-2.6.13-rc3-git9-orig/include/linux/ext3_fs.h  2005-07-28 
15:12:52.0 -0700
+++ linux-2.6.13-rc3-git9/include/linux/ext3_fs.h   2005-07-28 
16:18:24.0 -0700
@@ -373,6 +373,8 @@
 #define EXT3_MOUNT_BARRIER 0x2 /* Use block barriers */
 #define EXT3_MOUNT_NOBH0x4 /* No bufferheads */
 #define EXT3_MOUNT_QUOTA   0x8 /* Some quota option set */
+#define EXT3_MOUNT_USRQUOTA0x10 /* old user quota */
+#define EXT3_MOUNT_GRPQUOTA0x20 /* old group quota */
 
 /* Compatibility, for having both ext2_fs.h and ext3_fs.h included at once */
 #ifndef _LINUX_EXT2_FS_H
diff -Naur linux-2.6.13-rc3-git9-orig/fs/ext3/super.c 
linux-2.6.13-rc3-git9/fs/ext3/super.c
--- linux-2.6.13-rc3-git9-orig/fs/ext3/super.c  2005-07-28 15:12:51.0 
-0700
+++ linux-2.6.13-rc3-git9/fs/ext3/super.c   2005-07-29 10:00:38.0 
-0700
@@ -35,6 +35,7 @@
 #include linux/mount.h
 #include linux/namei.h
 #include linux/quotaops.h
+#include linux/seq_file.h
 #include asm/uaccess.h
 #include xattr.h
 #include acl.h
@@ -509,8 +510,41 @@
kfree(rsv);
 }
 
-#ifdef CONFIG_QUOTA
+static int ext3_show_options(struct seq_file *seq, struct vfsmount *vfs)
+{
+   struct ext3_sb_info *sbi = EXT3_SB(vfs-mnt_sb);
+
+   if (sbi-s_mount_opt  EXT3_MOUNT_JOURNAL_DATA)
+   seq_puts(seq, ,data=journal);
+
+   if (sbi-s_mount_opt  EXT3_MOUNT_ORDERED_DATA)
+   seq_puts(seq, ,data=ordered);
+
+   if (sbi-s_mount_opt  EXT3_MOUNT_WRITEBACK_DATA)
+   seq_puts(seq, ,data=writeback);
+
+#if defined(CONFIG_QUOTA)
+   if (sbi-s_jquota_fmt)
+   seq_printf(seq, ,jqfmt=%s,
+   (sbi-s_jquota_fmt == QFMT_VFS_OLD) ? vfsold: vfsv0);
+
+   if (sbi-s_qf_names[USRQUOTA])
+   seq_printf(seq, ,usrjquota=%s, sbi-s_qf_names[USRQUOTA]);
+
+   if (sbi-s_qf_names[GRPQUOTA])
+   seq_printf(seq, ,grpjquota=%s, sbi-s_qf_names[GRPQUOTA]);
+
+   if (sbi-s_mount_opt  EXT3_MOUNT_USRQUOTA)
+   seq_puts(seq, ,usrquota);
+
+   if (sbi-s_mount_opt  EXT3_MOUNT_GRPQUOTA)
+   seq_puts(seq, ,grpquota);
+#endif
 
+   return 0;
+}
+
+#ifdef CONFIG_QUOTA
 #define QTYPE2NAME(t) ((t)==USRQUOTA?user:group)
 #define QTYPE2MOPT(on, t) ((t)==USRQUOTA?((on)##USRJQUOTA):((on)##GRPJQUOTA))
 
@@ -569,6 +603,7 @@
.statfs

Re: [PATCH] (TAKE 3) disk quotas fail when /etc/mtab is symlinked to /proc/mounts

2005-07-29 Thread Mark Bellon
If /etc/mtab is a regular file all of the mount options (of a file 
system) are written to /etc/mtab by the mount command. The quota tools 
look there for the quota strings for their operation. If, however, 
/etc/mtab is a symlink to /proc/mounts (a good thing in some 
environments)  the tools don't write anything - they assume the kernel 
will take care of things.


While the quota options are sent down to the kernel via the mount system 
call and the file system codes handle them properly unfortunately there 
is no code to echo the quota strings into /proc/mounts and the quota 
tools fail in the symlink case.


The attached patchs modify the EXT[2|3] and JFS codes to add the 
necessary hooks. The show_options function of each file system in these 
patches currently deal with only those things that seemed related to 
quotas; especially in the EXT3 case more can be done (later?).


Jan Kara also noted the difficulty in moving these changes above the FS 
codes responding similarly to myself to Andrew's comment about possible 
VFS migration. Issue summary:

  - FS codes have to process the entire string of options anyway.
  - Only FS codes that use quotas must have a show_options function 
(for quotas to

work properly) however quotas are only used in a small number of FS.
  - Since most of the quota using FS support other options these FS 
codes should have
the a show_options function to show those options - and the quota 
echoing becomes

virtually negligible.

Based on feedback I have modified my patches from the original:
  JFS a missing patch has been restored to the posting
  EXT[2|3] and JFS always use the show_options function
  - Each FS has at least one FS specific option displayed
  - QUOTA output is under a CONFIG_QUOTA ifdef
  - a follow-on patch will add a multitude of options for each FS
  EXT[2|3] and JFS quota is treated as usrquota
  EXT3 journalled data check for journalled quota removed
  EXT[2|3] mount when quota specified but not compiled in
 - no changes from my original patch. I tested the patch and the 
codes warn but
 - still mount. With all due respection I believe the comments 
otherwise were a

 - misread of the patch. Please reread/test and comment.
  XFS patch removed - the XFS team already made the necessary changes
  EXT3 mixing old and new quotas are handled differently (not purely 
exclusive)
 - if old and new quotas for the same type are used together the 
old type
   is silently depricated for compatability (e.g. usrquota and 
usrjquota)
 - mixing of old and new quotas is an error (e.g. usrjquota and 
grpquota)


mark

Signed-off-by: Mark Bellon [EMAIL PROTECTED]

diff -Naur linux-2.6.13-rc3-git9-orig/fs/ext2/super.c 
linux-2.6.13-rc3-git9/fs/ext2/super.c
--- linux-2.6.13-rc3-git9-orig/fs/ext2/super.c  2005-07-28 15:12:51.0 
-0700
+++ linux-2.6.13-rc3-git9/fs/ext2/super.c   2005-07-29 14:49:45.0 
-0700
@@ -19,6 +19,7 @@
 #include linux/config.h
 #include linux/module.h
 #include linux/string.h
+#include linux/fs.h
 #include linux/slab.h
 #include linux/init.h
 #include linux/blkdev.h
@@ -27,6 +28,8 @@
 #include linux/buffer_head.h
 #include linux/smp_lock.h
 #include linux/vfs.h
+#include linux/seq_file.h
+#include linux/mount.h
 #include asm/uaccess.h
 #include ext2.h
 #include xattr.h
@@ -201,6 +204,26 @@
 #endif
 }
 
+static int ext2_show_options(struct seq_file *seq, struct vfsmount *vfs)
+{
+   struct ext2_sb_info *sbi = EXT2_SB(vfs-mnt_sb);
+
+   if (sbi-s_mount_opt  EXT2_MOUNT_GRPID)
+   seq_puts(seq, ,grpid);
+   else
+   seq_puts(seq, ,nogrpid);
+
+#if defined(CONFIG_QUOTA)
+   if (sbi-s_mount_opt  EXT2_MOUNT_USRQUOTA)
+   seq_puts(seq, ,usrquota);
+
+   if (sbi-s_mount_opt  EXT2_MOUNT_GRPQUOTA)
+   seq_puts(seq, ,grpquota);
+#endif
+
+   return 0;
+}
+
 #ifdef CONFIG_QUOTA
 static ssize_t ext2_quota_read(struct super_block *sb, int type, char *data, 
size_t len, loff_t off);
 static ssize_t ext2_quota_write(struct super_block *sb, int type, const char 
*data, size_t len, loff_t off);
@@ -218,6 +241,7 @@
.statfs = ext2_statfs,
.remount_fs = ext2_remount,
.clear_inode= ext2_clear_inode,
+   .show_options   = ext2_show_options,
 #ifdef CONFIG_QUOTA
.quota_read = ext2_quota_read,
.quota_write= ext2_quota_write,
@@ -256,10 +280,11 @@
 
 enum {
Opt_bsd_df, Opt_minix_df, Opt_grpid, Opt_nogrpid,
-   Opt_resgid, Opt_resuid, Opt_sb, Opt_err_cont, Opt_err_panic, Opt_err_ro,
-   Opt_nouid32, Opt_check, Opt_nocheck, Opt_debug, Opt_oldalloc, 
Opt_orlov, Opt_nobh,
-   Opt_user_xattr, Opt_nouser_xattr, Opt_acl, Opt_noacl, Opt_xip,
-   Opt_ignore, Opt_err,
+   Opt_resgid, Opt_resuid, Opt_sb, Opt_err_cont, Opt_err_panic,
+   Opt_err_ro, Opt_nouid32, Opt_check, Opt_nocheck, Opt_debug,
+   Opt_oldalloc, Opt_orlov, Opt_nobh, Opt_user_xattr

Re: [PATCH] disk quotas fail when /etc/mtab is symlinked to /proc/mounts

2005-07-28 Thread Mark Bellon

Mark Bellon wrote:


Andrew Morton wrote:


Mark Bellon <[EMAIL PROTECTED]> wrote:
 

If /etc/mtab is a regular file all of the mount options (of a file 
system) are written to /etc/mtab by the mount command. The quota 
tools look there for the quota strings for their operation. If, 
however, /etc/mtab is a symlink to /proc/mounts (a "good thing" in 
some environments)  the tools don't write anything - they assume the 
kernel will take care of things.


While the quota options are sent down to the kernel via the mount 
system call and the file system codes handle them properly 
unfortunately there is no code to echo the quota strings into 
/proc/mounts and the quota tools fail in the symlink case.
  



hm.  Perhaps others with operational experience in that area can 
comment.
 


OK.

 

The attached patchs modify the EXT[2|3] and [X|J]FS codes to add the 
necessary hooks. The show_options function of each file system in 
these patches currently deal with only those things that seemed 
related to quotas; especially in the EXT3 case more can be done 
(later?).
  



It seems sad to do it in each filesystem.  Is there no way in which 
we can

do this for all filesystems, in a single place in the VFS?
 

Each file system must be able to echo it's own FS specific options, 
hence the show_options hook (XFS is a good example). EXT3 has it's own 
form of journalled quota file options, hence the need for the specific 
hook.


The "older style" (e.g. "usrquota", "grpquota", "quota") could be done 
generically but a FS might want any number of quota options. The few 
lines of code in each file system didn't seem so bad especially if the 
show_function start echoing more.


Followup comment/through...

If we want /proc/mounts to really replace /etc/mtab in the general case, 
always using it as a symlink, the file system codes will all need the 
show_options hook - they will need to echo all of their private options 
duplicating, as closely as possible, what would have been written to the 
/etc/mtab regular file.


mark

mark

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] disk quotas fail when /etc/mtab is symlinked to /proc/mounts

2005-07-28 Thread Mark Bellon

Andrew Morton wrote:


Mark Bellon <[EMAIL PROTECTED]> wrote:
 

If /etc/mtab is a regular file all of the mount options (of a file 
system) are written to /etc/mtab by the mount command. The quota tools 
look there for the quota strings for their operation. If, however, 
/etc/mtab is a symlink to /proc/mounts (a "good thing" in some 
environments)  the tools don't write anything - they assume the kernel 
will take care of things.


While the quota options are sent down to the kernel via the mount system 
call and the file system codes handle them properly unfortunately there 
is no code to echo the quota strings into /proc/mounts and the quota 
tools fail in the symlink case.
   



hm.  Perhaps others with operational experience in that area can comment.
 


OK.

 

The attached patchs modify the EXT[2|3] and [X|J]FS codes to add the 
necessary hooks. The show_options function of each file system in these 
patches currently deal with only those things that seemed related to 
quotas; especially in the EXT3 case more can be done (later?).
   



It seems sad to do it in each filesystem.  Is there no way in which we can
do this for all filesystems, in a single place in the VFS?
 

Each file system must be able to echo it's own FS specific options, 
hence the show_options hook (XFS is a good example). EXT3 has it's own 
form of journalled quota file options, hence the need for the specific hook.


The "older style" (e.g. "usrquota", "grpquota", "quota") could be done 
generically but a FS might want any number of quota options. The few 
lines of code in each file system didn't seem so bad especially if the 
show_function start echoing more.


mark
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] disk quotas fail when /etc/mtab is symlinked to /proc/mounts

2005-07-28 Thread Mark Bellon
If /etc/mtab is a regular file all of the mount options (of a file 
system) are written to /etc/mtab by the mount command. The quota tools 
look there for the quota strings for their operation. If, however, 
/etc/mtab is a symlink to /proc/mounts (a "good thing" in some 
environments)  the tools don't write anything - they assume the kernel 
will take care of things.


While the quota options are sent down to the kernel via the mount system 
call and the file system codes handle them properly unfortunately there 
is no code to echo the quota strings into /proc/mounts and the quota 
tools fail in the symlink case.


The attached patchs modify the EXT[2|3] and JFS codes to add the 
necessary hooks. The show_options function of each file system in these 
patches currently deal with only those things that seemed related to 
quotas; especially in the EXT3 case more can be done (later?).


The original XFS change has been dropped as per Nathan Scott 
<[EMAIL PROTECTED]>. The functionality had already been added.


The EXT3 has added error checking and has two minor changes:
  The "quota" option is considered part of the older style quotas
  Journalled quotas and older style quotas are mutually exclusive.
  - both discussable topics

mark

Signed-off-by: Mark Bellon <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] disk quotas fail when /etc/mtab is symlinked to /proc/mounts

2005-07-28 Thread Mark Bellon

Nathan Scott wrote:


On Thu, Jul 28, 2005 at 05:03:02PM -0700, Mark Bellon wrote:
 

The attached patchs modify the EXT[2|3] and [X|J]FS codes to add the 
   



The XFS component is incorrect, we're already doing this elsewhere
(over in fs/xfs/quota/xfs_qm_bhv.c), so please drop this last part
from your patch...
 


No problem! New patch coming up.

mark

 


diff -Naur linux-2.6.13-rc3-git9-orig/fs/xfs/xfs_vfsops.c 
linux-2.6.13-rc3-git9/fs/xfs/xfs_vfsops.c
...
+   { XFS_UQUOTA_ACTIVE,"," "usrquota" },
+   { XFS_GQUOTA_ACTIVE,"," "grpquota" },
   



thanks.

 



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] disk quotas fail when /etc/mtab is symlinked to /proc/mounts

2005-07-28 Thread Mark Bellon
If /etc/mtab is a regular file all of the mount options (of a file 
system) are written to /etc/mtab by the mount command. The quota tools 
look there for the quota strings for their operation. If, however, 
/etc/mtab is a symlink to /proc/mounts (a "good thing" in some 
environments)  the tools don't write anything - they assume the kernel 
will take care of things.


While the quota options are sent down to the kernel via the mount system 
call and the file system codes handle them properly unfortunately there 
is no code to echo the quota strings into /proc/mounts and the quota 
tools fail in the symlink case.


The attached patchs modify the EXT[2|3] and [X|J]FS codes to add the 
necessary hooks. The show_options function of each file system in these 
patches currently deal with only those things that seemed related to 
quotas; especially in the EXT3 case more can be done (later?).


The EXT3 has added error checking and has two minor changes:
   The "quota" option is considered part of the older style quotas
   Journalled quotas and older style quotas are mutually exclusive.
   - both discussable topics

mark

Signed-off-by: Mark Bellon <[EMAIL PROTECTED]>

diff -Naur linux-2.6.13-rc3-git9-orig/include/linux/ext3_fs.h 
linux-2.6.13-rc3-git9/include/linux/ext3_fs.h
--- linux-2.6.13-rc3-git9-orig/include/linux/ext3_fs.h  2005-07-28 
15:12:52.0 -0700
+++ linux-2.6.13-rc3-git9/include/linux/ext3_fs.h   2005-07-28 
16:18:24.0 -0700
@@ -373,6 +373,8 @@
 #define EXT3_MOUNT_BARRIER 0x2 /* Use block barriers */
 #define EXT3_MOUNT_NOBH0x4 /* No bufferheads */
 #define EXT3_MOUNT_QUOTA   0x8 /* Some quota option set */
+#define EXT3_MOUNT_USRQUOTA0x10 /* "old" user quota */
+#define EXT3_MOUNT_GRPQUOTA0x20 /* "old" group quota */
 
 /* Compatibility, for having both ext2_fs.h and ext3_fs.h included at once */
 #ifndef _LINUX_EXT2_FS_H
diff -Naur linux-2.6.13-rc3-git9-orig/fs/ext3/super.c 
linux-2.6.13-rc3-git9/fs/ext3/super.c
--- linux-2.6.13-rc3-git9-orig/fs/ext3/super.c  2005-07-28 15:12:51.0 
-0700
+++ linux-2.6.13-rc3-git9/fs/ext3/super.c   2005-07-28 16:01:15.0 
-0700
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include "xattr.h"
 #include "acl.h"
@@ -510,6 +511,37 @@
 }
 
 #ifdef CONFIG_QUOTA
+static int ext3_show_options(struct seq_file *seq, struct vfsmount *vfs)
+{
+   struct ext3_sb_info *sbi = EXT3_SB(vfs->mnt_sb);
+
+   if (sbi->s_mount_opt & EXT3_MOUNT_JOURNAL_DATA)
+   seq_puts(seq, ",data=journal");
+
+   if (sbi->s_mount_opt & EXT3_MOUNT_ORDERED_DATA)
+   seq_puts(seq, ",data=ordered");
+
+   if (sbi->s_mount_opt & EXT3_MOUNT_WRITEBACK_DATA)
+   seq_puts(seq, ",data=writeback");
+
+   if (sbi->s_jquota_fmt)
+   seq_printf(seq, ",jqfmt=%s",
+   (sbi->s_jquota_fmt == QFMT_VFS_OLD) ? "vfsold": "vfsv0");
+
+   if (sbi->s_qf_names[USRQUOTA])
+   seq_printf(seq, ",usrjquota=%s", sbi->s_qf_names[USRQUOTA]);
+
+   if (sbi->s_qf_names[GRPQUOTA])
+   seq_printf(seq, ",grpjquota=%s", sbi->s_qf_names[GRPQUOTA]);
+
+   if (sbi->s_mount_opt & EXT3_MOUNT_USRQUOTA)
+   seq_puts(seq, ",usrquota");
+
+   if (sbi->s_mount_opt & EXT3_MOUNT_GRPQUOTA)
+   seq_puts(seq, ",grpquota");
+
+   return 0;
+}
 
 #define QTYPE2NAME(t) ((t)==USRQUOTA?"user":"group")
 #define QTYPE2MOPT(on, t) ((t)==USRQUOTA?((on)##USRJQUOTA):((on)##GRPJQUOTA))
@@ -572,6 +604,7 @@
 #ifdef CONFIG_QUOTA
.quota_read = ext3_quota_read,
.quota_write= ext3_quota_write,
+   .show_options   = ext3_show_options,
 #endif
 };
 
@@ -590,7 +623,8 @@
Opt_abort, Opt_data_journal, Opt_data_ordered, Opt_data_writeback,
Opt_usrjquota, Opt_grpjquota, Opt_offusrjquota, Opt_offgrpjquota,
Opt_jqfmt_vfsold, Opt_jqfmt_vfsv0, Opt_quota, Opt_noquota,
-   Opt_ignore, Opt_barrier, Opt_err, Opt_resize,
+   Opt_ignore, Opt_barrier, Opt_err, Opt_resize, Opt_usrquota,
+   Opt_grpquota
 };
 
 static match_table_t tokens = {
@@ -634,10 +668,10 @@
{Opt_grpjquota, "grpjquota=%s"},
{Opt_jqfmt_vfsold, "jqfmt=vfsold"},
{Opt_jqfmt_vfsv0, "jqfmt=vfsv0"},
-   {Opt_quota, "grpquota"},
+   {Opt_grpquota, "grpquota"},
{Opt_noquota, "noquota"},
{Opt_quota, "quota"},
-   {Opt_quota, "usrquota"},
+   {Opt_usrquota, "usrquota"},
{Opt_barrier, "barrier=%u"},
{Opt_err, NULL},
{Opt_resize, "resize"},

[PATCH] disk quotas fail when /etc/mtab is symlinked to /proc/mounts

2005-07-28 Thread Mark Bellon
If /etc/mtab is a regular file all of the mount options (of a file 
system) are written to /etc/mtab by the mount command. The quota tools 
look there for the quota strings for their operation. If, however, 
/etc/mtab is a symlink to /proc/mounts (a good thing in some 
environments)  the tools don't write anything - they assume the kernel 
will take care of things.


While the quota options are sent down to the kernel via the mount system 
call and the file system codes handle them properly unfortunately there 
is no code to echo the quota strings into /proc/mounts and the quota 
tools fail in the symlink case.


The attached patchs modify the EXT[2|3] and [X|J]FS codes to add the 
necessary hooks. The show_options function of each file system in these 
patches currently deal with only those things that seemed related to 
quotas; especially in the EXT3 case more can be done (later?).


The EXT3 has added error checking and has two minor changes:
   The quota option is considered part of the older style quotas
   Journalled quotas and older style quotas are mutually exclusive.
   - both discussable topics

mark

Signed-off-by: Mark Bellon [EMAIL PROTECTED]

diff -Naur linux-2.6.13-rc3-git9-orig/include/linux/ext3_fs.h 
linux-2.6.13-rc3-git9/include/linux/ext3_fs.h
--- linux-2.6.13-rc3-git9-orig/include/linux/ext3_fs.h  2005-07-28 
15:12:52.0 -0700
+++ linux-2.6.13-rc3-git9/include/linux/ext3_fs.h   2005-07-28 
16:18:24.0 -0700
@@ -373,6 +373,8 @@
 #define EXT3_MOUNT_BARRIER 0x2 /* Use block barriers */
 #define EXT3_MOUNT_NOBH0x4 /* No bufferheads */
 #define EXT3_MOUNT_QUOTA   0x8 /* Some quota option set */
+#define EXT3_MOUNT_USRQUOTA0x10 /* old user quota */
+#define EXT3_MOUNT_GRPQUOTA0x20 /* old group quota */
 
 /* Compatibility, for having both ext2_fs.h and ext3_fs.h included at once */
 #ifndef _LINUX_EXT2_FS_H
diff -Naur linux-2.6.13-rc3-git9-orig/fs/ext3/super.c 
linux-2.6.13-rc3-git9/fs/ext3/super.c
--- linux-2.6.13-rc3-git9-orig/fs/ext3/super.c  2005-07-28 15:12:51.0 
-0700
+++ linux-2.6.13-rc3-git9/fs/ext3/super.c   2005-07-28 16:01:15.0 
-0700
@@ -35,6 +35,7 @@
 #include linux/mount.h
 #include linux/namei.h
 #include linux/quotaops.h
+#include linux/seq_file.h
 #include asm/uaccess.h
 #include xattr.h
 #include acl.h
@@ -510,6 +511,37 @@
 }
 
 #ifdef CONFIG_QUOTA
+static int ext3_show_options(struct seq_file *seq, struct vfsmount *vfs)
+{
+   struct ext3_sb_info *sbi = EXT3_SB(vfs-mnt_sb);
+
+   if (sbi-s_mount_opt  EXT3_MOUNT_JOURNAL_DATA)
+   seq_puts(seq, ,data=journal);
+
+   if (sbi-s_mount_opt  EXT3_MOUNT_ORDERED_DATA)
+   seq_puts(seq, ,data=ordered);
+
+   if (sbi-s_mount_opt  EXT3_MOUNT_WRITEBACK_DATA)
+   seq_puts(seq, ,data=writeback);
+
+   if (sbi-s_jquota_fmt)
+   seq_printf(seq, ,jqfmt=%s,
+   (sbi-s_jquota_fmt == QFMT_VFS_OLD) ? vfsold: vfsv0);
+
+   if (sbi-s_qf_names[USRQUOTA])
+   seq_printf(seq, ,usrjquota=%s, sbi-s_qf_names[USRQUOTA]);
+
+   if (sbi-s_qf_names[GRPQUOTA])
+   seq_printf(seq, ,grpjquota=%s, sbi-s_qf_names[GRPQUOTA]);
+
+   if (sbi-s_mount_opt  EXT3_MOUNT_USRQUOTA)
+   seq_puts(seq, ,usrquota);
+
+   if (sbi-s_mount_opt  EXT3_MOUNT_GRPQUOTA)
+   seq_puts(seq, ,grpquota);
+
+   return 0;
+}
 
 #define QTYPE2NAME(t) ((t)==USRQUOTA?user:group)
 #define QTYPE2MOPT(on, t) ((t)==USRQUOTA?((on)##USRJQUOTA):((on)##GRPJQUOTA))
@@ -572,6 +604,7 @@
 #ifdef CONFIG_QUOTA
.quota_read = ext3_quota_read,
.quota_write= ext3_quota_write,
+   .show_options   = ext3_show_options,
 #endif
 };
 
@@ -590,7 +623,8 @@
Opt_abort, Opt_data_journal, Opt_data_ordered, Opt_data_writeback,
Opt_usrjquota, Opt_grpjquota, Opt_offusrjquota, Opt_offgrpjquota,
Opt_jqfmt_vfsold, Opt_jqfmt_vfsv0, Opt_quota, Opt_noquota,
-   Opt_ignore, Opt_barrier, Opt_err, Opt_resize,
+   Opt_ignore, Opt_barrier, Opt_err, Opt_resize, Opt_usrquota,
+   Opt_grpquota
 };
 
 static match_table_t tokens = {
@@ -634,10 +668,10 @@
{Opt_grpjquota, grpjquota=%s},
{Opt_jqfmt_vfsold, jqfmt=vfsold},
{Opt_jqfmt_vfsv0, jqfmt=vfsv0},
-   {Opt_quota, grpquota},
+   {Opt_grpquota, grpquota},
{Opt_noquota, noquota},
{Opt_quota, quota},
-   {Opt_quota, usrquota},
+   {Opt_usrquota, usrquota},
{Opt_barrier, barrier=%u},
{Opt_err, NULL},
{Opt_resize, resize},
@@ -902,8 +936,18 @@
case Opt_jqfmt_vfsv0:
sbi-s_jquota_fmt = QFMT_VFS_V0;
break;
+   case Opt_usrquota:
+   set_opt(sbi-s_mount_opt, QUOTA);
+   set_opt(sbi-s_mount_opt, USRQUOTA);
+   break;
+   case

Re: [PATCH] disk quotas fail when /etc/mtab is symlinked to /proc/mounts

2005-07-28 Thread Mark Bellon

Nathan Scott wrote:


On Thu, Jul 28, 2005 at 05:03:02PM -0700, Mark Bellon wrote:
 

The attached patchs modify the EXT[2|3] and [X|J]FS codes to add the 
   



The XFS component is incorrect, we're already doing this elsewhere
(over in fs/xfs/quota/xfs_qm_bhv.c), so please drop this last part
from your patch...
 


No problem! New patch coming up.

mark

 


diff -Naur linux-2.6.13-rc3-git9-orig/fs/xfs/xfs_vfsops.c 
linux-2.6.13-rc3-git9/fs/xfs/xfs_vfsops.c
...
+   { XFS_UQUOTA_ACTIVE,, usrquota },
+   { XFS_GQUOTA_ACTIVE,, grpquota },
   



thanks.

 



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] disk quotas fail when /etc/mtab is symlinked to /proc/mounts

2005-07-28 Thread Mark Bellon
If /etc/mtab is a regular file all of the mount options (of a file 
system) are written to /etc/mtab by the mount command. The quota tools 
look there for the quota strings for their operation. If, however, 
/etc/mtab is a symlink to /proc/mounts (a good thing in some 
environments)  the tools don't write anything - they assume the kernel 
will take care of things.


While the quota options are sent down to the kernel via the mount system 
call and the file system codes handle them properly unfortunately there 
is no code to echo the quota strings into /proc/mounts and the quota 
tools fail in the symlink case.


The attached patchs modify the EXT[2|3] and JFS codes to add the 
necessary hooks. The show_options function of each file system in these 
patches currently deal with only those things that seemed related to 
quotas; especially in the EXT3 case more can be done (later?).


The original XFS change has been dropped as per Nathan Scott 
[EMAIL PROTECTED]. The functionality had already been added.


The EXT3 has added error checking and has two minor changes:
  The quota option is considered part of the older style quotas
  Journalled quotas and older style quotas are mutually exclusive.
  - both discussable topics

mark

Signed-off-by: Mark Bellon [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] disk quotas fail when /etc/mtab is symlinked to /proc/mounts

2005-07-28 Thread Mark Bellon

Andrew Morton wrote:


Mark Bellon [EMAIL PROTECTED] wrote:
 

If /etc/mtab is a regular file all of the mount options (of a file 
system) are written to /etc/mtab by the mount command. The quota tools 
look there for the quota strings for their operation. If, however, 
/etc/mtab is a symlink to /proc/mounts (a good thing in some 
environments)  the tools don't write anything - they assume the kernel 
will take care of things.


While the quota options are sent down to the kernel via the mount system 
call and the file system codes handle them properly unfortunately there 
is no code to echo the quota strings into /proc/mounts and the quota 
tools fail in the symlink case.
   



hm.  Perhaps others with operational experience in that area can comment.
 


OK.

 

The attached patchs modify the EXT[2|3] and [X|J]FS codes to add the 
necessary hooks. The show_options function of each file system in these 
patches currently deal with only those things that seemed related to 
quotas; especially in the EXT3 case more can be done (later?).
   



It seems sad to do it in each filesystem.  Is there no way in which we can
do this for all filesystems, in a single place in the VFS?
 

Each file system must be able to echo it's own FS specific options, 
hence the show_options hook (XFS is a good example). EXT3 has it's own 
form of journalled quota file options, hence the need for the specific hook.


The older style (e.g. usrquota, grpquota, quota) could be done 
generically but a FS might want any number of quota options. The few 
lines of code in each file system didn't seem so bad especially if the 
show_function start echoing more.


mark
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] disk quotas fail when /etc/mtab is symlinked to /proc/mounts

2005-07-28 Thread Mark Bellon

Mark Bellon wrote:


Andrew Morton wrote:


Mark Bellon [EMAIL PROTECTED] wrote:
 

If /etc/mtab is a regular file all of the mount options (of a file 
system) are written to /etc/mtab by the mount command. The quota 
tools look there for the quota strings for their operation. If, 
however, /etc/mtab is a symlink to /proc/mounts (a good thing in 
some environments)  the tools don't write anything - they assume the 
kernel will take care of things.


While the quota options are sent down to the kernel via the mount 
system call and the file system codes handle them properly 
unfortunately there is no code to echo the quota strings into 
/proc/mounts and the quota tools fail in the symlink case.
  



hm.  Perhaps others with operational experience in that area can 
comment.
 


OK.

 

The attached patchs modify the EXT[2|3] and [X|J]FS codes to add the 
necessary hooks. The show_options function of each file system in 
these patches currently deal with only those things that seemed 
related to quotas; especially in the EXT3 case more can be done 
(later?).
  



It seems sad to do it in each filesystem.  Is there no way in which 
we can

do this for all filesystems, in a single place in the VFS?
 

Each file system must be able to echo it's own FS specific options, 
hence the show_options hook (XFS is a good example). EXT3 has it's own 
form of journalled quota file options, hence the need for the specific 
hook.


The older style (e.g. usrquota, grpquota, quota) could be done 
generically but a FS might want any number of quota options. The few 
lines of code in each file system didn't seem so bad especially if the 
show_function start echoing more.


Followup comment/through...

If we want /proc/mounts to really replace /etc/mtab in the general case, 
always using it as a symlink, the file system codes will all need the 
show_options hook - they will need to echo all of their private options 
duplicating, as closely as possible, what would have been written to the 
/etc/mtab regular file.


mark

mark

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/