Re: Bugfix in dquot_transfer()
> How about the following: > * dquot_{alloc,free}_block() _never_ blocks. > * we have 3 inlined helper functions - alloc_block(), free_block() and > change_xid(). They get exclusion (BKL, spinlock, whatever) and update both > quota and i_blocks. > > Consequences: > * quota for filesystems without ->i_blocks is history. It doesn't > work anyway - quota for minixfs is so easy to screw that it's not even > funny. > * we can't print any messages from the dquot_{alloc,free}_block(). > Let the helper thread do it - we would just add a request to queue and let > it pick the thing. > BTW, use of global buffer for creating the messages is > extremely bad idea - TTY output can block and you've got no protection > around print_warning(). Seems I'll have to read that part of kernel... I didn't expect writing to console can block (especially as this code was there for ages). > * we have to be careful in {read,write}_dquot(). Frankly, I would > prefer to use the pagecache for quota file rather than messing with > ->read() and ->write(). Then we can get an exclusion between updating > dquot and copying it to/from page without blocking. Incidentially, we kill > the set_fs() crap that way. I think it's reasonable (I wanted to move quota to pagecache anyway...). > BTW, changing ->dq_op looks nasty - AFAICS you can easily oops on > access to the methods, since the thing may become NULL between the check > and dereferencing. True.. I'll think off how to fix it. Honza - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Bugfix in dquot_transfer()
How about the following: * dquot_{alloc,free}_block() _never_ blocks. * we have 3 inlined helper functions - alloc_block(), free_block() and change_xid(). They get exclusion (BKL, spinlock, whatever) and update both quota and i_blocks. Consequences: * quota for filesystems without -i_blocks is history. It doesn't work anyway - quota for minixfs is so easy to screw that it's not even funny. * we can't print any messages from the dquot_{alloc,free}_block(). Let the helper thread do it - we would just add a request to queue and let it pick the thing. BTW, use of global buffer for creating the messages is extremely bad idea - TTY output can block and you've got no protection around print_warning(). Seems I'll have to read that part of kernel... I didn't expect writing to console can block (especially as this code was there for ages). * we have to be careful in {read,write}_dquot(). Frankly, I would prefer to use the pagecache for quota file rather than messing with -read() and -write(). Then we can get an exclusion between updating dquot and copying it to/from page without blocking. Incidentially, we kill the set_fs() crap that way. I think it's reasonable (I wanted to move quota to pagecache anyway...). BTW, changing -dq_op looks nasty - AFAICS you can easily oops on access to the methods, since the thing may become NULL between the check and dereferencing. True.. I'll think off how to fix it. Honza - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Bugfix in dquot_transfer()
On Thu, 14 Sep 2000, Jan Kara wrote: > > > of blocks after notify_change() once more all the quota will be counted > > > properly. The only problem is that quota can be exceeded this way. We have to >check > > > > Nope. You've just shifted the race window (and inverted the > > effect) - think what happens if you've got new allocations after the UID > > change but before the return from notify_change(). > Quota will be still acounted to old user - it's independent of i_uid contents - > so when we switch pointers to dquot structures and move allocated space without > blocking there shouldn't be problem... You still have a window when i_blocks is out of sync with dquot. If your nice, non-blocking, etc. switch happens in that window - we are toast, right? It means that we can't change i_blocks before dquot - currently dquot_{alloc,free}_block() can block. Now, consider the following: dquot_transfer() switched dqput() BLOCKED dquot_alloc_block() dqduplicate() dquot_lock() BLOCKED dquot_unlock() dqput() finished updated the thing *** WINDOW OPENED dquot_unlock() dqput() BLOCKED another dquot_transfer() switched finished finished updated i_blocks *** WINDOW CLOSED The second switch happens in the window. QED. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Bugfix in dquot_transfer()
> > of blocks after notify_change() once more all the quota will be counted > > properly. The only problem is that quota can be exceeded this way. We have to check > > Nope. You've just shifted the race window (and inverted the > effect) - think what happens if you've got new allocations after the UID > change but before the return from notify_change(). Quota will be still acounted to old user - it's independent of i_uid contents - so when we switch pointers to dquot structures and move allocated space without blocking there shouldn't be problem... > > exceeding before notify_change() because later there is no way to undo what > > notify_change() did. > > > Currently I'm thinking about change which would make sence to me (at least at > > the first sight): notify_change() will call dquot_transfer() (currently > > dquot_transfer() calls notify_change()). > > Umm... I don't think that it will help anything. > > How about the following: > * dquot_{alloc,free}_block() _never_ blocks. > * we have 3 inlined helper functions - alloc_block(), free_block() and > change_xid(). They get exclusion (BKL, spinlock, whatever) and update both > quota and i_blocks. > > Consequences: > * quota for filesystems without ->i_blocks is history. It doesn't > work anyway - quota for minixfs is so easy to screw that it's not even > funny. > * we can't print any messages from the dquot_{alloc,free}_block(). > Let the helper thread do it - we would just add a request to queue and let > it pick the thing. BTW, use of global buffer for creating the messages is > extremely bad idea - TTY output can block and you've got no protection > around print_warning(). > * we have to be careful in {read,write}_dquot(). Frankly, I would > prefer to use the pagecache for quota file rather than messing with > ->read() and ->write(). Then we can get an exclusion between updating > dquot and copying it to/from page without blocking. Incidentially, we kill > the set_fs() crap that way. > > BTW, changing ->dq_op looks nasty - AFAICS you can easily oops on > access to the methods, since the thing may become NULL between the check > and dereferencing. > > Comments? I'll think about it... Now I have to go... Honza - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Bugfix in dquot_transfer()
On Thu, 14 Sep 2000, Jan Kara wrote: > Yes I agree that if notify_change() blocks we still can account imprecisely. > I think I didn't understand your proposal. The pointers to structures where > quota should be charged are already in inode. And if we count current number Sorry, it was a brainfart ;-/ > of blocks after notify_change() once more all the quota will be counted > properly. The only problem is that quota can be exceeded this way. We have to check Nope. You've just shifted the race window (and inverted the effect) - think what happens if you've got new allocations after the UID change but before the return from notify_change(). > exceeding before notify_change() because later there is no way to undo what > notify_change() did. > Currently I'm thinking about change which would make sence to me (at least at > the first sight): notify_change() will call dquot_transfer() (currently > dquot_transfer() calls notify_change()). Umm... I don't think that it will help anything. How about the following: * dquot_{alloc,free}_block() _never_ blocks. * we have 3 inlined helper functions - alloc_block(), free_block() and change_xid(). They get exclusion (BKL, spinlock, whatever) and update both quota and i_blocks. Consequences: * quota for filesystems without ->i_blocks is history. It doesn't work anyway - quota for minixfs is so easy to screw that it's not even funny. * we can't print any messages from the dquot_{alloc,free}_block(). Let the helper thread do it - we would just add a request to queue and let it pick the thing. BTW, use of global buffer for creating the messages is extremely bad idea - TTY output can block and you've got no protection around print_warning(). * we have to be careful in {read,write}_dquot(). Frankly, I would prefer to use the pagecache for quota file rather than messing with ->read() and ->write(). Then we can get an exclusion between updating dquot and copying it to/from page without blocking. Incidentially, we kill the set_fs() crap that way. BTW, changing ->dq_op looks nasty - AFAICS you can easily oops on access to the methods, since the thing may become NULL between the check and dereferencing. Comments? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Bugfix in dquot_transfer()
> On Mon, 4 Sep 2000, Jan Kara wrote: > > > Hello. > > > > Following patch fixes bug in dquot_transfer() - while we were sleeping > > i_blocks might change and so number quota was miscounted. Patches are > > against 2.2.16 and 2.4.0-test6 (but should apply well on newer versions). > > Umm... It still has a hole: pageout during notify_change() can > allocate new blocks. Window is narrow, but it's there. Fsck knows what > to do with that. Could we just keep the pointer to current quota elements > (i.e. places where we should charge) in the inode? Then dquot_transfer() > would transfer the charge after notify_change() and switch these pointers > at the same time. Under BKL - nothing can block in that place. How about > such strategy? Then DQUOT_INIT would have set these pointers, but it's not > too horrible, IMO. I'm sorry I didn't replied to your mail but I had a bug in my .procmailrc and mails got sorted to bad folders... Yes I agree that if notify_change() blocks we still can account imprecisely. I think I didn't understand your proposal. The pointers to structures where quota should be charged are already in inode. And if we count current number of blocks after notify_change() once more all the quota will be counted properly. The only problem is that quota can be exceeded this way. We have to check exceeding before notify_change() because later there is no way to undo what notify_change() did. Currently I'm thinking about change which would make sence to me (at least at the first sight): notify_change() will call dquot_transfer() (currently dquot_transfer() calls notify_change()). Honza - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Bugfix in dquot_transfer()
On Mon, 4 Sep 2000, Jan Kara wrote: Hello. Following patch fixes bug in dquot_transfer() - while we were sleeping i_blocks might change and so number quota was miscounted. Patches are against 2.2.16 and 2.4.0-test6 (but should apply well on newer versions). Umm... It still has a hole: pageout during notify_change() can allocate new blocks. Window is narrow, but it's there. Fsck knows what to do with that. Could we just keep the pointer to current quota elements (i.e. places where we should charge) in the inode? Then dquot_transfer() would transfer the charge after notify_change() and switch these pointers at the same time. Under BKL - nothing can block in that place. How about such strategy? Then DQUOT_INIT would have set these pointers, but it's not too horrible, IMO. I'm sorry I didn't replied to your mail but I had a bug in my .procmailrc and mails got sorted to bad folders... Yes I agree that if notify_change() blocks we still can account imprecisely. I think I didn't understand your proposal. The pointers to structures where quota should be charged are already in inode. And if we count current number of blocks after notify_change() once more all the quota will be counted properly. The only problem is that quota can be exceeded this way. We have to check exceeding before notify_change() because later there is no way to undo what notify_change() did. Currently I'm thinking about change which would make sence to me (at least at the first sight): notify_change() will call dquot_transfer() (currently dquot_transfer() calls notify_change()). Honza - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Bugfix in dquot_transfer()
of blocks after notify_change() once more all the quota will be counted properly. The only problem is that quota can be exceeded this way. We have to check Nope. You've just shifted the race window (and inverted the effect) - think what happens if you've got new allocations after the UID change but before the return from notify_change(). Quota will be still acounted to old user - it's independent of i_uid contents - so when we switch pointers to dquot structures and move allocated space without blocking there shouldn't be problem... exceeding before notify_change() because later there is no way to undo what notify_change() did. Currently I'm thinking about change which would make sence to me (at least at the first sight): notify_change() will call dquot_transfer() (currently dquot_transfer() calls notify_change()). Umm... I don't think that it will help anything. How about the following: * dquot_{alloc,free}_block() _never_ blocks. * we have 3 inlined helper functions - alloc_block(), free_block() and change_xid(). They get exclusion (BKL, spinlock, whatever) and update both quota and i_blocks. Consequences: * quota for filesystems without -i_blocks is history. It doesn't work anyway - quota for minixfs is so easy to screw that it's not even funny. * we can't print any messages from the dquot_{alloc,free}_block(). Let the helper thread do it - we would just add a request to queue and let it pick the thing. BTW, use of global buffer for creating the messages is extremely bad idea - TTY output can block and you've got no protection around print_warning(). * we have to be careful in {read,write}_dquot(). Frankly, I would prefer to use the pagecache for quota file rather than messing with -read() and -write(). Then we can get an exclusion between updating dquot and copying it to/from page without blocking. Incidentially, we kill the set_fs() crap that way. BTW, changing -dq_op looks nasty - AFAICS you can easily oops on access to the methods, since the thing may become NULL between the check and dereferencing. Comments? I'll think about it... Now I have to go... Honza - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Bugfix in dquot_transfer()
Jan Kara ([EMAIL PROTECTED]) wrote on 4 September 2000 08:25: > Following patch fixes bug in dquot_transfer() - while we were sleeping >i_blocks might change and so number quota was miscounted. Patches are >against 2.2.16 and 2.4.0-test6 (but should apply well on newer versions). I seem to have problems with this patch. Our main/central/everything server crashed 3 times apparently because of it (runs fine without it). A simple dpkg-scanpackages or named -u nobody gets a segfault, and the kernel oopses: kernel: Unable to handle kernel NULL pointer dereference at virtual address 002c kernel: current->tss.cr3 = 32bca000, %cr3 = 32bca000 kernel: *pde = kernel: Oops: kernel: CPU:0 I was keen to apply it because I'm having occasional problems with quotas being above the real usage, mostly because of users running jobs in other machines and accessing the disk via NFS. This is with 2.2.17 patched with the below plus raid, ide, the latest knfs, openwall and Andrea's VM-global-2.2.18pre2-6.bz2. >--- linux/fs/dquot.c Tue Aug 29 11:01:00 2000 >+++ linux/fs/dquot.c Sat Sep 2 23:01:04 2000 >@@ -1260,14 +1260,6 @@ > if (!inode) > return -ENOENT; > /* >- * Find out if this filesystem uses i_blocks. >- */ >- if (inode->i_blksize == 0) >- blocks = isize_to_blocks(inode->i_size, BLOCK_SIZE); >- else >- blocks = (inode->i_blocks / 2); >- >- /* > * Build the transfer_from and transfer_to lists and check quotas to see > * if operation is permitted. > */ >@@ -1326,13 +1318,25 @@ > * dqget() could block and so the first structure might got > * invalidated or locked... > */ >- if (!transfer_to[cnt]->dq_mnt || !transfer_from[cnt]->dq_mnt || >- check_idq(transfer_to[cnt], cnt, 1, initiator, tty) == NO_QUOTA || >- check_bdq(transfer_to[cnt], cnt, blocks, initiator, tty, 0) == >NO_QUOTA) { >+ if (!transfer_to[cnt]->dq_mnt || !transfer_from[cnt]->dq_mnt) { > cnt++; > goto put_all; > } > } >+ >+ /* >+ * Find out if this filesystem uses i_blocks. >+ */ >+ if (inode->i_blksize == 0) >+ blocks = isize_to_blocks(inode->i_size, BLOCK_SIZE); >+ else >+ blocks = (inode->i_blocks / 2); >+ for (cnt = 0; cnt < MAXQUOTAS; cnt++) >+ if (check_idq(transfer_to[cnt], cnt, 1, initiator, tty) == NO_QUOTA || >+ check_bdq(transfer_to[cnt], cnt, blocks, initiator, tty, 0) == >NO_QUOTA) { >+ cnt = MAXQUOTAS; >+ goto put_all; >+ } > > if ((error = notify_change(dentry, iattr))) > goto put_all; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Bugfix in dquot_transfer()
Jan Kara ([EMAIL PROTECTED]) wrote on 4 September 2000 08:25: Following patch fixes bug in dquot_transfer() - while we were sleeping i_blocks might change and so number quota was miscounted. Patches are against 2.2.16 and 2.4.0-test6 (but should apply well on newer versions). I seem to have problems with this patch. Our main/central/everything server crashed 3 times apparently because of it (runs fine without it). A simple dpkg-scanpackages or named -u nobody gets a segfault, and the kernel oopses: kernel: Unable to handle kernel NULL pointer dereference at virtual address 002c kernel: current-tss.cr3 = 32bca000, %cr3 = 32bca000 kernel: *pde = kernel: Oops: kernel: CPU:0 I was keen to apply it because I'm having occasional problems with quotas being above the real usage, mostly because of users running jobs in other machines and accessing the disk via NFS. This is with 2.2.17 patched with the below plus raid, ide, the latest knfs, openwall and Andrea's VM-global-2.2.18pre2-6.bz2. --- linux/fs/dquot.c Tue Aug 29 11:01:00 2000 +++ linux/fs/dquot.c Sat Sep 2 23:01:04 2000 @@ -1260,14 +1260,6 @@ if (!inode) return -ENOENT; /* - * Find out if this filesystem uses i_blocks. - */ - if (inode-i_blksize == 0) - blocks = isize_to_blocks(inode-i_size, BLOCK_SIZE); - else - blocks = (inode-i_blocks / 2); - - /* * Build the transfer_from and transfer_to lists and check quotas to see * if operation is permitted. */ @@ -1326,13 +1318,25 @@ * dqget() could block and so the first structure might got * invalidated or locked... */ - if (!transfer_to[cnt]-dq_mnt || !transfer_from[cnt]-dq_mnt || - check_idq(transfer_to[cnt], cnt, 1, initiator, tty) == NO_QUOTA || - check_bdq(transfer_to[cnt], cnt, blocks, initiator, tty, 0) == NO_QUOTA) { + if (!transfer_to[cnt]-dq_mnt || !transfer_from[cnt]-dq_mnt) { cnt++; goto put_all; } } + + /* + * Find out if this filesystem uses i_blocks. + */ + if (inode-i_blksize == 0) + blocks = isize_to_blocks(inode-i_size, BLOCK_SIZE); + else + blocks = (inode-i_blocks / 2); + for (cnt = 0; cnt MAXQUOTAS; cnt++) + if (check_idq(transfer_to[cnt], cnt, 1, initiator, tty) == NO_QUOTA || + check_bdq(transfer_to[cnt], cnt, blocks, initiator, tty, 0) == NO_QUOTA) { + cnt = MAXQUOTAS; + goto put_all; + } if ((error = notify_change(dentry, iattr))) goto put_all; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/