Re: [PATCH net] cosa: Add missing kfree in error path of cosa_write
Wang Hai wrote: : If memory allocation for 'kbuf' succeed, cosa_write() doesn't have a : corresponding kfree() in exception handling. Thus add kfree() for this : function implementation. Acked-By: Jan "Yenya" Kasprzak Looks correct, thanks. That said, COSA is an ancient ISA bus device designed in late 1990s, I doubt anybody is still using it. I still do have one or two of these cards myself, but no computer with ISA bus to use them. -Yenya : : Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") : Reported-by: Hulk Robot : Signed-off-by: Wang Hai : --- : drivers/net/wan/cosa.c | 1 + : 1 file changed, 1 insertion(+) : : diff --git a/drivers/net/wan/cosa.c b/drivers/net/wan/cosa.c : index f8aed0696d77..2369ca250cd6 100644 : --- a/drivers/net/wan/cosa.c : +++ b/drivers/net/wan/cosa.c : @@ -889,6 +889,7 @@ static ssize_t cosa_write(struct file *file, : chan->tx_status = 1; : spin_unlock_irqrestore(>lock, flags); : up(>wsem); : + kfree(kbuf); : return -ERESTARTSYS; : } : } : -- : 2.17.1 -- | Jan "Yenya" Kasprzak | | http://www.fi.muni.cz/~kas/ GPG: 4096R/A45477D5 | We all agree on the necessity of compromise. We just can't agree on when it's necessary to compromise. --Larry Wall
Re: wan-cosa: Use memdup_user() rather than duplicating its implementation
SF Markus Elfring wrote: : > What about the GFP_DMA attribute, which your patch deletes? : > The buffer in question has to be ISA DMA-able. : : Thanks for your constructive feedback. : : Would you be interested in using a variant of the function "memdup_…" : with which the corresponding memory allocation option can be preserved? I am not sure that extending an in-kernel API just for one legacy driver is what we want. As I said, I would prefer the driver unchanged, if possible. Maybe it is the time for gradually phasing out ISA DMA support and all the legacy drivers which use it? Sincerely, -Yenya -- | Jan "Yenya" Kasprzak | | http://www.fi.muni.cz/~kas/ GPG: 4096R/A45477D5 | Like most things in Windows, on the surface it looks great. -- Jeremy Allison, A Tale of Two Standards
Re: wan-cosa: Use memdup_user() rather than duplicating its implementation
SF Markus Elfring wrote: : > What about the GFP_DMA attribute, which your patch deletes? : > The buffer in question has to be ISA DMA-able. : : Thanks for your constructive feedback. : : Would you be interested in using a variant of the function "memdup_…" : with which the corresponding memory allocation option can be preserved? I am not sure that extending an in-kernel API just for one legacy driver is what we want. As I said, I would prefer the driver unchanged, if possible. Maybe it is the time for gradually phasing out ISA DMA support and all the legacy drivers which use it? Sincerely, -Yenya -- | Jan "Yenya" Kasprzak | | http://www.fi.muni.cz/~kas/ GPG: 4096R/A45477D5 | Like most things in Windows, on the surface it looks great. -- Jeremy Allison, A Tale of Two Standards
Re: [PATCH] wan-cosa: Use memdup_user() rather than duplicating its implementation
Hello, SF Markus Elfring wrote: : From: Markus Elfring: Date: Sat, 20 Aug 2016 10:10:12 +0200 : : Reuse existing functionality from memdup_user() instead of keeping : duplicate source code. : : This issue was detected by using the Coccinelle software. What about the GFP_DMA attribute, which your patch deletes? The buffer in question has to be ISA DMA-able. Anyway, COSA is an ancient ISA device, which is out of production for at least 10 years. I don't have a box with ISA bus anymore, so I can't even test the modifications of the wan/cosa.c (I do have a COSA board itself, though :-). I suggest keeping the driver intact if possible, in case somebody still uses it. Sincerely, -Yenya : Signed-off-by: Markus Elfring : --- : drivers/net/wan/cosa.c | 12 +++- : 1 file changed, 3 insertions(+), 9 deletions(-) : : diff --git a/drivers/net/wan/cosa.c b/drivers/net/wan/cosa.c : index b87fe0a..02f5809 100644 : --- a/drivers/net/wan/cosa.c : +++ b/drivers/net/wan/cosa.c : @@ -875,16 +875,10 @@ static ssize_t cosa_write(struct file *file, : if (count > COSA_MTU) : count = COSA_MTU; : : - /* Allocate the buffer */ : - kbuf = kmalloc(count, GFP_KERNEL|GFP_DMA); : - if (kbuf == NULL) { : + kbuf = memdup_user(buf, count); : + if (IS_ERR(kbuf)) { : up(>wsem); : - return -ENOMEM; : - } : - if (copy_from_user(kbuf, buf, count)) { : - up(>wsem); : - kfree(kbuf); : - return -EFAULT; : + return PTR_ERR(kbuf); : } : chan->tx_status=0; : cosa_start_tx(chan, kbuf, count); : -- : 2.9.3 -- | Jan "Yenya" Kasprzak | | http://www.fi.muni.cz/~kas/ GPG: 4096R/A45477D5 | Like most things in Windows, on the surface it looks great. -- Jeremy Allison, A Tale of Two Standards
Re: [PATCH] wan-cosa: Use memdup_user() rather than duplicating its implementation
Hello, SF Markus Elfring wrote: : From: Markus Elfring : Date: Sat, 20 Aug 2016 10:10:12 +0200 : : Reuse existing functionality from memdup_user() instead of keeping : duplicate source code. : : This issue was detected by using the Coccinelle software. What about the GFP_DMA attribute, which your patch deletes? The buffer in question has to be ISA DMA-able. Anyway, COSA is an ancient ISA device, which is out of production for at least 10 years. I don't have a box with ISA bus anymore, so I can't even test the modifications of the wan/cosa.c (I do have a COSA board itself, though :-). I suggest keeping the driver intact if possible, in case somebody still uses it. Sincerely, -Yenya : Signed-off-by: Markus Elfring : --- : drivers/net/wan/cosa.c | 12 +++- : 1 file changed, 3 insertions(+), 9 deletions(-) : : diff --git a/drivers/net/wan/cosa.c b/drivers/net/wan/cosa.c : index b87fe0a..02f5809 100644 : --- a/drivers/net/wan/cosa.c : +++ b/drivers/net/wan/cosa.c : @@ -875,16 +875,10 @@ static ssize_t cosa_write(struct file *file, : if (count > COSA_MTU) : count = COSA_MTU; : : - /* Allocate the buffer */ : - kbuf = kmalloc(count, GFP_KERNEL|GFP_DMA); : - if (kbuf == NULL) { : + kbuf = memdup_user(buf, count); : + if (IS_ERR(kbuf)) { : up(>wsem); : - return -ENOMEM; : - } : - if (copy_from_user(kbuf, buf, count)) { : - up(>wsem); : - kfree(kbuf); : - return -EFAULT; : + return PTR_ERR(kbuf); : } : chan->tx_status=0; : cosa_start_tx(chan, kbuf, count); : -- : 2.9.3 -- | Jan "Yenya" Kasprzak | | http://www.fi.muni.cz/~kas/ GPG: 4096R/A45477D5 | Like most things in Windows, on the surface it looks great. -- Jeremy Allison, A Tale of Two Standards
Re: [PATCH 2/2] cosa.c : Array index 'i' is used before limits check.
Sergei Shtylyov wrote: : Hello. : : On 02/24/2015 10:52 PM, Ameen Ali wrote: : : >avoid out-of-bounds-read by checking count before indexing. : : >Signed-off-by: Ameen Ali : >--- : > drivers/net/wan/cosa.c | 2 +- : > 1 file changed, 1 insertion(+), 1 deletion(-) : : >diff --git a/drivers/net/wan/cosa.c b/drivers/net/wan/cosa.c : >index 83c39e2..5252e21 100644 : >--- a/drivers/net/wan/cosa.c : >+++ b/drivers/net/wan/cosa.c : >@@ -376,7 +376,7 @@ static int __init cosa_init(void) : > } : > for (i=0; i cosa_cards[i].num = -1; : >-for (i=0; io[i] != 0 && i < MAX_CARDS; i++) : >+for (i=0; (i < MAX_CARDS) && (io[i] != 0) ; i++) : :Parens you've added aren't necessary. Yes, but I am OK with both variants. :I suggest to add spaces after and before = in the first expression. In that case, also the upper for() cycle should be modified to match this. I don't think this is necessary. -Yenya -- | Jan "Yenya" Kasprzak| | New GPG 4096R/A45477D5 -- see http://www.fi.muni.cz/~kas/pgp-rollover.txt | | http://www.fi.muni.cz/~kas/ Journal: http://www.fi.muni.cz/~kas/blog/ | ||| "New and improved" is only really improved if it also takes backwards ||| ||| compatibility into account, rather than saying "now everybody must do ||| ||| things the new and improved - and different - way" --Linus Torvalds ||| -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] cosa.c : Array index 'i' is used before limits check.
Sergei Shtylyov wrote: : Hello. : : On 02/24/2015 10:52 PM, Ameen Ali wrote: : : avoid out-of-bounds-read by checking count before indexing. : : Signed-off-by: Ameen Ali ameenali...@gmail.com : --- : drivers/net/wan/cosa.c | 2 +- : 1 file changed, 1 insertion(+), 1 deletion(-) : : diff --git a/drivers/net/wan/cosa.c b/drivers/net/wan/cosa.c : index 83c39e2..5252e21 100644 : --- a/drivers/net/wan/cosa.c : +++ b/drivers/net/wan/cosa.c : @@ -376,7 +376,7 @@ static int __init cosa_init(void) : } : for (i=0; iMAX_CARDS; i++) : cosa_cards[i].num = -1; : -for (i=0; io[i] != 0 i MAX_CARDS; i++) : +for (i=0; (i MAX_CARDS) (io[i] != 0) ; i++) : :Parens you've added aren't necessary. Yes, but I am OK with both variants. :I suggest to add spaces after and before = in the first expression. In that case, also the upper for() cycle should be modified to match this. I don't think this is necessary. -Yenya -- | Jan Yenya Kasprzak kas at {fi.muni.cz - work | yenya.net - private} | | New GPG 4096R/A45477D5 -- see http://www.fi.muni.cz/~kas/pgp-rollover.txt | | http://www.fi.muni.cz/~kas/ Journal: http://www.fi.muni.cz/~kas/blog/ | ||| New and improved is only really improved if it also takes backwards ||| ||| compatibility into account, rather than saying now everybody must do ||| ||| things the new and improved - and different - way --Linus Torvalds ||| -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PATCH: mvsas: add support for Supermicro AOC-SAS2LP-MV8
Christoph Hellwig wrote: : On Fri, May 16, 2014 at 02:06:42PM +0200, Jan Kasprzak wrote: : > any news with this patch? Will it be acked by you and submitted upstream? : > Thanks! : : Give me an Acked-by and I'll pull it in. Acked-By: Jan "Yenya" Kasprzak Not sure whether I should ack my own patch, though. But you may apply it to the original one, which is identical to what I did. http://marc.info/?l=linux-scsi=139277202005675=2 Thanks! -Y. -- | Jan "Yenya" Kasprzak | | New GPG 4096R/A45477D5 - see http://www.fi.muni.cz/~kas/pgp-rollover.txt | | http://www.fi.muni.cz/~kas/Journal: http://www.fi.muni.cz/~kas/blog/ | There's clearly a balance between "octopus merges are fine" and "Christ, that's not an octopus, that's a Cthulhu merge". --Linus Torvalds -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PATCH: mvsas: add support for Supermicro AOC-SAS2LP-MV8
Christoph Hellwig wrote: : On Fri, May 16, 2014 at 02:06:42PM +0200, Jan Kasprzak wrote: : any news with this patch? Will it be acked by you and submitted upstream? : Thanks! : : Give me an Acked-by and I'll pull it in. Acked-By: Jan Yenya Kasprzak k...@fi.muni.cz Not sure whether I should ack my own patch, though. But you may apply it to the original one, which is identical to what I did. http://marc.info/?l=linux-scsim=139277202005675w=2 Thanks! -Y. -- | Jan Yenya Kasprzak kas at {fi.muni.cz - work | yenya.net - private} | | New GPG 4096R/A45477D5 - see http://www.fi.muni.cz/~kas/pgp-rollover.txt | | http://www.fi.muni.cz/~kas/Journal: http://www.fi.muni.cz/~kas/blog/ | There's clearly a balance between octopus merges are fine and Christ, that's not an octopus, that's a Cthulhu merge. --Linus Torvalds -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PATCH: mvsas: add support for Supermicro AOC-SAS2LP-MV8
Hello, Xiangliang Yu wrote: : > Ben Hutchings already submitted a patch for this twice, which I cc'd you : > on: : > : > http://marc.info/?t=13927720393 : > : > will you ack it? : I can't find this mail in my mail box. any news with this patch? Will it be acked by you and submitted upstream? Thanks! -Jan Kasprzak -- | Jan "Yenya" Kasprzak | | New GPG 4096R/A45477D5 - see http://www.fi.muni.cz/~kas/pgp-rollover.txt | | http://www.fi.muni.cz/~kas/Journal: http://www.fi.muni.cz/~kas/blog/ | There's clearly a balance between "octopus merges are fine" and "Christ, that's not an octopus, that's a Cthulhu merge". --Linus Torvalds -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PATCH: mvsas: add support for Supermicro AOC-SAS2LP-MV8
Hello, Xiangliang Yu wrote: : Ben Hutchings already submitted a patch for this twice, which I cc'd you : on: : : http://marc.info/?t=13927720393 : : will you ack it? : I can't find this mail in my mail box. any news with this patch? Will it be acked by you and submitted upstream? Thanks! -Jan Kasprzak -- | Jan Yenya Kasprzak kas at {fi.muni.cz - work | yenya.net - private} | | New GPG 4096R/A45477D5 - see http://www.fi.muni.cz/~kas/pgp-rollover.txt | | http://www.fi.muni.cz/~kas/Journal: http://www.fi.muni.cz/~kas/blog/ | There's clearly a balance between octopus merges are fine and Christ, that's not an octopus, that's a Cthulhu merge. --Linus Torvalds -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
PATCH: mvsas: add support for Supermicro AOC-SAS2LP-MV8
Add support for the AOC-SAS2LP-MV8 SAS-2 controller from SuperMicro. This controller has subdevice id 0x9485 instead of 0x9480, and apparently this simple patch is the only thing needed to make it work. # lspci -vn [...] 03:00.0 0104: 1b4b:9485 (rev 03) Subsystem: 1b4b:9485 Flags: bus master, fast devsel, latency 0, IRQ 24 Memory at feba (64-bit, non-prefetchable) [size=128K] Memory at febc (64-bit, non-prefetchable) [size=256K] Expansion ROM at feb9 [disabled] [size=64K] Capabilities: [40] Power Management version 3 Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [70] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Capabilities: [140] Virtual Channel Kernel driver in use: mvsas Kernel modules: mvsas Signed-off-by: Jan Kasprzak diff --git a/drivers/scsi/mvsas/mv_init.c b/drivers/scsi/mvsas/mv_init.c index 5ff978b..eacee48 100644 --- a/drivers/scsi/mvsas/mv_init.c +++ b/drivers/scsi/mvsas/mv_init.c @@ -728,6 +728,15 @@ static struct pci_device_id mvs_pci_table[] = { .class_mask = 0, .driver_data= chip_9485, }, + { + .vendor = PCI_VENDOR_ID_MARVELL_EXT, + .device = 0x9485, + .subvendor = PCI_ANY_ID, + .subdevice = 0x9485, + .class = 0, + .class_mask = 0, + .driver_data= chip_9485, + }, { PCI_VDEVICE(OCZ, 0x1021), chip_9485}, /* OCZ RevoDrive3 */ { PCI_VDEVICE(OCZ, 0x1022), chip_9485}, /* OCZ RevoDrive3/zDriveR4 (exact model unknown) */ { PCI_VDEVICE(OCZ, 0x1040), chip_9485}, /* OCZ RevoDrive3/zDriveR4 (exact model unknown) */ -- | Jan "Yenya" Kasprzak | | New GPG 4096R/A45477D5 - see http://www.fi.muni.cz/~kas/pgp-rollover.txt | | http://www.fi.muni.cz/~kas/Journal: http://www.fi.muni.cz/~kas/blog/ | There's clearly a balance between "octopus merges are fine" and "Christ, that's not an octopus, that's a Cthulhu merge". --Linus Torvalds -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
PATCH: mvsas: add support for Supermicro AOC-SAS2LP-MV8
Add support for the AOC-SAS2LP-MV8 SAS-2 controller from SuperMicro. This controller has subdevice id 0x9485 instead of 0x9480, and apparently this simple patch is the only thing needed to make it work. # lspci -vn [...] 03:00.0 0104: 1b4b:9485 (rev 03) Subsystem: 1b4b:9485 Flags: bus master, fast devsel, latency 0, IRQ 24 Memory at feba (64-bit, non-prefetchable) [size=128K] Memory at febc (64-bit, non-prefetchable) [size=256K] Expansion ROM at feb9 [disabled] [size=64K] Capabilities: [40] Power Management version 3 Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [70] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Capabilities: [140] Virtual Channel Kernel driver in use: mvsas Kernel modules: mvsas Signed-off-by: Jan Kasprzak k...@fi.muni.cz diff --git a/drivers/scsi/mvsas/mv_init.c b/drivers/scsi/mvsas/mv_init.c index 5ff978b..eacee48 100644 --- a/drivers/scsi/mvsas/mv_init.c +++ b/drivers/scsi/mvsas/mv_init.c @@ -728,6 +728,15 @@ static struct pci_device_id mvs_pci_table[] = { .class_mask = 0, .driver_data= chip_9485, }, + { + .vendor = PCI_VENDOR_ID_MARVELL_EXT, + .device = 0x9485, + .subvendor = PCI_ANY_ID, + .subdevice = 0x9485, + .class = 0, + .class_mask = 0, + .driver_data= chip_9485, + }, { PCI_VDEVICE(OCZ, 0x1021), chip_9485}, /* OCZ RevoDrive3 */ { PCI_VDEVICE(OCZ, 0x1022), chip_9485}, /* OCZ RevoDrive3/zDriveR4 (exact model unknown) */ { PCI_VDEVICE(OCZ, 0x1040), chip_9485}, /* OCZ RevoDrive3/zDriveR4 (exact model unknown) */ -- | Jan Yenya Kasprzak kas at {fi.muni.cz - work | yenya.net - private} | | New GPG 4096R/A45477D5 - see http://www.fi.muni.cz/~kas/pgp-rollover.txt | | http://www.fi.muni.cz/~kas/Journal: http://www.fi.muni.cz/~kas/blog/ | There's clearly a balance between octopus merges are fine and Christ, that's not an octopus, that's a Cthulhu merge. --Linus Torvalds -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 05/14] Convert from class_device to device for cosa sync driver
Tony Jones wrote: : On Mon, Aug 20, 2007 at 03:48:11PM -0700, [EMAIL PROTECTED] wrote: : > -- : > Content-Disposition: inline; filename=wan.patch : > : > Convert from class_device to device for drivers/net/wan/cosa. This is part of : > the work to eliminate struct class_device. : : Signed-off-by: Tony Jones <[EMAIL PROTECTED]> Acked-By: Jan "Yenya" Kasprzak <[EMAIL PROTECTED]> -Yenya -- | Jan "Yenya" Kasprzak | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/Journal: http://www.fi.muni.cz/~kas/blog/ | == Pruning mailbox after being ~10 days offline. == Sorry for the possibly delayed reply. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 05/14] Convert from class_device to device for cosa sync driver
Tony Jones wrote: : On Mon, Aug 20, 2007 at 03:48:11PM -0700, [EMAIL PROTECTED] wrote: : -- : Content-Disposition: inline; filename=wan.patch : : Convert from class_device to device for drivers/net/wan/cosa. This is part of : the work to eliminate struct class_device. : : Signed-off-by: Tony Jones [EMAIL PROTECTED] Acked-By: Jan Yenya Kasprzak [EMAIL PROTECTED] -Yenya -- | Jan Yenya Kasprzak kas at {fi.muni.cz - work | yenya.net - private} | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/Journal: http://www.fi.muni.cz/~kas/blog/ | == Pruning mailbox after being ~10 days offline. == Sorry for the possibly delayed reply. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: MOXA: mxser_new lockup
Jiri Slaby wrote: : I went through the locking again just now, but I can't see anything : wrong. Could you please enable CONFIG_PROVE_LOCKING and CONFIG_DEBUG_LOCKDEP and : retest? : : And also post stty -a -F /dev/ttyMIXX or a setup passed to : tcsetattr/ioctl(TCSETA/S) if available. What do you use for communication? I use cu(1) from the uucp package. This MOXA card is used as a serial console for 8 other linux boxes (mgetty is on the other side). I have recompiled the kernel with debugging options you have mentioned, connected to the remote serial console using "cu -l ttyMI0 -s 38400", logged to the remote system, and ran "find / -print" to generate some serial traffic. It went for a minute or so, but as soon as I have pressed enter on my keyboard (generating traffic in an opposite direction), I've got the recursive locking message (I have tried it twice, two dumps are attached). Looking at the code, maybe the >slock in mxser_interrupt() should be dropped before calling mxser_receive_chars() (and probably also befor calling other mxser_* functions too). A proof-of-concept patch attached - I am not able to reproduce the lockup with this patch. Another approach would be to use in_interrupt() all over the code to determine whether the lock should be re-aquired or not. -Yenya [ now the two "recursive locking" dumps, and the patch itself ] = [ INFO: possible recursive locking detected ] 2.6.21-rc6 #8 - cu/1793 is trying to acquire lock: (>slock){++..}, at: [] mxser_flush_chars+0x4d/0x83 [mxser_new] but task is already holding lock: (>slock){++..}, at: [] mxser_interrupt+0xc2/0x237 [mxser_new] other info that might help us debug this: 2 locks held by cu/1793: #0: (>atomic_write_lock){--..}, at: [] tty_write+0x9e/0x211 #1: (>slock){++..}, at: [] mxser_interrupt+0xc2/0x237 [mxser_new] stack backtrace: Call Trace: [] __lock_acquire+0x155/0xbe1 [] :mxser_new:mxser_flush_chars+0x4d/0x83 [] lock_acquire+0x7b/0x9f [] :mxser_new:mxser_flush_chars+0x4d/0x83 [] _spin_lock_irqsave+0x2d/0x3e [] :mxser_new:mxser_flush_chars+0x4d/0x83 [] n_tty_receive_buf+0xd26/0xddc [] __lock_acquire+0xb4c/0xbe1 [] flush_to_ldisc+0x3d/0x191 [] flush_to_ldisc+0x11a/0x191 [] flush_to_ldisc+0x131/0x191 [] :mxser_new:mxser_receive_chars+0x25f/0x26e [] :mxser_new:mxser_interrupt+0x198/0x237 [] handle_IRQ_event+0x1e/0x51 [] handle_fasteoi_irq+0x9a/0xd9 [] do_IRQ+0x6f/0xd6 [] ret_from_intr+0x0/0xf [] _spin_unlock_irqrestore+0x40/0x44 [] write_chan+0x2e0/0x2ff [] default_wake_function+0x0/0xe [] tty_write+0x177/0x211 [] write_chan+0x0/0x2ff [] vfs_write+0xce/0x157 [] sys_write+0x45/0x6e [] system_call+0x7e/0x83 BUG: spinlock lockup on CPU#1, cu/1793, 880082a8 Call Trace: [] _raw_spin_lock+0xcc/0xf3 [] _spin_lock_irqsave+0x35/0x3e [] :mxser_new:mxser_flush_chars+0x4d/0x83 [] n_tty_receive_buf+0xd26/0xddc [] __lock_acquire+0xb4c/0xbe1 [] flush_to_ldisc+0x3d/0x191 [] flush_to_ldisc+0x11a/0x191 [] flush_to_ldisc+0x131/0x191 [] :mxser_new:mxser_receive_chars+0x25f/0x26e [] :mxser_new:mxser_interrupt+0x198/0x237 [] handle_IRQ_event+0x1e/0x51 [] handle_fasteoi_irq+0x9a/0xd9 [] do_IRQ+0x6f/0xd6 [] ret_from_intr+0x0/0xf [] _spin_unlock_irqrestore+0x40/0x44 [] write_chan+0x2e0/0x2ff [] default_wake_function+0x0/0xe [] tty_write+0x177/0x211 [] write_chan+0x0/0x2ff [] vfs_write+0xce/0x157 [] sys_write+0x45/0x6e [] system_call+0x7e/0x83 --- Another one: = [ INFO: possible recursive locking detected ] ӡ���2.6.21-rc6 #8 - swapper/0 is trying to acquire lock: (>slock){++..}, at: [] mxser_stop+0x1c/0x43 [mxser_new] but task is already holding lock: (>slock){++..}, at: [] mxser_interrupt+0xc2/0x237 [mxser_new] other info that might help us debug this: 1 lock held by swapper/0: #0: (>slock){++..}, at: [] mxser_interrupt+0xc2/0x237 [mxser_new] stack backtrace: Call Trace: [] __lock_acquire+0x155/0xbe1 [] :mxser_new:mxser_stop+0x1c/0x43 [] lock_acquire+0x7b/0x9f [] :mxser_new:mxser_stop+0x1c/0x43 [] _spin_lock_irqsave+0x2d/0x3e [] :mxser_new:mxser_stop+0x1c/0x43 [] n_tty_receive_buf+0x305/0xddc [] __lock_acquire+0xb4c/0xbe1 [] flush_to_ldisc+0x3d/0x191 [] flush_to_ldisc+0x11a/0x191 [] flush_to_ldisc+0x131/0x191 [] :mxser_new:mxser_receive_chars+0x25f/0x26e [] :mxser_new:mxser_interrupt+0x198/0x237 [] handle_IRQ_event+0x1e/0x51 [] handle_fasteoi_irq+0x9a/0xd9 [] do_IRQ+0x6f/0xd6 [] default_idle+0x35/0x51 [] default_idle+0x0/0x51 [] ret_from_intr+0x0/0xf [] default_idle+0x0/0x51 [] default_idle+0x35/0x51 [] default_idle+0x37/0x51 [] default_idle+0x35/0x51 [] cpu_idle+0x57/0x76 BUG: spinlock lockup on CPU#1, swapper/0, 880082a8 Call Trace: []
Re: MOXA: mxser_new lockup
Jiri Slaby wrote: : I went through the locking again just now, but I can't see anything : wrong. Could you please enable CONFIG_PROVE_LOCKING and CONFIG_DEBUG_LOCKDEP and : retest? : : And also post stty -a -F /dev/ttyMIXX or a setup passed to : tcsetattr/ioctl(TCSETA/S) if available. What do you use for communication? I use cu(1) from the uucp package. This MOXA card is used as a serial console for 8 other linux boxes (mgetty is on the other side). I have recompiled the kernel with debugging options you have mentioned, connected to the remote serial console using cu -l ttyMI0 -s 38400, logged to the remote system, and ran find / -print to generate some serial traffic. It went for a minute or so, but as soon as I have pressed enter on my keyboard (generating traffic in an opposite direction), I've got the recursive locking message (I have tried it twice, two dumps are attached). Looking at the code, maybe the port-slock in mxser_interrupt() should be dropped before calling mxser_receive_chars() (and probably also befor calling other mxser_* functions too). A proof-of-concept patch attached - I am not able to reproduce the lockup with this patch. Another approach would be to use in_interrupt() all over the code to determine whether the lock should be re-aquired or not. -Yenya [ now the two recursive locking dumps, and the patch itself ] = [ INFO: possible recursive locking detected ] 2.6.21-rc6 #8 - cu/1793 is trying to acquire lock: (info-slock){++..}, at: [88003afd] mxser_flush_chars+0x4d/0x83 [mxser_new] but task is already holding lock: (info-slock){++..}, at: [88003ead] mxser_interrupt+0xc2/0x237 [mxser_new] other info that might help us debug this: 2 locks held by cu/1793: #0: (tty-atomic_write_lock){--..}, at: [802252dd] tty_write+0x9e/0x211 #1: (info-slock){++..}, at: [88003ead] mxser_interrupt+0xc2/0x237 [mxser_new] stack backtrace: Call Trace: IRQ [8028c6e4] __lock_acquire+0x155/0xbe1 [88003afd] :mxser_new:mxser_flush_chars+0x4d/0x83 [8028d1eb] lock_acquire+0x7b/0x9f [88003afd] :mxser_new:mxser_flush_chars+0x4d/0x83 [8025947c] _spin_lock_irqsave+0x2d/0x3e [88003afd] :mxser_new:mxser_flush_chars+0x4d/0x83 [80212a9b] n_tty_receive_buf+0xd26/0xddc [8028d0db] __lock_acquire+0xb4c/0xbe1 [803377b1] flush_to_ldisc+0x3d/0x191 [8033788e] flush_to_ldisc+0x11a/0x191 [803378a5] flush_to_ldisc+0x131/0x191 [880025aa] :mxser_new:mxser_receive_chars+0x25f/0x26e [88003f83] :mxser_new:mxser_interrupt+0x198/0x237 [8020ed84] handle_IRQ_event+0x1e/0x51 [8029ccd4] handle_fasteoi_irq+0x9a/0xd9 [80260ccd] do_IRQ+0x6f/0xd6 [80253aa6] ret_from_intr+0x0/0xf EOI [80259406] _spin_unlock_irqrestore+0x40/0x44 [802175dc] write_chan+0x2e0/0x2ff [80275d26] default_wake_function+0x0/0xe [802253b6] tty_write+0x177/0x211 [802172fc] write_chan+0x0/0x2ff [802142a3] vfs_write+0xce/0x157 [80214baa] sys_write+0x45/0x6e [8025355e] system_call+0x7e/0x83 BUG: spinlock lockup on CPU#1, cu/1793, 880082a8 Call Trace: IRQ [80207632] _raw_spin_lock+0xcc/0xf3 [80259484] _spin_lock_irqsave+0x35/0x3e [88003afd] :mxser_new:mxser_flush_chars+0x4d/0x83 [80212a9b] n_tty_receive_buf+0xd26/0xddc [8028d0db] __lock_acquire+0xb4c/0xbe1 [803377b1] flush_to_ldisc+0x3d/0x191 [8033788e] flush_to_ldisc+0x11a/0x191 [803378a5] flush_to_ldisc+0x131/0x191 [880025aa] :mxser_new:mxser_receive_chars+0x25f/0x26e [88003f83] :mxser_new:mxser_interrupt+0x198/0x237 [8020ed84] handle_IRQ_event+0x1e/0x51 [8029ccd4] handle_fasteoi_irq+0x9a/0xd9 [80260ccd] do_IRQ+0x6f/0xd6 [80253aa6] ret_from_intr+0x0/0xf EOI [80259406] _spin_unlock_irqrestore+0x40/0x44 [802175dc] write_chan+0x2e0/0x2ff [80275d26] default_wake_function+0x0/0xe [802253b6] tty_write+0x177/0x211 [802172fc] write_chan+0x0/0x2ff [802142a3] vfs_write+0xce/0x157 [80214baa] sys_write+0x45/0x6e [8025355e] system_call+0x7e/0x83 --- Another one: = [ INFO: possible recursive locking detected ] ӡ���2.6.21-rc6 #8 - swapper/0 is trying to acquire lock: (info-slock){++..}, at: [88003c6d] mxser_stop+0x1c/0x43 [mxser_new] but task is already holding lock: (info-slock){++..}, at: [88003ead] mxser_interrupt+0xc2/0x237 [mxser_new] other info that might help us debug this: 1 lock held by swapper/0: #0: (info-slock){++..}, at: [88003ead]
MOXA: mxser_new lockup
Hello, I have a MOXA C168H card, and after the upgrade from 2.6.19 to 2.6.21-rc6 (and from moxa.c to mxser_new.c) I get a system lockup after few seconds of communicating over the MOXA serial line. Noting is printed on the serial console at all. The system is SMP x86_64 (Fedora 5). I may be able to do a limited testing on request, but the system is in production use most of the time. For now, I will probably move back to the older driver. Are you aware of any SMP or 64-bit issues in mxser_new.c? Thanks, -Yenya -- | Jan "Yenya" Kasprzak | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/Journal: http://www.fi.muni.cz/~kas/blog/ | > I will never go to meetings again because I think face to face meetings < > are the biggest waste of time you can ever have.--Linus Torvalds < - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
MOXA: mxser_new lockup
Hello, I have a MOXA C168H card, and after the upgrade from 2.6.19 to 2.6.21-rc6 (and from moxa.c to mxser_new.c) I get a system lockup after few seconds of communicating over the MOXA serial line. Noting is printed on the serial console at all. The system is SMP x86_64 (Fedora 5). I may be able to do a limited testing on request, but the system is in production use most of the time. For now, I will probably move back to the older driver. Are you aware of any SMP or 64-bit issues in mxser_new.c? Thanks, -Yenya -- | Jan Yenya Kasprzak kas at {fi.muni.cz - work | yenya.net - private} | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/Journal: http://www.fi.muni.cz/~kas/blog/ | I will never go to meetings again because I think face to face meetings are the biggest waste of time you can ever have.--Linus Torvalds - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: No mptable found (Tyan h1000E)
Len Brown wrote: : This board appears to have no APIC (MADT) table -- which is what Linux uses to enumerate processors : in ACPI mode. (doesn't have MPS either, but in ACPI mode you wouldn't use it anyway). : : Are there any BIOS SETUP settings for enabling SMP or ACPI features? : In the ACPI submenu there are following options: ACPI version features (1.0, 2.0, 3.0 - I have set it to 3.0, but I think I have tried 2.0 as well) ACPI APIC support (I have Enabled) ACPI SRAT table (Enabled) ACPI OEMB table (Enabled) Headless mode (Disabled) Then there is a MPS option in a different submenu (I have 1.4, but I have tried 1.1 as well). Another APIC related function is in the southbridge setup: Hide XIOAPIC PCI function (I think I have it disabled[*]) [*] I had to put this box to production use (even with one CPU visible), but I will have an identical box available in two weeks or so, so I can do more testing then. : I guess that /proc/interrupts shows this system running in PIC mode? Yes, of course. : Please open a bugzilla here and attach the output from acpidump. : http://bugzilla.kernel.org/enter_bug.cgi?product=ACPI : Component: Config-Processors : Done, http://bugzilla.kernel.org/show_bug.cgi?id=7908 BTW, when trying to attach the dmesg output to the bugzilla entry, I've got the following message: > Attachment #10239 to Bug #7908 Created > /home/www/dead.letter... Saved message in /home/www/dead.letter > Email sent to: [EMAIL PROTECTED], > acpi-bugzilla@lists.sourceforge.net > Excluding: [EMAIL PROTECTED] I think the dead.letter message should not be there :-) Apparently, this bugzilla does not have a "bugzilla" category. Where can I report this? : Any chance you can see if Windows finds multiple processors on this board with this version of the BIOS? : I can try this in two weeks or so. : ps. after you get the acpidump for the failing system, check for a BIOS update. In the pervious mail I wrote: [...] : > the latest Fedora kernel (2.6.19-1.2895.fc6). I have the latest BIOS : > available for this board, and the BIOS can see all four cores. So AFAIK there is no BIOS update available. -Yenya -- | Jan "Yenya" Kasprzak | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/Journal: http://www.fi.muni.cz/~kas/blog/ | > I will never go to meetings again because I think face to face meetings < > are the biggest waste of time you can ever have.--Linus Torvalds < - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: No mptable found (Tyan h1000E)
Len Brown wrote: : This board appears to have no APIC (MADT) table -- which is what Linux uses to enumerate processors : in ACPI mode. (doesn't have MPS either, but in ACPI mode you wouldn't use it anyway). : : Are there any BIOS SETUP settings for enabling SMP or ACPI features? : In the ACPI submenu there are following options: ACPI version features (1.0, 2.0, 3.0 - I have set it to 3.0, but I think I have tried 2.0 as well) ACPI APIC support (I have Enabled) ACPI SRAT table (Enabled) ACPI OEMB table (Enabled) Headless mode (Disabled) Then there is a MPS option in a different submenu (I have 1.4, but I have tried 1.1 as well). Another APIC related function is in the southbridge setup: Hide XIOAPIC PCI function (I think I have it disabled[*]) [*] I had to put this box to production use (even with one CPU visible), but I will have an identical box available in two weeks or so, so I can do more testing then. : I guess that /proc/interrupts shows this system running in PIC mode? Yes, of course. : Please open a bugzilla here and attach the output from acpidump. : http://bugzilla.kernel.org/enter_bug.cgi?product=ACPI : Component: Config-Processors : Done, http://bugzilla.kernel.org/show_bug.cgi?id=7908 BTW, when trying to attach the dmesg output to the bugzilla entry, I've got the following message: Attachment #10239 to Bug #7908 Created /home/www/dead.letter... Saved message in /home/www/dead.letter Email sent to: [EMAIL PROTECTED], acpi-bugzilla@lists.sourceforge.net Excluding: [EMAIL PROTECTED] I think the dead.letter message should not be there :-) Apparently, this bugzilla does not have a bugzilla category. Where can I report this? : Any chance you can see if Windows finds multiple processors on this board with this version of the BIOS? : I can try this in two weeks or so. : ps. after you get the acpidump for the failing system, check for a BIOS update. In the pervious mail I wrote: [...] : the latest Fedora kernel (2.6.19-1.2895.fc6). I have the latest BIOS : available for this board, and the BIOS can see all four cores. So AFAIK there is no BIOS update available. -Yenya -- | Jan Yenya Kasprzak kas at {fi.muni.cz - work | yenya.net - private} | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/Journal: http://www.fi.muni.cz/~kas/blog/ | I will never go to meetings again because I think face to face meetings are the biggest waste of time you can ever have.--Linus Torvalds - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
No mptable found (Tyan h1000E)
Hello, I have a Tyan h1000E (S3970) dual-socket board, with two dual-core AMD Athlon 2210 CPUs (4 cores total). The problem is that the kernel apparently cannot detect the SMP configuration (after boot, /proc/cpuinfo lists only one processor). The full dmesg output is available at http://www.fi.muni.cz/~kas/tmp/dmesg-h1000E.txt The most interesting parts of it are probably these (with my comments inline marked by "---"): [...] SRAT: PXM 0 -> APIC 0 -> Node 0 SRAT: PXM 0 -> APIC 1 -> Node 0 SRAT: PXM 1 -> APIC 2 -> Node 1 SRAT: PXM 1 -> APIC 3 -> Node 1 --- So the kernel can see all four APICs. [...] Bootmem setup node 0 -bfff No mptable found. --- the above does not depend on MPS 1.1 or 1.4 settings in the BIOS [...] Kernel command line: ro root=/dev/md0 console=ttyS0,38400n8 Initializing CPU#0 PID hash table entries: 4096 (order: 12, 32768 bytes) Console: colour VGA+ 80x25 Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes) Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes) Checking aperture... CPU 0: aperture @ e800 size 128 MB CPU 1: aperture @ e800 size 128 MB --- the kernel knows something about CPU1 (presumably the second core of CPU0). [...] CPU 0/0 -> Node 0 CPU: Physical Processor ID: 0 CPU: Processor Core ID: 0 SMP alternatives: switching to UP code Freeing SMP alternatives: 32k freed ACPI: Core revision 20060707 ACPI: setting ELCR to 0200 (from 0e20) weird, boot CPU (#0) not listed by the BIOS. SMP motherboard not detected. Using local APIC timer interrupts. result 12469270 Detected 12.469 MHz APIC timer. testing NMI watchdog ... OK. SMP disabled --- hmm, no SMP configuration detected after all. Brought up 1 CPUs testing NMI watchdog ... OK. ^^^ here it waits for few seconds before printing "OK." I have tested it with vanilla 2.6.19.2, 2.6.20-rc6, and the latest Fedora kernel (2.6.19-1.2895.fc6). I have the latest BIOS available for this board, and the BIOS can see all four cores. How can I make all four cores visible by the Linux kernel? Thanks, -Yenya -- | Jan "Yenya" Kasprzak | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/Journal: http://www.fi.muni.cz/~kas/blog/ | > I will never go to meetings again because I think face to face meetings < > are the biggest waste of time you can ever have.--Linus Torvalds < - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
No mptable found (Tyan h1000E)
Hello, I have a Tyan h1000E (S3970) dual-socket board, with two dual-core AMD Athlon 2210 CPUs (4 cores total). The problem is that the kernel apparently cannot detect the SMP configuration (after boot, /proc/cpuinfo lists only one processor). The full dmesg output is available at http://www.fi.muni.cz/~kas/tmp/dmesg-h1000E.txt The most interesting parts of it are probably these (with my comments inline marked by ---): [...] SRAT: PXM 0 - APIC 0 - Node 0 SRAT: PXM 0 - APIC 1 - Node 0 SRAT: PXM 1 - APIC 2 - Node 1 SRAT: PXM 1 - APIC 3 - Node 1 --- So the kernel can see all four APICs. [...] Bootmem setup node 0 -bfff No mptable found. --- the above does not depend on MPS 1.1 or 1.4 settings in the BIOS [...] Kernel command line: ro root=/dev/md0 console=ttyS0,38400n8 Initializing CPU#0 PID hash table entries: 4096 (order: 12, 32768 bytes) Console: colour VGA+ 80x25 Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes) Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes) Checking aperture... CPU 0: aperture @ e800 size 128 MB CPU 1: aperture @ e800 size 128 MB --- the kernel knows something about CPU1 (presumably the second core of CPU0). [...] CPU 0/0 - Node 0 CPU: Physical Processor ID: 0 CPU: Processor Core ID: 0 SMP alternatives: switching to UP code Freeing SMP alternatives: 32k freed ACPI: Core revision 20060707 ACPI: setting ELCR to 0200 (from 0e20) weird, boot CPU (#0) not listed by the BIOS. SMP motherboard not detected. Using local APIC timer interrupts. result 12469270 Detected 12.469 MHz APIC timer. testing NMI watchdog ... OK. SMP disabled --- hmm, no SMP configuration detected after all. Brought up 1 CPUs testing NMI watchdog ... OK. ^^^ here it waits for few seconds before printing OK. I have tested it with vanilla 2.6.19.2, 2.6.20-rc6, and the latest Fedora kernel (2.6.19-1.2895.fc6). I have the latest BIOS available for this board, and the BIOS can see all four cores. How can I make all four cores visible by the Linux kernel? Thanks, -Yenya -- | Jan Yenya Kasprzak kas at {fi.muni.cz - work | yenya.net - private} | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/Journal: http://www.fi.muni.cz/~kas/blog/ | I will never go to meetings again because I think face to face meetings are the biggest waste of time you can ever have.--Linus Torvalds - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
TCP segmentation offload performance
Hello, world! I tried to find out whether the TCP segmentation offload can perform better on my server than no TSO at all. My server is dual Opteron 244 with Tyan S2882 board with the following NIC: eth0: Tigon3 [partno(BCM95704A7) rev 2003 PHY(5704)] (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:e0:81:27:de:17 eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1] eth0: dma_rwctrl[769f4000] The server runs ProFTPd with sendfile(2) enabled (and I have verified that it is being used with strace(8)). The kernel is 2.6.12.2. I have found that according to ethtool -k eth0 the TSO is switched off by default. So I tried to switch it on (altough I wondered why it is not switched on by default, provided that the hardware supports this feature). I tried to measure the difference by downloading an ISO image of FC4 i386 CD1 (665434112 bytes) from two hosts connected to the same switch. I did 10 transfers of the same file with each settings, and took the average and maximum of the last five transfers only (to avoid any start-up temporary conditions). The client Alpha was dual Opteron 248 with Tyan S2882 board, and the client Beta was quad Opteron 848 on HP DL-585 board. Client TSO Average speed Max speed Alpha off 108.7 MB/s 110.5 MB/s Alpha on 100.9 MB/s 101.2 MB/s Betaoff 102.1 MB/s 102.4 MB/s Betaon 93.2 MB/s 95.5 MB/s Surprisingly enough, the tests without TSO were faster than with TSO enabled. Looking at tcpdump it seems that the system with TSO enabled sends only a 15 KB-sized frames to the NIC instead of full 64 KB-sized ones: 18:45:38.993150 IP odysseus.ftp-data > alpha.33125: P 127424:143352(15928) ack 1 win 1460 18:45:38.993203 IP odysseus.ftp-data > alpha.33125: P 143352:159280(15928) ack 1 win 1460 So I wonder what is wrong with TSO on my hardware and whether the TSO is expected to be faster than generating MTU-sized packets in the TCP stack. I did not measure the CPU usage on the server, only the network speed. Thanks! -Yenya -- | Jan "Yenya" Kasprzak | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/Journal: http://www.fi.muni.cz/~kas/blog/ | >>> $ cd my-kernel-tree-2.6 <<< >>> $ dotest /path/to/mbox # yes, Linus has no taste in naming scripts <<< - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
TCP segmentation offload performance
Hello, world! I tried to find out whether the TCP segmentation offload can perform better on my server than no TSO at all. My server is dual Opteron 244 with Tyan S2882 board with the following NIC: eth0: Tigon3 [partno(BCM95704A7) rev 2003 PHY(5704)] (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:e0:81:27:de:17 eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1] eth0: dma_rwctrl[769f4000] The server runs ProFTPd with sendfile(2) enabled (and I have verified that it is being used with strace(8)). The kernel is 2.6.12.2. I have found that according to ethtool -k eth0 the TSO is switched off by default. So I tried to switch it on (altough I wondered why it is not switched on by default, provided that the hardware supports this feature). I tried to measure the difference by downloading an ISO image of FC4 i386 CD1 (665434112 bytes) from two hosts connected to the same switch. I did 10 transfers of the same file with each settings, and took the average and maximum of the last five transfers only (to avoid any start-up temporary conditions). The client Alpha was dual Opteron 248 with Tyan S2882 board, and the client Beta was quad Opteron 848 on HP DL-585 board. Client TSO Average speed Max speed Alpha off 108.7 MB/s 110.5 MB/s Alpha on 100.9 MB/s 101.2 MB/s Betaoff 102.1 MB/s 102.4 MB/s Betaon 93.2 MB/s 95.5 MB/s Surprisingly enough, the tests without TSO were faster than with TSO enabled. Looking at tcpdump it seems that the system with TSO enabled sends only a 15 KB-sized frames to the NIC instead of full 64 KB-sized ones: 18:45:38.993150 IP odysseus.ftp-data alpha.33125: P 127424:143352(15928) ack 1 win 1460 nop,nop,timestamp 2408404 698142942 18:45:38.993203 IP odysseus.ftp-data alpha.33125: P 143352:159280(15928) ack 1 win 1460 nop,nop,timestamp 2408404 698142942 So I wonder what is wrong with TSO on my hardware and whether the TSO is expected to be faster than generating MTU-sized packets in the TCP stack. I did not measure the CPU usage on the server, only the network speed. Thanks! -Yenya -- | Jan Yenya Kasprzak kas at {fi.muni.cz - work | yenya.net - private} | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/Journal: http://www.fi.muni.cz/~kas/blog/ | $ cd my-kernel-tree-2.6 $ dotest /path/to/mbox # yes, Linus has no taste in naming scripts - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.11.3 slab error crash
Hi all, yesterday our HP DL-585 quad opteron server running 2.6.11.3 crashed with the attached messages. The main load is NFS serving and running postfix as well as some other userland processes. The server uses XFS on top of LVM for NFS-exported volumes. More details available at request. Thanks, -Yenya Apr 7 16:25:11 opteron kernel: slab error in cache_free_debugcheck(): cache `size-1024': double free, or memory outside object was overwritten Apr 7 16:25:11 opteron kernel: Apr 7 16:25:11 opteron kernel: Call Trace:{cache_free_debugcheck+371} {kfree+213} Apr 7 16:25:11 opteron kernel:{nlm4svc_callback+150} {nlm4svc_proc_unlock_msg+112} Apr 7 16:25:11 opteron kernel: {nlm4svc_decode_unlockargs+48} {svc_process+853} Apr 7 16:25:11 opteron kernel:{lockd+359} {lockd+0} Apr 7 16:25:11 opteron kernel:{child_rip+8} {lockd+0} Apr 7 16:25:11 opteron kernel:{lockd+0} {child_rip+0} Apr 7 16:25:11 opteron kernel: Apr 7 16:25:11 opteron kernel: 81050ffacae8: redzone 1: 0x5a2cf071, redzone 2: 0x5a2cf071. Apr 7 16:25:11 opteron kernel: --- [cut here ] - [please bite here ] - Apr 7 16:25:11 opteron kernel: Kernel BUG at host:250 Apr 7 16:25:11 opteron kernel: invalid operand: [1] SMP Apr 7 16:25:11 opteron kernel: CPU 2 Apr 7 16:25:11 opteron kernel: Modules linked in: ide_cd cdrom Apr 7 16:25:12 opteron kernel: Pid: 18003, comm: rpciod/2 Not tainted 2.6.11.3 Apr 7 16:25:12 opteron kernel: RIP: 0010:[] {nlm_release_host+47} Apr 7 16:25:12 opteron kernel: RSP: :810207177da8 EFLAGS: 00010286 Apr 7 16:25:12 opteron kernel: RAX: RBX: 8103cb093248 RCX: 810064b23470 Apr 7 16:25:12 opteron kernel: RDX: fffb RSI: 0296 RDI: 8103cb093248 Apr 7 16:25:12 opteron kernel: RBP: 810142def688 R08: R09: 810142def9a8 Apr 7 16:25:12 opteron kernel: R10: 80582a80 R11: 2000 R12: 8100bfc33678 Apr 7 16:25:12 opteron kernel: R13: 80149380 R14: R15: 80149380 Apr 7 16:25:12 opteron kernel: FS: 2ae054c0() GS:80582b80() knlGS:556b7080 Apr 7 16:25:12 opteron kernel: CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b Apr 7 16:25:12 opteron kernel: CR2: 2b1a5970 CR3: 82e3 CR4: 06e0 Apr 7 16:25:12 opteron kernel: Process rpciod/2 (pid: 18003, threadinfo 810207176000, task 810207281210) Apr 7 16:25:12 opteron kernel: Stack: 8101467b8658 8021ec79 803fcc47 Apr 7 16:25:12 opteron kernel:810207177e58 8040e71a 2000 007627d8 Apr 7 16:25:12 opteron kernel:0003 8100bfc336b8 Apr 7 16:25:12 opteron kernel: Call Trace:{nlm4svc_callback_exit+57} {__rpc_execute+855} Apr 7 16:25:12 opteron kernel:{thread_return+42} {__wake_up+67} Apr 7 16:25:12 opteron kernel:{rpc_async_schedule+0} {worker_thread+476} Apr 7 16:25:12 opteron kernel: {default_wake_function+0} {default_wake_function+0} Apr 7 16:25:12 opteron kernel: {keventd_create_kthread+0} {worker_thread+0} Apr 7 16:25:12 opteron kernel: {keventd_create_kthread+0} {kthread+146} Apr 7 16:25:12 opteron kernel:{child_rip+8} {keventd_create_kthread+0} Apr 7 16:25:12 opteron kernel:{kthread+0} {child_rip+0} Apr 7 16:25:12 opteron kernel: Apr 7 16:25:12 opteron kernel: Apr 7 16:25:12 opteron kernel: Code: 0f 0b 7b 1d 45 80 ff ff ff ff fa 00 66 66 90 66 90 5b c3 66 -- | Jan "Yenya" Kasprzak | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ | > Whatever the Java applications and desktop dances may lead to, Unix will < > still be pushing the packets around for a quite a while. --Rob Pike < - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.11.3 slab error crash
Hi all, yesterday our HP DL-585 quad opteron server running 2.6.11.3 crashed with the attached messages. The main load is NFS serving and running postfix as well as some other userland processes. The server uses XFS on top of LVM for NFS-exported volumes. More details available at request. Thanks, -Yenya Apr 7 16:25:11 opteron kernel: slab error in cache_free_debugcheck(): cache `size-1024': double free, or memory outside object was overwritten Apr 7 16:25:11 opteron kernel: Apr 7 16:25:11 opteron kernel: Call Trace:8015b493{cache_free_debugcheck+371} 8015c615{kfree+213} Apr 7 16:25:11 opteron kernel:8021ec16{nlm4svc_callback+150} 8021e6f0{nlm4svc_proc_unlock_msg+112} Apr 7 16:25:11 opteron kernel: 8021d7d0{nlm4svc_decode_unlockargs+48} 803ff445{svc_process+853} Apr 7 16:25:11 opteron kernel:80218987{lockd+359} 80218820{lockd+0} Apr 7 16:25:11 opteron kernel:8010ddbf{child_rip+8} 80218820{lockd+0} Apr 7 16:25:11 opteron kernel:80218820{lockd+0} 8010ddb7{child_rip+0} Apr 7 16:25:11 opteron kernel: Apr 7 16:25:11 opteron kernel: 81050ffacae8: redzone 1: 0x5a2cf071, redzone 2: 0x5a2cf071. Apr 7 16:25:11 opteron kernel: --- [cut here ] - [please bite here ] - Apr 7 16:25:11 opteron kernel: Kernel BUG at host:250 Apr 7 16:25:11 opteron kernel: invalid operand: [1] SMP Apr 7 16:25:11 opteron kernel: CPU 2 Apr 7 16:25:11 opteron kernel: Modules linked in: ide_cd cdrom Apr 7 16:25:12 opteron kernel: Pid: 18003, comm: rpciod/2 Not tainted 2.6.11.3 Apr 7 16:25:12 opteron kernel: RIP: 0010:[8021841f] 8021841f{nlm_release_host+47} Apr 7 16:25:12 opteron kernel: RSP: :810207177da8 EFLAGS: 00010286 Apr 7 16:25:12 opteron kernel: RAX: RBX: 8103cb093248 RCX: 810064b23470 Apr 7 16:25:12 opteron kernel: RDX: fffb RSI: 0296 RDI: 8103cb093248 Apr 7 16:25:12 opteron kernel: RBP: 810142def688 R08: R09: 810142def9a8 Apr 7 16:25:12 opteron kernel: R10: 80582a80 R11: 2000 R12: 8100bfc33678 Apr 7 16:25:12 opteron kernel: R13: 80149380 R14: R15: 80149380 Apr 7 16:25:12 opteron kernel: FS: 2ae054c0() GS:80582b80() knlGS:556b7080 Apr 7 16:25:12 opteron kernel: CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b Apr 7 16:25:12 opteron kernel: CR2: 2b1a5970 CR3: 82e3 CR4: 06e0 Apr 7 16:25:12 opteron kernel: Process rpciod/2 (pid: 18003, threadinfo 810207176000, task 810207281210) Apr 7 16:25:12 opteron kernel: Stack: 8101467b8658 8021ec79 803fcc47 Apr 7 16:25:12 opteron kernel:810207177e58 8040e71a 2000 007627d8 Apr 7 16:25:12 opteron kernel:0003 8100bfc336b8 Apr 7 16:25:12 opteron kernel: Call Trace:8021ec79{nlm4svc_callback_exit+57} 803fcc47{__rpc_execute+855} Apr 7 16:25:12 opteron kernel:8040e71a{thread_return+42} 8012f483{__wake_up+67} Apr 7 16:25:12 opteron kernel:803fcd20{rpc_async_schedule+0} 8014477c{worker_thread+476} Apr 7 16:25:12 opteron kernel: 8012f3c0{default_wake_function+0} 8012f3c0{default_wake_function+0} Apr 7 16:25:12 opteron kernel: 80148d80{keventd_create_kthread+0} 801445a0{worker_thread+0} Apr 7 16:25:12 opteron kernel: 80148d80{keventd_create_kthread+0} 80148d42{kthread+146} Apr 7 16:25:12 opteron kernel:8010ddbf{child_rip+8} 80148d80{keventd_create_kthread+0} Apr 7 16:25:12 opteron kernel:80148cb0{kthread+0} 8010ddb7{child_rip+0} Apr 7 16:25:12 opteron kernel: Apr 7 16:25:12 opteron kernel: Apr 7 16:25:12 opteron kernel: Code: 0f 0b 7b 1d 45 80 ff ff ff ff fa 00 66 66 90 66 90 5b c3 66 -- | Jan Yenya Kasprzak kas at {fi.muni.cz - work | yenya.net - private} | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ | Whatever the Java applications and desktop dances may lead to, Unix will still be pushing the packets around for a quite a while. --Rob Pike - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: drivers/block/ub.c and multiple LUNs
Pete Zaitcev wrote: : On Mon, 21 Mar 2005 12:42:45 +0100 Jan Kasprzak <[EMAIL PROTECTED]> wrote: : : > is the ub.c USB storage driver supposed to work with multi-LUN(?) devices? : : There's no reason why it won't, see the attached patch. I just need to : send the patch to Greg Kroah one day. OK, thanks - it works for me with Tungsten T5. Sorry for the delayed reply. : > PS.: There is no MAINTAINERS file entry for the ub.c driver, and Pete Zaitcev : > is not even listed in the CREDITS file. : : I didn't think it was important, but I see now that if it helps to prevent : the ub traffic from going to Matt, it ought to be done. I'll send a patch. Thanks! -Yenya -- | Jan "Yenya" Kasprzak | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ | > Whatever the Java applications and desktop dances may lead to, Unix will < > still be pushing the packets around for a quite a while. --Rob Pike < - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: drivers/block/ub.c and multiple LUNs
Pete Zaitcev wrote: : On Mon, 21 Mar 2005 12:42:45 +0100 Jan Kasprzak [EMAIL PROTECTED] wrote: : : is the ub.c USB storage driver supposed to work with multi-LUN(?) devices? : : There's no reason why it won't, see the attached patch. I just need to : send the patch to Greg Kroah one day. OK, thanks - it works for me with Tungsten T5. Sorry for the delayed reply. : PS.: There is no MAINTAINERS file entry for the ub.c driver, and Pete Zaitcev : is not even listed in the CREDITS file. : : I didn't think it was important, but I see now that if it helps to prevent : the ub traffic from going to Matt, it ought to be done. I'll send a patch. Thanks! -Yenya -- | Jan Yenya Kasprzak kas at {fi.muni.cz - work | yenya.net - private} | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ | Whatever the Java applications and desktop dances may lead to, Unix will still be pushing the packets around for a quite a while. --Rob Pike - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
drivers/block/ub.c and multiple LUNs
Hello, is the ub.c USB storage driver supposed to work with multi-LUN(?) devices? I have a Palm Tungsten T5, which in drive mode looks like two USB storage devices (the internal flash is one LUN and the SD/MMC card in the slot is another one). With usb_storage driver I can see both devices, with ub driver I can see the internal flash only. I have moved back to usb_storage driver for now, but I am willing to provide more info if somebody is interested in making ub.c driver working with both devices on Tungsten T5. Thanks, -Yenya PS.: There is no MAINTAINERS file entry for the ub.c driver, and Pete Zaitcev is not even listed in the CREDITS file. -- | Jan "Yenya" Kasprzak | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ | > Whatever the Java applications and desktop dances may lead to, Unix will < > still be pushing the packets around for a quite a while. --Rob Pike < - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
drivers/block/ub.c and multiple LUNs
Hello, is the ub.c USB storage driver supposed to work with multi-LUN(?) devices? I have a Palm Tungsten T5, which in drive mode looks like two USB storage devices (the internal flash is one LUN and the SD/MMC card in the slot is another one). With usb_storage driver I can see both devices, with ub driver I can see the internal flash only. I have moved back to usb_storage driver for now, but I am willing to provide more info if somebody is interested in making ub.c driver working with both devices on Tungsten T5. Thanks, -Yenya PS.: There is no MAINTAINERS file entry for the ub.c driver, and Pete Zaitcev is not even listed in the CREDITS file. -- | Jan Yenya Kasprzak kas at {fi.muni.cz - work | yenya.net - private} | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ | Whatever the Java applications and desktop dances may lead to, Unix will still be pushing the packets around for a quite a while. --Rob Pike - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ACPI] inappropriate use of in_atomic()
Andrew Morton wrote: : : in_atomic() is not a reliable indication of whether it is currently safe : to call schedule(). : : This is because the lockdepth beancounting which in_atomic() uses is only : accumulated if CONFIG_PREEMPT=y. in_atomic() will return false inside : spinlocks if CONFIG_PREEMPT=n. : : Consequently the use of in_atomic() in the below files is probably : deadlocky if CONFIG_PREEMPT=n: [...] : drivers/acpi/osl.c [...] This may be the cause of http://bugme.osdl.org/show_bug.cgi?id=4150 - I have recently verified that the problem described in bug #4150 disappears when CONFIG_PREEMPT=y is used. -Yenya -- | Jan "Yenya" Kasprzak | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ | > Whatever the Java applications and desktop dances may lead to, Unix will < > still be pushing the packets around for a quite a while. --Rob Pike < - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ACPI] inappropriate use of in_atomic()
Andrew Morton wrote: : : in_atomic() is not a reliable indication of whether it is currently safe : to call schedule(). : : This is because the lockdepth beancounting which in_atomic() uses is only : accumulated if CONFIG_PREEMPT=y. in_atomic() will return false inside : spinlocks if CONFIG_PREEMPT=n. : : Consequently the use of in_atomic() in the below files is probably : deadlocky if CONFIG_PREEMPT=n: [...] : drivers/acpi/osl.c [...] This may be the cause of http://bugme.osdl.org/show_bug.cgi?id=4150 - I have recently verified that the problem described in bug #4150 disappears when CONFIG_PREEMPT=y is used. -Yenya -- | Jan Yenya Kasprzak kas at {fi.muni.cz - work | yenya.net - private} | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ | Whatever the Java applications and desktop dances may lead to, Unix will still be pushing the packets around for a quite a while. --Rob Pike - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: sysfs/kobject update breaks ACPI?
Jan Kasprzak wrote: : my laptop (Asus M6R, http://www.fi.muni.cz/~kas/m6r/) has problems with : ACPI with newer kernels - most of ACPI operations fail [...] : However, the patch does not touch anything related to ACPI : (I think). It is a sysfs and kobject update. So I don't see how this : can break ACPI for my laptop. Hmm, it seems to be some kind of race condition or what - with CONFIG_ACPI_DEBUG my ACPI does not work even with 2.6.8.1 and 2.6.9, without it it is OK up to 2.6.10-rc1-bk11. A pretty Heisenbergish behaviour, isn't it? Works only if you do not try to debug it :-) -Yenya -- | Jan "Yenya" Kasprzak | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ | > Whatever the Java applications and desktop dances may lead to, Unix will < > still be pushing the packets around for a quite a while. --Rob Pike < - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
sysfs/kobject update breaks ACPI?
Hi all, my laptop (Asus M6R, http://www.fi.muni.cz/~kas/m6r/) has problems with ACPI with newer kernels - most of ACPI operations fail - here is the sample of error messages printed during boot. All other error messages contain the "AE_TIME" line as well: Initializing Cryptographic API ACPI-0405: *** Error: Handler for [EmbeddedControl] returned AE_TIME ACPI-1138: *** Error: Method execution failed [\_SB_.PCI0.SBRG.EC0_.RDC3] (Node ddf09840), AE_TIME ACPI-1138: *** Error: Method execution failed [\ECIO] (Node ddf09380), AE_TIME ACPI-1138: *** Error: Method execution failed [\_SB_.PCI0.SBRG.EC0_.ACPS] (Node ddf078e0), AE_TIME ACPI-1138: *** Error: Method execution failed [\ACPS] (Node ddf03220), AE_TIME ACPI-1138: *** Error: Method execution failed [\_SB_.PCI0.AC0_._PSR] (Node ddf07160), AE_TIME ACPI-0405: *** Error: Handler for [EmbeddedControl] returned AE_TIME ACPI-1138: *** Error: Method execution failed [\_SB_.PCI0.SBRG.EC0_.SMBR] (Node ddf09540), AE_TIME ACPI-1138: *** Error: Method execution failed [\_SB_.PCI0.SBRG.EC0_.BIF0] (Node ddf07c40), AE_TIME ACPI-1138: *** Error: Method execution failed [\_SB_.PCI0.BAT0._BIF] (Node ddf07e20), AE_TIME I cannot read battery status etc. In 2.6.9 ACPI works, in 2.6.10 (and later) it doesn't. I've tried to do a binary search in order to find an exact change which breaks ACPI for me, and found the following: 2.6.10-rc1-bk11 is OK, 2.6.10-rc1-bk12 is not. So I've diffed the -bk11 and -bk12 trees and tried to trim the patch to an absolute minimum. Now I have a patch (http://bugme.osdl.org/attachment.cgi?id=4534=view) which, when applied to 2.6.10-rc1-bk11, breaks ACPI on my laptop, and, when subtracted from 2.6.10-rc1-bk12 (using patch -p1 -R), makes ACPI working in 2.6.10-rc1-bk12. However, the patch does not touch anything related to ACPI (I think). It is a sysfs and kobject update. So I don't see how this can break ACPI for my laptop. It seems the failing methods are always \_SB_.PCI0.SBRG.EC0_., and they all try to acquire an internal mutex using the following code (from my DSDT): If (LEqual (Acquire (MUEC, 0x), 0x00)) Do you have any idea on what is wrong here? More information, my source and binary DSDT are attached to the following kernel bugzilla entry: http://bugme.osdl.org/show_bug.cgi?id=4150 Thanks, -Yenya -- | Jan "Yenya" Kasprzak | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ | > Whatever the Java applications and desktop dances may lead to, Unix will < > still be pushing the packets around for a quite a while. --Rob Pike < - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
sysfs/kobject update breaks ACPI?
Hi all, my laptop (Asus M6R, http://www.fi.muni.cz/~kas/m6r/) has problems with ACPI with newer kernels - most of ACPI operations fail - here is the sample of error messages printed during boot. All other error messages contain the AE_TIME line as well: Initializing Cryptographic API ACPI-0405: *** Error: Handler for [EmbeddedControl] returned AE_TIME ACPI-1138: *** Error: Method execution failed [\_SB_.PCI0.SBRG.EC0_.RDC3] (Node ddf09840), AE_TIME ACPI-1138: *** Error: Method execution failed [\ECIO] (Node ddf09380), AE_TIME ACPI-1138: *** Error: Method execution failed [\_SB_.PCI0.SBRG.EC0_.ACPS] (Node ddf078e0), AE_TIME ACPI-1138: *** Error: Method execution failed [\ACPS] (Node ddf03220), AE_TIME ACPI-1138: *** Error: Method execution failed [\_SB_.PCI0.AC0_._PSR] (Node ddf07160), AE_TIME ACPI-0405: *** Error: Handler for [EmbeddedControl] returned AE_TIME ACPI-1138: *** Error: Method execution failed [\_SB_.PCI0.SBRG.EC0_.SMBR] (Node ddf09540), AE_TIME ACPI-1138: *** Error: Method execution failed [\_SB_.PCI0.SBRG.EC0_.BIF0] (Node ddf07c40), AE_TIME ACPI-1138: *** Error: Method execution failed [\_SB_.PCI0.BAT0._BIF] (Node ddf07e20), AE_TIME I cannot read battery status etc. In 2.6.9 ACPI works, in 2.6.10 (and later) it doesn't. I've tried to do a binary search in order to find an exact change which breaks ACPI for me, and found the following: 2.6.10-rc1-bk11 is OK, 2.6.10-rc1-bk12 is not. So I've diffed the -bk11 and -bk12 trees and tried to trim the patch to an absolute minimum. Now I have a patch (http://bugme.osdl.org/attachment.cgi?id=4534action=view) which, when applied to 2.6.10-rc1-bk11, breaks ACPI on my laptop, and, when subtracted from 2.6.10-rc1-bk12 (using patch -p1 -R), makes ACPI working in 2.6.10-rc1-bk12. However, the patch does not touch anything related to ACPI (I think). It is a sysfs and kobject update. So I don't see how this can break ACPI for my laptop. It seems the failing methods are always \_SB_.PCI0.SBRG.EC0_.something, and they all try to acquire an internal mutex using the following code (from my DSDT): If (LEqual (Acquire (MUEC, 0x), 0x00)) Do you have any idea on what is wrong here? More information, my source and binary DSDT are attached to the following kernel bugzilla entry: http://bugme.osdl.org/show_bug.cgi?id=4150 Thanks, -Yenya -- | Jan Yenya Kasprzak kas at {fi.muni.cz - work | yenya.net - private} | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ | Whatever the Java applications and desktop dances may lead to, Unix will still be pushing the packets around for a quite a while. --Rob Pike - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: sysfs/kobject update breaks ACPI?
Jan Kasprzak wrote: : my laptop (Asus M6R, http://www.fi.muni.cz/~kas/m6r/) has problems with : ACPI with newer kernels - most of ACPI operations fail [...] : However, the patch does not touch anything related to ACPI : (I think). It is a sysfs and kobject update. So I don't see how this : can break ACPI for my laptop. Hmm, it seems to be some kind of race condition or what - with CONFIG_ACPI_DEBUG my ACPI does not work even with 2.6.8.1 and 2.6.9, without it it is OK up to 2.6.10-rc1-bk11. A pretty Heisenbergish behaviour, isn't it? Works only if you do not try to debug it :-) -Yenya -- | Jan Yenya Kasprzak kas at {fi.muni.cz - work | yenya.net - private} | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ | Whatever the Java applications and desktop dances may lead to, Unix will still be pushing the packets around for a quite a while. --Rob Pike - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Memory leak in 2.6.11-rc1?
Jan Kasprzak wrote: : I think I have been running 2.6.10-rc3 before. I've copied : the fs/bio.c from 2.6.10-rc3 to my 2.6.11-rc2 sources and booted the : resulting kernel. I hope it will not eat my filesystems :-) I will send : my /proc/slabinfo in a few days. Hmm, after 3h35min of uptime I have biovec-1 92157 92250 16 2251 : tunables 120 608 : slabdata410410 60 bio92163 92163128 311 : tunables 120 608 : slabdata 2973 2973 60 so it is probably still leaking - about half an hour ago it was biovec-1 77685 77850 16 2251 : tunables 120 608 : slabdata346346 0 bio77841 77841128 311 : tunables 120 608 : slabdata 2511 2511180 -Yenya -- | Jan "Yenya" Kasprzak | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ | > Whatever the Java applications and desktop dances may lead to, Unix will < > still be pushing the packets around for a quite a while. --Rob Pike < - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Memory leak in 2.6.11-rc1?
[EMAIL PROTECTED] wrote: : My guess would be the clone change, if raid was not leaking before. I : cannot lookup any patches at the moment, as I'm still at the hospital : taking care of my new born baby and wife :) Congratulations! : But try and reverse the patches to fs/bio.c that mention corruption due to : bio_clone and bio->bi_io_vec and see if that cures it. If it does, I know : where to look. When did you notice this started to leak? I think I have been running 2.6.10-rc3 before. I've copied the fs/bio.c from 2.6.10-rc3 to my 2.6.11-rc2 sources and booted the resulting kernel. I hope it will not eat my filesystems :-) I will send my /proc/slabinfo in a few days. -Yenya -- | Jan "Yenya" Kasprzak | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ | > Whatever the Java applications and desktop dances may lead to, Unix will < > still be pushing the packets around for a quite a while. --Rob Pike < - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Memory leak in 2.6.11-rc1?
Linus Torvalds wrote: : Jan - can you give Jens a bit of an idea of what drivers and/or schedulers : you're using? I have a Tyan S2882 dual Opteron, network is on-board tg3, there are 8 P-ATA HDDs hooked on 3ware 7506-8 controller (no HW RAID there, but the drives are partitioned and partition grouped to form software RAID-0, 1, 5, and 10 volumes - the main fileserving traffic is on a RAID-5 volume, and /var is on RAID-10 volume. Filesystems are XFS for that RAID-5 volume, ext3 for the rest of the system. I have compiled-in the following I/O schedulers (according to my /var/log/dmesg :-) io scheduler noop registered io scheduler anticipatory registered io scheduler deadline registered io scheduler cfq registered I have not changed the scheduler by hand, so I suppose the anticipatory is the default. No X, just serial console. The server does FTP serving mostly (ProFTPd with sendfile() compiled in), sending mail via qmail (cca 100-200k mails a day), and bits of other work (rsync, Apache, ...). Fedora core 3 with all relevant updates. My fstab (physical devices only): /dev/md0/ ext3defaults1 1 /dev/md1/home ext3defaults1 2 /dev/md6/varext3defaults1 2 /dev/md4/fastraid xfs noatime 1 3 /dev/md5/export xfs noatime 1 4 /dev/sde4 swapswappri=10 0 0 /dev/sdf4 swapswappri=10 0 0 /dev/sdg4 swapswappri=10 0 0 /dev/sdh4 swapswappri=10 0 0 My mdstat: Personalities : [raid0] [raid1] [raid5] md6 : active raid0 md3[0] md2[1] 19550720 blocks 64k chunks md1 : active raid1 sdd1[1] sdc1[0] 14659200 blocks [2/2] [UU] md2 : active raid1 sdf1[1] sde1[0] 9775424 blocks [2/2] [UU] md3 : active raid1 sdh1[1] sdg1[0] 9775424 blocks [2/2] [UU] md4 : active raid0 sdh2[7] sdg2[6] sdf2[5] sde2[4] sdd2[3] sdc2[2] sdb2[1] sda2[0] 39133184 blocks 256k chunks md5 : active raid5 sdh3[7] sdg3[6] sdf3[5] sde3[4] sdd3[3] sdc3[2] sdb3[1] sda3[0] 1572512256 blocks level 5, 256k chunk, algorithm 2 [8/8] [] md0 : active raid1 sdb1[1] sda1[0] 14659200 blocks [2/2] [UU] unused devices: Anything else you want to know? Thanks, -Yenya -- | Jan "Yenya" Kasprzak | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ | > Whatever the Java applications and desktop dances may lead to, Unix will < > still be pushing the packets around for a quite a while. --Rob Pike < - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Memory leak in 2.6.11-rc1?
: I've been running 2.6.11-rc1 on my dual opteron Fedora Core 3 box for a week : now, and I think there is a memory leak somewhere. I am measuring the : size of active and inactive pages (from /proc/meminfo), and it seems : that the count of sum (active+inactive) pages is decreasing. Please : take look at the graphs at : : http://www.linux.cz/stats/mrtg-rrd/vm_active.html Well, with Linus' patch to fs/pipe.c the situation seems to improve a bit, but some leak is still there (look at the "monthly" graph at the above URL). The server has been running 2.6.11-rc2 + patch to fs/pipe.c for last 8 days. I am letting it run for a few more days in case you want some debugging info from a live system. I am attaching my /proc/meminfo and /proc/slabinfo. -Yenya # cat /proc/meminfo MemTotal: 4045168 kB MemFree: 59396 kB Buffers: 17812 kB Cached:2861648 kB SwapCached: 0 kB Active: 827700 kB Inactive: 2239752 kB HighTotal: 0 kB HighFree:0 kB LowTotal: 4045168 kB LowFree: 59396 kB SwapTotal:14651256 kB SwapFree: 14650584 kB Dirty:1616 kB Writeback: 0 kB Mapped: 206540 kB Slab: 861176 kB CommitLimit: 16673840 kB Committed_AS: 565684 kB PageTables: 20812 kB VmallocTotal: 34359738367 kB VmallocUsed: 7400 kB VmallocChunk: 34359730867 kB # cat /proc/slabinfo slabinfo - version: 2.1 # name : tunables: slabdata raid5/md5256260 141652 : tunables 24 128 : slabdata 52 52 0 rpc_buffers8 8 204821 : tunables 24 128 : slabdata 4 4 0 rpc_tasks 12 20384 101 : tunables 54 278 : slabdata 2 2 0 rpc_inode_cache8 1076851 : tunables 54 278 : slabdata 2 2 0 fib6_nodes27 61 64 611 : tunables 120 608 : slabdata 1 1 0 ip6_dst_cache 17 36320 121 : tunables 54 278 : slabdata 3 3 0 ndisc_cache2 30256 151 : tunables 120 608 : slabdata 2 2 0 rawv6_sock 4 4 102441 : tunables 54 278 : slabdata 1 1 0 udpv6_sock 1 496041 : tunables 54 278 : slabdata 1 1 0 tcpv6_sock 8 8 166442 : tunables 24 128 : slabdata 2 2 0 unix_sock56765076851 : tunables 54 278 : slabdata130130 0 tcp_tw_bucket445920192 201 : tunables 120 608 : slabdata 46 46 0 tcp_bind_bucket 389 2261 32 1191 : tunables 120 608 : slabdata 19 19 0 tcp_open_request 135310128 311 : tunables 120 608 : slabdata 10 10 0 inet_peer_cache 32 62128 311 : tunables 120 608 : slabdata 2 2 0 ip_fib_alias 20119 32 1191 : tunables 120 608 : slabdata 1 1 0 ip_fib_hash 18 61 64 611 : tunables 120 608 : slabdata 1 1 0 ip_dst_cache1738 2060384 101 : tunables 54 278 : slabdata206206 0 arp_cache 8 30256 151 : tunables 120 608 : slabdata 2 2 0 raw_sock 3 983292 : tunables 54 278 : slabdata 1 1 0 udp_sock 45 4583292 : tunables 54 278 : slabdata 5 5 0 tcp_sock 431600 147252 : tunables 24 128 : slabdata120120 0 flow_cache 0 0128 311 : tunables 120 608 : slabdata 0 0 0 dm_tio 0 0 24 1561 : tunables 120 608 : slabdata 0 0 0 dm_io 0 0 32 1191 : tunables 120 608 : slabdata 0 0 0 scsi_cmd_cache 26131551271 : tunables 54 278 : slabdata 45 45216 cfq_ioc_pool 0 0 48 811 : tunables 120 608 : slabdata 0 0 0 cfq_pool 0 0176 221 : tunables 120 608 : slabdata 0 0 0 crq_pool 0 0104 381 : tunables 120 608 : slabdata 0 0 0 deadline_drq 0 0 96 411 : tunables 120 608 : slabdata 0 0 0 as_arq 580700112 351 : tunables 120 608 : slabdata 20 20432 xfs_acl0 0304 131 : tunables 54 278 : slabdata 0 0 0 xfs_chashlist380 4879 32 1191
Re: Memory leak in 2.6.11-rc1?
: I've been running 2.6.11-rc1 on my dual opteron Fedora Core 3 box for a week : now, and I think there is a memory leak somewhere. I am measuring the : size of active and inactive pages (from /proc/meminfo), and it seems : that the count of sum (active+inactive) pages is decreasing. Please : take look at the graphs at : : http://www.linux.cz/stats/mrtg-rrd/vm_active.html Well, with Linus' patch to fs/pipe.c the situation seems to improve a bit, but some leak is still there (look at the monthly graph at the above URL). The server has been running 2.6.11-rc2 + patch to fs/pipe.c for last 8 days. I am letting it run for a few more days in case you want some debugging info from a live system. I am attaching my /proc/meminfo and /proc/slabinfo. -Yenya # cat /proc/meminfo MemTotal: 4045168 kB MemFree: 59396 kB Buffers: 17812 kB Cached:2861648 kB SwapCached: 0 kB Active: 827700 kB Inactive: 2239752 kB HighTotal: 0 kB HighFree:0 kB LowTotal: 4045168 kB LowFree: 59396 kB SwapTotal:14651256 kB SwapFree: 14650584 kB Dirty:1616 kB Writeback: 0 kB Mapped: 206540 kB Slab: 861176 kB CommitLimit: 16673840 kB Committed_AS: 565684 kB PageTables: 20812 kB VmallocTotal: 34359738367 kB VmallocUsed: 7400 kB VmallocChunk: 34359730867 kB # cat /proc/slabinfo slabinfo - version: 2.1 # nameactive_objs num_objs objsize objperslab pagesperslab : tunables batchcount limit sharedfactor : slabdata active_slabs num_slabs sharedavail raid5/md5256260 141652 : tunables 24 128 : slabdata 52 52 0 rpc_buffers8 8 204821 : tunables 24 128 : slabdata 4 4 0 rpc_tasks 12 20384 101 : tunables 54 278 : slabdata 2 2 0 rpc_inode_cache8 1076851 : tunables 54 278 : slabdata 2 2 0 fib6_nodes27 61 64 611 : tunables 120 608 : slabdata 1 1 0 ip6_dst_cache 17 36320 121 : tunables 54 278 : slabdata 3 3 0 ndisc_cache2 30256 151 : tunables 120 608 : slabdata 2 2 0 rawv6_sock 4 4 102441 : tunables 54 278 : slabdata 1 1 0 udpv6_sock 1 496041 : tunables 54 278 : slabdata 1 1 0 tcpv6_sock 8 8 166442 : tunables 24 128 : slabdata 2 2 0 unix_sock56765076851 : tunables 54 278 : slabdata130130 0 tcp_tw_bucket445920192 201 : tunables 120 608 : slabdata 46 46 0 tcp_bind_bucket 389 2261 32 1191 : tunables 120 608 : slabdata 19 19 0 tcp_open_request 135310128 311 : tunables 120 608 : slabdata 10 10 0 inet_peer_cache 32 62128 311 : tunables 120 608 : slabdata 2 2 0 ip_fib_alias 20119 32 1191 : tunables 120 608 : slabdata 1 1 0 ip_fib_hash 18 61 64 611 : tunables 120 608 : slabdata 1 1 0 ip_dst_cache1738 2060384 101 : tunables 54 278 : slabdata206206 0 arp_cache 8 30256 151 : tunables 120 608 : slabdata 2 2 0 raw_sock 3 983292 : tunables 54 278 : slabdata 1 1 0 udp_sock 45 4583292 : tunables 54 278 : slabdata 5 5 0 tcp_sock 431600 147252 : tunables 24 128 : slabdata120120 0 flow_cache 0 0128 311 : tunables 120 608 : slabdata 0 0 0 dm_tio 0 0 24 1561 : tunables 120 608 : slabdata 0 0 0 dm_io 0 0 32 1191 : tunables 120 608 : slabdata 0 0 0 scsi_cmd_cache 26131551271 : tunables 54 278 : slabdata 45 45216 cfq_ioc_pool 0 0 48 811 : tunables 120 608 : slabdata 0 0 0 cfq_pool 0 0176 221 : tunables 120 608 : slabdata 0 0 0 crq_pool 0 0104 381 : tunables 120 608 : slabdata 0 0 0 deadline_drq 0 0 96 411 : tunables 120 608 : slabdata 0 0 0 as_arq 580700112 351 : tunables 120 608 : slabdata 20 20432 xfs_acl0 0304 131 :
Re: Memory leak in 2.6.11-rc1?
Linus Torvalds wrote: : Jan - can you give Jens a bit of an idea of what drivers and/or schedulers : you're using? I have a Tyan S2882 dual Opteron, network is on-board tg3, there are 8 P-ATA HDDs hooked on 3ware 7506-8 controller (no HW RAID there, but the drives are partitioned and partition grouped to form software RAID-0, 1, 5, and 10 volumes - the main fileserving traffic is on a RAID-5 volume, and /var is on RAID-10 volume. Filesystems are XFS for that RAID-5 volume, ext3 for the rest of the system. I have compiled-in the following I/O schedulers (according to my /var/log/dmesg :-) io scheduler noop registered io scheduler anticipatory registered io scheduler deadline registered io scheduler cfq registered I have not changed the scheduler by hand, so I suppose the anticipatory is the default. No X, just serial console. The server does FTP serving mostly (ProFTPd with sendfile() compiled in), sending mail via qmail (cca 100-200k mails a day), and bits of other work (rsync, Apache, ...). Fedora core 3 with all relevant updates. My fstab (physical devices only): /dev/md0/ ext3defaults1 1 /dev/md1/home ext3defaults1 2 /dev/md6/varext3defaults1 2 /dev/md4/fastraid xfs noatime 1 3 /dev/md5/export xfs noatime 1 4 /dev/sde4 swapswappri=10 0 0 /dev/sdf4 swapswappri=10 0 0 /dev/sdg4 swapswappri=10 0 0 /dev/sdh4 swapswappri=10 0 0 My mdstat: Personalities : [raid0] [raid1] [raid5] md6 : active raid0 md3[0] md2[1] 19550720 blocks 64k chunks md1 : active raid1 sdd1[1] sdc1[0] 14659200 blocks [2/2] [UU] md2 : active raid1 sdf1[1] sde1[0] 9775424 blocks [2/2] [UU] md3 : active raid1 sdh1[1] sdg1[0] 9775424 blocks [2/2] [UU] md4 : active raid0 sdh2[7] sdg2[6] sdf2[5] sde2[4] sdd2[3] sdc2[2] sdb2[1] sda2[0] 39133184 blocks 256k chunks md5 : active raid5 sdh3[7] sdg3[6] sdf3[5] sde3[4] sdd3[3] sdc3[2] sdb3[1] sda3[0] 1572512256 blocks level 5, 256k chunk, algorithm 2 [8/8] [] md0 : active raid1 sdb1[1] sda1[0] 14659200 blocks [2/2] [UU] unused devices: none Anything else you want to know? Thanks, -Yenya -- | Jan Yenya Kasprzak kas at {fi.muni.cz - work | yenya.net - private} | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ | Whatever the Java applications and desktop dances may lead to, Unix will still be pushing the packets around for a quite a while. --Rob Pike - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Memory leak in 2.6.11-rc1?
[EMAIL PROTECTED] wrote: : My guess would be the clone change, if raid was not leaking before. I : cannot lookup any patches at the moment, as I'm still at the hospital : taking care of my new born baby and wife :) Congratulations! : But try and reverse the patches to fs/bio.c that mention corruption due to : bio_clone and bio-bi_io_vec and see if that cures it. If it does, I know : where to look. When did you notice this started to leak? I think I have been running 2.6.10-rc3 before. I've copied the fs/bio.c from 2.6.10-rc3 to my 2.6.11-rc2 sources and booted the resulting kernel. I hope it will not eat my filesystems :-) I will send my /proc/slabinfo in a few days. -Yenya -- | Jan Yenya Kasprzak kas at {fi.muni.cz - work | yenya.net - private} | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ | Whatever the Java applications and desktop dances may lead to, Unix will still be pushing the packets around for a quite a while. --Rob Pike - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Memory leak in 2.6.11-rc1?
Jan Kasprzak wrote: : I think I have been running 2.6.10-rc3 before. I've copied : the fs/bio.c from 2.6.10-rc3 to my 2.6.11-rc2 sources and booted the : resulting kernel. I hope it will not eat my filesystems :-) I will send : my /proc/slabinfo in a few days. Hmm, after 3h35min of uptime I have biovec-1 92157 92250 16 2251 : tunables 120 608 : slabdata410410 60 bio92163 92163128 311 : tunables 120 608 : slabdata 2973 2973 60 so it is probably still leaking - about half an hour ago it was biovec-1 77685 77850 16 2251 : tunables 120 608 : slabdata346346 0 bio77841 77841128 311 : tunables 120 608 : slabdata 2511 2511180 -Yenya -- | Jan Yenya Kasprzak kas at {fi.muni.cz - work | yenya.net - private} | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ | Whatever the Java applications and desktop dances may lead to, Unix will still be pushing the packets around for a quite a while. --Rob Pike - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.11-rc1-bk9 crash in mdadm
Lars Marowsky-Bree wrote: : On 2005-01-21T17:12:30, Jan Kasprzak <[EMAIL PROTECTED]> wrote: : : > Just FWIW, I've got the following crash when trying to boot a 2.6.11-rc1-bk9 : > kernel on my dual opteron Fedora Core 3 box. I will try -bk8 now. : : Attached is a likely candidate for a fix. : : (It's been discussed on linux-raid already.) Yes, it makes 2.6.11-rc1-bk9 boot correctly on my box. Thanks! -Y. -- | Jan "Yenya" Kasprzak | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ | > Whatever the Java applications and desktop dances may lead to, Unix will < > still be pushing the packets around for a quite a while. --Rob Pike < - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Memory leak in 2.6.11-rc1?
Hi all, I've been running 2.6.11-rc1 on my dual opteron Fedora Core 3 box for a week now, and I think there is a memory leak somewhere. I am measuring the size of active and inactive pages (from /proc/meminfo), and it seems that the count of sum (active+inactive) pages is decreasing. Please take look at the graphs at http://www.linux.cz/stats/mrtg-rrd/vm_active.html (especially the "monthly" graph) - I've booted 2.6.11-rc1 last Friday, and since then the size of "inactive" pages is decreasing almost constantly, while "active" is not increasing. The active+inactive sum has been steady before, as you can see from both the monthly and yearly graphs. Now I am playing with 2.6.11-rc1-bk snapshots to see what happens. I have been running 2.6.10-rc3 before. More info is available, please ask me. The box runs 3ware 7506-8 controller with SW RAID-0, 1, and 5 volumes, Tigon3 network card. The main load is FTP server, and there is also a HTTP server and Qmail. Thanks, -Yenya -- | Jan "Yenya" Kasprzak | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ | > Whatever the Java applications and desktop dances may lead to, Unix will < > still be pushing the packets around for a quite a while. --Rob Pike < - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.11-rc1-bk9 crash in mdadm
Hi, Just FWIW, I've got the following crash when trying to boot a 2.6.11-rc1-bk9 kernel on my dual opteron Fedora Core 3 box. I will try -bk8 now. -Yenya Starting up RAID devices: md5 Unable to handle kernel NULL pointer dereference at 0008 RIP: {__make_request+61} PGD f9f0f067 PUD fa01b067 PMD 0 Oops: [1] SMP CPU 1 Modules linked in: floppy Pid: 1632, comm: mdadm Not tainted 2.6.11-rc1-bk9 RIP: 0010:[] {__make_request+61} RSP: 0018:8100f9f2b7f8 EFLAGS: 00010212 RAX: RBX: RCX: 810002d779c0 RDX: RSI: 8100f9f2b808 RDI: 81007ff8b2a8 RBP: 81007ff8b2a8 R08: 8100fae9b540 R09: R10: 012a5200 R11: R12: R13: 81007f4e4180 R14: 0008 R15: 8100f9f2bab8 FS: 2aab6b00() GS:804e9240() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 0008 CR3: 7f599000 CR4: 06e0 Process mdadm (pid: 1632, threadinfo 8100f9f2a000, task 81007f4036e0) Stack: 012a523f 80151ca4 81007f4e4180 8100fae9b550 81007ff8b2a8 8100f9f2b878 81007f4e4180 81007feb6300 8100f9f2bab8 802e2a92 Call Trace:{mempool_alloc+164} {generic_make_request+546} {autoremove_wake_function+0} {autoremove_wake_function+0} {make_request+529} {autoremove_wake_function+9} {__wake_up_common+64} {generic_make_request+546} {autoremove_wake_function+0} {md_open+112} utoremove_wake_function+0} {submit_bio+280} {sync_page_io+203} {bi_complete+0} {read_disk_sb+83} {super_90_load+85} {md_import_device+459} {page_cache_readahead+490} {radix_tree_delete+347} {md_ioctl+3608} {__pagevec_free+37} {find_get_pages+107} {blkdev_ioctl+1834} {flat_send_IPI_allbutself+20} {__smp_call_function+110} {invalidate_bh_lru+0} {bdev_clear_inode+28} {wake_up_bit+24} {do_ioctl+116} {sys_ioctl+881} {system_call+126} Code: 44 8b 7c 10 08 41 83 e4 01 e8 e5 9d e7 ff 48 8b bd f0 01 00 RIP {__make_request+61} RSP CR2: 0008 /etc/rc.d/rc.sysinit: line 556: 1632 Killed /sbin/mdadm -Ac partitions $i -m dev Pid: 1632, comm: mdadm Not tainted 2.6.11-rc1-bk9 RIP: 0010:[] {__make_request+61} RSP: 0018:8100f9f2b7f8 EFLAGS: 00010212 RAX: RBX: RCX: 810002d779c0 RDX: RSI: 8100f9f2b808 RDI: 81007ff8b2a8 RBP: 81007ff8b2a8 R08: 8100fae9b540 R09: R10: 012a5200 R11: R12: R13: 81007f4e4180 R14: 0008 R15: 8100f9f2bab8 FS: 2aab6b00() GS:804e9240() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 0008 CR3: 7f599000 CR4: 06e0 Process mdadm (pid: 1632, threadinfo 8100f9f2a000, task 81007f4036e0) Stack: 012a523f 80151ca4 81007f4e4180 8100fae9b550 81007ff8b2a8 8100f9f2b878 81007f4e4180 81007feb6300 8100f9f2bab8 802e2a92 Call Trace:{mempool_alloc+164} {generic_make_request+546} {autoremove_wake_function+0} {autoremove_wake_function+0} {make_request+529} {autoremove_wake_function+9} {__wake_up_common+64} {generic_make_request+546} {autoremove_wake_function+0} {md_open+112} {autoremove_wake_function+0} {submit_bio+280} {sync_page_io+203} {bi_complete+0} {read_disk_sb+83} {super_90_load+85} {md_import_device+459} {page_cache_readahead+490} {radix_tree_delete+347} {md_ioctl+3608} {__pagevec_free+37} {find_get_pages+107} {blkdev_ioctl+1834} {flat_send_IPI_allbutself+20} {__smp_call_function+110} {invalidate_bh_lru+0} {bdev_clear_inode+28} {wake_up_bit+24} {do_ioctl+116} {sys_ioctl+881} {system_call+126} Code: 44 8b 7c 10 08 41 83 e4 01 e8 e5 9d e7 ff 48 8b bd f0 01 00 RIP {__make_request+61} RSP CR2: 0008 /etc/rc.d/rc.sysinit: line 556: 1632 Killed /sbin/mdadm -Ac partitions $i -m dev -- | Jan "Yenya" Kasprzak | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ | > Whatever the Java applications and desktop dances may lead to, Unix will < > still be pushing the packets around for a quite a while. --Rob Pike < - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.11-rc1-bk9 crash in mdadm
Hi, Just FWIW, I've got the following crash when trying to boot a 2.6.11-rc1-bk9 kernel on my dual opteron Fedora Core 3 box. I will try -bk8 now. -Yenya Starting up RAID devices: md5 Unable to handle kernel NULL pointer dereference at 0008 RIP: 802e219d{__make_request+61} PGD f9f0f067 PUD fa01b067 PMD 0 Oops: [1] SMP CPU 1 Modules linked in: floppy Pid: 1632, comm: mdadm Not tainted 2.6.11-rc1-bk9 RIP: 0010:[802e219d] 802e219d{__make_request+61} RSP: 0018:8100f9f2b7f8 EFLAGS: 00010212 RAX: RBX: RCX: 810002d779c0 RDX: RSI: 8100f9f2b808 RDI: 81007ff8b2a8 RBP: 81007ff8b2a8 R08: 8100fae9b540 R09: R10: 012a5200 R11: R12: R13: 81007f4e4180 R14: 0008 R15: 8100f9f2bab8 FS: 2aab6b00() GS:804e9240() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 0008 CR3: 7f599000 CR4: 06e0 Process mdadm (pid: 1632, threadinfo 8100f9f2a000, task 81007f4036e0) Stack: 012a523f 80151ca4 81007f4e4180 8100fae9b550 81007ff8b2a8 8100f9f2b878 81007f4e4180 81007feb6300 8100f9f2bab8 802e2a92 Call Trace:80151ca4{mempool_alloc+164} 802e2a92{generic_make_request+546} 80146050{autoremove_wake_function+0} 80146050{autoremove_wake_function+0} 803187e1{make_request+529} 80146059{autoremove_wake_function+9} 8012cd10{__wake_up_common+64} 802e2a92{generic_make_request+546} 80146050{autoremove_wake_function+0} 80320fc0{md_open+112} utoremove_wake_function+0} 802e2bc8{submit_bio+280} 8031ea2b{sync_page_io+203} 8031e930{bi_complete+0} 8031f863{read_disk_sb+83} 8031fbf5{super_90_load+85} 8032060b{md_import_device+459} 80155bfa{page_cache_readahead+490} 8028cccb{radix_tree_delete+347} 803225d8{md_ioctl+3608} 80152ae5{__pagevec_free+37} 8014edfb{find_get_pages+107} 802e429a{blkdev_ioctl+1834} 80119434{flat_send_IPI_allbutself+20} 8011734e{__smp_call_function+110} 80173c90{invalidate_bh_lru+0} 80178fdc{bdev_clear_inode+28} 80146208{wake_up_bit+24} 801845d4{do_ioctl+116} 80184971{sys_ioctl+881} 8010d29e{system_call+126} Code: 44 8b 7c 10 08 41 83 e4 01 e8 e5 9d e7 ff 48 8b bd f0 01 00 RIP 802e219d{__make_request+61} RSP 8100f9f2b7f8 CR2: 0008 /etc/rc.d/rc.sysinit: line 556: 1632 Killed /sbin/mdadm -Ac partitions $i -m dev Pid: 1632, comm: mdadm Not tainted 2.6.11-rc1-bk9 RIP: 0010:[802e219d] 802e219d{__make_request+61} RSP: 0018:8100f9f2b7f8 EFLAGS: 00010212 RAX: RBX: RCX: 810002d779c0 RDX: RSI: 8100f9f2b808 RDI: 81007ff8b2a8 RBP: 81007ff8b2a8 R08: 8100fae9b540 R09: R10: 012a5200 R11: R12: R13: 81007f4e4180 R14: 0008 R15: 8100f9f2bab8 FS: 2aab6b00() GS:804e9240() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 0008 CR3: 7f599000 CR4: 06e0 Process mdadm (pid: 1632, threadinfo 8100f9f2a000, task 81007f4036e0) Stack: 012a523f 80151ca4 81007f4e4180 8100fae9b550 81007ff8b2a8 8100f9f2b878 81007f4e4180 81007feb6300 8100f9f2bab8 802e2a92 Call Trace:80151ca4{mempool_alloc+164} 802e2a92{generic_make_request+546} 80146050{autoremove_wake_function+0} 80146050{autoremove_wake_function+0} 803187e1{make_request+529} 80146059{autoremove_wake_function+9} 8012cd10{__wake_up_common+64} 802e2a92{generic_make_request+546} 80146050{autoremove_wake_function+0} 80320fc0{md_open+112} 80146050{autoremove_wake_function+0} 802e2bc8{submit_bio+280} 8031ea2b{sync_page_io+203} 8031e930{bi_complete+0} 8031f863{read_disk_sb+83} 8031fbf5{super_90_load+85} 8032060b{md_import_device+459} 80155bfa{page_cache_readahead+490} 8028cccb{radix_tree_delete+347} 803225d8{md_ioctl+3608} 80152ae5{__pagevec_free+37} 8014edfb{find_get_pages+107} 802e429a{blkdev_ioctl+1834} 80119434{flat_send_IPI_allbutself+20} 8011734e{__smp_call_function+110} 80173c90{invalidate_bh_lru+0} 80178fdc{bdev_clear_inode+28}
Memory leak in 2.6.11-rc1?
Hi all, I've been running 2.6.11-rc1 on my dual opteron Fedora Core 3 box for a week now, and I think there is a memory leak somewhere. I am measuring the size of active and inactive pages (from /proc/meminfo), and it seems that the count of sum (active+inactive) pages is decreasing. Please take look at the graphs at http://www.linux.cz/stats/mrtg-rrd/vm_active.html (especially the monthly graph) - I've booted 2.6.11-rc1 last Friday, and since then the size of inactive pages is decreasing almost constantly, while active is not increasing. The active+inactive sum has been steady before, as you can see from both the monthly and yearly graphs. Now I am playing with 2.6.11-rc1-bk snapshots to see what happens. I have been running 2.6.10-rc3 before. More info is available, please ask me. The box runs 3ware 7506-8 controller with SW RAID-0, 1, and 5 volumes, Tigon3 network card. The main load is FTP server, and there is also a HTTP server and Qmail. Thanks, -Yenya -- | Jan Yenya Kasprzak kas at {fi.muni.cz - work | yenya.net - private} | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ | Whatever the Java applications and desktop dances may lead to, Unix will still be pushing the packets around for a quite a while. --Rob Pike - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.11-rc1-bk9 crash in mdadm
Lars Marowsky-Bree wrote: : On 2005-01-21T17:12:30, Jan Kasprzak [EMAIL PROTECTED] wrote: : : Just FWIW, I've got the following crash when trying to boot a 2.6.11-rc1-bk9 : kernel on my dual opteron Fedora Core 3 box. I will try -bk8 now. : : Attached is a likely candidate for a fix. : : (It's been discussed on linux-raid already.) Yes, it makes 2.6.11-rc1-bk9 boot correctly on my box. Thanks! -Y. -- | Jan Yenya Kasprzak kas at {fi.muni.cz - work | yenya.net - private} | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ | Whatever the Java applications and desktop dances may lead to, Unix will still be pushing the packets around for a quite a while. --Rob Pike - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS: inode with st_mode == 0
Christoph Hellwig wrote: : I have a better patch than the one I gave you (attached below). If you : send me a mail with steps to reproduce your remaining problems I'll put : this very high on my TODO list after christmas. Btw, any chance you could : try XFS CVS (which is at 2.6.9) + the patch below instead of plain 2.6.9, : there have been various other fixes in the last months. : Just FWIW, this patch (applied to 2.6.10) seems to fix the problem for me. I was not able to reproduce it by running my test script for ~24 hours. Thanks! -Yenya -- | Jan "Yenya" Kasprzak | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ | > Whatever the Java applications and desktop dances may lead to, Unix will < > still be pushing the packets around for a quite a while. --Rob Pike < - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS: inode with st_mode == 0
Christoph Hellwig wrote: : I have a better patch than the one I gave you (attached below). If you : send me a mail with steps to reproduce your remaining problems I'll put : this very high on my TODO list after christmas. Btw, any chance you could : try XFS CVS (which is at 2.6.9) + the patch below instead of plain 2.6.9, : there have been various other fixes in the last months. : Just FWIW, this patch (applied to 2.6.10) seems to fix the problem for me. I was not able to reproduce it by running my test script for ~24 hours. Thanks! -Yenya -- | Jan Yenya Kasprzak kas at {fi.muni.cz - work | yenya.net - private} | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ | Whatever the Java applications and desktop dances may lead to, Unix will still be pushing the packets around for a quite a while. --Rob Pike - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: AIC7xxx kernel driver; ATTN Mr. Justin T. Gibbs
Alan Cox wrote: : Except for an obscure bug under very high memory load I'm not aware of any : outstanding bugs in the AIC7xxx driver, certainly not like you describe. There : is however always a first time for any bug 8) Well, AIC7xxx crashes on me with stock 2.4.5 kernel (Athlon TB 850, ASUS A7V). I run 2.4.3 + zerocopy patches, and when I tried to upgrade to 2.4.5, it crashes during boot while initializing the AIC7xxx driver. I have written down the Oops numbers by hand, but I had to reboot the server back to 2.4.3+zerocopy. Here are relevant parts of dmesg on 2.4.3 (just to identify the hardware): SCSI subsystem driver Revision: 1.00 request_module[scsi_hostadapter]: Root fs not mounted PCI: Found IRQ 12 for device 00:0a.0 scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.1.5 aic7880: Wide Channel A, SCSI Id=7, 16/255 SCBs Vendor: SEAGATE Model: ST118273W Rev: 5698 Type: Direct-Access ANSI SCSI revision: 02 Detected scsi disk sda at scsi0, channel 0, id 0, lun 0 (scsi0:A:0): 40.000MB/s transfers (20.000MHz, offset 8, 16bit) scsi0:0:0:0: Tagged Queuing enabled. Depth 8 SCSI device sda: 35566480 512-byte hdwr sectors (18210 MB) sda: sda1 sda2 Now the Oops tracing: EIP was ahc_match_scb + 0x19 Call trace: ahc_search_qinfifo + 0x187 ahc_abort_scbs + 0x6a __udelay + 0x27 ahc_reset_current_channel + 0x27c ahc_pci_config + 0x4e2 pci_read_config_byte + 0x1c ... -Yenya -- \ Jan "Yenya" Kasprzakhttp://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// It is a very bad idea to feed negative numbers to memcpy. --Alan Cox - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: AIC7xxx kernel driver; ATTN Mr. Justin T. Gibbs
Alan Cox wrote: : Except for an obscure bug under very high memory load I'm not aware of any : outstanding bugs in the AIC7xxx driver, certainly not like you describe. There : is however always a first time for any bug 8) Well, AIC7xxx crashes on me with stock 2.4.5 kernel (Athlon TB 850, ASUS A7V). I run 2.4.3 + zerocopy patches, and when I tried to upgrade to 2.4.5, it crashes during boot while initializing the AIC7xxx driver. I have written down the Oops numbers by hand, but I had to reboot the server back to 2.4.3+zerocopy. Here are relevant parts of dmesg on 2.4.3 (just to identify the hardware): SCSI subsystem driver Revision: 1.00 request_module[scsi_hostadapter]: Root fs not mounted PCI: Found IRQ 12 for device 00:0a.0 scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.1.5 Adaptec 2940 Ultra SCSI adapter aic7880: Wide Channel A, SCSI Id=7, 16/255 SCBs Vendor: SEAGATE Model: ST118273W Rev: 5698 Type: Direct-Access ANSI SCSI revision: 02 Detected scsi disk sda at scsi0, channel 0, id 0, lun 0 (scsi0:A:0): 40.000MB/s transfers (20.000MHz, offset 8, 16bit) scsi0:0:0:0: Tagged Queuing enabled. Depth 8 SCSI device sda: 35566480 512-byte hdwr sectors (18210 MB) sda: sda1 sda2 Now the Oops tracing: EIP was ahc_match_scb + 0x19 Call trace: ahc_search_qinfifo + 0x187 ahc_abort_scbs + 0x6a __udelay + 0x27 ahc_reset_current_channel + 0x27c ahc_pci_config + 0x4e2 pci_read_config_byte + 0x1c ... -Yenya -- \ Jan Yenya Kasprzak kas at fi.muni.cz http://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// It is a very bad idea to feed negative numbers to memcpy. --Alan Cox - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
CacheFS
Hello, a friend of mine has developed the CacheFS for Linux. His work is a prototype read-only implementation for Linux 2.2.11 or so. I am thinking about adapting (or partly rewriting) his work for Linux 2.4. But before I'll start working, I'd like to ask you for comments on the proposed CacheFS architecture. The goal is to speed-up reading of potentially slow filesystems (NFS, maybe even CD-based ones) by the local on-disk cache in the same way IRIX or Solaris CacheFS works. I would expect this to be used on clusters of computers or university computer labs with NFS-mounted /usr or some other read-only filesystems. Another goal is to use the Linux filesystem as a backing store (as opposed to the block device or single large file used by CODA). The CacheFS architecture would consist in two components: - kernel module, implementing the filesystem of the type "cachefs" and a character device /dev/cachefs - user-space daemon, which would communicate with the kernel over /dev/cachefs and which would manage the backing store in a given directory. Every file on the front filesystem (NFS or so) volume will be cached in two local files by cachefsd: The first one would contain the (parts of) real file content, and the second one would contain file's metadata and the bitmap of valid blocks (or pages) of the first file. All files in cachefsd's backing store would be in a per-volume directory, and will be numbered by the inode number from the front filesystem. Now here are some questions: * Should the cachefsd be in user space (as it is in the prototype implementation) or should it be moved to the kernel space? The former allows probably better configuration (maybe a deeper directory structure in the backing store), but the later is faster as it avoids copying data between the user and kernel spaces. * Would you suggest to have only one instance of cachefsd, or one per volume, or a multithreading implementation with configurable number of threads? * Is the communication over the character device appropriate? Another alternative is probably a new syscall, /proc file, or maybe even an ioctl on the root directory of the filesystem (ugh!). * Can the kernel part of CODA can be used for this? Thanks, -Yenya -- \ Jan "Yenya" Kasprzakhttp://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// It is a very bad idea to feed negative numbers to memcpy. --Alan Cox - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
CacheFS
Hello, a friend of mine has developed the CacheFS for Linux. His work is a prototype read-only implementation for Linux 2.2.11 or so. I am thinking about adapting (or partly rewriting) his work for Linux 2.4. But before I'll start working, I'd like to ask you for comments on the proposed CacheFS architecture. The goal is to speed-up reading of potentially slow filesystems (NFS, maybe even CD-based ones) by the local on-disk cache in the same way IRIX or Solaris CacheFS works. I would expect this to be used on clusters of computers or university computer labs with NFS-mounted /usr or some other read-only filesystems. Another goal is to use the Linux filesystem as a backing store (as opposed to the block device or single large file used by CODA). The CacheFS architecture would consist in two components: - kernel module, implementing the filesystem of the type cachefs and a character device /dev/cachefs - user-space daemon, which would communicate with the kernel over /dev/cachefs and which would manage the backing store in a given directory. Every file on the front filesystem (NFS or so) volume will be cached in two local files by cachefsd: The first one would contain the (parts of) real file content, and the second one would contain file's metadata and the bitmap of valid blocks (or pages) of the first file. All files in cachefsd's backing store would be in a per-volume directory, and will be numbered by the inode number from the front filesystem. Now here are some questions: * Should the cachefsd be in user space (as it is in the prototype implementation) or should it be moved to the kernel space? The former allows probably better configuration (maybe a deeper directory structure in the backing store), but the later is faster as it avoids copying data between the user and kernel spaces. * Would you suggest to have only one instance of cachefsd, or one per volume, or a multithreading implementation with configurable number of threads? * Is the communication over the character device appropriate? Another alternative is probably a new syscall, /proc file, or maybe even an ioctl on the root directory of the filesystem (ugh!). * Can the kernel part of CODA can be used for this? Thanks, -Yenya -- \ Jan Yenya Kasprzak kas at fi.muni.cz http://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// It is a very bad idea to feed negative numbers to memcpy. --Alan Cox - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible problem with zero-copy TCP and sendfile()
Wolfgang Rohdewald wrote: : On Tuesday 17 April 2001 22:36, Jan Kasprzak wrote: : > + if (len == -1 || len > 0 && len < count) { : : are you sure there are no missing () ? : : if ((len == -1) || (len > 0) && (len < count)) { : : assumig that && has precedence over || (I believe so) Yes, but the precedence of ==, <, and > is even higher. However, I've found a problem with the previous patch: The first chunk should read: -if((len = sendfile(session.d->outf->fd, retr_fd, offset, count)) == -1) { +len = sendfile(session.d->outf->fd, retr_fd, offset, count); +if (len == -1 || len > 0 && len < count) { + if (len != -1) + errno = EINTR; i.e. we should not overwrite errno, when it is valid. -Yenya PS.: You can find the C operators precedence for example at http://www.howstuffworks.com/c14.htm (found by Google). -- \ Jan "Yenya" Kasprzakhttp://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// Mantra: "everything is a stream of bytes". Repeat until enlightened. --Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible problem with zero-copy TCP and sendfile()
Jesse S Sipprell wrote: : After cursory examination of proftpd, it appears that there is a misuse of the : sendfile() call under Linux, which may be responsible for the corruption. The : code was originally based on BSD semantics. Under Linux, the offset argument : is not being used correctly to determine how much data has been sent in the : case of EINTR. : : A patch will be coming out soon, as it is a fairly trivial fix. : FWIW, I've fixed ProFTPd on my server with the following patch. Sorry for making noise @ linux-kernel list, it was totally unrelated to the Linux kernel: --- proftpd-1.2.2rc1/src/data.c.sendfileThu Feb 15 15:24:53 2001 +++ proftpd-1.2.2rc1/src/data.c Tue Apr 17 21:35:24 2001 @@ -760,7 +760,9 @@ * * ssize_t sendfile(int out_fd, int in_fd, off_t *offset, size_t count) */ -if((len = sendfile(session.d->outf->fd, retr_fd, offset, count)) == -1) { +len = sendfile(session.d->outf->fd, retr_fd, offset, count); +if (len == -1 || len > 0 && len < count) { + errno = EINTR; #elif defined(HAVE_BSD_SENDFILE) /* BSD semantics for sendfile are flexible...it'd be nice if we could * standardize on something like it. The semantics are: @@ -797,7 +799,9 @@ if((count -= len) <= 0) break; +#if !defined(HAVE_LINUX_SENDFILE) *offset += len; +#endif if(TimeoutStalled) reset_timer(TIMER_STALLED, ANY_MODULE); -Yenya -- \ Jan "Yenya" Kasprzakhttp://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// Mantra: "everything is a stream of bytes". Repeat until enlightened. --Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible problem with zero-copy TCP and sendfile()
Jan Kasprzak wrote: : $ cmp -cl seawolf-sendfile.iso seawolf-i386-SRPMS.iso [...] : : Which simply means, that at 160628609 it started to send : the CD image from the beginning. Well, I did strace of proftpd, and it _may_ be a mis-interpretation of the sendfile(2) semantics on the proftpd side. The relevant part of strace follows: gettimeofday({987527927, 46167}, NULL) = 0 fcntl64(12, F_GETFL)= 0x802 (flags O_RDWR|O_NONBLOCK) fcntl64(12, F_SETFL, O_RDWR)= 0 sendfile(12, 9, [0], 678244352) = 138133872 --- SIGALRM (Alarm clock) --- rt_sigaction(SIGALRM, {0x804f520, [], SA_INTERRUPT|0x400}, NULL, 8) = 0 rt_sigaction(SIGALRM, NULL, {0x804f520, [], SA_INTERRUPT|0x400}, 8) = 0 rt_sigaction(SIGALRM, {0x804f520, [], SA_INTERRUPT|0x400}, NULL, 8) = 0 alarm(300) = 0 sigreturn() = ? (mask now []) fcntl64(12, F_SETFL, O_RDWR|O_NONBLOCK) = 0 alarm(0)= 300 alarm(300) = 0 alarm(0)= 300 alarm(300) = 0 getpid()= 24482 geteuid32() = 14 getegid32() = 50 flock(6, LOCK_EX) = 0 lseek(6, 644, SEEK_SET) = 644 read(6, "\242_\0\0\16\0\0\0002\0\0\0\0\0\0\0I\10\0\0\0\0\0\0ftp"..., 644) = 644 lseek(6, 644, SEEK_SET) = 644 write(6, "\242_\0\0\16\0\0\0002\0\0\0\0\0\0\0I\10\0\0\0\0\0\0ftp"..., 644) = 644 flock(6, LOCK_UN) = 0 fcntl64(12, F_GETFL)= 0x802 (flags O_RDWR|O_NONBLOCK) fcntl64(12, F_SETFL, O_RDWR)= 0 sendfile(12, 9, [0], 540110480) = 103469424 Now the fd 6 is the control connection, fd 9 is the file on disk, and fd 12 is the data connection. The ProFTPd seems to set alarm to 300 seconds (to detect stalled clients), but when interrupted, something strange happens: either sendfile does not update the offset in its third parameter, or it fails to update the offset in the filedescriptor, or something like that. Maybe ProFTPd should pass the non-zero value (actual offset?) to sendfile() second time? What is the expected semantics of sendfile() wrt. restarting transfers and being interrupted by SIGALRM? -Yenya -- \ Jan "Yenya" Kasprzakhttp://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// Mantra: "everything is a stream of bytes". Repeat until enlightened. --Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible problem with zero-copy TCP and sendfile()
Andi Kleen wrote: : I guess to debug this problem it would be useful to get some idea about the : nature of the corruption. Could you enable sendfile() again, and when a : user complains ask to download it again and provide a : cmp -cl fileA fileB | head -500 listing of their differences? Well, here it is: $ cmp -cl seawolf-sendfile.iso seawolf-i386-SRPMS.iso 160628609 0 ^@ 276 M-> 160628610 0 ^@32 ^Z 160628611 0 ^@14 ^L 160628612 0 ^@55 - 160628613 0 ^@ 116 N 160628614 0 ^@ 300 M-@ 160628615 0 ^@ 150 h 160628616 0 ^@ 210 M-^H 160628617 0 ^@ 271 M-9 160628618 0 ^@ 307 M-G 160628619 0 ^@ 377 M-^? [ all bytes in sendfile()d image changed to zero until: ] 160661374 0 ^@ 376 M-~ 160661375 0 ^@ 231 M-^Y 160661376 0 ^@ 205 M-^E 160661377 1 ^A 364 M-t 160661378 103 C277 M-? 160661379 104 D 13 ^K 160661380 60 0 50 ( 160661381 60 0360 M-p 160661382 61 1 77 ? 160661383 1 ^A 304 M-D 160661384 0 ^@ 133 [ 160661385 114 L131 Y 160661386 111 I377 M-^? 160661387 116 N123 S 160661388 125 U234 M-^\ 160661389 130 X250 M-( Which simply means, that at 160628609 it started to send the CD image from the beginning. Yes, the original image contains 0x8000 zeros, and then the text "\001CD001\001\000LINUX". So it has probably nothing to do with 3c59x driver, but with sendfile() or ProFTPd's use of sendfile(). If anybody wants to test it, I've left running ProFTPd with sendfile() enabled at ftp.linux.cz, port 2121. Thanks, -Yenya -- \ Jan "Yenya" Kasprzakhttp://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// ///... in B its 'extrn' not 'extern'.Alan (yes I programmed in B)\\\ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible problem with zero-copy TCP and sendfile()
Alan Cox wrote: : > : but once a fixed BIOS is out for your board that would be a good first step. : > : If it still does it then, its worth digging for kernel naughties : > : : > I don't think I have 686b southbridge. I have 686 (without "b"): : : Ok. What revision of 3c90x card do you have ? : PCI: Found IRQ 11 for device 00:0c.0 3c59x.c:LK1.1.13 27 Jan 2001 Donald Becker and others. http://www.scyld.com/network/vortex.html See Documentation/networking/vortex.txt eth0: 3Com PCI 3c905C Tornado at 0xa000, 00:50:da:06:95:21, IRQ 11 product code 5957 rev 00.13 date 07-17-99 8K byte-wide RAM 5:3 Rx:Tx split, autoselect/Autonegotiate interface. MII transceiver found at address 24, status 782d. Enabling bus-master transmits and whole-frame receives. eth0: scatter/gather enabled. h/w checksums enabled Some more progress: I now downgraded to proftpd without sendfile(). The CPU usage is now nearly 100% (with ~170 FTP users; with sendfile() it was under 50% with >320 FTP users). But nevertheless, the downloaded images now seem to be OK. Should I try the stock 2.4.3 without zero-copy patches? -Yenya -- \ Jan "Yenya" Kasprzakhttp://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// ///... in B its 'extrn' not 'extern'.Alan (yes I programmed in B)\\\ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible problem with zero-copy TCP and sendfile()
Andi Kleen wrote: : On Tue, Apr 17, 2001 at 03:10:07PM +0200, Jan Kasprzak wrote: : > 00:0c.0 Ethernet controller: 3Com Corporation 3c905C-TX [Fast Etherlink] (rev 74) : : IIRC the problem came up earlier. Some versions of 3com NICs seem to make : problems with the hardware checksum. There were some fixes in the driver : later; could you try it with 2.4.4pre3 (which includes zerocopy) ? : I was not able to boot 2.4.4pre3 at all: It panicked when initializing aic7xxx. So I've changed the config to old_aic7xxx, but it locked up on starting up RAID arrays. BTW, patch-2.4.4pre3 does not contain any significant change to 3c59x.c (the only change is adding some #include file). Now I am back to 2.4.3 and I'll try to run proftpd without sendfile(). -Y. -- \ Jan "Yenya" Kasprzakhttp://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// ///... in B its 'extrn' not 'extern'.Alan (yes I programmed in B)\\\ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible problem with zero-copy TCP and sendfile()
Alan Cox wrote: : > The long story: My server is Athlon 850 on ASUS A7V, 256M RAM. : > Seven IDE discs, one SCSI disc. The controllers and NIC are as follows : > (output of lspci): : : See the VIA chipset report on www.theregister.co.uk about corruption problems : with VIA chipsets. The cases seen on Linux included short and also sometimes : stale/corrupted DMA transfers. : : Nothing in your report says it is or isnt going to be a VIA chipset problem : but once a fixed BIOS is out for your board that would be a good first step. : If it still does it then, its worth digging for kernel naughties : I don't think I have 686b southbridge. I have 686 (without "b"): 00:00.0 Host bridge: VIA Technologies, Inc.: Unknown device 0305 (rev 02) 00:01.0 PCI bridge: VIA Technologies, Inc.: Unknown device 8305 00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 22) 00:04.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10) 00:04.2 USB Controller: VIA Technologies, Inc. VT82C586B USB (rev 10) 00:04.3 USB Controller: VIA Technologies, Inc. VT82C586B USB (rev 10) 00:04.4 Host bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 30 [...] -Yenya -- \ Jan "Yenya" Kasprzakhttp://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// ///... in B its 'extrn' not 'extern'.Alan (yes I programmed in B)\\\ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Possible problem with zero-copy TCP and sendfile()
Hello, I have discovered a possible problem on my host. The short story is: When downloading ISO images from this host (which runs 2.4.3 + zerocopy and ProFTPd with sendfile()), the image is sometimes corrupted (MD5 checksum of the downloaded file does not match). The long story: My server is Athlon 850 on ASUS A7V, 256M RAM. Seven IDE discs, one SCSI disc. The controllers and NIC are as follows (output of lspci): 00:04.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10) 00:0a.0 SCSI storage controller: Adaptec AIC-7881U 00:0c.0 Ethernet controller: 3Com Corporation 3c905C-TX [Fast Etherlink] (rev 74) 00:11.0 Unknown mass storage controller: Promise Technology, Inc.: Unknown device 0d30 (rev 02) The server runs Linux 2.4.3 with zero-copy patches and ProFTPd 1.2.2rc1 compiled with --enable-sendfile. The FTP area is on RAID-1 volume, which is created over two LVM partitions (each LV spans three physical disks). I hope RAID-1 can speed things up for multiple simultaneous users. Yesterday the Red Hat Linux 7.1 has been released, and from that time the server has about 220 anonymous FTP users and was pushing data at almost full 100 Mbps ethernet speed (currently the 2hour average is 89.7 Mbps according to MRTG). Today I've got about three complains about corrupted ISO images. When I run md5sum on the server itself, the MD5 checksums, of course, perfectly match. I've tried to download the files from another machine on the same net, and MD5 sums were correct. However, I have one report of corrupted download even from the same physical network. In the last 24 hours the server pushed out about 660 gigabytes of Red Hat 7.1. Is this amount (i.e. three reports out of 660 gigabytes) a serious problem? Also note that I have no corrupted download report for rsync. But I think rsyncd does not use sendfile(), and of course vast majority of people use FTP, not rsync, for downloading. -Yenya -- \ Jan "Yenya" Kasprzakhttp://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// Mantra: "everything is a stream of bytes". Repeat until enlightened. --Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Possible problem with zero-copy TCP and sendfile()
Hello, I have discovered a possible problem on my host. The short story is: When downloading ISO images from this host (which runs 2.4.3 + zerocopy and ProFTPd with sendfile()), the image is sometimes corrupted (MD5 checksum of the downloaded file does not match). The long story: My server is Athlon 850 on ASUS A7V, 256M RAM. Seven IDE discs, one SCSI disc. The controllers and NIC are as follows (output of lspci): 00:04.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10) 00:0a.0 SCSI storage controller: Adaptec AIC-7881U 00:0c.0 Ethernet controller: 3Com Corporation 3c905C-TX [Fast Etherlink] (rev 74) 00:11.0 Unknown mass storage controller: Promise Technology, Inc.: Unknown device 0d30 (rev 02) The server runs Linux 2.4.3 with zero-copy patches and ProFTPd 1.2.2rc1 compiled with --enable-sendfile. The FTP area is on RAID-1 volume, which is created over two LVM partitions (each LV spans three physical disks). I hope RAID-1 can speed things up for multiple simultaneous users. Yesterday the Red Hat Linux 7.1 has been released, and from that time the server has about 220 anonymous FTP users and was pushing data at almost full 100 Mbps ethernet speed (currently the 2hour average is 89.7 Mbps according to MRTG). Today I've got about three complains about corrupted ISO images. When I run md5sum on the server itself, the MD5 checksums, of course, perfectly match. I've tried to download the files from another machine on the same net, and MD5 sums were correct. However, I have one report of corrupted download even from the same physical network. In the last 24 hours the server pushed out about 660 gigabytes of Red Hat 7.1. Is this amount (i.e. three reports out of 660 gigabytes) a serious problem? Also note that I have no corrupted download report for rsync. But I think rsyncd does not use sendfile(), and of course vast majority of people use FTP, not rsync, for downloading. -Yenya -- \ Jan "Yenya" Kasprzak kas at fi.muni.cz http://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// Mantra: "everything is a stream of bytes". Repeat until enlightened. --Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible problem with zero-copy TCP and sendfile()
Alan Cox wrote: : The long story: My server is Athlon 850 on ASUS A7V, 256M RAM. : Seven IDE discs, one SCSI disc. The controllers and NIC are as follows : (output of lspci): : : See the VIA chipset report on www.theregister.co.uk about corruption problems : with VIA chipsets. The cases seen on Linux included short and also sometimes : stale/corrupted DMA transfers. : : Nothing in your report says it is or isnt going to be a VIA chipset problem : but once a fixed BIOS is out for your board that would be a good first step. : If it still does it then, its worth digging for kernel naughties : I don't think I have 686b southbridge. I have 686 (without "b"): 00:00.0 Host bridge: VIA Technologies, Inc.: Unknown device 0305 (rev 02) 00:01.0 PCI bridge: VIA Technologies, Inc.: Unknown device 8305 00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 22) 00:04.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10) 00:04.2 USB Controller: VIA Technologies, Inc. VT82C586B USB (rev 10) 00:04.3 USB Controller: VIA Technologies, Inc. VT82C586B USB (rev 10) 00:04.4 Host bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 30 [...] -Yenya -- \ Jan "Yenya" Kasprzak kas at fi.muni.cz http://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// ///... in B its 'extrn' not 'extern'.Alan (yes I programmed in B)\\\ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible problem with zero-copy TCP and sendfile()
Andi Kleen wrote: : On Tue, Apr 17, 2001 at 03:10:07PM +0200, Jan Kasprzak wrote: : 00:0c.0 Ethernet controller: 3Com Corporation 3c905C-TX [Fast Etherlink] (rev 74) : : IIRC the problem came up earlier. Some versions of 3com NICs seem to make : problems with the hardware checksum. There were some fixes in the driver : later; could you try it with 2.4.4pre3 (which includes zerocopy) ? : I was not able to boot 2.4.4pre3 at all: It panicked when initializing aic7xxx. So I've changed the config to old_aic7xxx, but it locked up on starting up RAID arrays. BTW, patch-2.4.4pre3 does not contain any significant change to 3c59x.c (the only change is adding some #include file). Now I am back to 2.4.3 and I'll try to run proftpd without sendfile(). -Y. -- \ Jan "Yenya" Kasprzak kas at fi.muni.cz http://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// ///... in B its 'extrn' not 'extern'.Alan (yes I programmed in B)\\\ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible problem with zero-copy TCP and sendfile()
Alan Cox wrote: : : but once a fixed BIOS is out for your board that would be a good first step. : : If it still does it then, its worth digging for kernel naughties : : : I don't think I have 686b southbridge. I have 686 (without "b"): : : Ok. What revision of 3c90x card do you have ? : PCI: Found IRQ 11 for device 00:0c.0 3c59x.c:LK1.1.13 27 Jan 2001 Donald Becker and others. http://www.scyld.com/network/vortex.html See Documentation/networking/vortex.txt eth0: 3Com PCI 3c905C Tornado at 0xa000, 00:50:da:06:95:21, IRQ 11 product code 5957 rev 00.13 date 07-17-99 8K byte-wide RAM 5:3 Rx:Tx split, autoselect/Autonegotiate interface. MII transceiver found at address 24, status 782d. Enabling bus-master transmits and whole-frame receives. eth0: scatter/gather enabled. h/w checksums enabled Some more progress: I now downgraded to proftpd without sendfile(). The CPU usage is now nearly 100% (with ~170 FTP users; with sendfile() it was under 50% with 320 FTP users). But nevertheless, the downloaded images now seem to be OK. Should I try the stock 2.4.3 without zero-copy patches? -Yenya -- \ Jan "Yenya" Kasprzak kas at fi.muni.cz http://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// ///... in B its 'extrn' not 'extern'.Alan (yes I programmed in B)\\\ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible problem with zero-copy TCP and sendfile()
Jesse S Sipprell wrote: : After cursory examination of proftpd, it appears that there is a misuse of the : sendfile() call under Linux, which may be responsible for the corruption. The : code was originally based on BSD semantics. Under Linux, the offset argument : is not being used correctly to determine how much data has been sent in the : case of EINTR. : : A patch will be coming out soon, as it is a fairly trivial fix. : FWIW, I've fixed ProFTPd on my server with the following patch. Sorry for making noise @ linux-kernel list, it was totally unrelated to the Linux kernel: --- proftpd-1.2.2rc1/src/data.c.sendfileThu Feb 15 15:24:53 2001 +++ proftpd-1.2.2rc1/src/data.c Tue Apr 17 21:35:24 2001 @@ -760,7 +760,9 @@ * * ssize_t sendfile(int out_fd, int in_fd, off_t *offset, size_t count) */ -if((len = sendfile(session.d-outf-fd, retr_fd, offset, count)) == -1) { +len = sendfile(session.d-outf-fd, retr_fd, offset, count); +if (len == -1 || len 0 len count) { + errno = EINTR; #elif defined(HAVE_BSD_SENDFILE) /* BSD semantics for sendfile are flexible...it'd be nice if we could * standardize on something like it. The semantics are: @@ -797,7 +799,9 @@ if((count -= len) = 0) break; +#if !defined(HAVE_LINUX_SENDFILE) *offset += len; +#endif if(TimeoutStalled) reset_timer(TIMER_STALLED, ANY_MODULE); -Yenya -- \ Jan "Yenya" Kasprzak kas at fi.muni.cz http://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// Mantra: "everything is a stream of bytes". Repeat until enlightened. --Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible problem with zero-copy TCP and sendfile()
Jan Kasprzak wrote: : $ cmp -cl seawolf-sendfile.iso seawolf-i386-SRPMS.iso [...] : : Which simply means, that at 160628609 it started to send : the CD image from the beginning. Well, I did strace of proftpd, and it _may_ be a mis-interpretation of the sendfile(2) semantics on the proftpd side. The relevant part of strace follows: gettimeofday({987527927, 46167}, NULL) = 0 fcntl64(12, F_GETFL)= 0x802 (flags O_RDWR|O_NONBLOCK) fcntl64(12, F_SETFL, O_RDWR)= 0 sendfile(12, 9, [0], 678244352) = 138133872 --- SIGALRM (Alarm clock) --- rt_sigaction(SIGALRM, {0x804f520, [], SA_INTERRUPT|0x400}, NULL, 8) = 0 rt_sigaction(SIGALRM, NULL, {0x804f520, [], SA_INTERRUPT|0x400}, 8) = 0 rt_sigaction(SIGALRM, {0x804f520, [], SA_INTERRUPT|0x400}, NULL, 8) = 0 alarm(300) = 0 sigreturn() = ? (mask now []) fcntl64(12, F_SETFL, O_RDWR|O_NONBLOCK) = 0 alarm(0)= 300 alarm(300) = 0 alarm(0)= 300 alarm(300) = 0 getpid()= 24482 geteuid32() = 14 getegid32() = 50 flock(6, LOCK_EX) = 0 lseek(6, 644, SEEK_SET) = 644 read(6, "\242_\0\0\16\0\0\0002\0\0\0\0\0\0\0I\10\0\0\0\0\0\0ftp"..., 644) = 644 lseek(6, 644, SEEK_SET) = 644 write(6, "\242_\0\0\16\0\0\0002\0\0\0\0\0\0\0I\10\0\0\0\0\0\0ftp"..., 644) = 644 flock(6, LOCK_UN) = 0 fcntl64(12, F_GETFL)= 0x802 (flags O_RDWR|O_NONBLOCK) fcntl64(12, F_SETFL, O_RDWR)= 0 sendfile(12, 9, [0], 540110480) = 103469424 Now the fd 6 is the control connection, fd 9 is the file on disk, and fd 12 is the data connection. The ProFTPd seems to set alarm to 300 seconds (to detect stalled clients), but when interrupted, something strange happens: either sendfile does not update the offset in its third parameter, or it fails to update the offset in the filedescriptor, or something like that. Maybe ProFTPd should pass the non-zero value (actual offset?) to sendfile() second time? What is the expected semantics of sendfile() wrt. restarting transfers and being interrupted by SIGALRM? -Yenya -- \ Jan "Yenya" Kasprzak kas at fi.muni.cz http://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// Mantra: "everything is a stream of bytes". Repeat until enlightened. --Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible problem with zero-copy TCP and sendfile()
Andi Kleen wrote: : I guess to debug this problem it would be useful to get some idea about the : nature of the corruption. Could you enable sendfile() again, and when a : user complains ask to download it again and provide a : cmp -cl fileA fileB | head -500 listing of their differences? Well, here it is: $ cmp -cl seawolf-sendfile.iso seawolf-i386-SRPMS.iso 160628609 0 ^@ 276 M- 160628610 0 ^@32 ^Z 160628611 0 ^@14 ^L 160628612 0 ^@55 - 160628613 0 ^@ 116 N 160628614 0 ^@ 300 M-@ 160628615 0 ^@ 150 h 160628616 0 ^@ 210 M-^H 160628617 0 ^@ 271 M-9 160628618 0 ^@ 307 M-G 160628619 0 ^@ 377 M-^? [ all bytes in sendfile()d image changed to zero until: ] 160661374 0 ^@ 376 M-~ 160661375 0 ^@ 231 M-^Y 160661376 0 ^@ 205 M-^E 160661377 1 ^A 364 M-t 160661378 103 C277 M-? 160661379 104 D 13 ^K 160661380 60 0 50 ( 160661381 60 0360 M-p 160661382 61 1 77 ? 160661383 1 ^A 304 M-D 160661384 0 ^@ 133 [ 160661385 114 L131 Y 160661386 111 I377 M-^? 160661387 116 N123 S 160661388 125 U234 M-^\ 160661389 130 X250 M-( Which simply means, that at 160628609 it started to send the CD image from the beginning. Yes, the original image contains 0x8000 zeros, and then the text "\001CD001\001\000LINUX". So it has probably nothing to do with 3c59x driver, but with sendfile() or ProFTPd's use of sendfile(). If anybody wants to test it, I've left running ProFTPd with sendfile() enabled at ftp.linux.cz, port 2121. Thanks, -Yenya -- \ Jan "Yenya" Kasprzak kas at fi.muni.cz http://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// ///... in B its 'extrn' not 'extern'.Alan (yes I programmed in B)\\\ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible problem with zero-copy TCP and sendfile()
Wolfgang Rohdewald wrote: : On Tuesday 17 April 2001 22:36, Jan Kasprzak wrote: : + if (len == -1 || len 0 len count) { : : are you sure there are no missing () ? : : if ((len == -1) || (len 0) (len count)) { : : assumig that has precedence over || (I believe so) Yes, but the precedence of ==, , and is even higher. However, I've found a problem with the previous patch: The first chunk should read: -if((len = sendfile(session.d-outf-fd, retr_fd, offset, count)) == -1) { +len = sendfile(session.d-outf-fd, retr_fd, offset, count); +if (len == -1 || len 0 len count) { + if (len != -1) + errno = EINTR; i.e. we should not overwrite errno, when it is valid. -Yenya PS.: You can find the C operators precedence for example at http://www.howstuffworks.com/c14.htm (found by Google). -- \ Jan "Yenya" Kasprzak kas at fi.muni.cz http://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// Mantra: "everything is a stream of bytes". Repeat until enlightened. --Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: drivers/block/loop.c:max_loop
Jens Axboe wrote: : On Mon, Apr 16 2001, Jan Kasprzak wrote: : > Hello, : > : > I run a relatively large FTP server, and I've just reached : > the max_loop limit of loop devices here (I use loopback mount of ISO 9660 : > images of Linux distros here). Is there any reason for keeping : > the max_loop variable in loop.c set to 8? : : Memory requirements -- nothing prevents you from loading it with a : bigger max count though... : I would suggest to make the limit configurable by /proc or ioctl() in run-time. Or even better, to make all allocations on first /dev/loopN open. Should I try to implement something like that? -Yenya -- \ Jan "Yenya" Kasprzakhttp://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// Mantra: "everything is a stream of bytes". Repeat until enlightened. --Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
drivers/block/loop.c:max_loop
Hello, I run a relatively large FTP server, and I've just reached the max_loop limit of loop devices here (I use loopback mount of ISO 9660 images of Linux distros here). Is there any reason for keeping the max_loop variable in loop.c set to 8? Thanks, -Yenya -- \ Jan "Yenya" Kasprzakhttp://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// Mantra: "everything is a stream of bytes". Repeat until enlightened. --Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
drivers/block/loop.c:max_loop
Hello, I run a relatively large FTP server, and I've just reached the max_loop limit of loop devices here (I use loopback mount of ISO 9660 images of Linux distros here). Is there any reason for keeping the max_loop variable in loop.c set to 8? Thanks, -Yenya -- \ Jan "Yenya" Kasprzak kas at fi.muni.cz http://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// Mantra: "everything is a stream of bytes". Repeat until enlightened. --Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: drivers/block/loop.c:max_loop
Jens Axboe wrote: : On Mon, Apr 16 2001, Jan Kasprzak wrote: : Hello, : : I run a relatively large FTP server, and I've just reached : the max_loop limit of loop devices here (I use loopback mount of ISO 9660 : images of Linux distros here). Is there any reason for keeping : the max_loop variable in loop.c set to 8? : : Memory requirements -- nothing prevents you from loading it with a : bigger max count though... : I would suggest to make the limit configurable by /proc or ioctl() in run-time. Or even better, to make all allocations on first /dev/loopN open. Should I try to implement something like that? -Yenya -- \ Jan "Yenya" Kasprzak kas at fi.muni.cz http://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// Mantra: "everything is a stream of bytes". Repeat until enlightened. --Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [reiserfs-list] ReiserFS Oops (2.4.1, deterministic, symlink related)
Jan Kasprzak wrote: : : : : 2.96-69 should be ok (thats the one I've been using without trouble). The : : original one with RH 7.0 off the CD does miscompile a few kernel things. : : It is the original one. I'll try with the -69: : With 2.96-69 the reiserfs seems to work well. Sorry for the confusion, I forgot to upgrade the gcc on my machine. -Yenya -- \ Jan "Yenya" Kasprzakhttp://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// > Is there anything else I can contribute? -- The latitude and longtitude of the bios writers current position, and a ballistic missile. (Alan Cox) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [reiserfs-list] ReiserFS Oops (2.4.1, deterministic, symlink related)
Alan Cox wrote: : > Hans Reiser wrote: : >: This is why our next patch will detect the use of gcc 2.96, and complain, in the : >: reiserfs Makefile. : >: : > OK, thanks. It works with older compiler (altough I use gcc 2.96 : > for a long time for compiling various 2.[34] kernels without problem). : : Ok which 2.96 compiler do you have. I need to get this one chased down since : its probably also going to be in the current gcc CVS branches heading for 3.0 : : 2.96-69 should be ok (thats the one I've been using without trouble). The : original one with RH 7.0 off the CD does miscompile a few kernel things. It is the original one. I'll try with the -69: $ rpm -q gcc gcc -gcc-2.96-54 $ gcc -v Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs gcc version 2.96 2731 (Red Hat Linux 7.0) -Yenya -- \ Jan "Yenya" Kasprzakhttp://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// > Is there anything else I can contribute? -- The latitude and longtitude of the bios writers current position, and a ballistic missile. (Alan Cox) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [reiserfs-list] ReiserFS Oops (2.4.1, deterministic, symlink related)
Hans Reiser wrote: : This is why our next patch will detect the use of gcc 2.96, and complain, in the : reiserfs Makefile. : OK, thanks. It works with older compiler (altough I use gcc 2.96 for a long time for compiling various 2.[34] kernels without problem). -Yenya -- \ Jan "Yenya" Kasprzakhttp://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// > Is there anything else I can contribute? -- The latitude and longtitude of the bios writers current position, and a ballistic missile. (Alan Cox) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
ReiserFS Oops (2.4.1, deterministic, symlink related)
Hello, with ReiserFS support in 2.4.1 I have decided to give it a try. I created a filesystem on a spare partition, mounted it as /mnt, and tried to use it. The kernel crashed - I am able to reproduce it with the following steps: - boot linux with init=/bin/bash - [optional] /sbin/mkreiserfs /dev/hdd1 (it can be reproduced even on freshly created FS) - mount -t reiserfs /dev/hdd1 /mnt - cp -arv /usr /mnt I am attaching the details, feel free to ask for more information, if you want it. Please Cc: me in any reply. Oops is a NULL pointer dereference at address 0010, EIP is c017c7c3 (in check_leaf), EFLAGS is 00010292, Process is "cp", Code: 8b 52 10 ff d2 59 5b 8b 54 24 14 8b 42 34 89 c7 0f b7 47 02, Call trace is the following: c015f459 (in do_balance) c0179466 (in fix_nodes) c0179476 (also in fix_nodes) c018612c (in reiserfs_insert_item) c0173cb4 (near the end of reiserfs_new_symlink) c0174170 (in reiserfs_new_inode) c0170cbd (in reiserfs_symlink) c0142a45 (in d_alloc) c013c825 (in vfs_symlink) c013c8de (in sys_symlink) c0109023 (in system_call) All numbers are written by hand from the screen, so there may be a minor mistakes. Looking at the cp output, it seems it crashed while copying the symlink "/usr/bin/sgml2xml -> osx" to /mnt/bin. My computer is almost generic Red Hat 7.0 with all updates. Hardware is K6-2 @523 MHz, 128M RAM, VIA VT82C598 north bridge. I tried to create ext2 filesystem on /dev/hdd1, and then cp -arv /usr /mnt worked fine. The kernel config (grep '=[ym]' /usr/src/linux/.config) is the following (no modules were loadaed, though): CONFIG_X86=y CONFIG_ISA=y CONFIG_UID16=y CONFIG_EXPERIMENTAL=y CONFIG_MODULES=y CONFIG_KMOD=y CONFIG_MK6=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_CMPXCHG=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_ALIGNMENT_16=y CONFIG_X86_TSC=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_NOHIGHMEM=y CONFIG_MTRR=y CONFIG_NET=y CONFIG_PCI=y CONFIG_PCI_GOANY=y CONFIG_PCI_BIOS=y CONFIG_PCI_DIRECT=y CONFIG_HOTPLUG=y CONFIG_PCMCIA=y CONFIG_CARDBUS=y CONFIG_SYSVIPC=y CONFIG_SYSCTL=y CONFIG_KCORE_ELF=y CONFIG_BINFMT_AOUT=m CONFIG_BINFMT_ELF=y CONFIG_BINFMT_MISC=y CONFIG_PARPORT=m CONFIG_PARPORT_PC=m CONFIG_PARPORT_PC_FIFO=y CONFIG_PARPORT_PC_SUPERIO=y CONFIG_PARPORT_1284=y CONFIG_BLK_DEV_FD=m CONFIG_BLK_DEV_LOOP=m CONFIG_PACKET=y CONFIG_PACKET_MMAP=y CONFIG_NETLINK=y CONFIG_RTNETLINK=y CONFIG_UNIX=y CONFIG_INET=y CONFIG_INET_ECN=y CONFIG_IPV6=m CONFIG_IPV6_EUI64=y CONFIG_IDE=y CONFIG_BLK_DEV_IDE=y CONFIG_BLK_DEV_IDEDISK=y CONFIG_IDEDISK_MULTI_MODE=y CONFIG_BLK_DEV_IDECS=m CONFIG_BLK_DEV_IDECD=m CONFIG_BLK_DEV_IDEPCI=y CONFIG_IDEPCI_SHARE_IRQ=y CONFIG_BLK_DEV_IDEDMA_PCI=y CONFIG_IDEDMA_PCI_AUTO=y CONFIG_BLK_DEV_IDEDMA=y CONFIG_IDEDMA_PCI_WIP=y CONFIG_IDEDMA_NEW_DRIVE_LISTINGS=y CONFIG_BLK_DEV_VIA82CXXX=y CONFIG_IDEDMA_AUTO=y CONFIG_BLK_DEV_IDE_MODES=y CONFIG_NETDEVICES=y CONFIG_NET_ETHERNET=y CONFIG_NET_VENDOR_3COM=y CONFIG_VORTEX=y CONFIG_HAMACHI=m CONFIG_PPP=m CONFIG_PPP_ASYNC=m CONFIG_PPP_DEFLATE=m CONFIG_PPP_BSDCOMP=m CONFIG_WAN=y CONFIG_COSA=m CONFIG_VT=y CONFIG_VT_CONSOLE=y CONFIG_SERIAL=y CONFIG_UNIX98_PTYS=y CONFIG_PRINTER=m CONFIG_MOUSE=y CONFIG_PSMOUSE=y CONFIG_NVRAM=m CONFIG_RTC=m CONFIG_AGP=y CONFIG_AGP_VIA=y CONFIG_DRM=y CONFIG_DRM_MGA=y CONFIG_PCMCIA_SERIAL=y CONFIG_AUTOFS4_FS=y CONFIG_REISERFS_FS=y CONFIG_REISERFS_CHECK=y CONFIG_ISO9660_FS=m CONFIG_PROC_FS=y CONFIG_DEVPTS_FS=y CONFIG_EXT2_FS=y CONFIG_CODA_FS=m CONFIG_NFS_FS=y CONFIG_NFS_V3=y CONFIG_NFSD=m CONFIG_NFSD_V3=y CONFIG_SUNRPC=y CONFIG_LOCKD=y CONFIG_LOCKD_V4=y CONFIG_MSDOS_PARTITION=y CONFIG_VGA_CONSOLE=y CONFIG_VIDEO_SELECT=y CONFIG_SOUND=y CONFIG_SOUND_ES1371=y CONFIG_USB=m CONFIG_USB_DEVICEFS=y CONFIG_USB_UHCI=m CONFIG_USB_AUDIO=m CONFIG_USB_SCANNER=m The dmesg output: Linux version 2.4.1 ([EMAIL PROTECTED]) (gcc version 2.96 2731 (Red Hat Linux 7.0)) #2 Fri Feb 2 11:46:21 CET 2001 BIOS-provided physical RAM map: BIOS-e820: 0009fc00 @ (usable) BIOS-e820: 0400 @ 0009fc00 (usable) BIOS-e820: 0001 @ 000f (reserved) BIOS-e820: 0001 @ (reserved) BIOS-e820: 07ef @ 0010 (usable) BIOS-e820: d000 @ 07ff3000 (ACPI data) BIOS-e820: 3000 @ 07ff (ACPI NVS) On node 0 totalpages: 32752 zone(0): 4096 pages. zone(1): 28656 pages. zone(2): 0 pages. Kernel command line: auto BOOT_IMAGE=linux ro root=301 BOOT_FILE=/boot/linux no-hlt Initializing CPU#0 Detected 524.100 MHz processor. Console: colour VGA+ 80x50 Calibrating delay loop... 1045.29 BogoMIPS Memory: 126608k/131008k available (1153k kernel code, 4012k reserved, 396k data, 64k init, 0k highmem) Dentry-cache hash table entries: 16384 (order: 5, 131072 bytes) Buffer-cache hash table entries: 4096
ReiserFS Oops (2.4.1, deterministic, symlink related)
Hello, with ReiserFS support in 2.4.1 I have decided to give it a try. I created a filesystem on a spare partition, mounted it as /mnt, and tried to use it. The kernel crashed - I am able to reproduce it with the following steps: - boot linux with init=/bin/bash - [optional] /sbin/mkreiserfs /dev/hdd1 (it can be reproduced even on freshly created FS) - mount -t reiserfs /dev/hdd1 /mnt - cp -arv /usr /mnt I am attaching the details, feel free to ask for more information, if you want it. Please Cc: me in any reply. Oops is a NULL pointer dereference at address 0010, EIP is c017c7c3 (in check_leaf), EFLAGS is 00010292, Process is "cp", Code: 8b 52 10 ff d2 59 5b 8b 54 24 14 8b 42 34 89 c7 0f b7 47 02, Call trace is the following: c015f459 (in do_balance) c0179466 (in fix_nodes) c0179476 (also in fix_nodes) c018612c (in reiserfs_insert_item) c0173cb4 (near the end of reiserfs_new_symlink) c0174170 (in reiserfs_new_inode) c0170cbd (in reiserfs_symlink) c0142a45 (in d_alloc) c013c825 (in vfs_symlink) c013c8de (in sys_symlink) c0109023 (in system_call) All numbers are written by hand from the screen, so there may be a minor mistakes. Looking at the cp output, it seems it crashed while copying the symlink "/usr/bin/sgml2xml - osx" to /mnt/bin. My computer is almost generic Red Hat 7.0 with all updates. Hardware is K6-2 @523 MHz, 128M RAM, VIA VT82C598 north bridge. I tried to create ext2 filesystem on /dev/hdd1, and then cp -arv /usr /mnt worked fine. The kernel config (grep '=[ym]' /usr/src/linux/.config) is the following (no modules were loadaed, though): CONFIG_X86=y CONFIG_ISA=y CONFIG_UID16=y CONFIG_EXPERIMENTAL=y CONFIG_MODULES=y CONFIG_KMOD=y CONFIG_MK6=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_CMPXCHG=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_ALIGNMENT_16=y CONFIG_X86_TSC=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_NOHIGHMEM=y CONFIG_MTRR=y CONFIG_NET=y CONFIG_PCI=y CONFIG_PCI_GOANY=y CONFIG_PCI_BIOS=y CONFIG_PCI_DIRECT=y CONFIG_HOTPLUG=y CONFIG_PCMCIA=y CONFIG_CARDBUS=y CONFIG_SYSVIPC=y CONFIG_SYSCTL=y CONFIG_KCORE_ELF=y CONFIG_BINFMT_AOUT=m CONFIG_BINFMT_ELF=y CONFIG_BINFMT_MISC=y CONFIG_PARPORT=m CONFIG_PARPORT_PC=m CONFIG_PARPORT_PC_FIFO=y CONFIG_PARPORT_PC_SUPERIO=y CONFIG_PARPORT_1284=y CONFIG_BLK_DEV_FD=m CONFIG_BLK_DEV_LOOP=m CONFIG_PACKET=y CONFIG_PACKET_MMAP=y CONFIG_NETLINK=y CONFIG_RTNETLINK=y CONFIG_UNIX=y CONFIG_INET=y CONFIG_INET_ECN=y CONFIG_IPV6=m CONFIG_IPV6_EUI64=y CONFIG_IDE=y CONFIG_BLK_DEV_IDE=y CONFIG_BLK_DEV_IDEDISK=y CONFIG_IDEDISK_MULTI_MODE=y CONFIG_BLK_DEV_IDECS=m CONFIG_BLK_DEV_IDECD=m CONFIG_BLK_DEV_IDEPCI=y CONFIG_IDEPCI_SHARE_IRQ=y CONFIG_BLK_DEV_IDEDMA_PCI=y CONFIG_IDEDMA_PCI_AUTO=y CONFIG_BLK_DEV_IDEDMA=y CONFIG_IDEDMA_PCI_WIP=y CONFIG_IDEDMA_NEW_DRIVE_LISTINGS=y CONFIG_BLK_DEV_VIA82CXXX=y CONFIG_IDEDMA_AUTO=y CONFIG_BLK_DEV_IDE_MODES=y CONFIG_NETDEVICES=y CONFIG_NET_ETHERNET=y CONFIG_NET_VENDOR_3COM=y CONFIG_VORTEX=y CONFIG_HAMACHI=m CONFIG_PPP=m CONFIG_PPP_ASYNC=m CONFIG_PPP_DEFLATE=m CONFIG_PPP_BSDCOMP=m CONFIG_WAN=y CONFIG_COSA=m CONFIG_VT=y CONFIG_VT_CONSOLE=y CONFIG_SERIAL=y CONFIG_UNIX98_PTYS=y CONFIG_PRINTER=m CONFIG_MOUSE=y CONFIG_PSMOUSE=y CONFIG_NVRAM=m CONFIG_RTC=m CONFIG_AGP=y CONFIG_AGP_VIA=y CONFIG_DRM=y CONFIG_DRM_MGA=y CONFIG_PCMCIA_SERIAL=y CONFIG_AUTOFS4_FS=y CONFIG_REISERFS_FS=y CONFIG_REISERFS_CHECK=y CONFIG_ISO9660_FS=m CONFIG_PROC_FS=y CONFIG_DEVPTS_FS=y CONFIG_EXT2_FS=y CONFIG_CODA_FS=m CONFIG_NFS_FS=y CONFIG_NFS_V3=y CONFIG_NFSD=m CONFIG_NFSD_V3=y CONFIG_SUNRPC=y CONFIG_LOCKD=y CONFIG_LOCKD_V4=y CONFIG_MSDOS_PARTITION=y CONFIG_VGA_CONSOLE=y CONFIG_VIDEO_SELECT=y CONFIG_SOUND=y CONFIG_SOUND_ES1371=y CONFIG_USB=m CONFIG_USB_DEVICEFS=y CONFIG_USB_UHCI=m CONFIG_USB_AUDIO=m CONFIG_USB_SCANNER=m The dmesg output: Linux version 2.4.1 ([EMAIL PROTECTED]) (gcc version 2.96 2731 (Red Hat Linux 7.0)) #2 Fri Feb 2 11:46:21 CET 2001 BIOS-provided physical RAM map: BIOS-e820: 0009fc00 @ (usable) BIOS-e820: 0400 @ 0009fc00 (usable) BIOS-e820: 0001 @ 000f (reserved) BIOS-e820: 0001 @ (reserved) BIOS-e820: 07ef @ 0010 (usable) BIOS-e820: d000 @ 07ff3000 (ACPI data) BIOS-e820: 3000 @ 07ff (ACPI NVS) On node 0 totalpages: 32752 zone(0): 4096 pages. zone(1): 28656 pages. zone(2): 0 pages. Kernel command line: auto BOOT_IMAGE=linux ro root=301 BOOT_FILE=/boot/linux no-hlt Initializing CPU#0 Detected 524.100 MHz processor. Console: colour VGA+ 80x50 Calibrating delay loop... 1045.29 BogoMIPS Memory: 126608k/131008k available (1153k kernel code, 4012k reserved, 396k data, 64k init, 0k highmem) Dentry-cache hash table entries: 16384 (order: 5, 131072 bytes) Buffer-cache hash table entries: 4096
Re: [reiserfs-list] ReiserFS Oops (2.4.1, deterministic, symlink related)
Hans Reiser wrote: : This is why our next patch will detect the use of gcc 2.96, and complain, in the : reiserfs Makefile. : OK, thanks. It works with older compiler (altough I use gcc 2.96 for a long time for compiling various 2.[34] kernels without problem). -Yenya -- \ Jan "Yenya" Kasprzak kas at fi.muni.cz http://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// Is there anything else I can contribute? -- The latitude and longtitude of the bios writers current position, and a ballistic missile. (Alan Cox) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [reiserfs-list] ReiserFS Oops (2.4.1, deterministic, symlink related)
Alan Cox wrote: : Hans Reiser wrote: : : This is why our next patch will detect the use of gcc 2.96, and complain, in the : : reiserfs Makefile. : : : OK, thanks. It works with older compiler (altough I use gcc 2.96 : for a long time for compiling various 2.[34] kernels without problem). : : Ok which 2.96 compiler do you have. I need to get this one chased down since : its probably also going to be in the current gcc CVS branches heading for 3.0 : : 2.96-69 should be ok (thats the one I've been using without trouble). The : original one with RH 7.0 off the CD does miscompile a few kernel things. It is the original one. I'll try with the -69: $ rpm -q gcc gcc -gcc-2.96-54 $ gcc -v Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs gcc version 2.96 2731 (Red Hat Linux 7.0) -Yenya -- \ Jan "Yenya" Kasprzak kas at fi.muni.cz http://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// Is there anything else I can contribute? -- The latitude and longtitude of the bios writers current position, and a ballistic missile. (Alan Cox) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/