Re: sata_nv ADMA controller lockup investigation

2007-03-20 Thread Neil Schemenauer
Not sure if this helps.  I'm getting this reset with 2.6.21-rc4.
After the reset the controller seems to work again.

sata_nv :00:07.0: version 3.3
ACPI: PCI Interrupt Link [APSI] enabled at IRQ 22
ACPI: PCI Interrupt :00:07.0[A] -> Link [APSI] -> GSI 22 (level, low) -> 
IRQ 22
sata_nv :00:07.0: Using ADMA mode
PCI: Setting latency timer of device :00:07.0 to 64
ata1: SATA max UDMA/133 cmd 0xc201e480 ctl 0xc201e4a0 bmdma 
0x0001cc00 irq 22
ata2: SATA max UDMA/133 cmd 0xc201e580 ctl 0xc201e5a0 bmdma 
0x0001cc08 irq 22
scsi0 : sata_nv
ata1: SATA link down (SStatus 0 SControl 300)
scsi1 : sata_nv
ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata2.00: ATA-7: Maxtor 6V300F0, VA111630, max UDMA/133
ata2.00: 586114704 sectors, multi 16: LBA48 NCQ (depth 31/32)
ata2.00: configured for UDMA/133
scsi 1:0:0:0: Direct-Access ATA  Maxtor 6V300F0   VA11 PQ: 0 ANSI: 5
ata2: bounce limit 0x, segment boundary 0x, hw segs 61
SCSI device sda: 586114704 512-byte hdwr sectors (300091 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
SCSI device sda: 586114704 512-byte hdwr sectors (300091 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
 sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 sda9 >
sd 1:0:0:0: Attached scsi disk sda
sd 1:0:0:0: Attached scsi generic sg0 type 0
ACPI: PCI Interrupt Link [APSJ] enabled at IRQ 21
ACPI: PCI Interrupt :00:08.0[A] -> Link [APSJ] -> GSI 21 (level, low) -> 
IRQ 21
sata_nv :00:08.0: Using ADMA mode
PCI: Setting latency timer of device :00:08.0 to 64
ata3: SATA max UDMA/133 cmd 0xc2020480 ctl 0xc20204a0 bmdma 
0x0001b800 irq 21
ata4: SATA max UDMA/133 cmd 0xc2020580 ctl 0xc20205a0 bmdma 
0x0001b808 irq 21
scsi2 : sata_nv
ata3: SATA link down (SStatus 0 SControl 300)
scsi3 : sata_nv
ata4: SATA link down (SStatus 0 SControl 300)
pata_amd :00:06.0: version 0.2.8
PCI: Setting latency timer of device :00:06.0 to 64
ata5: PATA max UDMA/133 cmd 0x000101f0 ctl 0x000103f6 bmdma 
0x0001e000 irq 14
ata6: PATA max UDMA/133 cmd 0x00010170 ctl 0x00010376 bmdma 
0x0001e008 irq 15
scsi4 : pata_amd
ata5.00: ATAPI, max UDMA/33
ata5.01: ATAPI, max MWDMA2
ata5.00: configured for UDMA/33
ata5.01: configured for MWDMA2
scsi5 : pata_amd
ATA: abnormal status 0x8 on port 0x00010177
scsi 4:0:0:0: CD-ROMPLEXTOR  DVDR   PX-504A   1.02 PQ: 0 ANSI: 5
sr0: scsi3-mmc drive: 40x/40x writer cd/rw xa/form2 cdda tray
Uniform CD-ROM driver Revision: 3.20
sr 4:0:0:0: Attached scsi CD-ROM sr0
sr 4:0:0:0: Attached scsi generic sg1 type 5
scsi 4:0:1:0: CD-ROMCOMPAQ   SC-140S  SE04 PQ: 0 ANSI: 5
sr1: scsi3-mmc drive: 1x/40x cd/rw xa/form2 cdda tray
sr 4:0:1:0: Attached scsi CD-ROM sr1
sr 4:0:1:0: Attached scsi generic sg2 type 5

[...]

ata2: EH in ADMA mode, notifier 0x0 notifier_error 0x0 gen_ctl 0x1501000 status 
0x400 next cpb count 0x0 next cpb idx 0x0
ata2: CPB 1: ctl_flags 0x1f, resp_flags 0x2
ata2: timeout waiting for ADMA IDLE, stat=0x400
ata2: timeout waiting for ADMA LEGACY, stat=0x400
ata2.00: exception Emask 0x0 SAct 0x2 SErr 0x0 action 0x2 frozen
ata2.00: cmd 61/00:08:72:44:22/02:00:21:00:00/40 tag 1 cdb 0x0 data 262144 out
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata2.00: configured for UDMA/133
ata2: EH complete
SCSI device sda: 586114704 512-byte hdwr sectors (300091 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA


$ lspci
00:00.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a3)
00:01.0 ISA bridge: nVidia Corporation CK804 ISA Bridge (rev a3)
00:01.1 SMBus: nVidia Corporation CK804 SMBus (rev a2)
00:02.0 USB Controller: nVidia Corporation CK804 USB Controller (rev a2)
00:02.1 USB Controller: nVidia Corporation CK804 USB Controller (rev a3)
00:04.0 Multimedia audio controller: nVidia Corporation CK804 AC'97 Audio 
Controller (rev a2)
00:06.0 IDE interface: nVidia Corporation CK804 IDE (rev f2)
00:07.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev f3)
00:08.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev f3)
00:09.0 PCI bridge: nVidia Corporation CK804 PCI Bridge (rev a2)
00:0a.0 Bridge: nVidia Corporation CK804 Ethernet Controller (rev a3)
00:0b.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:0c.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:0d.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:0e.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:18.0

Re: sata_nv - ADMA issues with 2.6.20

2007-02-11 Thread Neil Schemenauer
> David R wrote:
 Feb  9 18:40:29 server kernel: ata7: Resetting port
 Feb  9 18:40:29 server kernel: ata7.00: exception Emask 0x0 SAct 0x1 SErr 
 0x0 action 0x2 frozen
 Feb  9 18:40:29 server kernel: ata7.00: cmd 
 61/08:00:1f:e4:50/00:00:09:00:00/40 tag 0 cdb 0x0 data 4096 out
 Feb  9 18:40:29 server kernel:  res 
 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

I'm setting similar errors on my machine (AMD64 CPU, MSI board with
NForce chipset).  Linux 2.6.19.2 seems to be okay.

  Neil

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Page aging for 2.4.0-test8

2000-09-11 Thread Neil Schemenauer

On Mon, Sep 11, 2000 at 01:12:32PM -0300, Rik van Riel wrote:
> Your idea /heavily/ penalises libc and executable pages by aging them
> more often than anonymous pages...

I don't think I age anonymous pages any more than any other type of
page.  Perhaps you are saying that shared pages should recieve some
bonus?  That is a different issue and it is handled naturally with my
patch.  If shared pages are actually used then PageTouch() will be
called on them more often.  This should work better than the current
PG_referenced bit.

Prehaps I am missing your point.  Can you explain in more detail how
these pages are aged more often?

  Neil
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



[PATCH] Page aging for 2.4.0-test8

2000-09-10 Thread Neil Schemenauer

This patch adds page aging similar to what was in 2.0.  The patch
is quite straight forward but I've had one lockup that I have
been unable to reproduce.  I don't know if the lockup was caused
by my patch or was a test8 bug.

This patch is supposed to improve interactive performance,
especially during heavy IO.  The page referenced bit has been
removed and is replaced by an integer page age (can someone
explain how to cache align this?).  Newly mapped pages get an age
of 2 (younger pages have higher ages).  Whenever a page is
referenced the age is increased, up to a maximum of 5.  Each time
a page is examined in the shrink_mmap loop its age is decreased.
Pages with ages greater than zero are not paged out.

The idea is that during heavy IO, pages used only once for IO
will have an age of 2.  Hopefully the X server, your MP3 player
and other useful goodies have pages with ages greater than 2 and
will not be paged out.

Interactive performance during Bonnie tests seems to be quite
good (although stock test8 is not bad either).  I think there
still may be an issue with elevator starvation.  Has there been
any more work on this front?  The discussion seems to have died
out.

-- 
Neil Schemenauer <[EMAIL PROTECTED]> http://www.enme.ucalgary.ca/~nascheme/

diff -ur linux-2.4/Makefile linux-age/Makefile
--- linux-2.4/Makefile  Sun Sep 10 10:15:27 2000
+++ linux-age/Makefile  Sun Sep 10 15:06:14 2000
@@ -1,7 +1,7 @@
 VERSION = 2
 PATCHLEVEL = 4
 SUBLEVEL = 0
-EXTRAVERSION = -test8
+EXTRAVERSION = -test8-age
 
 KERNELRELEASE=$(VERSION).$(PATCHLEVEL).$(SUBLEVEL)$(EXTRAVERSION)
 
diff -ur linux-2.4/fs/buffer.c linux-age/fs/buffer.c
--- linux-2.4/fs/buffer.c   Sun Sep 10 10:15:53 2000
+++ linux-age/fs/buffer.c   Sun Sep 10 11:45:04 2000
@@ -2182,7 +2182,7 @@
spin_unlock(&free_list[isize].lock);
 
page->buffers = bh;
-   page->flags &= ~(1 << PG_referenced);
+   page->age = 0;
lru_cache_add(page);
atomic_inc(&buffermem_pages);
return 1;
diff -ur linux-2.4/include/linux/fs.h linux-age/include/linux/fs.h
--- linux-2.4/include/linux/fs.hSun Sep 10 10:16:08 2000
+++ linux-age/include/linux/fs.hSun Sep 10 11:49:39 2000
@@ -260,7 +260,7 @@
 
 extern void set_bh_page(struct buffer_head *bh, struct page *page, unsigned long 
offset);
 
-#define touch_buffer(bh)   SetPageReferenced(bh->b_page)
+#define touch_buffer(bh)   PageTouch(bh->b_page)
 
 
 #include 
diff -ur linux-2.4/include/linux/mm.h linux-age/include/linux/mm.h
--- linux-2.4/include/linux/mm.hSun Sep 10 10:16:09 2000
+++ linux-age/include/linux/mm.hSun Sep 10 14:33:52 2000
@@ -154,8 +154,15 @@
struct buffer_head * buffers;
void *virtual; /* non-NULL if kmapped */
struct zone_struct *zone;
+   int age;
 } mem_map_t;
 
+#define PG_AGE_INITIAL  2 /* age for pages when mapped */
+#define PG_AGE_YOUNG5 /* age for pages recently used */
+
+#define PageAgeInit(p)  (p->age = PG_AGE_INITIAL)
+#define PageTouch(p)if (p->age < PG_AGE_YOUNG) p->age++;
+
 #define get_page(p)atomic_inc(&(p)->count)
 #define put_page(p)__free_page(p)
 #define put_page_testzero(p)   atomic_dec_and_test(&(p)->count)
@@ -165,7 +172,7 @@
 /* Page flag bit values */
 #define PG_locked   0
 #define PG_error1
-#define PG_referenced   2
+#define PG_unused_002
 #define PG_uptodate 3
 #define PG_dirty4
 #define PG_decr_after   5
@@ -197,9 +204,6 @@
 #define PageError(page)test_bit(PG_error, &(page)->flags)
 #define SetPageError(page) set_bit(PG_error, &(page)->flags)
 #define ClearPageError(page)   clear_bit(PG_error, &(page)->flags)
-#define PageReferenced(page)   test_bit(PG_referenced, &(page)->flags)
-#define SetPageReferenced(page)set_bit(PG_referenced, &(page)->flags)
-#define PageTestandClearReferenced(page)   test_and_clear_bit(PG_referenced, 
&(page)->flags)
 #define PageDecrAfter(page)test_bit(PG_decr_after, &(page)->flags)
 #define SetPageDecrAfter(page) set_bit(PG_decr_after, &(page)->flags)
 #define PageTestandClearDecrAfter(page)test_and_clear_bit(PG_decr_after, 
&(page)->flags)
@@ -293,9 +297,9 @@
  * When a read completes, the page becomes uptodate, unless a disk I/O
  * error happened.
  *
- * For choosing which pages to swap out, inode pages carry a
- * PG_referenced bit, which is set any time the system accesses
- * that page through the (inode,offset) hash table.
+ * For choosing which pages to swap out, inode pages carry a page age
+ * which is increased (up to PG_AGE_YOUNG) any time the system
+ * accesses that page through the (inode,offset) hash table.
  *
  * PG_skip is used on sparc/sparc64 architectures to "skip" certain
  * parts of the address space.