Prolonging a drive's life
One of the four drives in my system is frequently timing out of late, although the operation succeeds on a second attempt: Aug 14 13:51:59 aldan kernel: (ada4:ahcich5:0:0:0): FLUSHCACHE48. ACB: ea 00 00 00 00 40 00 00 00 00 00 00 Aug 14 13:51:59 aldan kernel: (ada4:ahcich5:0:0:0): CAM status: Command timeout Aug 14 13:51:59 aldan kernel: (ada4:ahcich5:0:0:0): Retrying command Aug 14 13:59:12 aldan kernel: (ada4:ahcich5:0:0:0): FLUSHCACHE48. ACB: ea 00 00 00 00 40 00 00 00 00 00 00 Aug 14 13:59:12 aldan kernel: (ada4:ahcich5:0:0:0): CAM status: Command timeout Aug 14 13:59:12 aldan kernel: (ada4:ahcich5:0:0:0): Retrying command While I'm getting a replacement, maybe, I can use camcontrol to somehow lower the operating system's exceptions about it? For example, the "camcontrol negotiate" returns the following about it: Current parameters: (pass5:ahcich5:0:0:0): SATA revision: 2.x (pass5:ahcich5:0:0:0): ATA mode: UDMA6 (pass5:ahcich5:0:0:0): ATAPI packet length: 0 (pass5:ahcich5:0:0:0): PIO transaction length: 8192 (pass5:ahcich5:0:0:0): PMP presence: 0 (pass5:ahcich5:0:0:0): Number of tags: 32 (pass5:ahcich5:0:0:0): SATA capabilities: 0030 (pass5:ahcich5:0:0:0): tagged queueing: enabled Is there anything I can tweak for it to keep working even if at lower speeds? Also, years ago, some BIOSes had the feature, which would "verify" a drive -- is there something similar I can trigger with camcontrol or smartctl? Thanks! -mi ___ freebsd-hardware@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-hardware To unsubscribe, send any mail to "freebsd-hardware-unsubscr...@freebsd.org"
ada vs. da?
The four AHCI-drives in my system appear as both adaX and daX each: at scbus2 target 0 lun 0 (ada1,pass2) at scbus3 target 0 lun 0 (ada2,pass3) at scbus5 target 0 lun 0 (ada3,pass4) at scbus6 target 0 lun 0 (ada4,pass5) at scbus7 target 0 lun 0 (da0,pass6) at scbus7 target 0 lun 1 (da1,pass7) at scbus7 target 0 lun 2 (da2,pass8) at scbus7 target 0 lun 3 (da3,pass9) Each one is listed in /var/run/dmesg.boot like this: ada2: ATA8-ACS SATA 3.x device ada2: Serial Number Z1F1E8NK ada2: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada2: Command Queueing enabled ada2: 2861588MB (5860533168 512 byte sectors) ada2: quirks=0x1<4K> ada2: Previously was known as ad8 da2: Removable Direct Access SPC-3 SCSI device da2: Serial Number 0195 da2: 40.000MB/s transfers da2: Attempt to query device size failed: NOT READY, Medium not present da2: quirks=0x3 What am I supposed to make of it? Can they be accessed through either name? What are the advantages of each? If ada is always a better choice, how do I make the da ones disappear -- such as from the systat's output? Thanks! -mi ___ freebsd-hardware@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-hardware To unsubscribe, send any mail to "freebsd-hardware-unsubscr...@freebsd.org"
Do I need SAS drives?..
My server has 8 "hot-plug" slots, that can accept both SATA and SAS drives. SATA ones tend to be cheaper for the same features (like cache-sizes), what am I getting for the extra money spent on SAS? Asking specifically about the protocol differences... It would seem, for example, SATA can not be as easily hot-plugged, but with camcontrol(8) that should not be a problem, right? What else? Thank you! -- Sent from mobile device, please, pardon shorthand. -- Sent from mobile device, please, pardon shorthand. ___ freebsd-hardware@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-hardware To unsubscribe, send any mail to "freebsd-hardware-unsubscr...@freebsd.org"
Re: monitoring hardware temperatures
On 06.12.2010 18:19, Andriy Gapon wrote: Another possibility is that a driver that should be able to handle your hardwre just doesn't know the particular IDs. pciconf -lv output could shed some light. Attached -- it is a "vanilla" PowerEdge 2900 with just one add-on card -- audio... Thanks! Yours, -mi hos...@pci0:0:0:0: class=0x06 card=0x80868086 chip=0x25c08086 rev=0x12 hdr=0x00 vendor = 'Intel Corporation' device = '5000X Chipset Memory Controller Hub' class = bridge subclass = HOST-PCI pc...@pci0:0:2:0: class=0x060400 card=0x chip=0x25e28086 rev=0x12 hdr=0x01 vendor = 'Intel Corporation' device = '5000 Series Chipset PCIe x4 Port 2' class = bridge subclass = PCI-PCI pc...@pci0:0:3:0: class=0x060400 card=0x chip=0x25e38086 rev=0x12 hdr=0x01 vendor = 'Intel Corporation' device = '5000 Series Chipset PCIe x4 Port 3' class = bridge subclass = PCI-PCI pc...@pci0:0:4:0: class=0x060400 card=0x chip=0x25e48086 rev=0x12 hdr=0x01 vendor = 'Intel Corporation' device = '5000 Series Chipset PCIe x4 Port 4' class = bridge subclass = PCI-PCI pci...@pci0:0:5:0: class=0x060400 card=0x chip=0x25e58086 rev=0x12 hdr=0x01 vendor = 'Intel Corporation' device = '5000 Series Chipset PCIe x4 Port 5' class = bridge subclass = PCI-PCI pci...@pci0:0:6:0: class=0x060400 card=0x chip=0x25f98086 rev=0x12 hdr=0x01 vendor = 'Intel Corporation' device = '5000 Series Chipset PCIe x8 Port 6-7' class = bridge subclass = PCI-PCI pci...@pci0:0:7:0: class=0x060400 card=0x chip=0x25e78086 rev=0x12 hdr=0x01 vendor = 'Intel Corporation' device = '5000 Series Chipset PCIe x4 Port 7' class = bridge subclass = PCI-PCI no...@pci0:0:8:0: class=0x088000 card=0x80868086 chip=0x1a388086 rev=0x12 hdr=0x00 vendor = 'Intel Corporation' device = '5000 Series Chipset DMA Engine (5000P)' class = base peripheral hos...@pci0:0:16:0: class=0x06 card=0x01b11028 chip=0x25f08086 rev=0x12 hdr=0x00 vendor = 'Intel Corporation' device = '5000 Series Chipset Error Reporting Registers' class = bridge subclass = HOST-PCI hos...@pci0:0:16:1: class=0x06 card=0x01b11028 chip=0x25f08086 rev=0x12 hdr=0x00 vendor = 'Intel Corporation' device = '5000 Series Chipset Error Reporting Registers' class = bridge subclass = HOST-PCI hos...@pci0:0:16:2: class=0x06 card=0x01b11028 chip=0x25f08086 rev=0x12 hdr=0x00 vendor = 'Intel Corporation' device = '5000 Series Chipset Error Reporting Registers' class = bridge subclass = HOST-PCI hos...@pci0:0:17:0: class=0x06 card=0x80868086 chip=0x25f18086 rev=0x12 hdr=0x00 vendor = 'Intel Corporation' device = '5000 Series Chipset Reserved Registers' class = bridge subclass = HOST-PCI hos...@pci0:0:19:0: class=0x06 card=0x80868086 chip=0x25f38086 rev=0x12 hdr=0x00 vendor = 'Intel Corporation' device = '5000 Series Chipset Reserved Registers' class = bridge subclass = HOST-PCI hos...@pci0:0:21:0: class=0x06 card=0x80868086 chip=0x25f58086 rev=0x12 hdr=0x00 vendor = 'Intel Corporation' device = '5000 Series Chipset FBD Registers' class = bridge subclass = HOST-PCI hos...@pci0:0:22:0: class=0x06 card=0x80868086 chip=0x25f68086 rev=0x12 hdr=0x00 vendor = 'Intel Corporation' device = '5000 Series Chipset FBD Registers' class = bridge subclass = HOST-PCI pci...@pci0:0:28:0: class=0x060400 card=0x01b11028 chip=0x26908086 rev=0x09 hdr=0x01 vendor = 'Intel Corporation' device = '631xESB/632xESB/3100 PCIe Root Port 1' class = bridge subclass = PCI-PCI uh...@pci0:0:29:0: class=0x0c0300 card=0x01b11028 chip=0x26888086 rev=0x09 hdr=0x00 vendor = 'Intel Corporation' device = '631xESB/632xESB/3100 Chipset USB Universal Host Controller *1' class = serial bus subclass = USB uh...@pci0:0:29:1: class=0x0c0300 card=0x01b11028 chip=0x26898086 rev=0x09 hdr=0x00 vendor = 'Intel Corporation' device = '631xESB/632xESB/3100 Chipset USB Universal Host Controller *2' class = serial bus subclass = USB uh...@pci0:0:29:2: class=0x0c0300 card=0x01b11028 chip=0x268a8086 rev=0x09 hdr=0x00 vendor = 'Intel Corporation' device = '631xESB/632xESB/3100 Chipset USB Universal Host Controller *3' class = serial bus subclass = USB uh...@pci0:0:29:3: class=0x0c0300 card=0x01b11028 chip=0x268b8086 rev=0x09 hdr=0x00 vendor = 'Intel Corporation' device = '631xESB/632xESB/3100 Chipset USB Universal Host
Re: monitoring hardware temperatures
On 06.12.2010 18:02, Andriy Gapon wrote: BTW, you could probably write a simple script employing smbmsg(1) to query the DIMMs based on logic in the sdtemp driver. From OpenBSD's sdtemp man-page, it would seem, the driver uses the iic framework (if that's the right word, khmm...) And on this server I can't get /dev/iic* (nor smb*) to appear despite loading everything I could think of (even the viapm): 31 0x80c23000 d22 iic.ko 44 0x80c24000 10e7 iicbus.ko 51 0x80c26000 f16 iicsmb.ko 65 0x80c27000 819 smbus.ko 71 0x80c28000 c02 smb.ko 83 0x80c29000 114f iicbb.ko 91 0x80c2b000 1df3 ichsmb.ko 101 0x80c2d000 1aed intpm.ko 111 0x80c2f000 e38 pcf.ko 121 0x80c3 b83 lpbb.ko 131 0x80c31000 368b ppbus.ko 141 0x80c35000 262a viapm.ko Could it be, that the motherboard simply does not have the iic-circuitry and that some other method has to be used? Thanks! Yours, -mi ___ freebsd-hardware@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hardware To unsubscribe, send any mail to "freebsd-hardware-unsubscr...@freebsd.org"
Re: monitoring hardware temperatures
On 06.12.2010 14:51, Michael Fuckner wrote: did you try to read the data via IPMI? kldload ipmi;ipmitool sdr Interestingly, I was doing just that, when your e-mail arrived... ipmitool was impressive enough and I'm building openipmi to take a look at that too. I don't see information on each DIMM (yet?), but other information is quite useful... One of the fans, for example, was listed as "cr" (rather than "ok") -- which was, apparently, causing all other fans to run at maximum speed (*very* noisy fans in poweredge 2900). I reset it (by pulling it out and back again), and now the box is quieting back down... The sensors-patches did not add any new entries under hw.sensors hierarchy :( The coretemp(4) stopped functioning, unfortunately... Whereas before, when I simply kldload-ed it, it was reporting reasonable temperatures, now that I have the sensors-patch merged in, I see nonsense like: hw.sensors.cpu0.temp0: -1282,97 degC hw.sensors.cpu1.temp0: -1272,97 degC hw.sensors.cpu2.temp0: -1282,97 degC hw.sensors.cpu3.temp0: -1262,97 degC Seems like some kind of calibration issue -- the numbers differ from each other and change with time... I think, I'll back the patch out as it did not give me any new information -- the it- and lm-devices aren't found on this box :-( Anyway, sdtemp(4) -- or equivalent -- is something, I'd like to have... Thanks! Yours, -mi ___ freebsd-hardware@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hardware To unsubscribe, send any mail to "freebsd-hardware-unsubscr...@freebsd.org"
Re: monitoring hardware temperatures
On 06.12.2010 07:44, Andriy Gapon wrote: Well, that code has support only for a few types of hardware monitoring chips (Super I/Os with hardware monitoring function). Damn, I wish I knew earlier... The machine I'm retiring now -- but which was my primary horse 3 years ago -- has "Super I/O" :-( So, it greatly depends on exact kind of hardware and sensors that you have. First thing you should do to is to discover what kind of hardware is used for monitoring in your server. In your case that data might be provided via IPMI. Thanks, I'll explore that pointer... Especially I am not sure about monitoring DIMM temperature - greatly depends on the way that it is actually done. Perhaps it's reported via SMBus by the DIMMs themselves, not sure... Both NetBSD and OpenBSD (and, likely, DragonFly too) have something called sdtemp(4): http://fxr.watson.org/fxr/source/dev/i2c/sdtemp.c?v=NETBSD I thought, that driver would be part of the unfortunate "basic support for a few sensors"... Anyway, I'll try merging the http://people.freebsd.org/~avg/sensors9.diff, and see, what gives... Is not it just like Linux, that one needs to get patches from here and there to get going :-\ ? -mi ___ freebsd-hardware@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hardware To unsubscribe, send any mail to "freebsd-hardware-unsubscr...@freebsd.org"
monitoring hardware temperatures
Hello! I have a server (Dell Poweredge 2900), that's loaded with sensors. While it was in Windows-mode, a utility was able to tell me not only the temperature of each CPU-core, but also that of every DIMM!.. One of them was running far hotter than others, and I'd like to continue keeping an eye on it now that the box run FreeBSD. In FreeBSD there is coretemp(4), which is nice, but nothing else... There is no hw.acpi.thermal hierarchy either on this box... Yet, the box has 6 fans, two power-supplies, plus DIMMs -- all of them with sensors, that I can't read... It seems, in 2007, there was an attempt to introduce OpenBSD's sensor-framework: http://kerneltrap.org/OpenBSD/BSDCan_2008_Hardware_Sensors_Framework but it was backed-out after being declared a "pile of crap" and "festering junkpile" by our most mirthful contributor: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=193129+0+archive/2007/cvs-all/20071021.cvs-all "until a proper architectural solution has been found". Has that happened in the three years, that passed since that lovely discussion? Or are we still waiting for someone to design and implement it not merely "adequately", but "perfectly"? If the three other BSD-cousins have had this for a while (NetBSD -- for 10 years, apparently), continuing to insist on some future perfection seems wrong -- we should have this "adequate but imperfect" method if only for cross-BSD compatibility. Is there, perhaps, a set of patches still secretly maintained by some die-hard? I'd love to try it here, and will be very thankful, if it gives me the monitoring, that I can not obtain otherwise... Thanks! Yours, -mi ___ freebsd-hardware@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hardware To unsubscribe, send any mail to "freebsd-hardware-unsubscr...@freebsd.org"
After a disk's disappearance, ar0 (raid5) hung...
All of a sudden, one of the three parts (ad8) of my ar0 threw a fit. As one might expect, the OS logged the event and told me, the array is now in degraded mode. Unfortunately, all I/O on the array is hanging. The machine is otherwise responsive, but processes trying to access the array hang in either "biord" or "getblk". I'm pretty sure, that, if I reboot, things will get back to normal. But I was hoping to buy some redundancy by using RAID5... If anybody is interested in any diagnostics -- let me know, I'll hold off rebooting for 12 hours. Yours, -mi ___ freebsd-hardware@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hardware To unsubscribe, send any mail to "freebsd-hardware-unsubscr...@freebsd.org"
Very slow writing to SATA disk
Hi! I've got a new Hitachi drive with 16Mb of cache: ad8: 476940MB at ata4-master SATA150 and am trying to use it to store backups online. Unfortunately, writing to the disk is painfully slow (by today's standards) -- it can barely keep 7Mb/second and my other (SCSI) disks run circles around it. I tried to just cp a huge file from a SCSI disk to this one -- without any other activity. 7Mb/second is the best it could do :-( For bulk writing, undisturbed by other access, I'd expect at least 30Mb/sec... Is it the drive, the controller: atapci1: port 0xac00-0xac07,0xa480-0xa483,0xa400-0xa407,0xa080-0xa083,0xa000-0xa00f mem 0xbe6fbc00-0xbe6fbfff irq 25 at device 5.0 on pci3 the driver (6.0/amd64), or a misconfiguration of some sort? According to smartctl, the drive runs at 56C during the copying. Its idle temperature seems to be 54C. Thanks for any hints! -mi ___ freebsd-hardware@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hardware To unsubscribe, send any mail to "[EMAIL PROTECTED]"