Re: mcelog?
jnem...@cue.bc.ca (John Nemeth) writes: > On that note, I really wish that our ipmi(4) was a lot >more useful (all it does is get info for envstat, it can't be used >to configure or talk to the BMC). ... working on it. -- -- Michael van Elst Internet: mlel...@serpens.de "A potential Snark may lurk in every tree."
Re: st(4) and mt eom
On Mar 21, 6:48am, Frank Kardel wrote: } } As I wrote, extending MTIO ioctls to support more mt commands is not a } real issue (e. g. add access to the LOCATE command). We just need to } decide which features are useful to support and find the time to } implement them. Yes, those would be enhancements, not bug fixes. Also, what software would use these enhancements besides mt(1). Some of the stuff, like informational items and control items, look to be useful in their own right, but other stuff, like locate, look to be redundant. } For Bacula it wouldn't help right now, thus the EOM code needs fixing. For this, it is debatable if it is an enhancement or a bug fix. However, it is needed by a significant application, which should raise its priority. } There are also some other things that are useful to add to the SCSI } subsystem like acquiring timeout information from the device to avoid } early device resets while the drive is still fighting with its servo } errors (just happened to me with a bad LTO7 drive). I have that update } currently sitting in my tree. Getting timeout information from the device would be useful. I had a problem with an earlier tape changer. If I issued a tape move command without first rewinding the tape, the operation would often timeout. This would cause a SCSI bus reset to be issued while the tape was rewinding. Then it would try to probe the changer while it was still resetting. This drove the changer crazy. At first, I tried to remember to rewind the tape before issuing a move command. Eventually I found the right spot in our code and changed the timeout. } On 03/21/19 04:27, John Nemeth wrote: } > On Mar 20, 8:08pm, Frank Kardel wrote: } > } } > } This seems to be a long standing deficiency of the driver. Looking at } > } the SCSI spec it is recommended to issue a READ POSITION command } > } > I did read a book about SCSI sometime ago, so I have a good } > overview of CCBs and the protocol, but it has been some time since } > I've looked at it in any depth. And, although I've made minor } > changes to our SCSI code, I can't say that I know it in any depth. } > } > } } [snip] }-- End of excerpt from Frank Kardel
Re: st(4) and mt eom
Hi Jon ! As I wrote, extending MTIO ioctls to support more mt commands is not a real issue (e. g. add access to the LOCATE command). We just need to decide which features are useful to support and find the time to implement them. For Bacula it wouldn't help right now, thus the EOM code needs fixing. There are also some other things that are useful to add to the SCSI subsystem like acquiring timeout information from the device to avoid early device resets while the drive is still fighting with its servo errors (just happened to me with a bad LTO7 drive). I have that update currently sitting in my tree. Frank On 03/21/19 04:27, John Nemeth wrote: On Mar 20, 8:08pm, Frank Kardel wrote: } } This seems to be a long standing deficiency of the driver. Looking at } the SCSI spec it is recommended to issue a READ POSITION command I did read a book about SCSI sometime ago, so I have a good overview of CCBs and the protocol, but it has been some time since I've looked at it in any depth. And, although I've made minor changes to our SCSI code, I can't say that I know it in any depth. } [snip]
Re: mcelog?
On Mar 20, 2:12pm, Brian Buhrow wrote: } } hello. Does the server on which you're running Xen have a BMC } controller that keeps track of hardware conditions and the like? If it } does, then, if mcelog is too hard to port, you might be able to get the } details you want from ipmitool through the BMC. This might be a useful approach. Not sure why I didn't think about it. On that note, I really wish that our ipmi(4) was a lot more useful (all it does is get info for envstat, it can't be used to configure or talk to the BMC). } To answer your question, it looks like mcelog has been ported to } FreeBSD, with some limitations. However, if I remember correctly, there } needs to be some support in the kernel for trapping and logging the mce } errors and I'm not sure the NetBSD kernel does that. I did note the presence of /dev/mcelog on Linux with nothing that corresponds on NetBSD. It is interesting that FreeBSD has support and that might be a good place to start; however, porting FreeBSD device drivers can be a lot of work. } On Mar 20, 11:22am, John Nemeth wrote: } } Subject: mcelog? } } I originally posted this on port-amd64, but didn't get any } } response, so now trying a list with a wider audience. } } } } One of my Xen hosts has been getting this error a lot: } } } } (XEN) Bank 4: 945a4000fd080813 atef3581180 } } (XEN) MCE: polling routine found correctable error. Use mcelog to parse above e } } rror output. } } } } My research tells me that "mcelog" is a Linux program for } } reading and interpreting the MCE registers. Do we have anything } } like mcelog or anyway to read MCE errors? If not, any idea what } } it would take to port mcelog? It appears to need a device called, } } /dev/mcelog. } } } } In any event, if I'm reading the above correctly, I believe } } that it is telling that there is bad memory? } >-- End of excerpt from John Nemeth }-- End of excerpt from Brian Buhrow
Re: st(4) and mt eom
On Mar 20, 11:13pm, Frank Kardel wrote: } } I just finished implementing the EOM fix. On SPACE(EOM) a READ } POSITION(LONG_FORMAT) is } } done and current file number is set to the number of filemarks since BOT } } which is what we need. } } on a 10 file taoe (LTO6) both commands } } mt fsf 64 } } mt eom } } now return a current file number of 10. } } no additional changes to mt like adding a locate command are needed, } Though adding LOCATE might also be an option to be added separately for } mt even if it does not help with bacula at all as bacula uses the MT } ioctls directly and not via scripts. } } Thanks for the hint anyway. I looked at the FreeBSD's mt command source code and manpage. There appear to be a bunch of "non-standard" extensions, such as "locate". Many of them look useful, and would probably be worth acquiring. However, many of them also need new ioctls() and FreeBSD's SCSI stack is very different from NetBSD's SCSI stack, so some porting work would be needed. } On 03/20/19 21:09, Adrian Bocaniciu wrote: } > On Wed, 20 Mar 2019 20:08:17 +0100 } > Frank Kardel wrote: } > } >> This seems to be a long standing deficiency of the driver. Looking at } >> the SCSI spec it is recommended to issue a READ POSITION command } >> } >> get the current position. Looking at the spec and code it should be } >> possible to handle the SP_EOM case better with respect to the position } >> information } > } > I suggest that you should look at what FreeBSD does for the command: } > mt locate -e } > } > I have not attempted to use NetBSD with a tape, but I am using FreeBSD with an LTO-7 drive and "mt locate -e" works flawlessly. } > } > I always follow that command in my scripts with a "mt rdspos", to verify that it worked correctly, by comparing the result with the last written position on that tape (the position after "mt locate -e" should be the position read after writing the previous last file + 1). } > } > Best regards ! } > } }-- End of excerpt from Frank Kardel
Re: st(4) and mt eom
On Mar 20, 8:08pm, Frank Kardel wrote: } } This seems to be a long standing deficiency of the driver. Looking at } the SCSI spec it is recommended to issue a READ POSITION command I did read a book about SCSI sometime ago, so I have a good overview of CCBs and the protocol, but it has been some time since I've looked at it in any depth. And, although I've made minor changes to our SCSI code, I can't say that I know it in any depth. } get the current position. Looking at the spec and code it should be } possible to handle the SP_EOM case better with respect to the position } information } } by issuing READ POSITION (service action code 6 = LONG FORM) at that } point to set the correct position. } } I also ran bacula (now I run freshly ported bareos 18.2.5 for HW } encryption, tapealert) with HW EOM set to false. btape test made those } recommendation a long time ago. btape test recommended: Hardware End of Medium = No Fast Forward Space File = No Apparently it doesn't try all combinations. This combination made restores ridiculously slow (I'm currently using LTO-4, 800GB drives). I spent a fair bit of time reading the source code, running ktrace, instrumenting things, and finally reading the kernel source code. The kernel source code explained everything. } On 03/20/19 19:02, John Nemeth wrote: } > If you issue an "mt eom" (forward to end of media), the driver } > loses track of the tape position. This seriously messes with } > Bacula's tape handling. Since Bacula expects the driver not to } > lose the tape position I get the feeling there are other operating } > systems that don't. I found this code in st.c:st_space(): } > } > error = scsipi_command(st->sc_periph, (void *)&cmd, sizeof(cmd), 0, 0, } > 0, ST_SPC_TIME, NULL, flags); } > } > if (error == 0 && (st->flags & ST_POSUPDATED) == 0) { } > number = number - st->last_ctl_resid; } > if (what == SP_BLKS) { } > if (st->blkno != -1) } > st->blkno += number; } > } else if (what == SP_FILEMARKS) { } > if (st->fileno != -1) { } > st->fileno += number; } > if (number > 0) } > st->blkno = 0; } > else if (number < 0) } > st->blkno = -1; } > } } > } else if (what == SP_EOM) { } > /* This loses us relative position. */ } > st->fileno = st->blkno = -1; } > } } > } } > return error; } > } } > } > Notice the SP_EOM case. Can any SCSI experts, in particular SCSI } > tape experts, shed some light on this and what can be done about } > it? } > } > I have found a workaround for Bacula which is to tell it about } > this problem. If you do that, Bacula will do "mt fsf 65535" (and } > pray that there aren't more files then that on the tape). The tape } > I have with the largest number of files is at 1186, so this will } > do for now. Still, it would be nice to fix the underlying problem. } > } > For those wondering, my bacula-sd.conf contains: } > } > Device { } >Name = LTO-4a } >Archive Device = /dev/nrst0 } >Device Type = Tape } >Media Type = LTO-4 } >AutoChanger = yes } >LabelMedia = no } >Drive Index = 0 } >AlwaysOpen = yes; } >Removable Media = yes; } >Random Access = no; } >Maximum File Size = 2GB } >Automatic Mount = yes; # when device opened, read it } >Spool Directory = /bacula/spool-sd } >Hardware End of Medium = No } >#Fast Forward Space File = No } >BSF at EOM = yes } ># } ># New alert command in Bacula 9.0.0 } ># Note: you must have the sg3_utils (rpms) or the } >#sg3-utils (deb) installed on your system. } >#and you must set the correct control device that } >#corresponds to the Archive Device } > # Control Device = /dev/sg?? # must be SCSI ctl for /dev/nrst0 } > # Alert Command = "/usr/pkg/libexec/bacula/tapealert %l" } > } ># Enable the Alert command only if you have the mtx package loaded } > # Alert Command = "sh -c 'tapeinfo -f %c |grep TapeAlert|cat'" } > # If you have smartctl, enable this, it has more info than tapeinfo } > # Alert Command = "sh -c 'smartctl -H -l error %c'" } > } } }-- End of excerpt from Frank Kardel
daily CVS update output
Updating src tree: P src/bin/pax/file_subs.c P src/bin/pax/tar.c P src/distrib/sets/lists/base/shl.mi P src/distrib/sets/lists/comp/md.amd64 P src/distrib/sets/lists/comp/md.i386 P src/distrib/sets/lists/comp/mi P src/distrib/sets/lists/debug/mi P src/distrib/sets/lists/debug/shl.mi P src/external/bsd/llvm/bin/Makefile P src/external/bsd/llvm/include/Makefile P src/external/bsd/llvm/lib/Makefile P src/external/gpl3/gcc/lib/libgomp/arch/ia64/config.h P src/external/gpl3/gcc/lib/libstdc++-v3/Makefile P src/external/gpl3/gcc/lib/libstdc++-v3/arch/alpha/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/arm/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/armeb/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/earm/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/earmeb/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/earmhf/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/earmhfeb/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/earmv4/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/earmv4eb/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/earmv6/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/earmv6eb/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/earmv6hf/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/earmv6hfeb/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/earmv7/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/earmv7eb/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/earmv7hf/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/earmv7hfeb/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/hppa/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/i386/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/ia64/c++config.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/ia64/defs.mk P src/external/gpl3/gcc/lib/libstdc++-v3/arch/ia64/gstdint.h U src/external/gpl3/gcc/lib/libstdc++-v3/arch/ia64/symver-config.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/m68000/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/m68k/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/mips64eb/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/mips64el/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/mipseb/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/mipsel/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/powerpc/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/sh3eb/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/sh3el/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/sparc/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/sparc64/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/vax/gstdint.h P src/external/gpl3/gcc/lib/libstdc++-v3/arch/x86_64/gstdint.h P src/external/gpl3/gcc/usr.bin/gcc/arch/aarch64/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/alpha/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/arm/auto-host.h P src/external/gpl3/gcc/usr.bin/gcc/arch/arm/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/arm/defs.mk P src/external/gpl3/gcc/usr.bin/gcc/arch/arm/tm.h P src/external/gpl3/gcc/usr.bin/gcc/arch/armeb/auto-host.h P src/external/gpl3/gcc/usr.bin/gcc/arch/armeb/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/armeb/defs.mk P src/external/gpl3/gcc/usr.bin/gcc/arch/armeb/tm.h P src/external/gpl3/gcc/usr.bin/gcc/arch/earm/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/earmeb/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/earmhf/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/earmhfeb/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/earmv4/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/earmv4eb/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/earmv6/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/earmv6eb/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/earmv6hf/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/earmv6hfeb/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/earmv7/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/earmv7eb/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/earmv7hf/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/earmv7hfeb/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/hppa/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/i386/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/ia64/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/m68000/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/m68k/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/mips64eb/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/mips64el/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/mipseb/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/mipsel/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/powerpc/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/sh3eb/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/sh3el/configargs.h P src/external/gpl3/gcc/usr.bin/gcc/arch/sparc/configar
Re: st(4) and mt eom
Hi Adrian! I just finished implementing the EOM fix. On SPACE(EOM) a READ POSITION(LONG_FORMAT) is done and current file number is set to the number of filemarks since BOT which is what we need. on a 10 file taoe (LTO6) both commands mt fsf 64 mt eom now return a current file number of 10. no additional changes to mt like adding a locate command are needed, Though adding LOCATE might also be an option to be added separately for mt even if it does not help with bacula at all as bacula uses the MT ioctls directly and not via scripts. Thanks for the hint anyway. Frank On 03/20/19 21:09, Adrian Bocaniciu wrote: On Wed, 20 Mar 2019 20:08:17 +0100 Frank Kardel wrote: This seems to be a long standing deficiency of the driver. Looking at the SCSI spec it is recommended to issue a READ POSITION command get the current position. Looking at the spec and code it should be possible to handle the SP_EOM case better with respect to the position information I suggest that you should look at what FreeBSD does for the command: mt locate -e I have not attempted to use NetBSD with a tape, but I am using FreeBSD with an LTO-7 drive and "mt locate -e" works flawlessly. I always follow that command in my scripts with a "mt rdspos", to verify that it worked correctly, by comparing the result with the last written position on that tape (the position after "mt locate -e" should be the position read after writing the previous last file + 1). Best regards !
re: /dev/stdin: Device not configured
> After building today's sources (8.99.36), on i386 I get: > > # sort > sort: /dev/stdin: Device not configured > > 8.99.29 kernel works correctly. > > Same sources, but amd64, also work correctly. > > Any suggestions? :) what does "ls -l /dev/stdin" show? also -lL. .mrg.
Re: st(4) and mt eom
On Wed, 20 Mar 2019 20:08:17 +0100 Frank Kardel wrote: > This seems to be a long standing deficiency of the driver. Looking at > the SCSI spec it is recommended to issue a READ POSITION command > > get the current position. Looking at the spec and code it should be > possible to handle the SP_EOM case better with respect to the position > information I suggest that you should look at what FreeBSD does for the command: mt locate -e I have not attempted to use NetBSD with a tape, but I am using FreeBSD with an LTO-7 drive and "mt locate -e" works flawlessly. I always follow that command in my scripts with a "mt rdspos", to verify that it worked correctly, by comparing the result with the last written position on that tape (the position after "mt locate -e" should be the position read after writing the previous last file + 1). Best regards !
Re: mcelog?
hello. Does the server on which you're running Xen have a BMC controller that keeps track of hardware conditions and the like? If it does, then, if mcelog is too hard to port, you might be able to get the details you want from ipmitool through the BMC. To answer your question, it looks like mcelog has been ported to FreeBSD, with some limitations. However, if I remember correctly, there needs to be some support in the kernel for trapping and logging the mce errors and I'm not sure the NetBSD kernel does that. -thanks -Brian On Mar 20, 11:22am, John Nemeth wrote: } Subject: mcelog? } I originally posted this on port-amd64, but didn't get any } response, so now trying a list with a wider audience. } } One of my Xen hosts has been getting this error a lot: } } (XEN) Bank 4: 945a4000fd080813 atef3581180 } (XEN) MCE: polling routine found correctable error. Use mcelog to parse above e } rror output. } } My research tells me that "mcelog" is a Linux program for } reading and interpreting the MCE registers. Do we have anything } like mcelog or anyway to read MCE errors? If not, any idea what } it would take to port mcelog? It appears to need a device called, } /dev/mcelog. } } In any event, if I'm reading the above correctly, I believe } that it is telling that there is bad memory? >-- End of excerpt from John Nemeth
Re: st(4) and mt eom
Hi John ! This seems to be a long standing deficiency of the driver. Looking at the SCSI spec it is recommended to issue a READ POSITION command get the current position. Looking at the spec and code it should be possible to handle the SP_EOM case better with respect to the position information by issuing READ POSITION (service action code 6 = LONG FORM) at that point to set the correct position. I also ran bacula (now I run freshly ported bareos 18.2.5 for HW encryption, tapealert) with HW EOM set to false. btape test made those recommendation a long time ago. Time permitting I try to update the driver this week, would you be willing to test? Frank On 03/20/19 19:02, John Nemeth wrote: If you issue an "mt eom" (forward to end of media), the driver loses track of the tape position. This seriously messes with Bacula's tape handling. Since Bacula expects the driver not to lose the tape position I get the feeling there are other operating systems that don't. I found this code in st.c:st_space(): error = scsipi_command(st->sc_periph, (void *)&cmd, sizeof(cmd), 0, 0, 0, ST_SPC_TIME, NULL, flags); if (error == 0 && (st->flags & ST_POSUPDATED) == 0) { number = number - st->last_ctl_resid; if (what == SP_BLKS) { if (st->blkno != -1) st->blkno += number; } else if (what == SP_FILEMARKS) { if (st->fileno != -1) { st->fileno += number; if (number > 0) st->blkno = 0; else if (number < 0) st->blkno = -1; } } else if (what == SP_EOM) { /* This loses us relative position. */ st->fileno = st->blkno = -1; } } return error; } Notice the SP_EOM case. Can any SCSI experts, in particular SCSI tape experts, shed some light on this and what can be done about it? I have found a workaround for Bacula which is to tell it about this problem. If you do that, Bacula will do "mt fsf 65535" (and pray that there aren't more files then that on the tape). The tape I have with the largest number of files is at 1186, so this will do for now. Still, it would be nice to fix the underlying problem. For those wondering, my bacula-sd.conf contains: Device { Name = LTO-4a Archive Device = /dev/nrst0 Device Type = Tape Media Type = LTO-4 AutoChanger = yes LabelMedia = no Drive Index = 0 AlwaysOpen = yes; Removable Media = yes; Random Access = no; Maximum File Size = 2GB Automatic Mount = yes; # when device opened, read it Spool Directory = /bacula/spool-sd Hardware End of Medium = No #Fast Forward Space File = No BSF at EOM = yes # # New alert command in Bacula 9.0.0 # Note: you must have the sg3_utils (rpms) or the #sg3-utils (deb) installed on your system. #and you must set the correct control device that #corresponds to the Archive Device # Control Device = /dev/sg?? # must be SCSI ctl for /dev/nrst0 # Alert Command = "/usr/pkg/libexec/bacula/tapealert %l" # Enable the Alert command only if you have the mtx package loaded # Alert Command = "sh -c 'tapeinfo -f %c |grep TapeAlert|cat'" # If you have smartctl, enable this, it has more info than tapeinfo # Alert Command = "sh -c 'smartctl -H -l error %c'" }
mcelog?
I originally posted this on port-amd64, but didn't get any response, so now trying a list with a wider audience. One of my Xen hosts has been getting this error a lot: (XEN) Bank 4: 945a4000fd080813 atef3581180 (XEN) MCE: polling routine found correctable error. Use mcelog to parse above e rror output. My research tells me that "mcelog" is a Linux program for reading and interpreting the MCE registers. Do we have anything like mcelog or anyway to read MCE errors? If not, any idea what it would take to port mcelog? It appears to need a device called, /dev/mcelog. In any event, if I'm reading the above correctly, I believe that it is telling that there is bad memory?
st(4) and mt eom
If you issue an "mt eom" (forward to end of media), the driver loses track of the tape position. This seriously messes with Bacula's tape handling. Since Bacula expects the driver not to lose the tape position I get the feeling there are other operating systems that don't. I found this code in st.c:st_space(): error = scsipi_command(st->sc_periph, (void *)&cmd, sizeof(cmd), 0, 0, 0, ST_SPC_TIME, NULL, flags); if (error == 0 && (st->flags & ST_POSUPDATED) == 0) { number = number - st->last_ctl_resid; if (what == SP_BLKS) { if (st->blkno != -1) st->blkno += number; } else if (what == SP_FILEMARKS) { if (st->fileno != -1) { st->fileno += number; if (number > 0) st->blkno = 0; else if (number < 0) st->blkno = -1; } } else if (what == SP_EOM) { /* This loses us relative position. */ st->fileno = st->blkno = -1; } } return error; } Notice the SP_EOM case. Can any SCSI experts, in particular SCSI tape experts, shed some light on this and what can be done about it? I have found a workaround for Bacula which is to tell it about this problem. If you do that, Bacula will do "mt fsf 65535" (and pray that there aren't more files then that on the tape). The tape I have with the largest number of files is at 1186, so this will do for now. Still, it would be nice to fix the underlying problem. For those wondering, my bacula-sd.conf contains: Device { Name = LTO-4a Archive Device = /dev/nrst0 Device Type = Tape Media Type = LTO-4 AutoChanger = yes LabelMedia = no Drive Index = 0 AlwaysOpen = yes; Removable Media = yes; Random Access = no; Maximum File Size = 2GB Automatic Mount = yes; # when device opened, read it Spool Directory = /bacula/spool-sd Hardware End of Medium = No #Fast Forward Space File = No BSF at EOM = yes # # New alert command in Bacula 9.0.0 # Note: you must have the sg3_utils (rpms) or the #sg3-utils (deb) installed on your system. #and you must set the correct control device that #corresponds to the Archive Device # Control Device = /dev/sg?? # must be SCSI ctl for /dev/nrst0 # Alert Command = "/usr/pkg/libexec/bacula/tapealert %l" # Enable the Alert command only if you have the mtx package loaded # Alert Command = "sh -c 'tapeinfo -f %c |grep TapeAlert|cat'" # If you have smartctl, enable this, it has more info than tapeinfo # Alert Command = "sh -c 'smartctl -H -l error %c'" }
/dev/stdin: Device not configured
Greetings, After building today's sources (8.99.36), on i386 I get: # sort sort: /dev/stdin: Device not configured 8.99.29 kernel works correctly. Same sources, but amd64, also work correctly. Any suggestions? :) Kind regards, Adam