On 07/02/18 02:01, Kenneth Garges wrote:
> Permissions and ownership are ok. I think I figured out the problem.
>
> Both mtx and chio would work for a while, then stop working returning only 
> “Inappropriate ioctl for device” or other errors. The culprit I think is a 
> library management tool by Quantum. The library provides a web interface that 
> allows you to check status, move tapes around, and various utilities. 
>
> Problem is if you use that web interface it gets the server (both Bacula SD 
> and Director on the same FreeBSD box) wedged such that mtx and chio always 
> fail.

Yes - and if it's about to do anything which takes the changer offline
the web ui gives a warning message and asked if you want to continue.

> Unknown to me, the operations staff had been using that web interface while I 
> was testing. After I asked them to stop my system seems to work reliably. 

I hope you changed the password and put the changer in its own IP subnet
after that.

I wrote some kludgy shell scripts (originally for Neo4000, then Neo 8000
and now Quantum i500 - portable as far as I know) which automate loading
and unloading of the changer (imported tapes are scattered randomly
around available slots to ensure even wear and tear) - these mean that
all operations staff have to do is open the magazine to load/unload
tapes when emailed to do so by Bacula.

Dan can probably improve them considerably.

As an adjunct to that another script works out which tapes are oldest
(if in the scratch pool) or about to expire (if none are in the scratch
pool) and asks staff to pull them out of the safe.

I also submitted some chages to the mtx-changer script a while ago which
check that the changer's actually ready before attempting to send it
commands. Kern's been sitting on them for a couple of years.


There are some pretty whizzy things you can do to monitor tape health
when a tape is in the changer using the sg_attr and sg_logs commands
that go _far _ beyond Tapealart or smartmontools capabilities.

I've submitted a number of updates to sg_utils and am in the process of
(slowly) reverse engineering what I can from the MAMs that's not well
documented using text output provided by veritape (proprietary, windows
tool from mptapes.com which reads MAMs).

Selfish motivation is (of course) to reduce the amount of time I have to
spend manually scanning tapes with the windows program/standalone
scanner then interpreting output. All of that information is available
when the tape is in the drive and with appropriate MAM reads at
load/unload it's possible to have the bacula server tell me when a tape
is approaching end of life (this is far more accurate than the metrics
bacula uses of simple load cycles.) - and more importantly, when tape
drives are going bad.

Having spent several man-years dealing with the fallout of bad LTO tapes
damaging drives and then those drives damaging tapes, I want to minimize
the pain if it ever happens again.

(LTO drives even return detailed information about the condition of each
of the heads if queried the right way. This data is interpreted by IBM
or HP's proprietary tools to give reports but they use the same queries
as sg_logs does to get it)


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to