Re: [Bug-tar] Can't extract file from multivolume archive

Alford, Seth Mon, 21 Feb 2011 14:54:13 -0800

On Fri, Feb 04, 2011 at 06:43:09PM -0600, Paul Eggert wrote:
> On 02/04/11 07:21, [email protected] wrote:
> > Any ideas?
> 
> Is there some simple, self-contained shell script
> that reproduces the problem without requiring access
> to special hardware?  That might help us diagnose it.
> At least we could add it to the test cases.
> 
> Otherwise, I dunno, I suppose you can use GDB to figure
> out what went wrong.
>


I saw the same problem which stinga described, twice.  

Once, like stinga, in a non-portale environment.  

Once, while I was writing a script, test_multiple_volumes, where
I tried to get the problem to happen again.
test_multiple_volumes is attached.  

I have not been able to get the problem to repeat.

So I think that the problem which stinga saw exists, but it is
intermittent.

Information about the script are in the comment block at the
beginning of the script.  Basically it creates 3 loopback disks,
a bunch of files to write to those "disks", and experiments with
filling the disks using a tar command with the --multi-volume
option.

Rather than try to continue to get this script to "fail" in the
right way, I thought I would send it along to the list.  Maybe
someone else can get it to mis-behave.  That way we can get
the problem to repeat, and maybe get it fixed.

To get this script to work, you have to run it as root.  THIS
IS POTENTIALLY DANGEROUS.  While developing this script, it
inadvertently deposited files in the / directory, rather than
in the local directory where the script ran.  I think I fixed
that problem, but you never know.  So I recommend that you:
* go over this script very carefully before you run it,
* run it on a test machine with no important data,
* and/or run it on a test machine that may become
  non-functional without anyone caring.

To run this script, find a large empty directory on your test
machine (as described above) on a partition that has at least
1Gbyte of available disk space.  Save this script as
test_multi_volumes in that directory.  As root, type

./test_multi_volumes

By default, the script asks if you see desired output from df
at a couple of places.  Add the --noninteractive flag to get it
to just run without asking you questions.

--Seth Alford
[email protected]


This message and any attachments are intended only for the use of the addressee 
and may contain information that is privileged and confidential. If the reader 
of the message is not the intended recipient or an authorized representative of 
the intended recipient, you are hereby notified that any dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please notify us immediately by e-mail and delete the message and any 
attachments from your system.

#!/bin/sh
# test_multiple_volumes: test restores from individual multi-volume tar archives
# Copyright (C) 2011 Automatic Data Processing
# $Id: test_multiple_volumes /main/4 2011/02/21 20:34:24 alfords Exp $
#
# Licensed under terms of the GPLv3+.  Find a copy of the license at
# http://www.gnu.org/licenses/ .
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# This is a script to test extracting files from later volumes of
# a multi-volume tar archive.  This is version 0.1 of this
# script.
#
# I saw the same failure which stinga described in
# http://lists.gnu.org/archive/html/bug-tar/2011-02/msg00008.html
# Like stinga, the system on which I saw the problem is not portable.
#
# Paul Eggert (see
# http://lists.gnu.org/archive/html/bug-tar/2011-02/msg00009.html)
# asked for a self-contained script to duplicate the problem.  I
# tried to write that script.  I even saw the problem, once,
# using the script.  But then the problem went away.  tar behaved
# itself.  I could retrieve files from second and later tar
# archives, without having to run through the entire archive set.
#
# So I think that the problem which stinga saw exists, but it is
# intermittent.
#
# Rather than try to continue to get this script to "fail" in the
# right way, I thought I would send it along to the list.  Maybe
# someone else can get tar to mis-behave.  That way we can get
# the problem to repeat, and maybe get it fixed.
#
# To get this script to work, you have to run it as root.  THIS
# IS POTENTIALLY DANGEROUS.  While developing this script, it
# inadvertently deposited files in the / directory, rather than
# in the local directory where the script ran.  I think I fixed
# that problem, but you never know.  So I recommend that you:
# * go over this script very carefully before you run it,
# * run it on a test machine with no important data,
# * and/or run it on a test machine that may become
#   non-functional without anyone caring.
#
# To run this script, find a large empty directory on your test
# machine (as described above) on a partition that has at least
# 1Gbyte of available disk space.  Save this script as
# test_multi_volumes in that directory.  As root, type
#
# ./test_multi_volumes
#
# By default, the script asks if you see desired output from df
# at a couple of places.  Add the --noninteractive flag to get it
# to just run without asking you questions.
# --Seth Alford
# [email protected]

rm_files_on_backup2="testfile036 testfile041 testfile042 testfile047 
testfile051 testfile060"
rm_files_on_backup3="testfile067 testfile071 testfile080"
INTERACTIVE=1

if [ "$1" = "--noninteractive" ]
then
        INTERACTIVE=0
fi

cat << ClEaNuP
First, clean up junk files that might be left from the previous
time this script was run.  You may see some error messages about
non-existent mount points or files.

ClEaNuP

for i in 1 2 3
do
        umount backup.$i
        rm -rf file.$i backup.$i testfile* the_volume_list \
                the_volume_list.orig the_info_script
done

cat << setUPdIsKs

Next: set up some "disks".  No, they are not really disks.
Instead, they are big empty files that will be mounted as
loopback devices.  Make each one 100 Mbyte big.  dd the /dev/zero
file into files called file.1, file.2, and file.3.

setUPdIsKs

for i in 1 2 3
do
        dd if=/dev/zero of=file.$i bs=1M count=100 &
        let i=i+1
done

cat << DoThEDds

There are 3 dd's going in the background.  Wait for them to run
to completion.

DoThEDds

wait    # wait for the background dd's to finish

cat << doMkE2fs

Now put a filesystem on each of the "disks", using mke2fs.
Again, to save time, I'm putting them in the background.  You
will notice messages coming from the mke2fs processes.

doMkE2fs

for i in 1 2 3
do
        mke2fs -F file.$i &
        let i=i+1
done

cat << WaIt4It

There are 3 mke2fs's going in the background.  Wait for them to run
to completion.

WaIt4It

wait    # for all the background mke2fs's to finish


cat << setUPLoOpBaCkS

Now set up mount loopback mount points for the "disks".  And
mount the disks on the mount points.  If you are not running as
root, this may fail.

setUPLoOpBaCkS

for i in 1 2 3
do
        mkdir backup.$i
        mount file.$i backup.$i -o loop
        echo backup.$i >> the_volume_list.orig
done

cat << sHoUlDbE

There should be 3 disks mounted on 3 loopback devices on your
system.  Here is what df -k says are filesystems on your system.

sHoUlDbE

df -k

yesno="y"
while [ true -a $INTERACTIVE -eq 1 ]
do
        read -p "Do you see the new filesystems? [yn] " yesno

        if [[ $yesno == [YyNn]* ]]
        then
                break
        fi

        echo "Please type y[es] or n[o]"
done

if [[ $yesno == [nN]* ]]
then
        cat << ReGrEt

I am not sure what happened, but without the test volumes this
script cannot test tar.  Please make sure you have a large enough
filesystem to hold the files which hold the loopback devices.

ReGrEt

        exit 1
fi

cat << MaKeTeStFiLeS

Next step: create some test files which are big enough and numerous enough
to fill up most of the 3 loopback "disks" just created, above.  Then
start the tar to write the files to the "disks".

MaKeTeStFiLeS

# Create the first testfile "by hand"
let fileno=0
let lineno=0
testfile_name=`printf "testfile%.3d" $fileno`
while [ $lineno -lt 100000 ]
do
        printf "testfile%.3d line %.12d\n" $fileno $lineno
        let lineno=lineno+1
done > $testfile_name

let fileno=1
# Let sed do the work in creating the rest of the test files.
while [ $fileno -lt 90 ]
do
        testfile_name=`printf "testfile%.3d" $fileno`
        sed -e "s/^.* line/$testfile_name line/" < testfile000 > 
"$testfile_name"
        let fileno=fileno+1
done

cat > the_info_script << OuTpUt_tar_info_script
#!/bin/bash

if [ ! -s the_volume_list ]
then
        echo Sorry, no more mount points to write tar files, exiting
        exit 1
fi

next_mount_point=\`head -1 the_volume_list\`
echo A message from the_info_script
echo About to write \$next_mount_point/full.tar into the TAR_FD descriptor 
\$TAR_FD
echo \$next_mount_point/full.tar >&\$TAR_FD
tail -n +2 the_volume_list > the_volume_list.new
mv the_volume_list.new the_volume_list
echo TAR_VOLUME is \$TAR_VOLUME and TAR_ARCHIVE is \$TAR_ARCHIVE
mountpoint=\`dirname \$TAR_ARCHIVE\`
echo umounting \$mountpoint
umount \$mountpoint
exit 0
OuTpUt_tar_info_script

chmod 755 the_info_script

# Create the_volume_list from the_volume_list.orig, for the next instance
# of running tar with multiple volumes.  Strip off the first entry since
# that directory is provided to tar on the command line.
volume1=`head -1 the_volume_list.orig`
tail -n +2 the_volume_list.orig > the_volume_list

tar --verbose --create --multi-volume --info-script=./the_info_script testfile* 
--file=$volume1/full.tar

cat << rEmOuNtInG
Re-mounting backup.1 and backup.2, which the_info_script should have
unmounted.

rEmOuNtInG

for i in 1 2
do
        mount file.$i backup.$i -o loop
done

cat <<wEsHoUlD

Now, backup.1 and backup.2 directories should be full,
and backup.3 should be mostly full.  Here is what df says:

wEsHoUlD

df -k

yesno="y"
while [ true -a $INTERACTIVE -eq 1 ]
do
        read -p "Do you see the full and almost full filesystems? [yn] " yesno

        if [[ $yesno == [YyNn]* ]]
        then
                break
        fi

        echo "Please type y[es] or n[o]"
done

if [[ $yesno == [nN]* ]]
then
        cat << ReGrEt2

Unknown error.  Please check if your partition ran out of room.

ReGrEt2

        exit 1
fi

cp the_volume_list.orig the_volume_list

cat <<sHoUlDwOrK

The files:

$rm_files_on_backup2

should be in the middle of the second backup volume, which should be under
backup.2/full.tar.

The files

$rm_files_on_backup3

should be in the middle of the third backup volume, which should be under
backup.3/full.tar.

Next: remove both sets of files, then extract them using a multiple
volume extraction with the command:

tar --verbose --extract --multi-volume --info-script=./the_info_script 
--file=$volume1/full.tar $rm_files_on_backup2 $rm_files_on_backup3

Yes, the script re-populated the_volume_list with the list of backup volumes.
So this should work.

sHoUlDwOrK

rm $rm_files_on_backup2 $rm_files_on_backup3

cat << yOuShOuLd

Now to retrieve both sets of files.  You should see messages from
the_info_script at the end of each volume, where it tells tar from where to
read the next volume.  In between the the volumes, you should see the names
of files which tar extracts, as well as some tar warnings.

yOuShOuLd


# Create the_volume_list from the_volume_list.orig, for the next instance
# of running tar with multiple volumes.  Strip off the first entry since
# that directory is provided to tar on the command line.
tail -n +2 the_volume_list.orig > the_volume_list

tar --verbose --extract --multi-volume --info-script=./the_info_script 
--file=$volume1/full.tar $rm_files_on_backup2 $rm_files_on_backup3

for i in $rm_files_on_backup2 $rm_files_on_backup3
do
        if [ ! -f $i ]
        then
                echo $i did not get restored.  Something is wrong. Exiting.
                exit 1
        fi
done

cat << rEmOuNtInG
Re-mounting backup.1 and backup.2, which the_info_script should have
unmounted, again.

rEmOuNtInG

for i in 1 2
do
        mount file.$i backup.$i -o loop
done

cat << FaIls4Me

Now to once again remove both sets of files.  But, this time, the
script will retrieve each file from the second and third tar
volumes, individually, without going through volume1 first.  The
script will use to these commands to retrieve the files from the
tar archives:

for each_file in $rm_files_on_backup2
do
tar --verbose --extract --file=backup.2/full.tar \$each_file
done

for each_file in $rm_files_on_backup3
do
tar --verbose --extract --file=backup.3/full.tar \$each_file
done

FaIls4Me

rm $rm_files_on_backup2 $rm_files_on_backup3

for each_file in $rm_files_on_backup2
do
        tar --verbose --extract --file=backup.2/full.tar $each_file
done

for each_file in $rm_files_on_backup3
do
        tar --verbose --extract --file=backup.3/full.tar $each_file
done

echo Are the files back?

missing=""
for i in $rm_files_on_backup2 $rm_files_on_backup3
do
        if [ ! -f $i ]
        then
                missing="$missing $i"
        fi
done

if [ -z "$missing" ]
then
        cat << AlLbAcK
All the files were successfully retrieved.  Some tar users have
seen what appear to be intermittent errors when retrieving files
from multi-volume backup sets like this script tries to do.  If
you see this error, please report it to the bug-tar email list.
AlLbAcK
        exit 0
        fi
done

cat << iNfOsAyS

This failed despite what the info tar page says:

   You can read each individual volume of a multi-volume archive as if
   it were an archive by itself.  For example, to list the contents of one
   volume, use \`--list', without \`--multi-volume' specified.  To extract
   an archive member from one volume (assuming it is described that
   volume), use \`--extract', again without \`--multi-volume'.

Run "info tar" on your nearest Linux system and look for

   You can read each individual volume

iNfOsAyS

# Exit non-zero so that if someone feels like running this over
# and over in a loop they can stop when the script detects an
# error.
exit 1

--Zi0sgQQBxRFxMTsj--

Re: [Bug-tar] Can't extract file from multivolume archive

Reply via email to