Hi Jean-Louis,

I thought I had applied the patches on this machine also, but it turns out I
didn't (sorry about that). I applied the patches and ran a new dump. This
time all 4 file systems succeeded, though /usr got the same "strange"
message as on the other machine:

1251752282.483677: sendbackup: pid 15574 ruid 150 euid 150 version
2.6.2alpha: start at Mon Aug 31 14:58:02 2009
1251752282.484788: sendbackup: Version 2.6.2alpha
1251752282.523952: sendbackup: pid 15574 ruid 150 euid 150 version
2.6.2alpha: rename at Mon Aug 31 14:58:02 2009
1251752282.527795: sendbackup:   Parsed request as: program `DUMP'
1251752282.527880: sendbackup:                      disk `/usr'
1251752282.527931: sendbackup:                      device `/usr'
1251752282.527978: sendbackup:                      level 0
1251752282.528025: sendbackup:                      since NODATE
1251752282.528072: sendbackup:                      options `'
1251752282.533704: sendbackup: start: bromine.example.com:/usr lev 0
1251752282.537190: sendbackup: dumping device '/dev/rsd0d' with 'ffs'
1251752282.543664: sendbackup: Spawning "/sbin/dump dump 0usf 1048576 -
/dev/rsd0d" in pipeline
1251752282.548532: sendbackup: Started backup
1251752282.561558: sendbackup: fd 3 is set with O_NONBLOCK before dup2
1251752282.625055: sendbackup: Started index creator: "/sbin/restore -tvf -
2>&1 | sed -e '
s/^leaf[        ]*[0-9]*[       ]*\.//
t
/^dir[  ]/ {
s/^dir[         ]*[0-9]*[       ]*\.//
s%$%/%
t
}
d
'"
1251752282.627853: sendbackup:  90:  normal(|):   DUMP: Date of this level 0
dump: Mon Aug 31 14:58:02 2009
1251752282.641040: sendbackup:  90:  normal(|):   DUMP: Date of last level 0
dump: the epoch
1251752282.754862: sendbackup:  90:  normal(|):   DUMP: Dumping /dev/rsd0d
(/usr) to standard output
1251752282.792706: sendbackup:  90:  normal(|):   DUMP: mapping (Pass I)
[regular files]
1251752286.777085: sendbackup:  90:  normal(|):   DUMP: mapping (Pass II)
[directories]
1251752286.779538: sendbackup:  90:  normal(|):   DUMP: estimated 653358
tape blocks.
1251752286.789190: sendbackup:  90:  normal(|):   DUMP: Volume 1 started at:
Mon Aug 31 14:58:06 2009
1251752286.827782: sendbackup:  90:  normal(|):   DUMP: dumping (Pass III)
[directories]
1251752290.646004: sendbackup:  90:  normal(|):   DUMP: dumping (Pass IV)
[regular files]
1251752290.822019: sendbackup: 115: strange(?): sed: stdout: Resource
temporarily unavailable
1251752424.852658: sendbackup:  43:    size(|):   DUMP: 664333 tape blocks
1251752424.854957: sendbackup:  90:  normal(|):   DUMP: Date of this level 0
dump: Mon Aug 31 14:58:02 2009
1251752424.857083: sendbackup:  90:  normal(|):   DUMP: Volume 1 completed
at: Mon Aug 31 15:00:24 2009
1251752424.859064: sendbackup:  90:  normal(|):   DUMP: Volume 1 took
0:02:18
1251752424.860994: sendbackup:  90:  normal(|):   DUMP: Volume 1 transfer
rate: 4814 KB/s
1251752424.863042: sendbackup:  90:  normal(|):   DUMP: Date this dump
completed:  Mon Aug 31 15:00:24 2009
1251752424.864993: sendbackup:  90:  normal(|):   DUMP: Average transfer
rate: 4814 KB/s
1251752424.869138: sendbackup:  90:  normal(|):   DUMP: level 0 dump on Mon
Aug 31 14:58:02 2009
1251752424.878026: sendbackup:  90:  normal(|):   DUMP: DUMP IS DONE
1251752424.888324: sendbackup: Index pipe exited with status 1
1251752424.906986: sendbackup: Parsed backup messages
1251752424.907361: sendbackup: pid 15574 finish time Mon Aug 31 15:00:24
2009

So now I've seen the same behavior that Stan noted: applying the debug patch
seems to "fix" the dump.

Also, in case you didn't see it before, here is a snippet from an email
thread Stan started on an OpenBSD list:
> >  I am trying to get Amanda of a recent vintage working on 4.5. The
developrs
> >  don't understand how writing to a blocking pipe can return EAGAIN.
> >  Should the write just block?
>
> Yes, it should, and a quick, about-to-go-to-bed readthrough of the
> pipe write path shows that it shouldn't; is Amanda doing fcntls to
> the fd? If so, check those to be absolutely sure that they're not
> doing something hinkey.

Thanks,
Michael

On Mon, Aug 31, 2009 at 2:41 PM, Jean-Louis Martineau
<martin...@zmanda.com>wrote:

> Do you have the patch I sent to stan on this system?
>
> The patch check before and after the write if the pipe is in O_NONBLOCK or
> not and give an error if it is.
>
> I'm totally lost since it is in blocking mode and you get EAGAIN, which is
> impossible!!!!
>
> Jean-louis
>
> Michael Burk wrote:
>
>> Here are the two sendbackup.*.debug files for the / fs.
>>
>> First:
>>
>> 1251693102.343436: sendbackup: pid 29087 ruid 150 euid 150 version
>> 2.6.2alpha: start at Sun Aug 30 22:31:42 2009
>> 1251693102.345326: sendbackup: Version 2.6.2alpha
>> 1251693102.379331: sendbackup: pid 29087 ruid 150 euid 150 version
>> 2.6.2alpha: rename at Sun Aug 30 22:31:42 2009
>> 1251693102.382860: sendbackup:   Parsed request as: program `DUMP'
>> 1251693102.382940: sendbackup:                      disk `/'
>> 1251693102.382991: sendbackup:                      device `/'
>> 1251693102.383039: sendbackup:                      level 0
>> 1251693102.383086: sendbackup:                      since NODATE
>> 1251693102.383134: sendbackup:                      options `'
>> 1251693102.388833: sendbackup: start: bromine.example.com:/ lev 0
>> 1251693102.391475: sendbackup: dumping device '/dev/rsd0a' with 'ffs'
>> 1251693102.398325: sendbackup: Spawning "/sbin/dump dump 0usf 1048576 -
>> /dev/rsd0a" in pipeline
>> 1251693102.403120: sendbackup: Started backup
>> 1251693102.461663: sendbackup:  90:  normal(|):   DUMP: Date of this level
>> 0 dump: Sun Aug 30 22:31:42 2009
>> 1251693102.468669: sendbackup:  90:  normal(|):   DUMP: Date of last level
>> 0 dump: the epoch
>> 1251693102.487536: sendbackup: Started index creator: "/sbin/restore -tvf
>> - 2>&1 | sed -e '
>> s/^leaf[        ]*[0-9]*[       ]*\.//
>> t
>> /^dir[  ]/ {
>> s/^dir[         ]*[0-9]*[       ]*\.//
>> s%$%/%
>> t
>> }
>> d
>> '"
>> 1251693102.519939: sendbackup:  90:  normal(|):   DUMP: Dumping /dev/rsd0a
>> (/) to standard output
>> 1251693102.588086: sendbackup:  90:  normal(|):   DUMP: mapping (Pass I)
>> [regular files]
>> 1251693104.583392: sendbackup:  90:  normal(|):   DUMP: mapping (Pass II)
>> [directories]
>> 1251693104.585824: sendbackup:  90:  normal(|):   DUMP: estimated 35134
>> tape blocks.
>> 1251693104.595314: sendbackup:  90:  normal(|):   DUMP: Volume 1 started
>> at: Sun Aug 30 22:31:44 2009
>> 1251693104.615444: sendbackup:  90:  normal(|):   DUMP: dumping (Pass III)
>> [directories]
>> 1251693104.919691: sendbackup:  90:  normal(|):   DUMP: dumping (Pass IV)
>> [regular files]
>> 1251693105.800795: sendbackup: critical (fatal): index tee cannot write
>> [Resource temporarily unavailable]
>> 1251693105.803486: sendbackup: 115: strange(?): sendbackup: index tee
>> cannot write [Resource temporarily unavailable]
>> 1251693105.827339: sendbackup:  90:  normal(|):   DUMP: Broken pipe
>> 1251693105.832068: sendbackup:  90:  normal(|):   DUMP: The ENTIRE dump is
>> aborted.
>> 1251693105.839675: sendbackup: critical (fatal): error [dump (7257)
>> /sbin/dump returned 3]
>>
>> Second:
>>
>> 1251693123.840560: sendbackup: pid 32144 ruid 150 euid 150 version
>> 2.6.2alpha: start at Sun Aug 30 22:32:03 2009
>> 1251693123.841613: sendbackup: Version 2.6.2alpha
>> 1251693123.892773: sendbackup: pid 32144 ruid 150 euid 150 version
>> 2.6.2alpha: rename at Sun Aug 30 22:32:03 2009
>> 1251693123.896360: sendbackup:   Parsed request as: program `DUMP'
>> 1251693123.896437: sendbackup:                      disk `/'
>> 1251693123.896487: sendbackup:                      device `/'
>> 1251693123.896535: sendbackup:                      level 0
>> 1251693123.896584: sendbackup:                      since NODATE
>> 1251693123.896632: sendbackup:                      options `'
>> 1251693123.902332: sendbackup: start: bromine.example.com:/ lev 0
>> 1251693123.905012: sendbackup: dumping device '/dev/rsd0a' with 'ffs'
>> 1251693123.911778: sendbackup: Spawning "/sbin/dump dump 0usf 1048576 -
>> /dev/rsd0a" in pipeline
>> 1251693123.916860: sendbackup: Started backup
>> 1251693123.992261: sendbackup: Started index creator: "/sbin/restore -tvf
>> - 2>&1 | sed -e '
>> s/^leaf[        ]*[0-9]*[       ]*\.//
>> t
>> /^dir[  ]/ {
>> s/^dir[         ]*[0-9]*[       ]*\.//
>> s%$%/%
>> t
>> }
>> d
>> '"
>> 1251693123.995743: sendbackup:  90:  normal(|):   DUMP: Date of this level
>> 0 dump: Sun Aug 30 22:32:03 2009
>> 1251693124.009013: sendbackup:  90:  normal(|):   DUMP: Date of last level
>> 0 dump: the epoch
>> 1251693124.038977: sendbackup:  90:  normal(|):   DUMP: Dumping /dev/rsd0a
>> (/) to standard output
>> 1251693124.143454: sendbackup:  90:  normal(|):   DUMP: mapping (Pass I)
>> [regular files]
>> 1251693126.173916: sendbackup:  90:  normal(|):   DUMP: mapping (Pass II)
>> [directories]
>> 1251693126.176287: sendbackup:  90:  normal(|):   DUMP: estimated 35134
>> tape blocks.
>> 1251693126.185677: sendbackup:  90:  normal(|):   DUMP: Volume 1 started
>> at: Sun Aug 30 22:32:06 2009
>> 1251693126.205305: sendbackup:  90:  normal(|):   DUMP: dumping (Pass III)
>> [directories]
>> 1251693126.505055: sendbackup:  90:  normal(|):   DUMP: dumping (Pass IV)
>> [regular files]
>> 1251693126.797271: sendbackup: critical (fatal): index tee cannot write
>> [Resource temporarily unavailable]
>> 1251693126.799929: sendbackup: 115: strange(?): sendbackup: index tee
>> cannot write [Resource temporarily unavailable]
>> 1251693126.821488: sendbackup:  90:  normal(|):   DUMP: Broken pipe
>> 1251693126.825377: sendbackup:  90:  normal(|):   DUMP: The ENTIRE dump is
>> aborted.
>> 1251693126.833573: sendbackup: critical (fatal): error [dump (22791)
>> /sbin/dump returned 3]
>>
>> Thanks,
>> Michael
>>
>> On Mon, Aug 31, 2009 at 12:49 PM, Jean-Louis Martineau <
>> martin...@zmanda.com <mailto:martin...@zmanda.com>> wrote:
>>
>>    You get the error in the index pipe instead of the data path. The
>>    backup is correct but your index is empty. That's why you get a
>>    STRANGE result instead of a failure.
>>
>>    Can you post a sendbackup.*.debug for a dle that failed?
>>
>>    Jean-Louis
>>
>>    Michael Burk wrote:
>>
>>        Hello,
>>
>>        I applied the patches to the 20090827 snapshot. I tried it on
>>        two OpenBSD 4.5 sparc64 systems, forcing both to do full
>>        backups. One system seemed to work on all 3 file systems, the
>>        other failed on all 4 file systems with the same errors as
>>        before (exactly like what Stan reported). I'm using bsdtcp auth.
>>
>>        The system that worked, however, did give a "strange" result
>>        in the report on one filesystem. Here's the corresponding
>>        "sendbackup" debug file:
>>
>>        1251692700.529842: sendbackup: pid 30459 ruid 150 euid 150
>>        version 2.6.2alpha: start at Sun Aug 30 2
>>        2:25:00 2009
>>        1251692700.530411: sendbackup: Version 2.6.2alpha
>>        1251692700.553135: sendbackup: pid 30459 ruid 150 euid 150
>>        version 2.6.2alpha: rename at Sun Aug 30
>>        22:25:00 2009
>>        1251692700.555394: sendbackup:   Parsed request as: program `DUMP'
>>        1251692700.555435: sendbackup:                      disk `/usr'
>>        1251692700.555461: sendbackup:                      device `/usr'
>>        1251692700.555484: sendbackup:                      level 0
>>        1251692700.555507: sendbackup:                      since NODATE
>>        1251692700.555530: sendbackup:                      options `'
>>        1251692700.556750: sendbackup: start:
>>        zirconium.example.com:/usr lev 0
>>        1251692700.559495: sendbackup: dumping device '/dev/rsd1d'
>>        with 'ffs'
>>        1251692700.566094: sendbackup: Spawning "/sbin/dump dump 0usf
>>        1048576 - /dev/rsd1d" in pipeline
>>        1251692700.571185: sendbackup: Started backup
>>        1251692700.572425: sendbackup: fd 3 is set with O_NONBLOCK
>>        before dup2
>>        1251692700.597388: sendbackup: Started index creator:
>>        "/sbin/restore -tvf - 2>&1 | sed -e '
>>        s/^leaf[        ]*[0-9]*[       ]*\.//
>>        t
>>        /^dir[  ]/ {
>>        s/^dir[         ]*[0-9]*[       ]*\.//
>>        s%$%/%
>>        t
>>        }
>>        d
>>        '"
>>        1251692700.611526: sendbackup:  90:  normal(|):   DUMP: Date
>>        of this level 0 dump: Sun Aug 30 22:25:
>>        00 2009
>>        1251692700.669703: sendbackup:  90:  normal(|):   DUMP: Date
>>        of last level 0 dump: the epoch
>>        1251692700.670724: sendbackup:  90:  normal(|):   DUMP:
>>        Dumping /dev/rsd1d (/usr) to standard output
>>        1251692700.763600: sendbackup:  90:  normal(|):   DUMP:
>>        mapping (Pass I) [regular files]
>>        1251692705.618345: sendbackup:  90:  normal(|):   DUMP:
>>        mapping (Pass II) [directories]
>>        1251692705.620040: sendbackup:  90:  normal(|):   DUMP:
>>        estimated 1330085 tape blocks.
>>        1251692705.623976: sendbackup:  90:  normal(|):   DUMP: Volume
>>        1 started at: Sun Aug 30 22:25:05 2009
>>        1251692705.650687: sendbackup:  90:  normal(|):   DUMP:
>>        dumping (Pass III) [directories]
>>        1251692745.051547: sendbackup:  90:  normal(|):   DUMP:
>>        dumping (Pass IV) [regular files]
>>        1251692747.915767: sendbackup: 115: strange(?): sed: stdout:
>>        Resource temporarily unavailable
>>        1251692954.633083: sendbackup:  43:    size(|):   DUMP:
>>        1437829 tape blocks
>>        1251692954.637414: sendbackup:  90:  normal(|):   DUMP: Date
>>        of this level 0 dump: Sun Aug 30 22:25:00 2009
>>        1251692954.641414: sendbackup:  90:  normal(|):   DUMP: Volume
>>        1 completed at: Sun Aug 30 22:29:14 2009
>>        1251692954.641858: sendbackup: Index pipe exited with status 1
>>        1251692954.642719: sendbackup:  90:  normal(|):   DUMP: Volume
>>        1 took 0:04:09
>>        1251692954.643918: sendbackup:  90:  normal(|):   DUMP: Volume
>>        1 transfer rate: 5774 KB/s
>>        1251692954.645095: sendbackup:  90:  normal(|):   DUMP: Date
>>        this dump completed:  Sun Aug 30 22:29:14 2009
>>        1251692954.646239: sendbackup:  90:  normal(|):   DUMP:
>>        Average transfer rate: 5774 KB/s
>>        1251692954.647430: sendbackup:  90:  normal(|):   DUMP: level
>>        0 dump on Sun Aug 30 22:25:00 2009
>>        1251692954.653480: sendbackup:  90:  normal(|):   DUMP: DUMP
>>        IS DONE
>>        1251692954.653841: sendbackup: Parsed backup messages
>>        1251692954.654032: sendbackup: pid 30459 finish time Sun Aug
>>        30 22:29:14 2009
>>
>>        -- Michael
>>
>>        On Fri, Aug 28, 2009 at 5:22 AM, stan <st...@panix.com
>>        <mailto:st...@panix.com> <mailto:st...@panix.com
>>
>>        <mailto:st...@panix.com>>> wrote:
>>
>>           On Thu, Aug 27, 2009 at 03:35:47PM -0600, Michael Burk wrote:
>>           > Cool - can you send me the patch? Or is it already in the
>>        0826
>>           snapshot?
>>           >
>>           > FYI - I got 2.5.0p2 working, but could never get 2.5.1p3
>>           working, even with
>>           > the same config. No idea why.
>>           > I started looking at the source code last night in the 0825
>>           snapshot. I'll
>>           > discontinue that search if there's a patch.
>>           >
>>           This is interesting. I sent Jean-Louis Martineau an email,
>>        saying
>>           teaks for
>>           fixing things, and he replied that the patch he sent me
>>        just added
>>           debugging,
>>           it was not intended to fix anything.
>>
>>           Bit, using this snapshot 20090813, and the patch, which I will
>>           send to you.
>>           I have it working on my 3 OpenBSD 4.5 machines that I have
>>           upgraded so far.
>>           I ran several test yesterday, and a full fledged backup run
>>        last
>>           night,
>>           and I have not seen any issues since installing this code.
>>
>>           I am going to attach the patch to this email.
>>
>>           I would greatly appreciate it, if you would apply this patch to
>>           the above
>>           snapshot, test, and tell me what your results are.
>>
>>           I am very puzzled at the moment.
>>
>>           BTW, is your failure the same as mine? That is amcheck
>>        passes, and
>>           most
>>           times you get a PARTIAL backup?
>>
>>           BTW, I am using bsdtcp "auth" and client compression now, but I
>>           have made
>>           this work with this code, with bsd auth, and no
>>        compression, so I
>>           don't
>>           think that has any affect on this issue.
>>
>>           Thanks for the help!
>>
>>           --
>>           One of the main causes of the fall of the roman empire was
>>        that,
>>           lacking
>>           zero, they had no way to indicate successful termination of
>>        their C
>>           programs.
>>
>>
>>
>>
>>
>

Reply via email to