subject:"bug#10055\: \[sr #107875\] BUG cp \-u corrupts 'fs'' information if interupted; can't recover on future invoctions"

bug#10055: [sr #107875] BUG cp -u corrupts 'fs'' information if interupted; can't recover on future invoctions

2011-11-16 Thread Linda A. Walsh

   Jim Meyering wrote:

Linda A. Walsh wrote:

Hmmm  Dang strange processes on bugs...  can't submit directly bug
can just by
emailing it to the email list?   ...  (bureaucracy!)

Linda Walsh wrote:

This should be filed under bugs, not under support, but it seems that users of
the core utilis are ot allowed to find bugs...convenient.

Thanks for the report.

Please do not use savannah's bug or support interfaces for coreutils.
We deliberately disabled the former.
Now, when you send a message to the bug-coreutils mailing list,
it creates a ticket for you.  Yours is here:

[1]http://bugs.gnu.org/10055

Simply replying to any mail about it adds entries to its log.

   
   But that's not the bug db interface...thats just a log...where? the bug
   db intface for the bug in the bug database?

References

   1. http://bugs.gnu.org/10055

bug#10055: [sr #107875] BUG cp -u corrupts 'fs'' information if interupted; can't recover on future invoctions

2011-11-16 Thread Jim Meyering

Linda A. Walsh wrote:
...
But that's not the bug db interface...thats just a log...where? the bug
db intface for the bug in the bug database?

 References

1. http://bugs.gnu.org/10055

Here's a description of the interface:

  http://debbugs.gnu.org/

bug#10055: [sr #107875] BUG cp -u corrupts 'fs'' information if interupted; can't recover on future invoctions

2011-11-15 Thread Linda Walsh








 Original Message 
Subject: 	[sr #107875] BUG cp -u corrupts 'fs'' information if 
interupted; can't recover on future invoctions

Date:   Tue, 15 Nov 2011 17:58:23 +
From:   Linda A. Walsh invalid.nore...@gnu.org
To: Linda A. Walsh 



URL:
 http://savannah.gnu.org/support/?107875

Summary: BUG cp -u corrupts 'fs'' information if interupted;
can't recover on future invoctions
Project: GNU Core Utilities
   Submitted by: law
   Submitted on: Tue Nov 15 09:58:22 2011
   Category: None
   Priority: 5 - Normal
   Severity: 3 - Normal
 Status: None
Privacy: Public
Assigned to: None
   Originator Email: 
Open/Closed: Open

Discussion Lock: Any
   Operating System: None

   ___

Details:

This should be filed under bugs, not under support, but it seems that users of
the core utilis are ot allowed to find bugs...convenient.  No wonder quality
metrics worthless.

Not trying for a sensationalist summary, but you try coming up with a SHORT
accurate summary for this.

The problem is bad (in the sense of providing false assurance and not being
reliable), but not as bad as the summary might sound...

if you copy a bunch of files (or 1 file for that matter, but then it _might_
be more quickly noticed, and the copy is interrupted (most often control-C,
cuz some param was forgotten, but could be other causes),  a partial file with
the current time stamp is left in the target location and the corrupt copy is
not removed upon interruption, though it is marked as being current
(w/current DT stamp).

This creates a corrupt copy of the file in a collection of files that
subsequent cp -u won't correct.  This is a problem.

As there is no indication in a collection of how many files are corrupted in
this manner...and the sources may have long been deleted.  


If interrupted, the cp tool should remove any partials or ensure they are not
created to begin with.

Possible ways of addressing:
A) catch INT ( catchable signals), and remove any files that are
'incomplete'
Besides that, several other steps could be taken to provide increasing
protections (some are orthogonal, some dependent):
B) 1). open destination name for write (verifying accesses) w/
  Exclusive Write;
  2). open tmp file for actual cp operation.
  3). use posix_fallocate (if available) to allocate sufficient space for the
copy
  4). do the copy.
  5); rename tmp over original; (closing original before rename on systems
that don't support separation of names and FD's (Win systems et al).
C) reset DT stamps on newly opened files to '0' (~1969/70?)' in all
non-auto-updated fields; -- then start copy...  any future 
invokations of cp -u could examine the time stamps, and if the

non-auto-updated fields appear to be zero; do the copy (and correct the time
stamps) with 2 possible exception conditions being noted:
 (a) if the source file also has '0'd time fields, then check file
sizes:
  if they match presume 'ok' (a statistical 'guess', -- possibly warned
about with a -verbose option), 
  if sizes don't match, presume not a correct update and do the copy.  
D) others?


As this is, it creates a situation of cp being unreliable.

Note, 'rsync' isn't a great substitute either, as I've ntoed
that when I was updating files with 'rsync', (which is always slower on full
file copies) with equivalent options, a later
usage of cp -uav to copy the files recopied most of the files
(all? not sure)  that rsync had copied with -aUVHAX (supposedly the same info
as cp -au from my understanding)).

The same was not true for the reverse case (files cp'ed and updated by cp,
were not updated by rsync, -- leading me to suspect rsync as not only being
significantly slower, but not as thorough in copying over information).

FWIW, I feel it important to file bugs about tools that are currently the best
in their class...(and tend to devote my attentions to wanting to see them
enhanced, even beyond their original scope at times);  rsync used to have a
very basic feature which put it above cp, ... it copied extended attrs and
ACLS.  Now that cp does that, and that cp was about 2-3x faster
than rsync for full files...







   ___

Reply to this item at:

 http://savannah.gnu.org/support/?107875

___
 Message sent via/by Savannah
 http://savannah.gnu.org/

bug#10055: [sr #107875] BUG cp -u corrupts 'fs'' information if interupted; can't recover on future invoctions

2011-11-15 Thread Paul Eggert

Thanks for your thoughtful suggestions.
I like many of the ideas and hope that somebody can find the time
to code them up.  Here are some more-detailed comments.

On 11/15/11 11:07, Linda Walsh wrote:

   3). use posix_fallocate (if available) to allocate sufficient space for the
 copy

This seems like a good idea, independently of the other points.
That is, if A and B are regular files, cp A B could
use A's size to preallocate B's storage, and it could
fail immediately (without trashing B!) if there's not
enough storage.  I like this.

 A) catch INT ( catchable signals), and remove any files that are
 'incomplete'

That might cause trouble in other cases.  For example, cp A B where
B already exists.  In this case it's unwise to remove B if interrupted
-- people won't expect that.  And in general 'cp' has behaved the way
that it does for decades, and we need to be careful about changing its
default behavior in such a fairly-drastic way.

But we could add an option to 'cp' to have this behavior.
Perhaps --remove-destination=signal?  That is --remove-destination
could have an optional list of names of places where the destination
could be removed, where the default is not to remove it, and
plain --remove-destination means --remove-destination=before.

 B) 1). open destination name for write (verifying accesses) w/
   Exclusive Write;

This could be another new option, though (as you write) it's
orthogonal to the main point.  I would suggest that this option be
called --oflag=excl (by analogy with dd's oflag= option).  We can add
support for the other output flags while we're at it, e.g.,
--oflag=excl,append,noatime.

   2). open tmp file for actual cp operation.
   5); rename tmp over original; (closing original before rename on systems
 that don't support separation of names and FD's (Win systems et al).

Yes, that could be another option.  I see (2) and (5) as being the
same feature.  Perhaps --remove-destination=after?

 C) reset DT stamps on newly opened files to '0' (~1969/70?)'

I dunno, this kind of time stamp munging sounds like it'd cause more
trouble than it'd cure.  It's more natural (and easier to debug
failures) if the last-modified time of a file is the time that the
file was last modified.

bug#10055: [sr #107875] BUG cp -u corrupts 'fs'' information if interupted; can't recover on future invoctions

2011-11-15 Thread Paul Eggert

On 11/15/11 12:46, Linda A. Walsh wrote:

 Better than leaving *doo doo* in a file

Sometimes, but not always.  I can think of plausible cases where I'd
rather have a partial copy than no copy at all.  As an extreme example,
if I'm doing 'cp /dev/tty A', I do not want A removed on interrupt
even if A has already been truncated and overwritten,
as A contains the only copy of the data that I just typed in by hand.

 But we could add an option to 'cp' to have this behavior.
 Perhaps --remove-destination=signal?  That is --remove-destination
 could have an optional list of names of places where the destination
 could be removed, where the default is not to remove it, and
 plain --remove-destination means --remove-destination=before.
 
 
 I think you misunderstood the problem.

Perhaps I did.  But could you explain the problem then?  For example,
how would the proposed cp --remove-destination=signal A B
not address the problem?

bug#10055: [sr #107875] BUG cp -u corrupts 'fs'' information if interupted; can't recover on future invoctions

2011-11-15 Thread Linda A. Walsh




Paul Eggert wrote:





A) catch INT ( catchable signals), and remove any files that are
'incomplete'


That might cause trouble in other cases.  For example, cp A B where
B already exists. 


===
Am **only** suggesting this where 'B' has already been opened
and truncated by stuff being copied from 'A'...

The point is to not leave a 'B' that is *indeterminate*.


In this case it's unwise to remove B if interrupted
-- people won't expect that.  


--
Better than leaving *doo doo* in a file where they expect
some.thing valid.

And in general 'cp' has behaved the way

that it does for decades, and we need to be careful about changing its
default behavior in such a fairly-drastic way.



It's a bug...Fixing a bug isn't usually considered
drastic.



But we could add an option to 'cp' to have this behavior.
Perhaps --remove-destination=signal?  That is --remove-destination
could have an optional list of names of places where the destination
could be removed, where the default is not to remove it, and
plain --remove-destination means --remove-destination=before.



I think you misunderstood the problem.

bug#10055: [sr #107875] BUG cp -u corrupts 'fs'' information if interupted; can't recover on future invoctions

2011-11-15 Thread Linda A. Walsh

Hmmm   Dang strange processes on bugs...  can't submit directly bug 
can just by

emailing it to the email list?   ...  (bureaucracy!)

Linda Walsh wrote:
This should be filed under bugs, not under support, but it seems that 
users of

the core utilis are ot allowed to find bugs...convenient.

bug#10055: [sr #107875] BUG cp -u corrupts 'fs'' information if interupted; can't recover on future invoctions

2011-11-15 Thread Pádraig Brady

On 11/15/2011 08:23 PM, Paul Eggert wrote:
 Thanks for your thoughtful suggestions.
 I like many of the ideas and hope that somebody can find the time
 to code them up.  Here are some more-detailed comments.
 
 On 11/15/11 11:07, Linda Walsh wrote:
 
   3). use posix_fallocate (if available) to allocate sufficient space for the
 copy
 
 This seems like a good idea, independently of the other points.
 That is, if A and B are regular files, cp A B could
 use A's size to preallocate B's storage, and it could
 fail immediately (without trashing B!) if there's not
 enough storage.  I like this.

I'll take a look at this at some stage.
I was intending to do it right after the fiemap stuff
as it was quite related, but that needed to be bypassed
for normal copies. Anyway I'll bump fallocate
up my priority list.

 
 A) catch INT ( catchable signals), and remove any files that are
 'incomplete'
 
 That might cause trouble in other cases.  For example, cp A B where
 B already exists.  In this case it's unwise to remove B if interrupted
 -- people won't expect that.  And in general 'cp' has behaved the way
 that it does for decades, and we need to be careful about changing its
 default behavior in such a fairly-drastic way.
 
 But we could add an option to 'cp' to have this behavior.
 Perhaps --remove-destination=signal?  That is --remove-destination
 could have an optional list of names of places where the destination
 could be removed, where the default is not to remove it, and
 plain --remove-destination means --remove-destination=before.
 
 B) 1). open destination name for write (verifying accesses) w/
   Exclusive Write;
 
 This could be another new option, though (as you write) it's
 orthogonal to the main point.  I would suggest that this option be
 called --oflag=excl (by analogy with dd's oflag= option).  We can add
 support for the other output flags while we're at it, e.g.,
 --oflag=excl,append,noatime.
 
   2). open tmp file for actual cp operation.
   5); rename tmp over original; (closing original before rename on systems
 that don't support separation of names and FD's (Win systems et al).
 
 Yes, that could be another option.  I see (2) and (5) as being the
 same feature.  Perhaps --remove-destination=after?

There are lots of implementation issues with tmp files,
many of which are noted here:
http://www.pixelbeat.org/docs/unix_file_replacement.html

 
 C) reset DT stamps on newly opened files to '0' (~1969/70?)'
 
 I dunno, this kind of time stamp munging sounds like it'd cause more
 trouble than it'd cure.  It's more natural (and easier to debug
 failures) if the last-modified time of a file is the time that the
 file was last modified.

Not a bad idea and least invasive, but if the Ctrl-C happened
between the creat() and utime() you'd get a newer zero length file.
Then subsequent `cp -u` would have to treat zero length files specially.

cheers,
Pádraig.

bug#10055: [sr #107875] BUG cp -u corrupts 'fs'' information if interupted; can't recover on future invoctions

2011-11-15 Thread Linda A. Walsh

   Paul Eggert wrote:

On 11/15/11 12:46, Linda A. Walsh wrote:


Better than leaving *doo doo* in a file

Sometimes, but not always.  I can think of plausible cases where I'd
rather have a partial copy than no copy at all.  As an extreme example,
if I'm doing 'cp /dev/tty A', I do not want A removed on interrupt
even if A has already been truncated and overwritten,
as A contains the only copy of the data that I just typed in by hand.


But we could add an option to 'cp' to have this behavior.
Perhaps --remove-destination=signal?  That is --remove-destination
could have an optional list of names of places where the destination
could be removed, where the default is not to remove it, and
plain --remove-destination means --remove-destination=before.


I think you misunderstood the problem.

Perhaps I did.  But could you explain the problem then?  For example,
how would the proposed cp --remove-destination=signal A B
not address the problem?

bug#10055: [sr #107875] BUG cp -u corrupts 'fs'' information if interupted; can't recover on future invoctions

2011-11-15 Thread Linda A. Walsh

   [Thought I send out rspns to this, but can't find it in my outgo,
   so...recomposing/sending,
   sorry for delay)

On 11/15/11 12:46, Linda A. Walsh wrote:


Better than leaving *doo doo* in a file

Sometimes, but not always.  I can think of plausible cases where I'd
rather have a partial copy than no copy at all.  As an extreme example,
if I'm doing 'cp /dev/tty A', I do not want A removed on interrupt
even if A has already been truncated and overwritten,
as A contains the only copy of the data that I just typed in by hand.

   =
   A A A  Um...yeah, you could try to apply the idea in general, but it
   might not have
   unforeseen side effects like you are demonstrating.A A  Why don't we
   focus on the specific problem mentioned which was using it in the
   context of
   the -u flag, (and with -a/-r and/or a wildcard), where you expect it
   to update
   contents of 'Dst' with 'Src'.
   In that case, you get interrupt, and you end up with a truncated file
   in Dst, that has
   some (not even the DT of the src file, but the DT the file was opened
   (or more likely
   closed) DateTime that will guarantee, that a correct copy will never
   get updated
   over the now, destroyed, bogus copy.A
   Not only that, but weeks later, when you go though your backup dir, and
   wonder
   why some file 'x' is only 1/10th the size of the rest of the similar
   backups,  your
   original can be very gone...(not that 1 of the multiple other backups
   might not sub-in, but
   that's not the point!)...A  You don't want the partially copied update
   -- that has already
   destroyed an original, to now leave a turd in place so that no future
   cp -uav will correct
   the problem
   Though, (I'm sure you'd love to see this in 'cp', (*cough*), cp could
   check file sizes and see
   if the target is smaller and if so.. assume, if the DT's were equal
   that the file cp was
   interrupted...and finish it...
   Actually that might not be a bad idea...

But we could add an option to 'cp' to have this behavior.
Perhaps --remove-destination=signal?  That is --remove-destination
could have an optional list of names of places where the destination
could be removed, where the default is not to remove it, and
plain --remove-destination means --remove-destination=before.


I think you misunderstood the problem.

Perhaps I did.  But could you explain the problem then?  For example,
how would the proposed cp --remove-destination=signal A B
not address the problem?

   Well, if it were the default case, sure, but if default is to trash
   files, that's bad.

bug#10055: [sr #107875] BUG cp -u corrupts 'fs'' information if interupted; can't recover on future invoctions

2011-11-15 Thread Paul Eggert

On 11/15/11 19:33, Linda A. Walsh wrote:
 Why don't we
 focus on the specific problem mentioned which was using it in the context of
 the -u flag, (and with -a/-r and/or a wildcard), where you expect it to 
 update
 contents of 'Dst' with 'Src'.

I'd rather not have a heuristic that says cp removes the destination
when interrupted, if you use the -u flag with -a or -r or a wildcard.
That'd be a hard rule to remember, and it's probably not the best
rule anyway, for somebody's opinion of best.  We need a simple rule
that's easy to document and to remember, even if it isn't necessarily
the best by some other measure.

It'd be OK if cp -a implies the new --remove-destination=signal
(or whatever) option.  Then you could just use cp -a.

 cp could check file sizes and see
 if the target is smaller and if so.. assume, if the DT's were equal that the 
 file cp was
 interrupted...and finish it...

I'm still not convinced by the idea about trusting the time stamp on
the destination.  Every time 'cp' writes to its destination, it will
update the destination's time stamp.  Sure, 'cp' can use utime immediately
afterwards to alter the time stamp, but there's still a window where
the destination's time stamp will be 'now'.  In general 'cp' must
continue to work in that case -- so why should it bother to reset the
destination's time stamp after every write?

bug#10055: [sr #107875] BUG cp -u corrupts 'fs'' information if interupted; can't recover on future invoctions

2011-11-15 Thread Jim Meyering

Linda A. Walsh wrote:
 Hmmm  Dang strange processes on bugs...  can't submit directly bug
 can just by
 emailing it to the email list?   ...  (bureaucracy!)

 Linda Walsh wrote:
 This should be filed under bugs, not under support, but it seems that users 
 of
 the core utilis are ot allowed to find bugs...convenient.

Thanks for the report.

Please do not use savannah's bug or support interfaces for coreutils.
We deliberately disabled the former.
Now, when you send a message to the bug-coreutils mailing list,
it creates a ticket for you.  Yours is here:

http://bugs.gnu.org/10055

Simply replying to any mail about it adds entries to its log.

bug#10055: [sr #107875] BUG cp -u corrupts 'fs'' information if interupted; can't recover on future invoctions

bug#10055: [sr #107875] BUG cp -u corrupts 'fs'' information if interupted; can't recover on future invoctions

bug#10055: [sr #107875] BUG cp -u corrupts 'fs'' information if interupted; can't recover on future invoctions

bug#10055: [sr #107875] BUG cp -u corrupts 'fs'' information if interupted; can't recover on future invoctions

bug#10055: [sr #107875] BUG cp -u corrupts 'fs'' information if interupted; can't recover on future invoctions

bug#10055: [sr #107875] BUG cp -u corrupts 'fs'' information if interupted; can't recover on future invoctions

bug#10055: [sr #107875] BUG cp -u corrupts 'fs'' information if interupted; can't recover on future invoctions

bug#10055: [sr #107875] BUG cp -u corrupts 'fs'' information if interupted; can't recover on future invoctions

bug#10055: [sr #107875] BUG cp -u corrupts 'fs'' information if interupted; can't recover on future invoctions

bug#10055: [sr #107875] BUG cp -u corrupts 'fs'' information if interupted; can't recover on future invoctions

bug#10055: [sr #107875] BUG cp -u corrupts 'fs'' information if interupted; can't recover on future invoctions

bug#10055: [sr #107875] BUG cp -u corrupts 'fs'' information if interupted; can't recover on future invoctions

12 matches

Site Navigation

Mail list logo

Footer information