Re: Safely modify a file in place -- am I doing it right?

2011-06-30 Thread Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:

 I have a script running under Python 2.5 that needs to modify files in
 place. I want to do this with some level of assurance that I won't lose
 data. E.g. this is not safe:
 [snip] 

Thanks to all who replied, your comments were helpful.


-- 
Steven

-- 
http://mail.python.org/mailman/listinfo/python-list


Safely modify a file in place -- am I doing it right?

2011-06-29 Thread steve+comp . lang . python
I have a script running under Python 2.5 that needs to modify files in
place. I want to do this with some level of assurance that I won't lose
data. E.g. this is not safe:

def unsafe_modify(filename):
fp = open(filename, 'r')
data = modify(fp.read())
fp.close()
fp = open(filename, 'w')  # == original data lost here
fp.write(fp)
fp.close()  # == new data not saved until here

If something goes wrong writing the new data, I've lost the previous
contents.

I have come up with this approach:

import os, tempfile
def safe_modify(filename):
fp = open(filename, 'r')
data = modify(fp.read())
fp.close()
# Use a temporary file.
loc = os.path.dirname(filename)
fd, tmpname = tempfile.mkstemp(dir=loc, text=True)
# In my real code, I need a proper Python file object, 
# not just a file descriptor.
outfile = os.fdopen(fd, 'w')
outfile.write(data)
outfile.close()
# Move the temp file over the original.
os.rename(tmpname, filename)

os.rename is an atomic operation, at least under Linux and Mac, so if the
move fails, the original file should be untouched.

This seems to work for me, but is this the right way to do it? Is there a
better/safer way?



-- 
Steven

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Safely modify a file in place -- am I doing it right?

2011-06-29 Thread Grant Edwards
On 2011-06-29, steve+comp.lang.pyt...@pearwood.info 
steve+comp.lang.pyt...@pearwood.info wrote:
 I have a script running under Python 2.5 that needs to modify files in
 place. I want to do this with some level of assurance that I won't lose
 data. E.g. this is not safe:

 def unsafe_modify(filename):
 fp = open(filename, 'r')
 data = modify(fp.read())
 fp.close()
 fp = open(filename, 'w')  # == original data lost here
 fp.write(fp)
 fp.close()  # == new data not saved until here

 If something goes wrong writing the new data, I've lost the previous
 contents.

 I have come up with this approach:

 import os, tempfile
 def safe_modify(filename):
 fp = open(filename, 'r')
 data = modify(fp.read())
 fp.close()
 # Use a temporary file.
 loc = os.path.dirname(filename)
 fd, tmpname = tempfile.mkstemp(dir=loc, text=True)
 # In my real code, I need a proper Python file object, 
 # not just a file descriptor.
 outfile = os.fdopen(fd, 'w')
 outfile.write(data)
 outfile.close()
 # Move the temp file over the original.
 os.rename(tmpname, filename)

 os.rename is an atomic operation, at least under Linux and Mac, so if
 the move fails, the original file should be untouched.

 This seems to work for me, but is this the right way to do it?

That's how Unix programs have modified files in place since time
immemorial.

 Is there a better/safer way?

Many programs rename the original file with a backup suffix (a tilde
is popular).

-- 
Grant Edwards   grant.b.edwardsYow! It's NO USE ... I've
  at   gone to CLUB MED!!
  gmail.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Safely modify a file in place -- am I doing it right?

2011-06-29 Thread Chris Torek
In article 4e0b6383$0$29996$c3e8da3$54964...@news.astraweb.com
 steve+comp.lang.pyt...@pearwood.info wrote:
I have a script running under Python 2.5 that needs to modify files in
place. I want to do this with some level of assurance that I won't lose
data. ... I have come up with this approach:

[create temp file in suitable directory, write new data, and
use os.rename() to atomically swap out the old file for the
new]

As Grant Edwards said, this is the right general idea.  There
are lots of variations.  If you want to make the original
be a backup, the sequence:

os.link(original_name, backup_name)
os.rename(new_synced_file, original_name)

should generally do the trick (rename will unlink the target
which means that the backup name will refer to the original
inode).

import os, tempfile
def safe_modify(filename):
fp = open(filename, 'r')
data = modify(fp.read())
fp.close()
# Use a temporary file.
loc = os.path.dirname(filename)
fd, tmpname = tempfile.mkstemp(dir=loc, text=True)
# In my real code, I need a proper Python file object, 
# not just a file descriptor.
outfile = os.fdopen(fd, 'w')
outfile.write(data)
outfile.close()

It is a good idea to use outfile.flush() and then os.fsync() before
doing the close, as well.  Among other things, this *usually* gets
you some kind of notice-of-failure in the case of deferred writes
across a network (e.g., NFS).  (While it would be nice for os.close()
to deliver failure notices, in practice the fsync() is at least
sometimes required.  This is the OS's fault, not Python's. :-) )

# Move the temp file over the original.
os.rename(tmpname, filename)

os.rename is an atomic operation, at least under Linux and Mac,
so if the move fails, the original file should be untouched.

This seems to work for me, but is this the right way to do it?
Is there a better/safer way?

For additional checking and cleanup purposes, you may want to catch
exceptions and delete the temporary file if the rename has not yet
been done (and therefore the original file is still intact).

You will likely also need to fiddle with the permission bits
on the file resulting from the mkstemp() call (to make them
match those on the original file).  Alternatively, you may want
to build your own mkstemp() (this can be a bit of a challenge!).

Finally, as I implied above in talking about the os.link()-then-
os.rename() sequence, if the original file has multiple links to
it, note that this breaks the links.  If this is not what you
want, the problem has no fully general solution (but there are
various application-specific solutions).
-- 
In-Real-Life: Chris Torek, Wind River Systems
Intel require I note that my opinions are not those of WRS or Intel
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W)  +1 801 277 2603
email: gmail (figure it out)  http://web.torek.net/torek/index.html
-- 
http://mail.python.org/mailman/listinfo/python-list