[EMAIL PROTECTED] wrote: > i put this together to fix a bunch of files with wierd names, please > gimme feedback, i am a newbie
Ok, so let's go... Hope you won't hate me too much !-) > > #!/usr/bin/env python > import os > import sys > import string > import platform > dir = sys.argv[1] This will shadow the builtin 'dir' function. > noworky = sys.argv[2] If the user fails to provide args, the program will crash with an IndexError - which may not be very helpful. Also, a better scheme would be myprog [-opt1 [-opt2=val [-optN]]] arg1 arg2 argN Hint: the optparse module is your friend http://www.python.org/doc/2.4.2/lib/module-optparse.html > if platform.system() == 'Linux': > uglychars = ''.join( set(string.punctuation+' ') - set('/_.') ) > else: > if platform.system() == 'Windows':#this is broken because windows > is gay with case > uglychars = ''.join( set(string.punctuation+' ') - > set(':\\/_.') ) > else: > print "wtf... what platform is this anyway?" May be MacOS Classic, MacOS X or any *n*x or *BSD variant, or any other platform supporting Python - are there are some... > underscore = '_' > underscore = underscore * len(uglychars) You don't need the intermediate value: underscores = '_' * len(uglychars) > chars = string.maketrans(uglychars, underscore) > print "# PHASE I, DIRECTORIES" Why not processing dirs and files in one pass ? > for path, subdirs, files in os.walk(dir, topdown=True): Err... is the 'dir' argument supposed to be an absolute path or a relative path ? And why using the topdown option ? > oldname = path woops ! this may be the absolute path. Are you sure you want to process an absolute path ? I think you'd better process files and dirs in one path, walking bottom up (so you don't process any file twice). > newname = oldname.translate(chars) > newname = string.lower(newname) Use the 'str' object methods instead of functions from the string module: newname = newname.lower() You can also chain method/function calls: newname = oldname.translate(chars).lower() > while string.count(newname, "__") > 0: in the context of a boolean expression, 0 evaluate to False, non-zero to True. So you don't have to be so explicit: while newname.count('__'): > newname = string.replace(newname,"__","_") You don't need to actually *count* the occurrences of '__' - if there's one, that's enough: while '__' in newname: # proceed Also, a regexp may be more effective here. > while string.count(newname, "..") > 0: > newname = string.replace(newname,"..",".") Don't forget the '..' and '.' special directories in unix filesystems... > if oldname != newname: > if os.path.isfile(newname) or os.path.isdir(newname): And if there's a special file (device etc) ? hint : os.path.exists() > print oldname, "-->\n", newname, "\t\t\tERROR: file/dir > exists\n" stdout is for 'normal' program outputs (ie: outputs that may be used as inputs to another program). This kind of output should go to stderr: print >> sys.stdout, "%s --> %s : \t\t\tERROR: " "file/dir exists" % (oldname, newname,) > else: > print oldname, "-->\n", newname, "\t\t\tYAY: file > renamed\n" > if noworky == "doit": > os.renames(oldname, newname) How is the user supposed to know that he has to pass the string "doit" as a second arg ? There are some (more or less agreed upon) conventions about cli options. Like '--dry-run' to express the fact that the program shouldn't actually do more than simulate it's execution. > print "# PHASE II, FILES" > for path, subdirs, files in os.walk(dir, topdown=True): > for oldname in files: > oldname = os.path.join(path, oldname) Are you *sure* you want to operate on the *whole* *absolute* path ? (cf my thoughts about the "first phase" and the whole algorithm) > newname = oldname.translate(chars) > newname = string.lower(newname) Aren't you repeating some code here ? hint : all duplicated code should be factored out into a function. > newname = string.replace(newname,".mpeg",".mpg") > newname = string.replace(newname,".ram",".rm") > newname = string.replace(newname,".jpeg",".jpg") > newname = string.replace(newname,".qt",".mov") # outside the loop, define a dict like: ext_map = {'mpeg': 'mpg', 'ram' : 'rm', 'jpeg' : 'jpg', # etc } # then in the loop: base, ext = os.path.split(newname) if ext in ext_map: newname = "%s.%s" % (base, ext_map[ext]) > while string.count(newname, "__") > 0: > newname = string.replace(newname,"__","_") duplicated code... > while string.count(newname, "..") > 0: > newname = string.replace(newname,"..",".") > newname = string.replace(newname,"._","_") > newname = string.replace(newname,"_.",".") all this is a perfect usecase for regexps... > if oldname != newname: > if os.path.isfile(newname) or os.path.isdir(newname): > print oldname, "-->\n", newname, "\t\t\tERROR: file/dir > exists\n" > else: > print oldname, "-->\n", newname, "\t\t\tYAY: file > renamed\n" > if noworky == "doit": > os.renames(oldname, newname) duplicated code. Still alive ?-) I thing the first step would be to correct your algorithm, then use some more efficient or idiomatic constructs where possible (like using dicts instead of repeated tests, sending messages to stderr etc). Then here are some more hints: - put the processing algorithm in a dedicated function, if possible one that doesn't rely on any global variable, - factor out any duplicated code into helper functions - then add a 'main' function that take cares of options and call the 'processing' function if everything's ok. The main function should return 0 if ok, non-zero if errors (usually 2 for invalid cli options or args, 1 for other errors) - and finally add this at the end of your module: if __name__ == '__main__': sys.exit(main(sys.argv)) This will allow your script to be used either as a program or as a module. -- bruno desthuilliers python -c "print '@'.join(['.'.join([w[::-1] for w in p.split('.')]) for p in '[EMAIL PROTECTED]'.split('@')])" -- http://mail.python.org/mailman/listinfo/python-list