Re: [Tutor] Newbie & Unittest ...
Hello again everyone - and thanks for your responses. Adding the unittest method message was something I didn't realize I could do! On Thu, May 6, 2010 at 10:20 PM, Steven D'Aprano wrote: > With respect to Lie, dynamically adding methods is an advanced technique > that is overkill for what you seem to be doing, and the code he gave > you can't work without major modification. I think you make a good argument for simple testing ... and I already fell victim to "It's working great! My tests pass!" when in fact the test wasn't working at all! Here is what I ended up doing, and it (currently) runs 52 tests. I'm not sure if it is worth the trade-off, but I think it saved me some typing (and makes it easy to add another file or tag key/value pair). #!/usr/bin/env python ''' unit tests for tagging.py ''' import unittest from mlc import filetypes TAG_VALUES = ( ('title', 'Christmas Waltz'), ('artist', 'Damon Timm'), ('album', 'Homemade'), ('albumartist', 'Damon Timm'), ('compilation', False ), ('composer', 'Damon Timm'), ('date', '2005'), ('description', 'For more music, visit: damonjustisntfunny.com'), ('discnumber', 1), ('disctotal', 1), ('genre', 'Folk'), ('tracknumber', 1), ('tracktotal', 10), ) FILES = ( filetypes.FLACFile('data/lossless/01 - Christmas Waltz.flac'), filetypes.MP3File('data/lossy/04 - Christmas Waltz (MP3-79).mp3'), filetypes.OGGFile('data/lossy/01 - Christmas Waltz (OGG-77).ogg'), filetypes.MP4File('data/lossy/06 - Christmas Waltz (M4A-64).m4a'), ) class TestTagOutput(unittest.TestCase): pass def add_assert_equal(cls, test_name, value1, value2): new_test = lambda self: self.assertEqual(value1, value2) new_test.__doc__ = test_name setattr(cls, test_name, new_test) for file in FILES: for key, value in TAG_VALUES: test_name = 'test_' + file.exts[0] + '_' + key # test_ext_key add_assert_equal(TestFileTags, test_name, file.tags[key], value) if __name__ == '__main__': unittest.main() ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Newbie & Unittest ...
Sorry for the multiple posts ... I'll be quiet for a while until I find a real answer! What I wrote below doesn't actually work -- it appears to work because all the functions have different names but they all reference a single function ... I should have looked more closely at my initial output... I'm going to have to look into why that is. I need a way to make each function unique ... On Thu, May 6, 2010 at 2:04 PM, Damon Timm wrote: > class TestFileTags(unittest.TestCase): > pass > > for test_name, file, key, value in list_of_tests: > def test_func(self): > self.assertEqual(file.tags[key], value) > > setattr(TestFileTags, test_name, test_func) ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Newbie & Unittest ...
Ooh! Wait! I found another method that is similar in style and appears to work ... class TestFileTags(unittest.TestCase): pass for test_name, file, key, value in list_of_tests: def test_func(self): self.assertEqual(file.tags[key], value) setattr(TestFileTags, test_name, test_func) I'm not sure if it is the *best* or *right* way to do it, but it does the trick! Damon On Thu, May 6, 2010 at 1:53 PM, Damon Timm wrote: > Hi Lie - > > Thanks for that idea -- I tried it but am getting an error. I read a > little about the __dict__ feature but couldn't figure it. I am going > to keep searching around for how to dynamically add methods to a class > ... here is the error and then the code. > > Thanks. > > # ERROR: > > $ python tests_tagging.py > Traceback (most recent call last): > File "tests_tagging.py", line 25, in > class TestFileTags(unittest.TestCase): > File "tests_tagging.py", line 31, in TestFileTags > __dict__[test] = new_test > NameError: name '__dict__' is not defined > > # CODE: > > import unittest > from mlc.filetypes import * > > TAG_VALUES = ( > ('title', 'Christmas Waltz'), > ('artist', 'Damon Timm'), > ('album', 'Homemade'), > ) > > FILES = ( > FLACFile('data/lossless/01 - Christmas Waltz.flac'), > MP3File('data/lossy/04 - Christmas Waltz (MP3-79).mp3'), > OGGFile('data/lossy/01 - Christmas Waltz (OGG-77).ogg'), > MP4File('data/lossy/06 - Christmas Waltz (M4A-64).m4a'), > ) > > list_of_tests = [] > for file in FILES: > for k, v in TAG_VALUES: > test_name = 'test_' + file.exts[0] + '_' + k > list_of_tests.append((test_name, file, k, v)) > > class TestFileTags(unittest.TestCase): > > for test in list_of_tests: > def new_test(self): > self.assertEqual(test[1].tags[test[2]],test[3]) > > __dict__[test] = new_test > > if __name__ == '__main__': > unittest.main() > > > On Thu, May 6, 2010 at 12:26 PM, Lie Ryan wrote: >> On 05/06/10 10:37, Damon Timm wrote: >>> Hi - am trying to write some unit tests for my little python project - >>> I had been hard coding them when necessary here or there but I figured >>> it was time to try and learn how to do it properly. >>> >>> This test works, however, it only runs as *one* test (which either >>> fails or passes) and I want it to run as 12 different tests (three for >>> each file type) and be able to see which key is failing for which file >>> type. I know I could write them all out individually but that seems >>> unnecessary. >> >> One way to do what you wanted is to harness python's dynamicity and >> generate the methods by their names: >> >> class TestFiles(unittest.TestCase): >> for methname, case in somedict: >> def test(self): >> ... >> __dict__[methname] = test >> >> ___ >> Tutor maillist - tu...@python.org >> To unsubscribe or change subscription options: >> http://mail.python.org/mailman/listinfo/tutor >> > ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Newbie & Unittest ...
Hi Lie - Thanks for that idea -- I tried it but am getting an error. I read a little about the __dict__ feature but couldn't figure it. I am going to keep searching around for how to dynamically add methods to a class ... here is the error and then the code. Thanks. # ERROR: $ python tests_tagging.py Traceback (most recent call last): File "tests_tagging.py", line 25, in class TestFileTags(unittest.TestCase): File "tests_tagging.py", line 31, in TestFileTags __dict__[test] = new_test NameError: name '__dict__' is not defined # CODE: import unittest from mlc.filetypes import * TAG_VALUES = ( ('title', 'Christmas Waltz'), ('artist', 'Damon Timm'), ('album', 'Homemade'), ) FILES = ( FLACFile('data/lossless/01 - Christmas Waltz.flac'), MP3File('data/lossy/04 - Christmas Waltz (MP3-79).mp3'), OGGFile('data/lossy/01 - Christmas Waltz (OGG-77).ogg'), MP4File('data/lossy/06 - Christmas Waltz (M4A-64).m4a'), ) list_of_tests = [] for file in FILES: for k, v in TAG_VALUES: test_name = 'test_' + file.exts[0] + '_' + k list_of_tests.append((test_name, file, k, v)) class TestFileTags(unittest.TestCase): for test in list_of_tests: def new_test(self): self.assertEqual(test[1].tags[test[2]],test[3]) __dict__[test] = new_test if __name__ == '__main__': unittest.main() On Thu, May 6, 2010 at 12:26 PM, Lie Ryan wrote: > On 05/06/10 10:37, Damon Timm wrote: >> Hi - am trying to write some unit tests for my little python project - >> I had been hard coding them when necessary here or there but I figured >> it was time to try and learn how to do it properly. >> >> This test works, however, it only runs as *one* test (which either >> fails or passes) and I want it to run as 12 different tests (three for >> each file type) and be able to see which key is failing for which file >> type. I know I could write them all out individually but that seems >> unnecessary. > > One way to do what you wanted is to harness python's dynamicity and > generate the methods by their names: > > class TestFiles(unittest.TestCase): > for methname, case in somedict: > def test(self): > ... > __dict__[methname] = test > > ___ > Tutor maillist - tu...@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor > ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Newbie & Unittest ...
Hi Vincent - Thanks for your input. Where would I put that string ? In the function's doctsring ? Or just as a print method ? I have been looking online some more and it appears there may be a way to create some sort of generator ... it's still a little confusing to me, though. I was hoping there was an easier way. I can't imagine I am the first person with this task to accomplish ... Thanks, Damon On Thu, May 6, 2010 at 9:46 AM, Vincent Davis wrote: > By they way you shouldn't need to use str(file) as I did. Unlessit is > not a string already. Bad habit. I am used to numbers > vincet > > On Thursday, May 6, 2010, Vincent Davis wrote: >> I can't think of a way to do what you ask, without defining a test for each. >> ButI think what you might actually want is the define the error message to >> report which one failed. ie, it's one test with a meaningful error message. >> 'Failed to load' + str(file)+' '+ str(k)+', '+str(v)I am not ecpert on >> unittests >> >> >> >> >> >> Vincent Davis >> 720-301-3003 >> >> vinc...@vincentdavis.net >> >> my blog <http://vincentdavis.net> | >> LinkedIn <http://www.linkedin.com/in/vincentdavis> >> On Wed, May 5, 2010 at 6:37 PM, Damon Timm wrote: >> Hi - am trying to write some unit tests for my little python project - >> I had been hard coding them when necessary here or there but I figured >> it was time to try and learn how to do it properly. >> >> I've read over Python's guide >> (http://docs.python.org/library/unittest.html) but I am having a hard >> time understanding how I can apply it *properly* to my first test case >> ... >> >> What I am trying to do is straightforward, I am just not sure how to >> populate the tests easily. Here is what I want to accomplish: >> >> # code >> import unittest >> from mlc.filetypes import * # the module I am testing >> >> # here are the *correct* key, value pairs I am testing against >> TAG_VALUES = ( >> ('title', 'Christmas Waltz'), >> ('artist', 'Damon Timm'), >> ('album', 'Homemade'), >> ) >> >> # list of different file types that I want to test my tag grabbing >> capabilities >> # the tags inside these files are set to match my TAG_VALUES >> # I want to make sure my code is extracting them correctly >> FILES = ( >> FLACFile('data/lossless/01 - Christmas Waltz.flac'), >> MP3File('data/lossy/04 - Christmas Waltz (MP3-79).mp3'), >> OGGFile('data/lossy/01 - Christmas Waltz (OGG-77).ogg'), >> MP4File('data/lossy/06 - Christmas Waltz (M4A-64).m4a'), >> ) >> >> class TestFiles(unittest.TestCase): >> >> # this is the basic test >> def test_values(self): >> '''see if values from my object match what they should match''' >> for file in FILES: >> for k, v in TAG_VALUES: >> self.assertEqual(self.file.tags[k], v) >> >> This test works, however, it only runs as *one* test (which either >> fails or passes) and I want it to run as 12 different tests (three for >> each file type) and be able to see which key is failing for which file >> type. I know I could write them all out individually but that seems >> unnecessary. >> >> I suspect my answer lies in the Suites but I can't wrap my head around it. >> >> Thanks! >> >> Damon >> ___ >> Tutor maillist - tu...@python.org >> To unsubscribe or change subscription options: >> http://mail.python.org/mailman/listinfo/tutor >> >> >> > ___ > Tutor maillist - tu...@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor > ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] Newbie & Unittest ...
Hi - am trying to write some unit tests for my little python project - I had been hard coding them when necessary here or there but I figured it was time to try and learn how to do it properly. I've read over Python's guide (http://docs.python.org/library/unittest.html) but I am having a hard time understanding how I can apply it *properly* to my first test case ... What I am trying to do is straightforward, I am just not sure how to populate the tests easily. Here is what I want to accomplish: # code import unittest from mlc.filetypes import * # the module I am testing # here are the *correct* key, value pairs I am testing against TAG_VALUES = ( ('title', 'Christmas Waltz'), ('artist', 'Damon Timm'), ('album', 'Homemade'), ) # list of different file types that I want to test my tag grabbing capabilities # the tags inside these files are set to match my TAG_VALUES # I want to make sure my code is extracting them correctly FILES = ( FLACFile('data/lossless/01 - Christmas Waltz.flac'), MP3File('data/lossy/04 - Christmas Waltz (MP3-79).mp3'), OGGFile('data/lossy/01 - Christmas Waltz (OGG-77).ogg'), MP4File('data/lossy/06 - Christmas Waltz (M4A-64).m4a'), ) class TestFiles(unittest.TestCase): # this is the basic test def test_values(self): '''see if values from my object match what they should match''' for file in FILES: for k, v in TAG_VALUES: self.assertEqual(self.file.tags[k], v) This test works, however, it only runs as *one* test (which either fails or passes) and I want it to run as 12 different tests (three for each file type) and be able to see which key is failing for which file type. I know I could write them all out individually but that seems unnecessary. I suspect my answer lies in the Suites but I can't wrap my head around it. Thanks! Damon ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] How to map different keys together ?
Thanks again for your input. Comments below with working (yea!) code. On Sun, Apr 18, 2010 at 3:23 PM, ALAN GAULD wrote: > Does what I've shown make sense? Alan - I think I got my mind around it -- I had never used lambda functions before, so this is new territory to me. Thanks for the examples and I think it has me in the right direction. I included updated code below, which is working. 2010/4/18 spir ☣ : > All of this is not only plain data, but constant! The case of non-exisitng > tag for a given format is, as shown by your 'pass', not to be handled here. > Using a class is not only averkill but (wrong in my opininon and) misleading. > > Anyway it won't work since you need more than simple lookup in the case > above, meaning some process must be done, and also the case of mp3 (iirc) > titles shown in your first post. Denis - I thought that with all the extra functions that would be needed, making a class would simplify it for the user (me, in this case). I actually have another class that this will be merged into (maybe it will make more sense then) ... I was just setting up the Tag classes so I didn't confuse everyone with all the other code ... Again - I believe I am on the right track (even if it is overkill -- wink). Here is what I updated and what is working: class TagNotSupported(Exception): '''Raised when trying to use a tag that is not currently supported across filetypes. Hopefully we cover enough so this does not happen!''' def set_tag(mutagen, tag, value): mutagen[tag] = value class Tags(object): '''Wrapper class for a mutagen music file object.''' _get_tags = {} _set_tags = {} def __init__(self, mutagen): '''Requires a loaded mutagen object to get us rolling''' self._mutagen = mutagen def keys(self): '''Get list of tag keys in the file.''' keys = [] for key in self._get_tags.keys(): try: self._get_tags[key](self._mutagen) keys.append(key) except KeyError: pass return keys def save(self): '''Save the mutagen changes.''' self._mutagen.save() class MP4Tags(Tags): _get_tags = { 'album' : lambda x: x['\xa9alb'], 'artist': lambda x: x['\xa9ART'], 'albumartist' : lambda x: x['aART'], 'compilation' : lambda x: x['cpil'], 'composer' : lambda x: x['\xa9wrt'], 'description' : lambda x: x['\xa9cmt'], 'discnumber': lambda x: [str(x['disk'][0][0]).decode('utf-8')], 'disctotal' : lambda x: [str(x['disk'][0][1]).decode('utf-8')], 'genre' : lambda x: x['\xa9gen'], 'title' : lambda x: x['\xa9nam'], 'tracknumber' : lambda x: [str(x['trkn'][0][0]).decode('utf-8')], 'tracktotal': lambda x: [str(x['trkn'][0][1]).decode('utf-8')], 'date' : lambda x: x['\xa9day'], } _set_tags = { 'album' : lambda x, v: set_tag(x._mutagen, '\xa9alb', v), 'albumartist' : lambda x, v: set_tag(x._mutagen, 'aART',v), 'artist': lambda x, v: set_tag(x._mutagen, '\xa9ART', v), 'compilation' : lambda x, v: set_tag(x._mutagen, 'cpil',v), 'composer' : lambda x, v: set_tag(x._mutagen, '\xa9wrt', v), 'description' : lambda x, v: set_tag(x._mutagen, '\xa9cmt', v), 'discnumber': lambda x, v: x.x_of_y('disk', 0, v), 'disctotal' : lambda x, v: x.x_of_y('disk', 1, v), 'genre' : lambda x, v: set_tag(x._mutagen, '\xa9gen', v), 'title' : lambda x, v: set_tag(x._mutagen, '\xa9nam', v), 'tracknumber' : lambda x, v: x.x_of_y('trkn', 0, v), 'tracktotal': lambda x, v: x.x_of_y('trkn', 1, v), 'date' : lambda x, v: set_tag(x._mutagen, '\xa9day', v), } def __getitem__(self, key): try: return self._get_tags[key](self._mutagen) except KeyError: pass def __setitem__(self, key, value): try: self._set_tags[key](self, value) except KeyError: raise TagNotSupported('The tag "' + key + '" is not supported.') def x_of_y(self, key, index, value): '''Used to set our disc and track information. MP4 stores everything in a tuple of (x,y).''' try: # if this value is not already set, we need defaults init_val = self._mutagen[key][0] except KeyError: init_val = (0,0) try: # mutagen often passes things in lists, eg [u'1'] value = int(value) except TypeError: value = int(value[0]) if not index: # if index == 0 self._mutagen[key] = [(value, init_val[1])] else: # if index == 1 self._mutagen[key] = [(init_val[0], value)] Thanks again. ___ Tutor maillist
Re: [Tutor] How to map different keys together ?
Hi Alan, et al - thanks for your response and your ideas. I sat down and did a little more coding so that I might tackle what I can and bring back where I am having trouble. I have implemented the basic 'tag_map' you suggested without a hitch using my own class and getitem/setitem builtins. Right now, if there is a one-to-one correlation between my *generic* key (as a standard between my music files) and the specific tag for the filetype, I am doing well. However, everything is not so clear cut in the world of metatagging! My first stumbling block is that M4A files store the track number and track total in a single tuple ... but I need them as separate fields (which is how some of the other formats do it). This is going to be one of many hurdles -- I need a way to accomplish more than a one-to-one data map. See my code, below, as well as some command line examples where I am having trouble. I feel there may be a way to pass functions through my tag_map dictionary (maybe a lambda?!) but I can't get my head around what approach is best (I can't think of any approach, right now, actually). Code follows. Thanks again. class Tags(object): '''Wrapper class for a mutagen music file object.''' tag_map = {} def __init__(self, mutagen): self._mutagen = mutagen self.tags = {} def keys(self): '''Get list of generic tag keys in use''' keys = [] for k in self.tag_map.keys(): try: self._mutagen[self.tag_map[k]] keys.append(k) except KeyError: pass return keys def save(self): '''Save the mutagen changes.''' self._mutagen.save() class MP4Tags(Tags): tag_map = { # GENERIC : SPECIFIC 'title' : '\xa9nam', 'album' : '\xa9alb', 'artist': '\xa9ART', 'albumartist' : 'aART', 'comment' : '\xa9cmt', 'compilation' : 'cpil', 'composer' : '\xa9wrt', 'genre' : '\xa9gen', 'discnumber': 'disk', # returns: (2,10) need lmbda or something ?! 'disctotal' : 'disk', # returns: (2,10) need lmbda or something ?! 'year' : '\xa9day', 'tracknumber' : 'trkn', # returns: (2,10) need lmbda or something ?! 'tracktotal': 'trkn' # returns: (2,10) need lmbda or something ?! } def __getitem__(self, key): try: return self._mutagen[self.tag_map[key]] except KeyError: pass def __setitem__(self, key, value): self._mutagen[self.tag_map[key]] = value #EOF **Here is how it works: >>> import tagging >>> from mutagen.mp4 import MP4 >>> mp4 = MP4('../tests/data/Compressed/M4A-256.m4a') >>> mp4_tags = tagging.MP4Tags(mp4) >>> mp4_tags['title'] [u'bob the builder'] # woo hoo! it works! >>> mp4_tags['title'] = [u'I can change the title!'] >>> mp4_tags['title'] [u'I can change the title!'] # changing the titles works too >>> mp4_tags['discnumber'] [(1, 1)] # TODO - I need to return disk[0][0] ... not the tuple >>> mp4_tags.save() So, I need to modify how the data is shown to me as well as how I would go about writing the data something like: return_tag(disk): return disk[0][0] save_tag(num): return [(%s, %s)] % ( num, somehow_get_the_original_second_value_before_re_saving) Thanks again and any advice or guidance about how to approach this is greatly appreciated. Damon On Sat, Apr 17, 2010 at 3:55 PM, Alan Gauld wrote: > > "Damon Timm" wrote > >> I am struggling, on a theoretical level, on how to map the various >> filetype's tag/key naming conventions. I have already created my own >> MusicFile objects to handle each filetype (with functions for >> encoding/decoding) but I can't wrap my head around how to map the >> tags. > > I'd define a mapping table per file type that maps from a standad > set of keys to the file specific tag. > > Then define a class that works with the generic tags. > You can then eirther subclass per file type and use the file specific > mapping(a class variable) to translate internally or just create a function > that takes the file mapping as a parameter(maybe in the init() ) and sets > it up for the generic methods to use. > >> And here is what I would need to do to find the song'
[Tutor] How to map different keys together ?
Hello - I am writing a script that converts an entire music library into a single desired output format. The source music library has a variety of music filetypes (flac, mp3, m4a, ogg, etc) and I am attempting to use mutagen (a music file tagging module, http://code.google.com/p/mutagen/) in order to handle the tagging. I am struggling, on a theoretical level, on how to map the various filetype's tag/key naming conventions. I have already created my own MusicFile objects to handle each filetype (with functions for encoding/decoding) but I can't wrap my head around how to map the tags. Here is a relevant (I think) example from mutagen for three different filetypes all with the same tags (but different keys identifying them). >>> flac.keys() ['album', 'disctotal', 'artist', 'title', 'tracktotal', 'genre', 'composer', 'date', 'tracknumber', 'discnumber'] >>> mp3.keys() ['TPOS', u'APIC:', 'TDRC', 'TIT2', 'TPE2', 'TPE1', 'TALB', 'TCON', 'TCOM'] >>> mp4.keys() ['\xa9alb', 'tmpo', '\xa9ART', '\xa9cmt', '\xa9too', 'cpil', ':com.apple.iTunes:iTunSMPB', '\xa9wrt', '\xa9nam', 'pgap', '\xa9gen', 'covr', 'disk', ':com.apple.iTunes:Encoding Params', ':com.apple.iTunes:iTunNORM'] And here is what I would need to do to find the song's TITLE text: >>> flac['title'] [u'Christmas Waltz'] >>> mp3['TIT2'].text #notice this one takes another additional step, as well, >>> by specifying text ! [u'Christmas Waltz'] >>> mp4['\xa9nam'] [u"Christmas Waltz"] In the end, after "the mapping", I would like to be able to do something along these approaches: [1] >>> target_file.tags = src_file.tags [2] >>> target_file.set_tags(src_file.get_tags()) However, none of the keys match, so I need to somehow map them to a central common source first ... and I am not sure about how to approach this. I know I could manually assign each key to a class property (using the @property tag) ... but this seems tedious: Any insight on where I can start with mapping all these together? Thanks, Damon ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Script Feedback
Hello Denis & Steven - Thanks for your replies. I have taken another stab at things to try and bring it a little further up to snuff ... some more comments/thoughts follow ... On Tue, Mar 30, 2010 at 12:57 PM, Steven D'Aprano wrote: > I usually create a function "print_" or "pr", something like this: > > def print_(obj, verbosity=1): >if verbosity > 0: >print obj > > and then have a variable "verbosity" which defaults to 1 and is set to 0 > if the user passes the --quiet flag. Then in my code, I write: > > print_("this is a message", verbosity) Your suggestion prompted me to remember having looked into this earlier (and found an old thread of mine) -- some folks had recommended using the logging module -- which I have implemented in round two (seems to work). I think it accomplishes the same thing that you are suggesting, only using one of Python's built-ins. On Tue, Mar 30, 2010 at 12:57 PM, Steven D'Aprano wrote: > Separate the underlying functionality from the application-level code. > These functions should NEVER print anything: they do all communication > through call-backs, or by returning a value, or raising an exception. I tried to implement this, however, I am not sure how the 'callback' works ... is that just a function that a user would pass to *my* function that gets called at the end of the script? Also, I tried to separate out the logic a little so the functions make more sense ... I think I may remove the 'ignore_walk' function and just add it to the tar_bz2_directory function (see below) ... but am still unclear about the callback concept. 2010/3/30 spir ☣ : > First, it seems you use [:] only to preserves the object identity so that it > remains a generator. But it may be better (at least clearer for me) to filter > and transform the generation process so as to get what you actually need, I > guess: iterating on (dirpath,filename) pairs. If I'm right on this, maybe try > to figure out how do that. > I would call the func eg "filtered_dir_walk" or "relevant_dir_walk". I am not sure where I first got this 'ignore_walk' bit but I do remember taking it from another program of mine ... to be honest, though, I am rethinking its use and may implement it using fnmatch testing so that I may implement wildcards (eg, *.pyc) ... right now, it won't match wildcards and that might be helpful. Again, thank you both for your feedback. I made some changes tonight (posted below) and also updated the changes on: http://blog.damontimm.com/python-script-clean-bzip/ (if you want pretty colors). Damon CODE BELOW --- #! /usr/bin/env python '''Script to perform a "clean" bzip2 on a directory (or directories). Removes extraneous files that are created by Apple/AFP/netatalk before compressing. ''' import os import tarfile import logging from optparse import OptionParser # Default files and directories to exclude from the bzip tar IGNORE_DIRS = ('.AppleDouble',) IGNORE_FILES = ('.DS_Store',) class DestinationTarFileExists(Exception): '''If the destination tar.bz2 file already exists.''' def ignore_walk(directory, ignore_dirs=None, ignore_files=None): '''Ignore defined files and directories when doing the walk.''' # TODO: this does not currently take wild cards into account. For example, # if you wanted to exclude *.pyc files ... should fix that. Perhaps # consider moving this entirely into the below function (or making it more # reusable for other apps). for dirpath, dirnames, filenames in os.walk(directory): if ignore_dirs: dirnames[:] = [dn for dn in dirnames if dn not in ignore_dirs] if ignore_files: filenames[:] = [fn for fn in filenames if fn not in ignore_files] yield dirpath, dirnames, filenames def tar_bzip2_directory(directory, ignore_dirs=IGNORE_DIRS, ignore_files=IGNORE_FILES ): '''Takes a directory and creates a tar.bz2 file (based on the directory name). You can exclude files and sub-directories as desired.''' file_name = '-'.join(directory.split(' ')) tar_name = file_name.replace('/','').lower() + ".tar.bz2" if os.path.exists(tar_name): msg = ("The file %s already exists. " + "Please move or rename it and try again.") % tar_name raise DestinationTarFileExists(msg) tar = tarfile.open(tar_name, 'w:bz2') for dirpath, dirnames, filenames in ignore_walk(directory, ignore_dirs, ignore_files): for file in filenames: logging.info(os.path.join(dirpath, file)) tar.add(os.path.join(dirpath, file)) tar.close() def main(args=None, callback=None): directories = [] for arg in args: if os.path.isdir(arg): directories.append(arg) else: logging.ERROR("Ingoring: %s (it's not a directory)." % arg) for dir in directories: try: tar_bzip2_directory(dir) except Destin
[Tutor] Script Feedback
As a self-taught Python user I am still looking for insight on the most pythonic and programmatically-friendly way of accomplishing a given task. In this case, I have written a script that will perform a “clean bzip2″ of a directory (or directories). Mac OS X (via AFP and netatalk, in my case) tends leaves a bunch of ugly files/directories hanging around and I would rather not include them in my compressed tar file. In writing the script, though, I ran into some questions and I am not sure what the recommended approach would be. The script works, as it is, but I feel its a little hacked together and also a little limited in its application. There is something to be said for programs that "just work" (this does) but I want to take it a little further as an educational endeavor and would like it to appear robust, future-thinking, and pythonic. My initial questions are: 1. Is there a better way to implement a --quiet flag? 2. I am not very clear on the use of Exceptions (or even if I am using it in a good way here) — is what I have done the right approach? 3. Finally, in general: any feedback on how to improve this? (I am thinking, just now, that the script is only suitable for a command line usage, and couldn’t be imported by another script, for example.) Any feedback is greatly appreciated. Writing a script like this is a good learning tool (for me, at least). I have posted this email online if you want to see the script with pretty code formatting: http://blog.damontimm.com/python-script-clean-bzip/ Thanks for any insight you may provide. Damon Script follows #! /usr/bin/env python '''Script to perform a "clean" bzip2 on a directory (or directories). Removes extraneous files that are created by Apple/AFP/netatalk before compressing. ''' import os import tarfile from optparse import OptionParser IGNORE_DIRS = ( '.AppleDouble', ) IGNORE_FILES = ('.DS_Store', ) class DestinationTarFileExists(Exception): '''If the destination tar.bz2 file already exists.''' def ignore_walk(directory): '''Ignore defined files and directories when doing the walk.''' for dirpath, dirnames, filenames in os.walk(directory): dirnames[:] = [ dn for dn in dirnames if dn not in IGNORE_DIRS ] filenames[:] = [ fn for fn in filenames if fn not in IGNORE_FILES ] yield dirpath, dirnames, filenames def tar_bzip2_directories(directories): for directory in directories: file_name = '-'.join(directory.split(' ')) tar_name = file_name.replace('/','').lower() + ".tar.bz2" if os.path.exists(tar_name): raise DestinationTarFileExists() if not options.quiet: print 'Compressing files into: ' + tar_name tar = tarfile.open(tar_name, 'w:bz2') for dirpath, dirnames, filenames in ignore_walk(directory): for file in filenames: if not options.quiet: print os.path.join(dirpath, file) tar.add(os.path.join(dirpath, file)) tar.close() if __name__ == "__main__": parser = OptionParser(usage="%prog [options: -q ] [directory]") parser.add_option("-q", "--quiet", action="store_true", dest="quiet") options, args = parser.parse_args() directories = [] for arg in args: if os.path.isdir(arg): directories.append(arg) else: print "Ingoring: %s (it's not a directory)." % arg try: tar_bzip2_directories(directories) except DestinationTarFileExists: print "A tar file already exists this this directory name." print "Move or rename it and try again." ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Initialize Class Variables by Dictionary ...
Hey! I am a newbie too but it works for me: >>> class Test(object): ... def __init__(self,dict): ... for key in dict: ... self.__setattr__(key,dict[key]) ... >>> t = Test() >>> t.test1 'hi there' >>> t.test2 'not so much' >>> t.test3 'etc' Thanks! On Sat, Aug 29, 2009 at 4:59 PM, Mac Ryan wrote: > On Sat, 2009-08-29 at 16:31 -0400, Damon Timm wrote: >> Hi again - thanks for your help with my question early today (and last >> night). Tried searching google for this next question but can't get >> an answer ... here is what I would like to do (but it is not working) >> ... >> >> >>>dict = {'test1': 'value1', 'test2': 'value2', 'test3': 'value3'} >> >>> class Test(): >> ... def __init__(self): >> ... for key in dict: >> ... self.key = dict[key] >> ... >> >>> t = Test() >> >>> t.test1 >> Traceback (most recent call last): >> File "", line 1, in >> AttributeError: Test instance has no attribute 'test1' >> >>> t.key >> 'value3' >> >> Can I do what I am dreaming of ? > > Yes you can, but not that way. Here is how I did on my console: > >>>> class A(object): > ... pass > ... >>>> dir(A) > ['__class__', '__delattr__', '__dict__', '__doc__', '__format__', > '__getattribute__', '__hash__', '__init__', '__module__', '__new__', > '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', > '__str__', '__subclasshook__', '__weakref__'] >>>> setattr(A, 'test1', 'mytest') >>>> dir(A) > ['__class__', '__delattr__', '__dict__', '__doc__', '__format__', > '__getattribute__', '__hash__', '__init__', '__module__', '__new__', > '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', > '__str__', '__subclasshook__', '__weakref__', 'test1'] >>>> A.test1 > 'mytest' > > So, the key here is to use the setattr() builtin function. > > However keep in mind that I am a python beginner, so you would probably > like to hear from some real tutors before implementing this solution all > over your code... > > Mac. > > > ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] Initialize Class Variables by Dictionary ...
Hi again - thanks for your help with my question early today (and last night). Tried searching google for this next question but can't get an answer ... here is what I would like to do (but it is not working) ... >>>dict = {'test1': 'value1', 'test2': 'value2', 'test3': 'value3'} >>> class Test(): ... def __init__(self): ... for key in dict: ... self.key = dict[key] ... >>> t = Test() >>> t.test1 Traceback (most recent call last): File "", line 1, in AttributeError: Test instance has no attribute 'test1' >>> t.key 'value3' Can I do what I am dreaming of ? Thanks, Damon ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Store Class in Tuple Before Defining it ...
Hi Everyone - thanks for your responses. Answered my direct questions: [1] can't be done at the top and [2] would have to move the tuple somewhere else as well as gave me some new ideas about completely rethinking the design ... I love keeping the RE definitions with the child classes ... makes it easy just to add a child anywhere without having to fool with the original code. One of my issues has been fixed using the @staticmethod decorator (which I did not know about). Here is what I have now, and my final question (perhaps) follows. #this is a django app, by the way class Video(models.Model): url = models.URLField('Video URL') # more fields and functions here def sync(self): 'Update videos external data - for children only' pass @staticmethod def add_video(url): 'add a video only if it matches correct regex -- need ExceptionHandler' for sc in Video.__subclasses__(): if sc.RE.match(url): return sc(url=url) class YoutubeVideo(Video): RE = re.compile(r'([^(]|^)http://www\.youtube\.com/watch\?\S*v=(?P[A-Za-z0-9_-]+)\S*') def sync(self): # do custom syncing here print "am syncing a YOUTUBE video" class ViemoVideo(Video): RE = re.compile(r'([^(]|^)http://(www.|)vimeo\.com/(?P\d+)\S*') def sync(self): # do custom syncing here print "am syncing a VIMEO video" ## So, this is *great* because now I can "add_video" without knowing the video url type at all. >>> v = Video.add_video(url="http://www.youtube.com/watch?v=UtEg3EQwN9A";) >>> v >>> Perfect! The last part is figuring out the syncing -- what I would like to do would be either: >>> Video.sync_all_videos() #create another static method Or, if I had to, create another function that did a: >>> for video in Video.objects.all(): # object.all() is a django function that >>> returns everything ... video.sync() However, I am not sure how to determine the "actual" class of a video when I am dealing only with the parent. That is, how do I call a child's sync() function when I am dealing the parent object? As suggested (in email below) I could re-run the regex for each parent video for each sync, but that seems like it could be an expensive operation (supposing one day I had thousands of videos to deal with). Is there another easy way to find the "real" class of an object when dealing with the parent? I know so little, I have to think there might be something (like the @staticmethod decorator!). Thanks again! Damon On Sat, Aug 29, 2009 at 8:05 AM, Lie Ryan wrote: > what I suggest you could do: > > class Video(object): > # Video is a mixin class > def __init__(self, url): > self.url = url > def sync(self): > # source agnostic sync-ing or just undefined > pass > �...@staticmethod > def find_video(url): > for sc in Video.__subclasses__(): > if sc.RE.match(url) > return sc(url) > class Youtube(Video): > RE = re.compile('... some regex here ...') > def sync(self): > # costum sync-ing > pass > class Blip(Video): > RE = re.compile('... other regex here ...') > def sync(self): > # costum sync-ing > pass > a = Video.find_video('http://www.youtube.com/') > > that way, url detection will only happen on class initialization instead of > every time sync is called. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Store Class in Tuple Before Defining it ...
Sorry for the double post! Went off by mistake before I was done ... Anyhow, I would like to have a tuple defined at the beginning of my code that includes classes *before* they are defined ... as such (this is on-the-fly-hack-code just for demonstrating my question): VIDEO_TYPES = ( (SyncYoutube, re.compile(r'([^(]|^)http://www\.youtube\.com/watch\?\S*v=(?P[A-Za-z0-9_-]+)\S*'),), (SyncVimeo, re.compile(#more regex here#),), (SyncBlip, re.compile(#more regex here#),), ) class Video(object): url = "http://youtube.com/"; #variables ... def sync(self): for videotype in VIDEO_TYPES: #check the url against the regex, # if it matches then initiate the appropriate class and pass it the current "self" object sync = videotype[0](self).sync() class SyncYoutube(object): def __init__(self,video): self.video = video def sync(self): #do some custom Youtube syncing here class SyncBlip(object): #etc This way, I can get any video object and simply run Video.sync() and it will figure out which "sync" to run. However, I am finding (of course) that I can't reference a class that hasn't been defined! I know this is a rush-job question, but I am hoping someone seems my quandary and maybe has a way around it. I am learning python as we speak! Thanks! And sorry for the double post. Damon On Fri, Aug 28, 2009 at 5:10 PM, Damon Timm wrote: > Hi - > > I would like to have a tuple that holds information, as such: > > VIDEO_TYPES = ( > (SyncYoutube, > re.compile(r'([^(]|^)http://www\.youtube\.com/watch\?\S*v=(?P[A-Za-z0-9_-]+)\S*'),), > > ) > ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] Store Class in Tuple Before Defining it ...
Hi - I would like to have a tuple that holds information, as such: VIDEO_TYPES = ( (SyncYoutube, re.compile(r'([^(]|^)http://www\.youtube\.com/watch\?\S*v=(?P[A-Za-z0-9_-]+)\S*'),), ) ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] fnmatch -or glob question
Hi Kent and Allen - thanks for the responses ... I was sure there was some part of the "search" I wasn't getting, but with your follow-up questions it seemed something else was amiss ... So, I went back to re-create the problem with some more python output to show you and realized my mistake. Sigh. All the files were DSC_00XX ... and then one I had chosen to test didn't have any thumbnails. So, of course, I wasn't getting any responses ... though, because the filenames were so close, I sure thought I should have been. Good lesson here is: check the variables ! My goal is to delete the main image plus its thumbnails in one fell swoop -- I plan to use fnmatch to do that (though glob works too). Unless there is something else I should consider? For those who stumble like me, here is what is working. Python 2.5.2 (r252:60911, Jul 31 2008, 17:31:22) [GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2 Type "help", "copyright", "credits" or "license" for more information. (InteractiveConsole) >>> from kissfist.read.models import Issue #custom model that stores issue info >>> import os, fnmatch, glob >>> i = Issue.objects.get(pk=1) # get the first issue (contains cover path) >>> i.cover.path '/media/uploads/issue-covers/DSC_0065.jpg' >>> working_dir, file_name = os.path.split(i.cover.path) >>> file_base, file_ext = os.path.splitext(file_name) >>> glob_text = file_base + "*" + file_ext >>> for f in os.listdir(working_dir): ... if fnmatch.fnmatch(f, glob_text): ... print f ... DSC_0065.400x400.jpg DSC_0065.jpg DSC_0065.300.jpg >>> os.chdir(working_dir) >>> glob.glob(glob_text) ['DSC_0065.400x400.jpg', 'DSC_0065.jpg', 'DSC_0065.300.jpg'] Thanks again. Damon On Sat, Jul 4, 2009 at 4:27 AM, Alan Gauld wrote: > > "Damon Timm" wrote > >> And I thought I could just construct something for glob or fnmatch like: >> >> glob.glob("DSC_0065*.jpg") --or-- fnmatch.fnmatch(file, "DSC_0065*.jpg") >> >> But I'm not getting anywhere. > > Can you give is a clue as to what you are getting? > What is happening and what do you experct to happen? > Are you finding any files? some files? too many files? > Do you get an error message? > > -- > Alan Gauld > Author of the Learn to Program web site > http://www.alan-g.me.uk/ > > > ___ > Tutor maillist - tu...@python.org > http://mail.python.org/mailman/listinfo/tutor > ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] fnmatch -or glob question
Hi - I am trying to find a group of thumbnail files for deletion -- the files all have similar naming structure (though the details vary). When the main file is deleted, I want all the little ones to go too. For example, here is a directory listing: DSC_0063.100.jpg DSC_0063.100x150.jpg DSC_0063.jpg DSC_0065.300.jpg DSC_0065.400x400.jpg DSC_0065.jpg Using os.path.splitext and os.path.split I am able to break the files down into the two parts I need: base = DSC_0065 ext = .jpg And I thought I could just construct something for glob or fnmatch like: glob.glob("DSC_0065*.jpg") --or-- fnmatch.fnmatch(file, "DSC_0065*.jpg") But I'm not getting anywhere. I feel like there is something I am missing in terms of using the wildcard correctly in the middle of a filename. Any ideas ? Thanks, Damon ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] General Feedback, Script Structure
Kent and Alan - thanks! I moved things around a bit and I think it "looks" better: http://python.pastebin.com/m64e4565d On Sun, Feb 15, 2009 at 2:25 PM, Kent Johnson wrote: > Exactly. dict.get() does a key lookup with a default for missing keys, > then the result is used as a string format. See > http://docs.python.org/library/stdtypes.html#dict.get > > BTW I strongly recommend becoming familiar with the Built-in Functions > and Built-in Types sections of the standard library docs: > http://docs.python.org/library/ Great - thanks. That's exactly what I needed! My issue is that I don't use Python enough (or computers, for that matter) to remember everything I've learned previously ... after I saw that, I remember I had read it before ... I just do it for fun, so one of the challenges is finding the right word/answer to what I need to do ... hopefully, if I stick with Python long enough it will become more second nature ... This list is a great resource because I learn a lot from other people's issues ... and is easy to search, too, for my own! Thanks again, everyone. > > Kent > ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] General Feedback, Script Structure
Hi Kent - thanks for taking a look! Follow-up below: On Sun, Feb 15, 2009 at 12:20 PM, Kent Johnson wrote: > - put the main code in a main() function rather than splitting it > across the file. That's a good idea - I will do that. Is it proper to create a def main() or just under: if __name__ == "__main__" > - the use of tmpfile is awkward, can you make the gmailme script take > its input in strings? that gmail script needs an actual file to attach ... or rather, the location of a file to attach ... would have to change something so it could take text that it could save as a file. but that would probably be better. > - I would use plain positional parameters to getMessage(), rather than > supplying defaults. > - the status dict could be built once, containing just the format strings, > e.g. > status = dict( > TestMessage = "This was a crazy test message! Woo hoo!", > RebuildStarted = "The rebuilding of %(mddevice)s has begun!", > # etc > } > > then you can build the message as > body = status.get(event, nomatch) % vars() > message = header + body + footer That last part I am not so clear on ... how does: body = status.get(event, nomatch) % vars() work ? Does it say, first look for "event" as a key and then, if it doesn't find a match with event,, use the "nomatch" key ? I was trying to do something like that but couldn't figure out how to make it work ... Thanks! > > Kent > ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] General Feedback, Script Structure
Hi - Am still new to python -- was writing a script that is used by mdadm (linux software raid utility) when there was a raid "event" ... the script then sends an email (using another python script caled "gmailme") to me with the information from the event and attaches the details of the raid device that triggered the problem. It works fine -- seems to do what I need -- but, to be very frank, I don't think it is very *pretty* ! But, I am not sure what the "python-way" would be to re-structure this so it continues to do what it needs to do but is more readable and future-friendly. Could you take a look ? http://python.pastebin.com/m4e1694d5 Would be interested in any feedback (obviously, I should add some doc strings!). Thanks, Damon ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Sys.stdin Question
On Tue, Jan 13, 2009 at 9:05 PM, John Fouhy wrote: > It's easy to test: > > ### test.py ### > import time > time.sleep(1) > print 'foo' > ### > > $ python test.py | python stdintest.py > Nothing to see here. > close failed: [Errno 32] Broken pipe > > [not shown: the 1 second pause after "Nothing to see here." showed up] > > You could "fix" it by adding a delay to your consumer script.. as long > as none of your input scripts take longer than the delay to generate > output. Or do things differently, which might be smarter :-) Yea - I can see where that would become a problem ... hmm ... I think I am going to look into what Steve is suggesting with the fileinput module ... On Tue, Jan 13, 2009 at 9:10 PM, Steve Willoughby wrote: > As the Zen of Python states, "explicit is better than implicit." > > Scripts which just magically "do the right thing" can be cool to play > with, but they have a nasty tendency to guess wrong about what the right > thing might be. In this case, you're making assumptions about how to > *guess* whether there's standard input piped at your script, and running > with those assumptions... but there are easily identified cases where > those assumptions don't hold true. That's a good point ... here is more of what I am trying to do ... minus exceptions and the actual functions that sends the mail ... I could use the "-" stdin bit to either signify the body message of the email, or an attachment ... or, I guess, both ... (I left out the details of actually emailing stuff and all the imports): def sendMail(recipient, subject, text, *attachmentFilePaths): #here is the function used to send the mail and attachments... def main(): parser = OptionParser(usage="%prog [options: -s -t -m ] [attachemnts, if any...]") parser.add_option("-q", "--quiet", action="store_true", dest="quiet") parser.add_option("-s", "--subject", action="store", type="string", dest="subject", default=default_subject) parser.add_option("-t", "--to", action="store", type="string", dest="recipient", default=default_recipient) parser.add_option("-m", "--message", action="store", type="string", dest="message") (options, args) = parser.parse_args() sendMail(options.recipient, options.subject, options.text, args) $ sendmail.py -t m...@me.com -s "my subject" -m - attament.jpg or maybe $ dmesg | sendmail.py -t m...@me.com -s "my subject" -m "here is output from dmesg" - someotherattachmenttoo.doc Thanks for all this help. On Tue, Jan 13, 2009 at 9:10 PM, Steve Willoughby wrote: > On Tue, January 13, 2009 17:59, Damon Timm wrote: >> ... then, I guess, I can just have it do an if statement that asks: if >> args[0] == "-" then ... blah. I may do that ... the script, itself, > > Look at the fileinput module. If you're looking at the command line for a > list of filenames, which may include "-" to mean (read stdin at this point > in the list of files), your script's logic is reduced to simply: > > while line in fileinput.input(): > # process the line > > and your script focuses on its specific task. > >>> or not. So instead of seeing if anything's showing up (and introducing >>> timing dependencies and uncertainty), see if it's attached to a real >>> terminal at all. On Unix, os.isatty(sys.stdin) will tell you this. >> >> Does this concern still apply with John's suggestion? I just tested >> it in my little application and didn't have an issue ... of course, I > > Yes, it does. And in a trivial case, it will usually work. But don't > base your solutions on something that looks like it sorta works most of > the time but isn't really the recommended practice, because it will break > later and you'll spend a lot of time figuring out why it's not being > reliable. > >> I can go to using the "-" option ... although, to be honest, I like >> the idea of the script thinking for itself ... that is: if there is >> stdin, use it -- if not, not ... and, I was thinking of attaching the >> stdin as a text file, if present. And not attaching anything, if not. > > As the Zen of Python states, "explicit is better than implicit." > > Scripts which just magically "do the right thing" can be cool to play > with, but they have a nasty tendency to guess wrong about what the right > thing might be. In this case, you're
Re: [Tutor] Sys.stdin Question
On Tue, Jan 13, 2009 at 8:28 PM, Alan Gauld wrote: > The way other Unix style programs deal with this is to write the code > to read from a general file and if you want to use stdin provide an > argument of '-': That's a good idea, actually -- I hadn't thought of that. Although I use that "-" a lot in command line programs ... It would force someone to be specific about what it was they wanted the script to do ... then, I guess, I can just have it do an if statement that asks: if args[0] == "-" then ... blah. I may do that ... the script, itself, actually handles attachments, too ... so I could use that flag, also, to say: attach the standard out to an email. On Tue, Jan 13, 2009 at 8:45 PM, John Fouhy wrote: > This might work: > > import select, sys > def isData(): >return select.select([sys.stdin], [], [], 0) == > ([sys.stdin], [], []) > > if isData(): > print 'You typed:', sys.stdin.read() > else: > print 'Nothing to see here.' Oh ho ho! Very neat! I will have to head over to the python docs to see what that is but it seems to work! > I say "might" because it is OS-dependent, but I guess you are using > unix/linux. Yea - you guessed right. > Source: > http://www.darkcoding.net/software/non-blocking-console-io-is-not-possible/ > > I found that by searching for "python stdin non-blocking". This is > because "blocking" is jargon for "waiting until something happens". > In this case, stdin.read() is blocking until it sees some data with an > EOF. Thanks for that tip -- as you probably guessed, google wasn't turning up too many results for me. But this is working now. However ... reading the next comment has me thinking again: On Tue, Jan 13, 2009 at 8:55 PM, Steve Willoughby wrote: > This is playing a dangerous game, though, of introducing a race condition. > Is there nothing on the standard input RIGHT NOW because the source on > the other end of the pipe hasn't managed to generate anything yet, or > because there's nothing piped? > > A better approach is either to explicitly specify whether to read from > stdin or a file, as Alan demonstrated (and the fileinput module implements > this for you, by the way), or to see if stdin is connected to a terminal > or not. So instead of seeing if anything's showing up (and introducing > timing dependencies and uncertainty), see if it's attached to a real > terminal at all. On Unix, os.isatty(sys.stdin) will tell you this. Does this concern still apply with John's suggestion? I just tested it in my little application and didn't have an issue ... of course, I only ran a couple different command line items before throwing up my hands in celebration. I can go to using the "-" option ... although, to be honest, I like the idea of the script thinking for itself ... that is: if there is stdin, use it -- if not, not ... and, I was thinking of attaching the stdin as a text file, if present. And not attaching anything, if not. Thanks everyone! ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] Sys.stdin Question
Hi - am writing a script to send myself email messages from the command line ... I want to have the option be able to input the message body via a pipe so I can easily shoot emails to myself (like from: ls, cat, df, du, mount, etc) ... what i want to be able to do is: $ ls -la | myscript.py and in the script use something like this (just an example to keep it short): cli_input = sys.stdin.read() if cli_input: print "I piped this in:", cli_input else: print "nothing got piped in, moving on." This works when I do have something coming via stdin ... but if I run the script without piping something first ... it just sits there (I assume, waiting for some stdin) ... How do I tell it: if there is no stdin, just move on? Thanks, Damon ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Can subprocess run functions ?
On Wed, Jan 7, 2009 at 7:36 PM, wesley chun wrote: > this has been a highly-desired feature for quite awhile. > > starting in 2.6, you can use the new multiprocessing module > (originally called pyprocessing): > http://docs.python.org/library/multiprocessing.html > > there is a backport to 2.4 and 2.5 here: > http://pypi.python.org/pypi/multiprocessing/2.6.0.2 > > there are similar packages called pypar and pprocess: > http://datamining.anu.edu.au/~ole/pypar/ > http://www.boddie.org.uk/python/pprocess.html > > hope this helps! Thanks Wesley - it does! As is often the case, the minute I ask a question like this I have some anxiety that the answer must be right under my nose and I rush about the tubes of the internet searching for a clue ... I also found: http://chrisarndt.de/projects/threadpool/threadpool.py.html Since I have been doing a bit of reading since I started, I was able to download this "threadpool", import it, and it actually get it to work. Here is what I messily put together (using my first referenced script, as an example) ... I think I may stick with this for a while, since it seems well-thought out and, as far as I can tell, works! (No shame in "borrowing" ... though not as cool as making it up myself.) import subprocess import threadpool import os totProcs = 2 #number of processes to spawn before waiting flacFiles = ["test.flac","test2.flac","test3.flac","test4.flac","test5.flac","test6.flac"] def flac_to_mp3(flacfile): print "Processing: " + flacfile mp3file = flacfile.rsplit('.', 1)[0] + '.mp3' p = subprocess.Popen(["flac","--decode","--stdout","--silent",flacfile], stdout=subprocess.PIPE) p1 = subprocess.Popen(["lame","--silent","-",mp3file], stdin=p.stdout) p1.communicate() #Test file size so we know it is actually waiting until it has been created. size = os.path.getsize(mp3file) return str("File: " + mp3file + "* Size: " + str(size)) def print_result(request, result): print "* Result from request #%s: %r" % (request.requestID, result) pool = threadpool.ThreadPool(totProcs) convert = threadpool.makeRequests(flac_to_mp3,flacFiles,print_result) [pool.putRequest(req) for req in convert] pool.wait() print "All Done!" > -- wesley > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > "Core Python Programming", Prentice Hall, (c)2007,2001 > "Python Fundamentals", Prentice Hall, (c)2009 >http://corepython.com > > wesley.j.chun :: wescpy-at-gmail.com > python training and technical consulting > cyberweb.consulting : silicon valley, ca > http://cyberwebconsulting.com > ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] Can subprocess run functions ?
Hi everyone - I was playing with subprocess (with some success, I might add) to implement threading in my script (audio conversion). My goal is to be able to spawn off threads to make use of my multiprocessor system (and speed up encoding). With your help, I was successful. Anyhow, subprocess is working -- but I wonder if there is a way I can send the entire *function* into its own subprocess ? Because, in my case, I want my function to: [a] convert files, [b] tag files, [c] do some other stuff to files. Steps [b] and [c] require step [a] to be complete ... but the minute I spawn off step [a] it acts like it is already done (even though it is still working) ... I was hoping they could all run in a single thread, one after another ... I tried just giving subprocess.Popen the function name (rather than the external program) but that didn't work; and I read through the docs over at python.org ... but I can't find my answer. With the code I have, I am not sure how to both wait for my subprocess to finish (in the function) and allow the multithreading bit to work together ... I have experimented myself but didn't really get anywhere. I commented in where I want to "do other stuff" before it finishes ... wonder if you can take a look and show me where I may try to head next? Thanks! -- import time import subprocess totProcs = 2 #number of processes to spawn before waiting flacFiles = [["test.flac","test.mp3"],["test2.flac","test2.mp3"],\ ["test3.flac","test3.mp3"],["test4.flac","test4.mp3"],\ ["test5.flac","test5.mp3"],["test6.flac","test6.mp3"]] procs = [] def flac_to_mp3(flacfile,mp3file): print "beginning to process " + flacfile p = subprocess.Popen(["flac","--decode","--stdout","--silent",flacfile], stdout=subprocess.PIPE) p1 = subprocess.Popen(["lame","--silent","-",mp3file], stdin=p.stdout) # I want to do more stuff before this function ends, but need to wait for p1 to finish first; # and, at the same time, I need to "return" p1 so the while loop (below) works [I think] return p1 while flacFiles or procs: procs = [p for p in procs if p.poll() is None] while flacFiles and len(procs) < totProcs: file = flacFiles.pop(0) procs.append(flac_to_mp3(file[0],file[1])) time.sleep(1) ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] SMTP Module Help
On 1/6/09, Marco Petersen wrote: > I'm using Python 2.5.4. I wanted to try out the SMTP module. I tried to send > an email through my Gmail account but it keeps saying that the connection > was refused. I used this example to get emailing from python through gmail smtp to work: http://codecomments.wordpress.com/2008/01/04/python-gmail-smtp-example/ Note, there are a couple errors in the code that prevent it from working out-of-the-box ... however, if you read the comments they get worked out. Also, I think I would echo what the other folks mentioned here, is that you probably need to specify the port (587) as in: server = smtplib.SMTP('smtp.gmail.com', 587) Damon > > This is the code that I used : > > import smtplib > msg = 'Test' > > > server = smtplib.SMTP('smtp.gmail.com') > server.set_debuglevel(1) > server.ehlo() > server.starttls() > server.ehlo() > server.login('marco.m.peter...@gmail.com', 'password') > server.sendmail('marco.m.peter...@gmail.com', > 'marcoleepeter...@gmail.com', msg) > server.close() > > This error message keeps coming up: > > > Traceback (most recent call last): > File "C:/Python25/send_mail.py", line 5, in > server = smtplib.SMTP('smtp.gmail.com') > File "C:\Python25\Lib\smtplib.py", line 244, in __init__ > (code, msg) = self.connect(host, port) > File "C:\Python25\Lib\smtplib.py", line 310, in connect > raise socket.error, msg > error: (10061, 'Connection refused') > > > Can anyone help me with this? > > Thanks. > > -Marco > ___ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor > ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Better way - fnmatch with list ? CORRECTION
On Sat, Jan 3, 2009 at 4:35 PM, Mark Tolonen wrote: > fnmatch already takes into account systems with case-sensitive filenames: > help(fnmatch.fnmatch) > > Help on function fnmatch in module fnmatch: > > fnmatch(name, pat) > Test whether FILENAME matches PATTERN. > > Patterns are Unix shell style: > > * matches everything > ? matches any single character > [seq] matches any character in seq > [!seq] matches any char not in seq > > An initial period in FILENAME is not special. > Both FILENAME and PATTERN are first case-normalized > if the operating system requires it. > If you don't want this, use fnmatchcase(FILENAME, PATTERN). > > -Mark Hey Mark - thanks for your reply and the details ... I saw fnmatch did *not* match case (which was great) ... but it also couldn't match any item from another list of items ... had to do a single PATTERN at a time ... that's why it was suggested I try to use the ext matching from os.path.splitext() ... and why I needed to drop it to lowercase. It seems like that is the easiest way to search for a match among a list of file systems ... Thanks! Damon > > > ___ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor > ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Better way - fnmatch with list ? CORRECTION
On Fri, Jan 2, 2009 at 7:16 PM, Jervis Whitley wrote: > for fn in files: > base, ext = os.path.splitext(fn) > if ext.lower() in ['.flac', '.mp3', '.mp4']: > > takes into account systems with case sensitive filenames. Thanks! Will throw that in there. I'm getting it ... bit by little bit. > > cheers, > > Jervis > > ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Better way - fnmatch with list ?
On Fri, Jan 2, 2009 at 6:44 PM, bob gailer wrote: > Since file is a built-in function it is a good idea to not use it as a > variable name. Oooh! I did not know that ... thanks ... went through and changed them all. > for fn in files: > base, ext = os.path.splitext(fn) > if ext in ['*.flac','*.mp3','*.m4a']: > #do some stuff if the someFile matches one of the items in the list I caught the * bit - I must be learning! One thought though ... because fnmatch ignores case I could get away with: .FLAC, .flac, .FLac, or any other such foolishness for file extensions ... Using the above approach, however, matches by case ... so, I think, to be safe, I would have to list each iteration of the case in the list ... is there a way to account for that ? Thanks again - Damon ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] Better way - fnmatch with list ?
Hi - am learning Pythong and loving it! Anyhow, what I have works, but I wondered if there was a "better" (more python-y) way. Here is what I am doing with fnmatch ... am thinking there has to be a one-line way to do this with a lambda or list comprehension ... but my mind can't get around it ... I have other code, but the main part is that I have a list of files that I am going through to check if they have a given extension ... in this part, if the file matches any of the extension I am looking for, I want it to do some stuff ... later I check it against some other extensions to do "other" stuff: for file in files: for ext in ['*.flac','*.mp3','*.m4a']: if fnmatch.fnmatch(someFile, ext): #do some stuff if the someFile matches one of the items in the list Can it get any better ? I was hoping fnmatch would *accept* a list instead of a string ... but it didn't (frown). I thought maybe: if fnmatch(someFile, ['*.flac','*.mp3','*.m4a']): But that didn't work. Thanks in advance, Damon PS: just reading the conversations on this list is a little like taking a python class (only the classes don't progress in any particular order!). ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] MP3Info class usage
On 12/10/08, Todd Zullinger <[EMAIL PROTECTED]> wrote: > I'd recommend eyeD3¹ and/or mutagen² for tag reading. Both are pretty > easy to use. I would second eyeD3 -- I use the command line version and it is pretty versatile. D > > ¹ http://eyed3.nicfit.net/ > ² http://code.google.com/p/quodlibet/wiki/Development/Mutagen > > -- > ToddOpenPGP -> KeyID: 0xBEAF0CE3 | URL: www.pobox.com/~tmz/pgp > ~~ > A paranoid is someone who knows a little of what's going on. >-- William S. Burroughs > > > ___ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor > > > ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] list.index() question
On Mon, Dec 8, 2008 at 7:55 PM, Kent Johnson <[EMAIL PROTECTED]> wrote: > index() searches for a specific matching item, it doesn't have any > wildcard ability. Ah ha! > There is actually an index: > http://docs.python.org/genindex.html Heh heh - and the info I was looking for is at: http://docs.python.org/library/stdtypes.html#index-584 ... I've become google dependent ... if it's not on google I don't know where to look. Thanks for the .endswith() tip. On 12/8/08 7:47 PM, Alan Gauld wrote: > Check out the glob module. > >> for dirpath, subFolders, files in os.walk(rootDir): >> try: >> i = files.index("*.flac") #how do I make it search for files >> that end in ".flac" ? > > If yu call glob.glob() with the dirpath you will get a list of all > the flac files in the current dir. Heading to check out glob.glob() now ... On 12/8/08 7:29 PM, John Fouhy wrote: > The fnmatch module will help here. It basically implements unix-style > filename patterns. For example: > > import os > import fnmatch > > files = os.listdir('.') > flac_files = fnmatch(files, '*.flac') > > So, to test whether you have any flac files, you can just test whether > fnmatch(files, '*.flac') is empty. > > If you wanted to roll your own solution (the fnmatch module is a bit > obscure, I think), you could do something with os.path.splitext: > > files = os.listdir('.') > extensions = [os.path.splitext(f)[1] for f in files] > if '.flac' in extensions: > print 'FLAC files found!' And then to look at fnmatch! Thanks for the direction -- on my way ... On 12/8/08 7:55 PM, Kent Johnson wrote: > On Mon, Dec 8, 2008 at 7:05 PM, Damon Timm <[EMAIL PROTECTED]> wrote: >> Hi again! >> >> (Now that everyone was so helpful the first time you'll never get rid of me!) > > That's fine, pretty soon you'll be answering other people's questions :-) Not quite there yet ... one day, maybe. I can show people where the index for index is! Damon ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] list.index() question
Hi again! (Now that everyone was so helpful the first time you'll never get rid of me!) I had a question about using the index() function on a list -- as I walk the directory path, I want to see if a directory contains any files ending in a certain type ... if it does, I wanna do some stuff ... if not, I would like to move on ... . for dirpath, subFolders, files in os.walk(rootDir): try: i = files.index("*.flac") #how do I make it search for files that end in ".flac" ? for file in files: #do some things in here to sort my files except ValueError: pass Basically: how do I make it match *.flac ? I couldn't find anything on google (searching for "python index" just gets me a lot of indexes of python docs - wink) Thanks again, Damon ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Newbie Wondering About Threads
On Sun, Dec 7, 2008 at 9:35 PM, Kent Johnson <[EMAIL PROTECTED]> wrote: > There is no need to include both the flac file name and the mp3 file > name if the roots match. You can use os.path functions to split the > extension or the quick-and-dirty way: > mp3file = flacfile.rsplit('.', 1)[0] + '.mp3' That is *so* what I was looking for! You guys are awesome. Damon > > Kent > ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Newbie Wondering About Threads
On Sun, Dec 7, 2008 at 10:47 AM, Kent Johnson <[EMAIL PROTECTED]> wrote: > A function as mentioned above would help. For the threaded solution > the function could just start the child process and wait for it to > finish, it doesn't have to return anything. Each thread will block on > its associated child. I think I did it! Woo hoo! (cheers all around! drinks on me!) First, I found that using the Popen.communicate() function wasn't going to work (because it sits there and waits for until it's done before continuing); so, I ditched that, created my own little function that returned the Popen object and went from there ... I mixed in one super-long audio file file with all the others it seems to work without a hitch (so far) ... watching top I see both processors running at max during the lame processing. Check it out (there are probably sexier ways to populate the *.mp3 files but I was more interested in the threads): --- import time import subprocess totProcs = 2 #number of processes to spawn before waiting flacFiles = [["test.flac","test.mp3"],["test2.flac","test2.mp3"],\ ["test3.flac","test3.mp3"],["test4.flac","test4.mp3"],\ ["test5.flac","test5.mp3"],["test6.flac","test6.mp3"]] procs = [] def flac_to_mp3(flacfile,mp3file): print "beginning to process " + flacfile p = subprocess.Popen(["flac","--decode","--stdout","--silent",flacfile], stdout=subprocess.PIPE) p1 = subprocess.Popen(["lame","--silent","-",mp3file], stdin=p.stdout) return p1 while flacFiles or procs: procs = [p for p in procs if p.poll() is None] while flacFiles and len(procs) < totProcs: file = flacFiles.pop(0) procs.append(flac_to_mp3(file[0],file[1])) time.sleep(1) --[EOF]-- Thanks again - onward I go! Damon ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Newbie Wondering About Threads
On Sun, Dec 7, 2008 at 12:33 AM, Martin Walsh <[EMAIL PROTECTED]> wrote: > I'm not certain this completely explains the poor performance, if at > all, but the communicate method of Popen objects will wait until EOF is > reached and the process ends. So IIUC, in your example the process 'p' > runs to completion and only then is its stdout (p.communicate()[0]) > passed to stdin of 'p2' by the outer communicate call. > > You might try something like this (untested!) ... > > p1 = subprocess.Popen( >["flac","--decode","--stdout","test.flac"], >stdout=subprocess.PIPE, stderr=subprocess.PIPE > ) > p2 = subprocess.Popen( >["lame","-","test.mp3"], stdin=p1.stdout, # <-- >stdout=subprocess.PIPE, stderr=subprocess.PIPE > ) > p2.communicate() That did the trick! Got it back down to 20s ... which is what it was taking on the command line. Thanks for that! > Here is my simplistic, not-very-well-thought-out, attempt in > pseudo-code, perhaps it will get you started ... > > paths = ["file1.flac","file2.flac", ... "file11.flac"] > procs = [] > while paths or procs: >procs = [p for p in procs if p.poll() is None] >while paths and len(procs) < 2: >flac = paths.pop(0) >procs.append(Popen(['...', flac], ...)) >time.sleep(1) I think I got a little lost with the "procs = [p for p in procs if p.poll() is None]" statement -- I'm not sure exactly what that is doing ... but otherwise, I think that makes sense ... will have to try it out (if not one of the more "robust" thread pool suggestions (below). On Sun, Dec 7, 2008 at 2:58 AM, Lie Ryan <[EMAIL PROTECTED]> wrote: > I think when you do that (p2.wait() then p3.wait() ), if p3 finishes > first, you wouldn't start another p3 until p2 have finished (i.e. until > p2.wait() returns) and if p2 finishes first, you wouldn't start another > p2 until p3 finishes (i.e. until p3.wait() returns ). > > The solution would be to start and wait() the subprocessess in two > threads. Use threading module or -- if you use python2.6 -- the new > multiprocessing module. > > Alternatively, you could do a "non-blocking wait", i.e. poll the thread. > > while True: >if p1.poll(): # start another p1 >if p2.poll(): # start another p2 Yea, looks like it - I think the trick, for me, will be getting a dynamic list that can be iterated through ... I experimented a little with the .poll() function and I think I follow how it is working ... but really, I am going to have to do a little more "pre-thinking" than I had to do with the bash version ... not sure if I should create a class containing the list of flac files or just a number of functions to handle the list ... whatever way it ends up being, is going to take a little thought to get it straightened out. And the objected oriented part is different than bash -- so, I have to "think different" too. On Sun, Dec 7, 2008 at 8:31 AM, Kent Johnson <[EMAIL PROTECTED]> wrote: > A simple way to do this would be to use poll() instead of wait(). Then > you can check both processes for completion in a loop and start a new > process when one of the current ones ends. You could keep the list of > active processes in a list. Make sure you put a sleep() in the polling > loop, otherwise the loop will consume your CPU! Thanks for that tip - I already throttled my CPU and had to abort the first time (without the sleep() function) ... smile. > Another approach is to use a thread pool with one worker for each > process. The thread would call wait() on its child process; when it > finishes the thread will take a new task off the queue. There are > several thread pool recipes in the Python cookbook, for example > http://code.activestate.com/recipes/203871/ > http://code.activestate.com/recipes/576576/ (this one has many links > to other pool implementations) Oh neat! I will be honest, more than one screen full of code and I get a little overwhelmed (at this point) but I am going to check that idea out. I was thinking something along these lines, where I can send all the input/ouput variables along with a number argument (threads) to a class/function that would then handle everything ... so using a thread pool may make sense ... Looks like I would create a loop that went through the list of all the files to be converted and then sent them all off, one by one, to the thread pool -- which would then just dish them out so that no more than 2 (if I chose that) would be converting at a time? I gotta try and wrap my head around it ... also, I will be using two subprocesses to accomplish a single command (one for stdoutput and the other taking stdinput) as well ... so they have to be packaged together somehow ... hmm! Great help everyone. Not quite as simple as single threading but am learning quite a bit. One day, I will figure it out. Smile. Damon ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Newbie Wondering About Threads
On Sat, Dec 6, 2008 at 6:25 PM, Python Nutter <[EMAIL PROTECTED]> wrote: > I'm on my phone so excuse the simple reply. > From what I skimmed you are wrapping shell commands which is what I do > all the time. Some hints. 1) look into popen or subprocess in place of > execute for more flexibility. I use popen a lot and assigning a popen > call to an object name let's you parse the output and make informed > decisions depending on what the shell program outputs. So I took a peak at subprocess.Popen --> looks like that's the direction I would be headed for parallel processes ... a real simple way to see it work for me was: p2 = subprocess.Popen(["lame","--silent","test.wav","test.mp3"]) p3 = subprocess.Popen(["lame","--silent","test2.wav","test2.mp3"]) p2.wait() p3.wait() top showed that both cores get busy and it takes half the time! So that's great -- when I tried to add the flac decoding through stdout I was able to accomplish it as well ... I was mimicing the command of "flac --decode --stdout test.flac | lame - test.mp3" ... see: p = subprocess.Popen(["flac","--decode","--stdout","test.flac"], stdout=subprocess.PIPE) p2 = subprocess.Popen(["lame","-","test.mp3"], stdin=subprocess.PIPE) p2.communicate(p.communicate()[0]) That did the trick - it worked! However, it was *very* slow! The python script has a "real" time of 2m22.504s whereas if I run it from the command line it is only 0m18.594s. Not sure why this is ... The last piece of my puzzle though, I am having trouble wrapping my head around ... I will have a list of files ["file1.flac","file2.flac","file3.flac","etc"] and I want the program to tackle compressing two at a time ... but not more than two at a time (or four, or eight, or whatever) because that's not going to help me at all (I have dual cores right now) ... I am having trouble thinking how I can create the algorithm that would do this for me ... Thanks everyone. Maybe after a good night's sleep it will come to me. If you have any ideas - would love to hear them. Damon ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] Newbie Wondering About Threads
Hi Everyone - I am a complete and utter Python newbie (as of today, honestly) -- am interested in expanding my programming horizons beyond bash scripting and thought Python would be a nice match for me. To start, I thought I may try re-writing some of my bash scripts in Python as a learning tool for me ... and the first one I wanted to talkle was a script that converts .flac audio files into .mp3 files ... basic idea is I supply a sourceDirectory and targetDirectory and then recursively convert the source file tree into an identical target file tree filled with mp3 files. I'm sure this has been done before (by those much wiser than me) but I figured I can learn something as I go ... for what I've accomplished so far, it seems pretty ugly! But I'm learning ... Anyhow, I think I got the basics down but I had a thought: can I thread this program to utilize all of my cores? And if so, how? Right now, the lame audio encoder is only hitting one core ... I could do all this faster if I could pass a variable that says: open 2 or 4 threads instead. Here is what I've been working on so far -- would appreciate any insight you may have. Thanks, Damon #!/usr/bin/env python import os import sys import fnmatch from os import system fileList = [] rootDir = sys.argv[1] targetDir = sys.argv[2] def shell_quote(s): """Quote and escape the given string (if necessary) for inclusion in a shell command""" return "\"%s\"" % s.replace('"', '\\"') def _mkdir(newdir): """works the way a good mkdir should :) - already exists, silently complete - regular file in the way, raise an exception - parent directory(ies) does not exist, make them as well http://code.activestate.com/recipes/82465/ """ if os.path.isdir(newdir): pass elif os.path.isfile(newdir): raise OSError("a file with the same name as the desired " \ "dir, '%s', already exists." % newdir) else: head, tail = os.path.split(newdir) if head and not os.path.isdir(head): _mkdir(head) #print "_mkdir %s" % repr(newdir) if tail: os.mkdir(newdir) # get all the flac files and directory structures for dirpath, subFolders, files in os.walk(rootDir): for file in files: if fnmatch.fnmatch(file, '*.flac'): flacFileInfo = [os.path.join(dirpath,file),dirpath+"/",file,dirpath.lstrip(rootDir)+"/"] fileList.append(flacFileInfo) # create new directory structure and mp3 files for sourceFile,dir,flacfile,strip in fileList: mp3File = shell_quote(targetDir + strip + flacfile.strip('.flac') + ".mp3") mp3FileDir = targetDir + strip sourceFile = shell_quote(sourceFile) _mkdir(mp3FileDir) flacCommand = "flac --decode --stdout --silent " + sourceFile + " | lame -V4 --slient - " + mp3File system(flacCommand) ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor