Re: [Tutor] Flatten a list in tuples and remove doubles
On Sat, Jul 28, 2012 at 7:12 PM, Francesco Loffredo wrote: > > My bad, now I'll RTFM again and I will study very carefully the operator and > itertools modules. I forgot to mention a gotcha about groupby's implementation. The grouby object and the yielded _grouper objects share a single iterator. Here's a (somewhat contrived) example of a mistake: >>> groups = groupby(sorted(data, key=keyfunc), keyfunc) >>> groups = list(groups) #DON'T DO THIS >>> groups [((0, '3eA', 'Dupont', 'Juliette'), ), ((1, '3eA', 'Pop', 'Iggy'), )] >>> list(groups[0][1]) #EMPTY [] >>> list(groups[1][1]) #ONLY THE LAST ITEM [(1, '3eA', 'Pop', 'Iggy', 5, 40.5, 60.0)] ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Flatten a list in tuples and remove doubles
On 07/28/2012 07:12 PM, Francesco Loffredo wrote: > Il 28/07/2012 20:41, eryksun wrote: >> On Sat, Jul 28, 2012 at 11:12 AM, Francesco Loffredo >> wrote: >>> I had to study carefully your present and desired lists, and I >>> understood >>> what follows (please, next time explain !): >>> - each 7-tuple in your present list is a record for some measure >>> relative to >>> a person. Its fields are as follows: >>> - field 0: code (I think you want that in growing order) >>> - field 1: group code (could be a class or a group to which >>> both of your >>> example persons belong) >>> - fields 2, 3: surname and name of the person >>> - field 4: progressive number of the measure (these are in order >>> already, but I think you want to enforce this) that you want to >>> exclude from >>> the output list while keeping the order >>> - field 5, 6: numerator and denominator of a ratio that is the >>> measure. >>> you want the ratio to be written as a single string: "%s/%s" % field5, >>> field6 >> This looks like a good problem for itertools.groupby. My solution >> below needs error checking and testing, but this is where I'd start: >> >> data = [ >>(0, '3eA', 'Dupont', 'Juliette', 0, 11.0, 10.0), >>(0, '3eA', 'Dupont', 'Juliette', 1, 4.0, 5.0), >>(0, '3eA', 'Dupont', 'Juliette', 2, 17.5, 30.0), >>(0, '3eA', 'Dupont', 'Juliette', 3, 3.0, 5.0), >>(0, '3eA', 'Dupont', 'Juliette', 4, 4.5, 10.0), >>(0, '3eA', 'Dupont', 'Juliette', 5, 35.5, 60.0), >>(1, '3eA', 'Pop', 'Iggy', 0, 12.0, 10.0), >>(1, '3eA', 'Pop', 'Iggy', 1, 3.5, 5.0), >>(1, '3eA', 'Pop', 'Iggy', 2, 11.5, 30.0), >>(1, '3eA', 'Pop', 'Iggy', 3, 4.0, 5.0), >>(1, '3eA', 'Pop', 'Iggy', 4, 5.5, 10.0), >>(1, '3eA', 'Pop', 'Iggy', 5, 40.5, 60.0), >> ] >> >> from operator import itemgetter >> from itertools import groupby >> >> #first sort by keyfunc, then group by it >> keyfunc = itemgetter(0,1,2,3) >> groups = groupby(sorted(data, key=keyfunc), keyfunc) >> >> result = [] >> for group, records in groups: >> temp = tuple('%s/%s' % r[5:] for r in sorted(records, >> key=itemgetter(4))) >> result.append(group + temp) >> > result >> [(0, '3eA', 'Dupont', 'Juliette', '11.0/10.0', '4.0/5.0', '17.5/30.0', >> '3.0/5.0', '4.5/10.0', '35.5/60.0'), (1, '3eA', 'Pop', 'Iggy', >> '12.0/10.0', '3.5/5.0', '11.5/30.0', '4.0/5.0', '5.5/10.0', >> '40.5/60.0')] >> > H... it happened again. I spend some time and effort to solve a > problem, I feel like I'm almost a Real Python Programmer... and > someone spoils my pride showing me some standard module whose name I > barely remember, that can solve that same problem in a few lines... > > Hey! That's a GREAT solution, Eryksun! Nothing wrong with you, really! > > Every time this happens, I have to admit that I'm a newbie and I've > still got a lot to learn about Python. Especially about its wonderful > standard library. Better than Apple's App Store: for anything you can > think, there's a Module. Problem is, I still can't readily recall > which to use for a given problem. > My bad, now I'll RTFM again and I will study very carefully the > operator and itertools modules. Who knows, maybe in a few decades I'll > be able to say "This looks like a good problem for Module X" too. > > Francesco > You might find it enlightening to look up: http://www.doughellmann.com/PyMOTW/ which explores the Pythons standard library. -- DaveA ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Flatten a list in tuples and remove doubles
On 29/07/12 00:12, Francesco Loffredo wrote: Every time this happens, I have to admit that I'm a newbie and I've still got a lot to learn about Python. Especially about its wonderful standard library. Don't worry, I've been using Python for 15 years and there are plenty modules I haven't explored yet - and I must admit itertools is one that I really should get to grips with but never seem to find the time! -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Topic #2 of Tutor Digest
On 28/07/12 18:53, Todd Tabern wrote: Even if I were to purposefully ask the question in multiple places, > why does that concern you? I wasn't aware that asking for > help in multiple places is forbidden. Its not. But it is considered bad Netiquette. Its also likely to get you noticed as a "nuisance" and someone to be ignored by the more experienced news users - usually the very ones you want to be answering your questions! Some of the reasons it's bad are: - often the same people read all of the groups you choose to post in and so they read the same message multiple times - which is tedious, especially if they have already replied. - often the person replying unwittingly sends to all of the forums, further compounding the issue above. - if they don't send to all forums, multiple people from different forums duplicate the solution, wasting everyone's time. - It uses up extra bandwidth which for smartphone users (and many third world users on dial up) may mean extra cash payments to read duplicate messages. So its better to try the most likely group and only if you don't get a reply after a day or two try another. I'm sorry that it offended you so much that you felt the > need to respond in that manner instead of providing assistance... He was providing assistance, he was pointing out that you had breached good practice for posting on internet fora... :-) Speaking of which tutor-requ...@python.org wrote: Send Tutor mailing list submissions to tutor@python.org To subscribe or unsubscribe via the World Wide Web, visit < snip > When replying, please edit your Subject line so it is more specific than "Re: Contents of Tutor digest..." Please follow those instructions... And also please don't post the entire digest. Trim it back to the bit that is relevant - see the last point above about consideration for other users bandwidth limitations. HTH, -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Flatten a list in tuples and remove doubles
Il 28/07/2012 20:41, eryksun wrote: On Sat, Jul 28, 2012 at 11:12 AM, Francesco Loffredo wrote: I had to study carefully your present and desired lists, and I understood what follows (please, next time explain !): - each 7-tuple in your present list is a record for some measure relative to a person. Its fields are as follows: - field 0: code (I think you want that in growing order) - field 1: group code (could be a class or a group to which both of your example persons belong) - fields 2, 3: surname and name of the person - field 4: progressive number of the measure (these are in order already, but I think you want to enforce this) that you want to exclude from the output list while keeping the order - field 5, 6: numerator and denominator of a ratio that is the measure. you want the ratio to be written as a single string: "%s/%s" % field5, field6 This looks like a good problem for itertools.groupby. My solution below needs error checking and testing, but this is where I'd start: data = [ (0, '3eA', 'Dupont', 'Juliette', 0, 11.0, 10.0), (0, '3eA', 'Dupont', 'Juliette', 1, 4.0, 5.0), (0, '3eA', 'Dupont', 'Juliette', 2, 17.5, 30.0), (0, '3eA', 'Dupont', 'Juliette', 3, 3.0, 5.0), (0, '3eA', 'Dupont', 'Juliette', 4, 4.5, 10.0), (0, '3eA', 'Dupont', 'Juliette', 5, 35.5, 60.0), (1, '3eA', 'Pop', 'Iggy', 0, 12.0, 10.0), (1, '3eA', 'Pop', 'Iggy', 1, 3.5, 5.0), (1, '3eA', 'Pop', 'Iggy', 2, 11.5, 30.0), (1, '3eA', 'Pop', 'Iggy', 3, 4.0, 5.0), (1, '3eA', 'Pop', 'Iggy', 4, 5.5, 10.0), (1, '3eA', 'Pop', 'Iggy', 5, 40.5, 60.0), ] from operator import itemgetter from itertools import groupby #first sort by keyfunc, then group by it keyfunc = itemgetter(0,1,2,3) groups = groupby(sorted(data, key=keyfunc), keyfunc) result = [] for group, records in groups: temp = tuple('%s/%s' % r[5:] for r in sorted(records, key=itemgetter(4))) result.append(group + temp) result [(0, '3eA', 'Dupont', 'Juliette', '11.0/10.0', '4.0/5.0', '17.5/30.0', '3.0/5.0', '4.5/10.0', '35.5/60.0'), (1, '3eA', 'Pop', 'Iggy', '12.0/10.0', '3.5/5.0', '11.5/30.0', '4.0/5.0', '5.5/10.0', '40.5/60.0')] H... it happened again. I spend some time and effort to solve a problem, I feel like I'm almost a Real Python Programmer... and someone spoils my pride showing me some standard module whose name I barely remember, that can solve that same problem in a few lines... Hey! That's a GREAT solution, Eryksun! Nothing wrong with you, really! Every time this happens, I have to admit that I'm a newbie and I've still got a lot to learn about Python. Especially about its wonderful standard library. Better than Apple's App Store: for anything you can think, there's a Module. Problem is, I still can't readily recall which to use for a given problem. My bad, now I'll RTFM again and I will study very carefully the operator and itertools modules. Who knows, maybe in a few decades I'll be able to say "This looks like a good problem for Module X" too. Francesco ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Flatten a list in tuples and remove doubles
Il 28/07/2012 19:43, Steven D'Aprano wrote: Francesco Loffredo wrote: but I must avoid reading my function again, or I'll find some more bugs! Perhaps you should run your function, and test it. Of course I did. Just not as thoroughly as I would if this were a job commitment. Unfortunately, I don't know anything about the possible contents of PyProg's list, so my testing can only be partial. I think my function answers the question, though. Finding bugs is not the problem. Once you find them, you can fix them. It is the bugs that you don't know about that is the problem. Absolutely. Ignorance is Nr.1 cause of software failure. But I'm not sure that a complete, fully tested and robust function would be proper in this tutoring environment, unless the OP had asked about unit testing or proven software reliability. The many lines of code needed to take care of all possible input cases could even make more difficult for him/her to understand how that function solves the problem. At first, I wrote flatten(inlist) without any error testing. Then I added a couple of lines, just to show that input control is a Good Thing. And every time I looked at it, some more controls asked for being coded... but I wanted to stop at some point. If you'd like to show us what can be done to really make it rock-solid, feel free and welcome! ;-)) Francesco ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Flatten a list in tuples and remove doubles
On Sat, Jul 28, 2012 at 11:12 AM, Francesco Loffredo wrote: > > I had to study carefully your present and desired lists, and I understood > what follows (please, next time explain !): > - each 7-tuple in your present list is a record for some measure relative to > a person. Its fields are as follows: > - field 0: code (I think you want that in growing order) > - field 1: group code (could be a class or a group to which both of your > example persons belong) > - fields 2, 3: surname and name of the person > - field 4: progressive number of the measure (these are in order > already, but I think you want to enforce this) that you want to exclude from > the output list while keeping the order > - field 5, 6: numerator and denominator of a ratio that is the measure. > you want the ratio to be written as a single string: "%s/%s" % field5, > field6 This looks like a good problem for itertools.groupby. My solution below needs error checking and testing, but this is where I'd start: data = [ (0, '3eA', 'Dupont', 'Juliette', 0, 11.0, 10.0), (0, '3eA', 'Dupont', 'Juliette', 1, 4.0, 5.0), (0, '3eA', 'Dupont', 'Juliette', 2, 17.5, 30.0), (0, '3eA', 'Dupont', 'Juliette', 3, 3.0, 5.0), (0, '3eA', 'Dupont', 'Juliette', 4, 4.5, 10.0), (0, '3eA', 'Dupont', 'Juliette', 5, 35.5, 60.0), (1, '3eA', 'Pop', 'Iggy', 0, 12.0, 10.0), (1, '3eA', 'Pop', 'Iggy', 1, 3.5, 5.0), (1, '3eA', 'Pop', 'Iggy', 2, 11.5, 30.0), (1, '3eA', 'Pop', 'Iggy', 3, 4.0, 5.0), (1, '3eA', 'Pop', 'Iggy', 4, 5.5, 10.0), (1, '3eA', 'Pop', 'Iggy', 5, 40.5, 60.0), ] from operator import itemgetter from itertools import groupby #first sort by keyfunc, then group by it keyfunc = itemgetter(0,1,2,3) groups = groupby(sorted(data, key=keyfunc), keyfunc) result = [] for group, records in groups: temp = tuple('%s/%s' % r[5:] for r in sorted(records, key=itemgetter(4))) result.append(group + temp) >>> result [(0, '3eA', 'Dupont', 'Juliette', '11.0/10.0', '4.0/5.0', '17.5/30.0', '3.0/5.0', '4.5/10.0', '35.5/60.0'), (1, '3eA', 'Pop', 'Iggy', '12.0/10.0', '3.5/5.0', '11.5/30.0', '4.0/5.0', '5.5/10.0', '40.5/60.0')] ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Topic #2 of Tutor Digest
On 7/28/2012 10:53 AM Todd Tabern said... Even if I were to purposefully ask the question in multiple places, why does that concern you? wasn't aware that asking for help in multiple places is forbidden. It's not forbidden -- simply disrespectful. We all volunteer our available time to respond to questions, and most of us tend to follow multiple lists, so by asking on both you'll waste the time of those responders unaware that an appropraite response has already been provided, although in an alternate forum. I generally don't even read threads where I recognize that a respected or known respondant has already replied. When I see a question posed on multiple forums I note the poster and tend to disregard their future posts. When it continues you'll start seeing *plonk* as potential respondnts add you to their kill file. See http://www.catb.org/~esr/faqs/smart-questions.html#forum in particular, but the entire article should be committed to memory. Emile ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Topic #2 of Tutor Digest
Mark Lawrence: Yes, I did... I kept encountering errors when trying to post the first time. I didn't think my question went through, so I tried this one. Even if I were to purposefully ask the question in multiple places, why does that concern you? I wasn't aware that asking for help in multiple places is forbidden. I'm sorry that it offended you so much that you felt the need to respond in that manner instead of providing assistance... Cheers tutor-requ...@python.org wrote: Send Tutor mailing list submissions to tutor@python.org To subscribe or unsubscribe via the World Wide Web, visit http://mail.python.org/mailman/listinfo/tutor or, via email, send a message with subject or body 'help' to tutor-requ...@python.org You can reach the person managing the list at tutor-ow...@python.org When replying, please edit your Subject line so it is more specific than "Re: Contents of Tutor digest..." Today's Topics: 1. Re: Encoding error when reading text files in Python 3 (Steven D'Aprano) 2. Re: Search and replace text in XML file? (Mark Lawrence) 3. Re: Encoding error when reading text files in Python 3 (Dat Huynh) 4. Re: Flatten a list in tuples and remove doubles (Francesco Loffredo) 5. Re: Flatten a list in tuples and remove doubles (Francesco Loffredo) -- Message: 1 Date: Sat, 28 Jul 2012 20:09:28 +1000 From: Steven D'Aprano To: tutor@python.org Subject: Re: [Tutor] Encoding error when reading text files in Python 3 Message-ID: <5013ba58.1040...@pearwood.info> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Dat Huynh wrote: > Dear all, > > I have written a simple application by Python to read data from text files. > > Current I have both Python version 2.7.2 and Python 3.2.3 on my laptop. > I don't know why it does not run on Python version 3 while it runs > well on Python 2. Python 2 is more forgiving of beginner errors when dealing with text and bytes, but makes it harder to deal with text correctly. Python 3 makes it easier to deal with text correctly, but is less forgiving. When you read from a file in Python 2, it will give you *something*, even if it is the wrong thing. It will not give an decoding error, even if the text you are reading is not valid text. It will just give you junk bytes, sometimes known as moji-bake. Python 3 no longer does that. It tells you when there is a problem, so you can fix it. > Could you please tell me how I can run it on python 3? > Following is my Python code. > > -- >for subdir, dirs, files in os.walk(rootdir): > for file in files: > print("Processing [" +file +"]...\n" ) > f = open(rootdir+file, 'r') > data = f.read() > f.close() > print(data) > -- > > This is the error message: [...] > UnicodeDecodeError: 'ascii' codec can't decode byte 0xd1 in position > 4980: ordinal not in range(128) This tells you that you are reading a non-ASCII file but haven't told Python what encoding to use, so by default Python uses ASCII. Do you know what encoding the file is? Do you understand about Unicode text and bytes? If not, I suggest you read this article: http://www.joelonsoftware.com/articles/Unicode.html In Python 3, you can either tell Python what encoding to use: f = open(rootdir+file, 'r', encoding='utf8') # for example or you can set an error handler: f = open(rootdir+file, 'r', errors='ignore') # for example or both f = open(rootdir+file, 'r', encoding='ascii', errors='replace') You can see the list of encodings and error handlers here: http://docs.python.org/py3k/library/codecs.html Unfortunately, Python 2 does not support this using the built-in open function. Instead, you have to uses codecs.open instead of the built-in open, like this: import codecs f = codecs.open(rootdir+file, 'r', encoding='utf8') # for example which fortunately works in both Python 2 or 3. Or you can read the file in binary mode, and then decode it into text: f = open(rootdir+file, 'rb') data = f.read() f.close() text = data.decode('cp866', 'replace') print(text) If you don't know the encoding, you can try opening the file in Firefox or Internet Explorer and see if they can guess it, or you can use the chardet library in Python. http://pypi.python.org/pypi/chardet Or if you don't care about getting moji-bake, you can pretend that the file is encoded using Latin-1. That will pretty much read anything, although what it gives you may be junk. -- Steven -- Message: 2 Date: Sat, 28 Jul 2012 11:25:30 +0100 From: Mark Lawrence To: tutor@python.org Subject: Re: [Tutor] Search and replace text in XML file? Message-ID: Content-Type: text/plain; charset=ISO-8859-1; format=flowed On 28/07/2012 02:38, Todd Tabern wrote: > I'm looking to search an enti
Re: [Tutor] Flatten a list in tuples and remove doubles
Francesco Loffredo wrote: but I must avoid reading my function again, or I'll find some more bugs! Perhaps you should run your function, and test it. Finding bugs is not the problem. Once you find them, you can fix them. It is the bugs that you don't know about that is the problem. -- Steven ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Flatten a list in tuples and remove doubles
Il 28/07/2012 17:12, Francesco Loffredo ha scritto: Il 19/07/2012 19:33, PyProg PyProg ha scritto: Hi all, I would get a new list as: [(0, '3eA', 'Dupont', 'Juliette', '11.0/10.0', '4.0/5.0', '17.5/30.0', '3.0/5.0', '4.5/10.0', '35.5/60.0'), (1, '3eA', 'Pop', 'Iggy', '12.0/10.0', '3.5/5.0', '11.5/30.0', '4.0/5.0', '5.5/10.0', '7.5/10.0', '40.5/60.0')] ... from this one: [(0, '3eA', 'Dupont', 'Juliette', 0, 11.0, 10.0), (0, '3eA', 'Dupont', 'Juliette', 1, 4.0, 5.0), (0, '3eA', 'Dupont', 'Juliette', 2, 17.5, 30.0), (0, '3eA', 'Dupont', 'Juliette', 3, 3.0, 5.0), (0, '3eA', 'Dupont', 'Juliette', 4, 4.5, 10.0), (0, '3eA', 'Dupont', 'Juliette', 5, 35.5, 60.0), (1, '3eA', 'Pop', 'Iggy', 0, 12.0, 10.0), (1, '3eA', 'Pop', 'Iggy', 1, 3.5, 5.0), (1, '3eA', 'Pop', 'Iggy', 2, 11.5, 30.0), (1, '3eA', 'Pop', 'Iggy', 3, 4.0, 5.0), (1, '3eA', 'Pop', 'Iggy', 4, 5.5, 10.0), (1, '3eA', 'Pop', 'Iggy', 5, 40.5, 60.0)] How to make that ? I'm looking for but for now I can't do it. Thanks in advance. a+ I had to study carefully your present and desired lists, and I understood what follows (please, next time explain !): - each 7-tuple in your present list is a record for some measure relative to a person. Its fields are as follows: - field 0: code (I think you want that in growing order) - field 1: group code (could be a class or a group to which both of your example persons belong) - fields 2, 3: surname and name of the person - field 4: progressive number of the measure (these are in order already, but I think you want to enforce this) that you want to exclude from the output list while keeping the order - field 5, 6: numerator and denominator of a ratio that is the measure. you want the ratio to be written as a single string: "%s/%s" % field5, field6 Taking for granted this structure and my educated guesses about what you didn't tell us, here's my solution: def flatten(inlist) """ takes PyProg PyProg's current list and returns his/her desired one, given my guesses about the structure of inlist and the desired result. """ tempdict = {} for item in inlist: if len(item) != 7: print "Item errato: \n", item id = tuple(item[:4]) progr = item[4] payload = "%s/%s" % item[5:] if id in tempdict: tempdict[id].extend([(progr, payload)]) else: tempdict[id] = [(progr, payload)] for item in tempdict: tempdict[item].sort() # so we set payloads in progressive order, if they aren't already # print "Temporary Dict: ", tempdict tmplist2 = [] for item in tempdict: templist = [] templist.extend(item) templist.extend(tempdict[item]) tmplist2.append(tuple(templist)) tmplist2.sort()# so we set IDs in order # print "Temporary List: ", tmplist2 outlist = [] for item in tmplist2: templist = [] if isinstance(item, tuple): for subitem in item: if isinstance(subitem, tuple): templist.append(subitem[1]) else: templist.append(subitem) outlist.append(tuple(templist)) else: outlist.append(item) # print "\nOutput List: ", outlist return outlist ok, as usual when I look again at something I wrote, I found some little mistakes. Here's my errata corrige: 1- of course, a function definition must end with a colon... line 1: def flatten(inlist): 2- sorry, English is not my first language... line 9: print "Item length wrong!\n", item 3- I didn't insert a break statement after line 9, but if inlist contained a wrong item it would be nice to do something more than simply tell the user, for example we could skip that item, or trim / pad it, or stop the execution, or raise an exception... I just told it to the unsuspecting user, and this may very probably lead to some exception in a later point, or (much worse) to wrong results. So: line 8-9: if len(item) != 7: print "Item length wrong!\n", item raise ValueError("item length != 7") ... now I feel better ... but I must avoid reading my function again, or I'll find some more bugs! Francesco ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Flatten a list in tuples and remove doubles
Il 19/07/2012 19:33, PyProg PyProg ha scritto: Hi all, I would get a new list as: [(0, '3eA', 'Dupont', 'Juliette', '11.0/10.0', '4.0/5.0', '17.5/30.0', '3.0/5.0', '4.5/10.0', '35.5/60.0'), (1, '3eA', 'Pop', 'Iggy', '12.0/10.0', '3.5/5.0', '11.5/30.0', '4.0/5.0', '5.5/10.0', '7.5/10.0', '40.5/60.0')] ... from this one: [(0, '3eA', 'Dupont', 'Juliette', 0, 11.0, 10.0), (0, '3eA', 'Dupont', 'Juliette', 1, 4.0, 5.0), (0, '3eA', 'Dupont', 'Juliette', 2, 17.5, 30.0), (0, '3eA', 'Dupont', 'Juliette', 3, 3.0, 5.0), (0, '3eA', 'Dupont', 'Juliette', 4, 4.5, 10.0), (0, '3eA', 'Dupont', 'Juliette', 5, 35.5, 60.0), (1, '3eA', 'Pop', 'Iggy', 0, 12.0, 10.0), (1, '3eA', 'Pop', 'Iggy', 1, 3.5, 5.0), (1, '3eA', 'Pop', 'Iggy', 2, 11.5, 30.0), (1, '3eA', 'Pop', 'Iggy', 3, 4.0, 5.0), (1, '3eA', 'Pop', 'Iggy', 4, 5.5, 10.0), (1, '3eA', 'Pop', 'Iggy', 5, 40.5, 60.0)] How to make that ? I'm looking for but for now I can't do it. Thanks in advance. a+ I had to study carefully your present and desired lists, and I understood what follows (please, next time explain !): - each 7-tuple in your present list is a record for some measure relative to a person. Its fields are as follows: - field 0: code (I think you want that in growing order) - field 1: group code (could be a class or a group to which both of your example persons belong) - fields 2, 3: surname and name of the person - field 4: progressive number of the measure (these are in order already, but I think you want to enforce this) that you want to exclude from the output list while keeping the order - field 5, 6: numerator and denominator of a ratio that is the measure. you want the ratio to be written as a single string: "%s/%s" % field5, field6 Taking for granted this structure and my educated guesses about what you didn't tell us, here's my solution: def flatten(inlist) """ takes PyProg PyProg's current list and returns his/her desired one, given my guesses about the structure of inlist and the desired result. """ tempdict = {} for item in inlist: if len(item) != 7: print "Item errato: \n", item id = tuple(item[:4]) progr = item[4] payload = "%s/%s" % item[5:] if id in tempdict: tempdict[id].extend([(progr, payload)]) else: tempdict[id] = [(progr, payload)] for item in tempdict: tempdict[item].sort() # so we set payloads in progressive order, if they aren't already # print "Temporary Dict: ", tempdict tmplist2 = [] for item in tempdict: templist = [] templist.extend(item) templist.extend(tempdict[item]) tmplist2.append(tuple(templist)) tmplist2.sort()# so we set IDs in order # print "Temporary List: ", tmplist2 outlist = [] for item in tmplist2: templist = [] if isinstance(item, tuple): for subitem in item: if isinstance(subitem, tuple): templist.append(subitem[1]) else: templist.append(subitem) outlist.append(tuple(templist)) else: outlist.append(item) # print "\nOutput List: ", outlist return outlist ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Encoding error when reading text files in Python 3
I change my code and it runs on Python 3 now. f = open(rootdir+file, 'rb') data = f.read().decode('utf8', 'ignore') Thank you very much. Sincerely, Dat. On Sat, Jul 28, 2012 at 6:09 PM, Steven D'Aprano wrote: > Dat Huynh wrote: >> >> Dear all, >> >> I have written a simple application by Python to read data from text >> files. >> >> Current I have both Python version 2.7.2 and Python 3.2.3 on my laptop. >> I don't know why it does not run on Python version 3 while it runs >> well on Python 2. > > > Python 2 is more forgiving of beginner errors when dealing with text and > bytes, but makes it harder to deal with text correctly. > > Python 3 makes it easier to deal with text correctly, but is less forgiving. > > When you read from a file in Python 2, it will give you *something*, even if > it is the wrong thing. It will not give an decoding error, even if the text > you are reading is not valid text. It will just give you junk bytes, > sometimes known as moji-bake. > > Python 3 no longer does that. It tells you when there is a problem, so you > can fix it. > > > >> Could you please tell me how I can run it on python 3? >> Following is my Python code. >> >> -- >>for subdir, dirs, files in os.walk(rootdir): >> for file in files: >> print("Processing [" +file +"]...\n" ) >> f = open(rootdir+file, 'r') >> data = f.read() >> f.close() >> print(data) >> -- >> >> This is the error message: > > [...] > >> UnicodeDecodeError: 'ascii' codec can't decode byte 0xd1 in position >> 4980: ordinal not in range(128) > > > > This tells you that you are reading a non-ASCII file but haven't told Python > what encoding to use, so by default Python uses ASCII. > > Do you know what encoding the file is? > > Do you understand about Unicode text and bytes? If not, I suggest you read > this article: > > http://www.joelonsoftware.com/articles/Unicode.html > > > In Python 3, you can either tell Python what encoding to use: > > f = open(rootdir+file, 'r', encoding='utf8') # for example > > or you can set an error handler: > > f = open(rootdir+file, 'r', errors='ignore') # for example > > or both > > f = open(rootdir+file, 'r', encoding='ascii', errors='replace') > > > You can see the list of encodings and error handlers here: > > http://docs.python.org/py3k/library/codecs.html > > > Unfortunately, Python 2 does not support this using the built-in open > function. Instead, you have to uses codecs.open instead of the built-in > open, like this: > > import codecs > f = codecs.open(rootdir+file, 'r', encoding='utf8') # for example > > which fortunately works in both Python 2 or 3. > > > Or you can read the file in binary mode, and then decode it into text: > > f = open(rootdir+file, 'rb') > data = f.read() > f.close() > text = data.decode('cp866', 'replace') > print(text) > > > If you don't know the encoding, you can try opening the file in Firefox or > Internet Explorer and see if they can guess it, or you can use the chardet > library in Python. > > http://pypi.python.org/pypi/chardet > > Or if you don't care about getting moji-bake, you can pretend that the file > is encoded using Latin-1. That will pretty much read anything, although what > it gives you may be junk. > > > > -- > Steven > > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Search and replace text in XML file?
On 28/07/2012 02:38, Todd Tabern wrote: I'm looking to search an entire XML file for specific text and replace that text, while maintaining the structure of the XML file. The text occurs within multiple nodes throughout the file. I basically need to replace every occurrence C:\Program Files with C:\Program Files (x86), regardless of location. For example, that text appears within: C:\Program Files\\Map Data\Road_Centerlines.shp and also within: C:\Program Files\Templates\RoadNetwork.rtx ...among others. I've tried some non-python methods and they all ruined the XML structure. I've been Google searching all day and can only seem to find solutions that look for a specific node and replace the whole string between the tags. I've been looking at using minidom to achieve this but I just can't seem to figure out the right method. My end goal, once I have working code, is to compile an exe that can work on machines without python, allowing a user can click in order to perform the XML modification. Thanks in advance. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor Did you really have to ask the same question on two separate Python mailing lists and only 15 minutes apart? -- Cheers. Mark Lawrence. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Encoding error when reading text files in Python 3
Dat Huynh wrote: Dear all, I have written a simple application by Python to read data from text files. Current I have both Python version 2.7.2 and Python 3.2.3 on my laptop. I don't know why it does not run on Python version 3 while it runs well on Python 2. Python 2 is more forgiving of beginner errors when dealing with text and bytes, but makes it harder to deal with text correctly. Python 3 makes it easier to deal with text correctly, but is less forgiving. When you read from a file in Python 2, it will give you *something*, even if it is the wrong thing. It will not give an decoding error, even if the text you are reading is not valid text. It will just give you junk bytes, sometimes known as moji-bake. Python 3 no longer does that. It tells you when there is a problem, so you can fix it. Could you please tell me how I can run it on python 3? Following is my Python code. -- for subdir, dirs, files in os.walk(rootdir): for file in files: print("Processing [" +file +"]...\n" ) f = open(rootdir+file, 'r') data = f.read() f.close() print(data) -- This is the error message: [...] UnicodeDecodeError: 'ascii' codec can't decode byte 0xd1 in position 4980: ordinal not in range(128) This tells you that you are reading a non-ASCII file but haven't told Python what encoding to use, so by default Python uses ASCII. Do you know what encoding the file is? Do you understand about Unicode text and bytes? If not, I suggest you read this article: http://www.joelonsoftware.com/articles/Unicode.html In Python 3, you can either tell Python what encoding to use: f = open(rootdir+file, 'r', encoding='utf8') # for example or you can set an error handler: f = open(rootdir+file, 'r', errors='ignore') # for example or both f = open(rootdir+file, 'r', encoding='ascii', errors='replace') You can see the list of encodings and error handlers here: http://docs.python.org/py3k/library/codecs.html Unfortunately, Python 2 does not support this using the built-in open function. Instead, you have to uses codecs.open instead of the built-in open, like this: import codecs f = codecs.open(rootdir+file, 'r', encoding='utf8') # for example which fortunately works in both Python 2 or 3. Or you can read the file in binary mode, and then decode it into text: f = open(rootdir+file, 'rb') data = f.read() f.close() text = data.decode('cp866', 'replace') print(text) If you don't know the encoding, you can try opening the file in Firefox or Internet Explorer and see if they can guess it, or you can use the chardet library in Python. http://pypi.python.org/pypi/chardet Or if you don't care about getting moji-bake, you can pretend that the file is encoded using Latin-1. That will pretty much read anything, although what it gives you may be junk. -- Steven ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] Encoding error when reading text files in Python 3
Dear all, I have written a simple application by Python to read data from text files. Current I have both Python version 2.7.2 and Python 3.2.3 on my laptop. I don't know why it does not run on Python version 3 while it runs well on Python 2. Could you please tell me how I can run it on python 3? Following is my Python code. -- for subdir, dirs, files in os.walk(rootdir): for file in files: print("Processing [" +file +"]...\n" ) f = open(rootdir+file, 'r') data = f.read() f.close() print(data) -- This is the error message: -- Traceback (most recent call last): File "/Users/dathuynh/Documents/workspace/PyTest/MyParser.py", line 53, in main() File "/Users/dathuynh/Documents/workspace/PyTest/MyParser.py", line 20, in main data = f.read() File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/encodings/ascii.py", line 26, in decode return codecs.ascii_decode(input, self.errors)[0] UnicodeDecodeError: 'ascii' codec can't decode byte 0xd1 in position 4980: ordinal not in range(128) -- Thank you very much for your help. Sincerely, Dat Huynh. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Search and replace text in XML file?
On 28/07/12 02:38, Todd Tabern wrote: I'm looking to search an entire XML file for specific text and replace that text, > while maintaining the structure of the XML file. Do you mean the physical layout of the file or the technical XML structure? I'm assuming its the latter? If it's the former just use 'tidy' to reformat the file. I basically need to replace every occurrence C:\Program Files with C:\Program Files (x86), > regardless of location. ... I've tried some non-python methods and they all ruined the XML structure. Can you give examples of what you tried? I'd have gone for sed for a job like this. > I've been Google searching all day and can only seem to find solutionsthat look for a specific node > and replace the whole string between the tags. Because that's usually the requirement, but generally you can extract the existing string and do the substitution and then write back the new version. I've been looking at using minidom to achieve this I'd consider element tree if you must do it via a parser, but it sounds like you don't need that, sed or simple string replacements should suffice here. compile an exe that can work on machines without python, You can do that with Python but its not my favourite approach, you'd be better with a sed based solution. (You can get GNU sed for Windowss and it already is installed on MacOS/Linux boxes.) HTH -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Search and replace text in XML file?
On Jul 28, 2012 2:39 AM, "Todd Tabern" wrote: > > I'm looking to search an entire XML file for specific text and replace that text, while maintaining the structure of the XML file. The text occurs within multiple nodes throughout the file. > I basically need to replace every occurrence C:\Program Files with C:\Program Files (x86), regardless of location. For example, that text appears within: > C:\Program Files\\Map Data\Road_Centerlines.shp > and also within: > C:\Program Files\Templates\RoadNetwork.rtx > ...among others. > I've tried some non-python methods and they all ruined the XML structure. I've been Google searching all day and can only seem to find solutions that look for a specific node and replace the whole string between the tags. > I've been looking at using minidom to achieve this but I just can't seem to figure out the right method. > My end goal, once I have working code, is to compile an exe that can work on machines without python, allowing a user can click in order to perform the XML modification. > Thanks in advance. I'm not sure what you have tried already, but this should be as simple as reading the file, replacing the strings and then writing the file. Because you don't care about just a specific entry, you don't really need to be concerned that its an XML file. If you continue having formatting problems, send us a sample and the code you've tried. Bodsda ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor