[Tutor] Please Help
I am new to python. I like to calculate average of the numbers by reading the file 'digi_2.txt'. I have written the following code: def average(s): return sum(s) * 1.0 / len(s) f = open (digi_2.txt, r+) list_of_lists1 = f.readlines() for index in range(len(list_of_lists1)): tt = list_of_lists1[index] print 'Current value :', tt avg =average (tt) This gives an error: def average(s): return sum(s) * 1.0 / len(s) TypeError: unsupported operand type(s) for +: 'int' and 'str' I also attach the file i am reading. Please help to rectify. Regards, Arijit Ukil Tata Consultancy Services Mailto: arijit.u...@tcs.com Website: http://www.tcs.com Experience certainty. IT Services Business Solutions Outsourcing From: Alan Gauld alan.ga...@btinternet.com To: tutor@python.org Date: 03/21/2013 06:00 AM Subject: Re: [Tutor] Help Sent by: Tutor tutor-bounces+arijit.ukil=tcs@python.org On 20/03/13 19:57, travis jeanfrancois wrote: I create a function that allows the user to a create sentence by inputing a string and to end the sentence with a period meaning inputing . .The problem is while keeps overwriting the previuos input 'While' does not do any such thing. Your code is doing that all by itself. What while does is repeat your code until a condition becomes false or you explicitly break out of the loop. Here is my code: def B1(): Try to give your functions names that describe what they do. B1() is meaningless, readSentence() would be better. period = . # The variable period is assigned Its normal programming practice to put the comment above the code not after it. Also comments should indicate why you are doing something not what you are doing - we can see that from the code. first = input(Enter the first word in your sentence ) next1 = input(Enter the next word in you sentence or enter period:) # I need store the value so when while overwrites next1 with the next input the previous input is stored and will print output when I call it later along with last one # I believe the solution is some how implenting this expression x = x+ variable You could be right. Addition works for strings as well as numbers. Although there are other (better) options but you may not have covered them in your class yet. while next1 != (period) : You don;t need the parentheses around period. Also nextWord might be a better name than next1. Saving 3 characters of typing is not usually worthwhile. next1 = input(Enter the next word in you sentence or enter period:) Right, here you are overwriting next1. It's not the while's fault - it is just repeating your code. It is you who are overwriting the variable. Notice that you are not using the first that you captured? Maybe you should add next1 to first at some point? Then you can safely overwrite next1 as much as you like? if next1 == (period): Again you don;t need the parentheses around period next1 = next1 + period Here, you add the period to next1 which the 'if' has already established is now a period. print (Your sentence is:,first,next1,period) And now you print out the first word plus next1 (= 2 periods) plus a period = 3 periods in total... preceded by the phrase Your sentence is: This tells us that the sample output you posted is not from this program... Always match the program and the output when debugging or you will be led seriously astray! PS : The # is I just type so I can understand what each line does The # is a comment marker. Comments are a very powerful tool that programmers use to explain to themselves and other programmers why they have done what they have. When trying to debug faults like this it is often worthwhile grabbing a pen and drawing a chart of your variables and their values after each time round the loop. In this case it would have looked like iterationperiod first next1 0. I am 1. I a 2. I novice 3. I .. If you aren't sure of the values insert a print statement and get the program to tell you, but working it out in your head is more likely to show you the error. HTH, -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor =-=-= Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any
Re: [Tutor] Please Help
Please trim unrelated text from emails. On 21 March 2013 10:42, Arijit Ukil arijit.u...@tcs.com wrote: I am new to python. I like to calculate average of the numbers by reading the file 'digi_2.txt'. I have written the following code: def average(s): return sum(s) * 1.0 / len(s) f = open (digi_2.txt, r+) list_of_lists1 = f.readlines() for index in range(len(list_of_lists1)): tt = list_of_lists1[index] print 'Current value :', tt avg =average (tt) This gives an error: def average(s): return sum(s) * 1.0 / len(s) TypeError: unsupported operand type(s) for +: 'int' and 'str' tt is a string as it's read from the file. int(tt) would fix the problem. But in addition you're also not actually calculating the average. def average(s): return sum(s) / len(s) # convert the list of strings to a list of floats tt = [float(x) for x in list_of_lists1] avg = average(tt) -- ./Sven ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Please Help
On 03/21/2013 06:42 AM, Arijit Ukil wrote: I am new to python. Since you're new to Python, I won't try to supply you an answer using list comprehensions, since you've probably not learned them yet. I like to calculate average of the numbers by reading the file 'digi_2.txt'. I have written the following code: def average(s): return sum(s) * 1.0 / len(s) This function presumably expects to be passed a list (or iterable) of ints or a list of floats as its argument. It'll fail if given a list of strings. A comment or docstring to that effect would be useful to remind yourself. f = open (digi_2.txt, r+) Why is there a plus sign in the mode string? Not necessary, since you're just going to read the file straight through. list_of_lists1 = f.readlines() Not a good name, since that's not what readlines() returns. It'll return a list of strings, each string representing one line of the file. for index in range(len(list_of_lists1)): Since your file is only one line long, this loop doesn't do much. tt = list_of_lists1[index] print 'Current value :', tt At this point, It is the string read from the last line of the file. The other lines are not represented in any way. avg =average (tt) This gives an error: def average(s): return sum(s) * 1.0 / len(s) TypeError: unsupported operand type(s) for +: 'int' and 'str' I also attach the file i am reading. You shouldn't assume everyone can read the attached data. Since it's short, you should just include it in your message. It's MUCH shorter than all the irrelevant data you included at the end of your message. For those others who may be reading this portion of the hijacked thread, here's the one line in digi_2.txt 1350696461, 448.0, 538660.0, 1350696466, 448.0 Now to try to solve the problem. First, you don't specify what the numbers in the file will look like. Looking at your code, I naturally assumed you had one value per line. Instead I see a single line with multiple numbers separated by commas. I'll assume that the data will always be in a single line, or that if there are multiple lines, every line but the last will end with a comma, and that the last one will NOT have a trailing comma. If I don't assume something, the problem can't be solved. Since we don't care about newlines, we can read the whole file into one string, with the read() function. f = open (digi_2.txt, r) filedata = f.read() f.close() Now we have to separate the data by the commas. numstsrings = filedata.split(,) And now we have to convert each of these numstring values from a string into a float. nums = [] for numstring in numstrings: nums.append(float(numstring)) Now we can call the average function, since we have a list of floats. avg = average(nums) Completely untested, so there may be typos in it. -- DaveA ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Please Help
Hi Arijit, On Thu, Mar 21, 2013 at 8:42 PM, Arijit Ukil arijit.u...@tcs.com wrote: I am new to python. I like to calculate average of the numbers by reading the file 'digi_2.txt'. I have written the following code: def average(s): return sum(s) * 1.0 / len(s) f = open (digi_2.txt, r+) list_of_lists1 = f.readlines() for index in range(len(list_of_lists1)): tt = list_of_lists1[index] print 'Current value :', tt avg =average (tt) This gives an error: def average(s): return sum(s) * 1.0 / len(s) TypeError: unsupported operand type(s) for +: 'int' and 'str' I also attach the file i am reading. Please help to rectify. The main issue here is that when you are reading from a file, to Python, its all strings. And although, 'abc' + 'def' is valid, 'abc' + 5 isn't (for example). Hence, besides the fact that your average calculation is not right, you will have to 'convert' the string to an integer/float to do any arithmetic operation on them. (If you know C, this is similar to typecasting). So, coming back to your program, I will first demonstrate you a few things and then you can write the program yourself. If you were to break down this program into simple steps, they would be: 1. Read the lines from a file (Assume a generic case, where you have more than one line in the file, and you have to calculate the average for each such row) 2. Create a list of floating point numbers for each of those lines 3. And call your average function on each of these lists You could of course do 2 3 together, so you create the list and call the average function. So, here is step 1: with open('digi.txt','r') as f: lines = f.readlines() Please refer to http://docs.python.org/2/tutorial/inputoutput.html#methods-of-file-objects for an explanation of the advantage of using 'with'. Now, you have *all* the lines of the file in 'lines'. Now, you want to perform step 2 for each line in this file. Here you go: for line in lines: number_list = [] for number in line.split(','): number_list.append(float(number)) (To learn more about Python lists, see http://effbot.org/zone/python-list.htm). It is certainly possible to use the index of an element to access elements from a list, but this is more Pythonic way of doing it. To understand this better, in the variable 'line', you will have a list of numbers on a single line. For example: 1350696461, 448.0, 538660.0, 1350696466, 448.0. Note how they are separated by a ',' ? To get each element, we use the split( ) function, which returns a list of the individual numbers. (See: http://docs.python.org/2/library/stdtypes.html#str.split). And then, we use the .append() method to create the list. Now, you have a number_list which is a list of floating point numbers for each line. Now, step 2 3 combined: for line in lines: number_list = [] for number in line.split(','): number_list.append(float(number)) print average(number_list) Where average( ) is defined as: def average(num_list): return sum(num_list)/len(num_list) There may be a number of unknown things I may have talked about, but i hope the links will help you learn more and write your program now. Good Luck. -Amit. -- http://amitsaha.github.com/ ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Please Help
On 03/21/2013 08:09 AM, Arijit Ukil wrote: Thanks for the help. You're welcome. You replied privately, instead of including the list, so I'm forwarding the response so everyone can see it. You also top-posted, so the context is backwards. After running your code, I am getting the following error: Traceback (most recent call last): File C:\Documents and Settings\207564\Desktop\Imp privacy analyzer\New Folder\test\src\test.py, line 53, in ? nums.append(float(numstsrings)) TypeError: float() argument must be a string or a number Check carefully, apparently the copy/paste on your machine inserts extra letters. -- DaveA ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] Importing data from a file.
I have an elementary question provoked by another post today. 1. Is it the case that ALL imported data from a file is a string? 2. Does this therefor imply that said data has to be processed appropriately to generate the data in the form required by the program? 3. Are there defined procedures for doing the required processing? With many thanks, Sydney -- Professor Sydney Shall, Department of Haematological Medicine, King's College London, Medical School, 123 Coldharbour Lane, LONDON SE5 9NU, Tel Fax: +44 (0)207 848 5902, E-Mail: sydney.shall, [correspondents outside the College should add; @kcl.ac.uk] www.kcl.ac.uk ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Importing data from a file.
On Thu, Mar 21, 2013 at 11:43 PM, Shall, Sydney sydney.sh...@kcl.ac.uk wrote: I have an elementary question provoked by another post today. 1. Is it the case that ALL imported data from a file is a string? 2. Does this therefor imply that said data has to be processed appropriately to generate the data in the form required by the program? To the best of my knowledge, yes to both of your queries. Once you have the element you want to process, you can make use of the type converting functions (int(), float().. ) and use them appropriately. 3. Are there defined procedures for doing the required processing? If you meant conversion functions, int() and float() are examples of those. You of course (most of the times) have to make use of string manipulation functions (strip(), rstrip(), etc) to extract the exact data item you might be looking for. So, they would be the building blocks for your processing functions. I hope that makes some things clear. -Amit. -- http://amitsaha.github.com/ ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Importing data from a file.
On 21/03/2013 13:54, Amit Saha wrote: On Thu, Mar 21, 2013 at 11:43 PM, Shall, Sydney sydney.sh...@kcl.ac.uk wrote: I have an elementary question provoked by another post today. 1. Is it the case that ALL imported data from a file is a string? 2. Does this therefor imply that said data has to be processed appropriately to generate the data in the form required by the program? To the best of my knowledge, yes to both of your queries. Once you have the element you want to process, you can make use of the type converting functions (int(), float().. ) and use them appropriately. 3. Are there defined procedures for doing the required processing? If you meant conversion functions, int() and float() are examples of those. You of course (most of the times) have to make use of string manipulation functions (strip(), rstrip(), etc) to extract the exact data item you might be looking for. So, they would be the building blocks for your processing functions. I hope that makes some things clear. -Amit. Yes, Thanks. This is now quite clear. Sydney -- Professor Sydney Shall, Department of Haematological Medicine, King's College London, Medical School, 123 Coldharbour Lane, LONDON SE5 9NU, Tel Fax: +44 (0)207 848 5902, E-Mail: sydney.shall, [correspondents outside the College should add; @kcl.ac.uk] www.kcl.ac.uk ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Importing data from a file.
On 03/21/2013 09:43 AM, Shall, Sydney wrote: I have an elementary question provoked by another post today. 1. Is it the case that ALL imported data from a file is a string? No, the imported data is a module. For example import sys print type(sys) type 'module' At this point, sys is a object of type module. Perhaps you really mean data returned by the readline() method of the file object. In that case, it's a list of strings. Or data returned from the readline() method of the file object. That is a string. Or data returned from the read() method of the file object. The return type of that depends on the version of Python. Be more specific, since the answer greatly depends on how you read this data. 2. Does this therefor imply that said data has to be processed appropriately to generate the data in the form required by the program? Again, you have to be specific. The program might well want exactly what one of these methods returns. 3. Are there defined procedures for doing the required processing? Sure, hundreds of thousands of them, most of them to be found in other people's programs. Sorry, but your questions are so vague as to defy definitive answers. -- DaveA ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Importing data from a file.
3. Are there defined procedures for doing the required processing? If you meant conversion functions, int() and float() are examples of those. You of course (most of the times) have to make use of string manipulation functions (strip(), rstrip(), etc) to extract the exact data item you might be looking for. Let's add the pickle module though. Pickle converts (most) Python objects into string representations, when you unpickle these string representations you get back your object. A dictionary object becomes a dictionary object and so on. Read more here: http://docs.python.org/3.3/library/pickle.html#module-pickle -- best regards, Robert S. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Importing data from a file.
On 03/21/2013 10:03 AM, Dave Angel wrote: A typo below; sorry. On 03/21/2013 09:43 AM, Shall, Sydney wrote: I have an elementary question provoked by another post today. 1. Is it the case that ALL imported data from a file is a string? No, the imported data is a module. For example import sys print type(sys) type 'module' At this point, sys is a object of type module. Perhaps you really mean data returned by the readline() method of the Perhaps you really mean data returned by the readlines() method of the file object. In that case, it's a list of strings. Or data returned from the readline() method of the file object. That is a string. Or data returned from the read() method of the file object. The return type of that depends on the version of Python. Be more specific, since the answer greatly depends on how you read this data. 2. Does this therefor imply that said data has to be processed appropriately to generate the data in the form required by the program? Again, you have to be specific. The program might well want exactly what one of these methods returns. 3. Are there defined procedures for doing the required processing? Sure, hundreds of thousands of them, most of them to be found in other people's programs. Sorry, but your questions are so vague as to defy definitive answers. -- DaveA ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Importing data from a file.
On 21/03/13 13:43, Shall, Sydney wrote: I have an elementary question provoked by another post today. 1. Is it the case that ALL imported data from a file is a string? Assuming you mean data read from a file rather than modules imported using 'import' then the answer is 'it depends'. Most files are text files and the data is stored as strings and therefore when you read them back they will be strings. You then convert them to the native data using int(), float() etc. Some files are binary files and then the data read back will be bytes and need to be decoded into the original data. This is often done using the struct module. Either way if you use the Python read() operation on a file you will get back a bunch of bytes. What those bytes represent depends on how they were written. How they are interpreted is down to the programmer. 2. Does this therefor imply that said data has to be processed appropriately to generate the data in the form required by the program? Yes, always. 3. Are there defined procedures for doing the required processing? Yes, for the standard types. For custom types and arbitrary binary data you need to find out what the original encoding was and reverse it. HTH, -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Importing data from a file.
On 21/03/2013 16:17, Alan Gauld wrote: On 21/03/13 13:43, Shall, Sydney wrote: I have an elementary question provoked by another post today. 1. Is it the case that ALL imported data from a file is a string? Assuming you mean data read from a file rather than modules imported using 'import' then the answer is 'it depends'. Most files are text files and the data is stored as strings and therefore when you read them back they will be strings. You then convert them to the native data using int(), float() etc. Some files are binary files and then the data read back will be bytes and need to be decoded into the original data. This is often done using the struct module. Either way if you use the Python read() operation on a file you will get back a bunch of bytes. What those bytes represent depends on how they were written. How they are interpreted is down to the programmer. 2. Does this therefor imply that said data has to be processed appropriately to generate the data in the form required by the program? Yes, always. 3. Are there defined procedures for doing the required processing? Yes, for the standard types. For custom types and arbitrary binary data you need to find out what the original encoding was and reverse it. HTH, Thank you Alan, That was most useful. Cheers, Sydney -- Professor Sydney Shall, Department of Haematological Medicine, King's College London, Medical School, 123 Coldharbour Lane, LONDON SE5 9NU, Tel Fax: +44 (0)207 848 5902, E-Mail: sydney.shall, [correspondents outside the College should add; @kcl.ac.uk] www.kcl.ac.uk ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] Help with iterators
Dear list, I have been trying to understand out how to use iterators and in particular groupby statements. I am, however, quite lost. I wish to subset the below list, selecting the observations that have an ID ('realtime_start') value that is greater than some date (i've used the variable name maxDate), and in the case that there is more than one such record, returning only the one that has the largest ID ('realtime_start'). The code below does the job, however i have the impression that it might be done in a more python way using iterators and groupby statements. could someone please help me understand how to go from this code to the pythonic idiom? thanks in advance, Matt Johnson _ ## Code example import pprint obs = [{'date': '2012-09-01', 'realtime_end': '2013-02-18', 'realtime_start': '2012-10-15', 'value': '231.951'}, {'date': '2012-09-01', 'realtime_end': '2013-02-18', 'realtime_start': '2012-11-15', 'value': '231.881'}, {'date': '2012-10-01', 'realtime_end': '2013-02-18', 'realtime_start': '2012-11-15', 'value': '231.751'}, {'date': '2012-10-01', 'realtime_end': '-12-31', 'realtime_start': '2012-12-19', 'value': '231.623'}, {'date': '2013-02-01', 'realtime_end': '-12-31', 'realtime_start': '2013-03-21', 'value': '231.157'}, {'date': '2012-11-01', 'realtime_end': '2013-02-18', 'realtime_start': '2012-12-14', 'value': '231.025'}, {'date': '2012-11-01', 'realtime_end': '-12-31', 'realtime_start': '2013-01-19', 'value': '231.071'}, {'date': '2012-12-01', 'realtime_end': '2013-02-18', 'realtime_start': '2013-01-16', 'value': '230.979'}, {'date': '2012-12-01', 'realtime_end': '-12-31', 'realtime_start': '2013-02-19', 'value': '231.137'}, {'date': '2012-12-01', 'realtime_end': '-12-31', 'realtime_start': '2013-03-19', 'value': '231.197'}, {'date': '2013-01-01', 'realtime_end': '-12-31', 'realtime_start': '2013-02-21', 'value': '231.198'}, {'date': '2013-01-01', 'realtime_end': '-12-31', 'realtime_start': '2013-03-21', 'value': '231.222'}] maxDate = 2013-03-21 dobs = dict([(d, []) for d in set([e['date'] for e in obs])]) for o in obs: dobs[o['date']].append(o) dobs_subMax = dict([(k, [d for d in v if d['realtime_start'] = maxDate]) for k, v in dobs.items()]) rts = lambda x: x['realtime_start'] mmax = [sorted(e, key=rts)[-1] for e in dobs_subMax.values() if e] mmax.sort(key = lambda x: x['date']) pprint.pprint(mmax) ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Help with iterators
On 03/21/2013 08:39 PM, Matthew Johnson wrote: Dear list, I have been trying to understand out how to use iterators and in particular groupby statements. I am, however, quite lost. I wish to subset the below list, selecting the observations that have an ID ('realtime_start') value that is greater than some date (i've used the variable name maxDate), and in the case that there is more than one such record, returning only the one that has the largest ID ('realtime_start'). The code below does the job, however i have the impression that it might be done in a more python way using iterators and groupby statements. could someone please help me understand how to go from this code to the pythonic idiom? thanks in advance, Matt Johnson _ ## Code example import pprint obs = [{'date': '2012-09-01', 'realtime_end': '2013-02-18', 'realtime_start': '2012-10-15', 'value': '231.951'}, {'date': '2012-09-01', 'realtime_end': '2013-02-18', 'realtime_start': '2012-11-15', 'value': '231.881'}, {'date': '2012-10-01', 'realtime_end': '2013-02-18', 'realtime_start': '2012-11-15', 'value': '231.751'}, {'date': '2012-10-01', 'realtime_end': '-12-31', 'realtime_start': '2012-12-19', 'value': '231.623'}, {'date': '2013-02-01', 'realtime_end': '-12-31', 'realtime_start': '2013-03-21', 'value': '231.157'}, {'date': '2012-11-01', 'realtime_end': '2013-02-18', 'realtime_start': '2012-12-14', 'value': '231.025'}, {'date': '2012-11-01', 'realtime_end': '-12-31', 'realtime_start': '2013-01-19', 'value': '231.071'}, {'date': '2012-12-01', 'realtime_end': '2013-02-18', 'realtime_start': '2013-01-16', 'value': '230.979'}, {'date': '2012-12-01', 'realtime_end': '-12-31', 'realtime_start': '2013-02-19', 'value': '231.137'}, {'date': '2012-12-01', 'realtime_end': '-12-31', 'realtime_start': '2013-03-19', 'value': '231.197'}, {'date': '2013-01-01', 'realtime_end': '-12-31', 'realtime_start': '2013-02-21', 'value': '231.198'}, {'date': '2013-01-01', 'realtime_end': '-12-31', 'realtime_start': '2013-03-21', 'value': '231.222'}] maxDate = 2013-03-21 dobs = dict([(d, []) for d in set([e['date'] for e in obs])]) for o in obs: dobs[o['date']].append(o) dobs_subMax = dict([(k, [d for d in v if d['realtime_start'] = maxDate]) for k, v in dobs.items()]) rts = lambda x: x['realtime_start'] mmax = [sorted(e, key=rts)[-1] for e in dobs_subMax.values() if e] mmax.sort(key = lambda x: x['date']) pprint.pprint(mmax) ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor You can do it with groupby like so: from itertools import groupby from operator import itemgetter maxDate = 2013-03-21 mmax= list() obs.sort(key=itemgetter('date')) for k, group in groupby(obs, key=itemgetter('date')): group = [dob for dob in group if dob['realtime_start'] = maxDate] if group: group.sort(key=itemgetter('realtime_start')) mmax.append(group[-1]) pprint.pprint(mmax) Note that writing multiply-nested comprehensions like you did results in very unreadable code. Do you find this code more readable? -m -- Lark's Tongue Guide to Python: http://lightbird.net/larks/ Many a man fails as an original thinker simply because his memory it too good. Friedrich Nietzsche ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Help with iterators
On 22/03/13 11:39, Matthew Johnson wrote: Dear list, I have been trying to understand out how to use iterators and in particular groupby statements. I am, however, quite lost. groupby is a very specialist function which is not very intuitive to use. Sometimes I think that groupby is an excellent solution in search of a problem. I wish to subset the below list, selecting the observations that have an ID ('realtime_start') value that is greater than some date (i've used the variable name maxDate), and in the case that there is more than one such record, returning only the one that has the largest ID ('realtime_start'). The code that you show does not so what you describe here. The most obvious difference is that it doesn't return or display a single record, but shows multiple records. In your case, it selects six records, four of which have a realtime_start that occurs BEFORE the given maxDate. To solve the problem you describe here, of finding at most a single record, the solution is much simpler than what you have done. Prepare a list of observations, sorted by realtime_start. Take the latest such observation. If the realtime_start is greater than the maxDate, you have your answer. If not, there is no answer. The simplest solution is usually the best. The simpler your code, the fewer bugs it will contain. obs.sort(key=lambda rec: rec['realtime_start']) rec = obs[-1] if rec['realtime_start'] maxDate: print rec else: print no record found which prints: {'date': '2013-01-01', 'realtime_start': '2013-03-21', 'realtime_end': '-12-31', 'value': '231.222'} -- Steven ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Help with iterators
On 22/03/13 12:39, Mitya Sirenef wrote: You can do it with groupby like so: from itertools import groupby from operator import itemgetter maxDate = 2013-03-21 mmax= list() obs.sort(key=itemgetter('date')) for k, group in groupby(obs, key=itemgetter('date')): group = [dob for dob in group if dob['realtime_start'] = maxDate] if group: group.sort(key=itemgetter('realtime_start')) mmax.append(group[-1]) pprint.pprint(mmax) This suffers from the same problem of finding six records instead of one, and that four of the six have start dates before the given date instead of after it. Here's another solution that finds all the records that start on or after the given data (the poorly named maxDate) and displays them sorted by date. selected = [rec for rec in obs if rec['realtime_start'] = maxDate] selected.sort(key=lambda rec: rec['date']) print selected -- Steven ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Help with iterators
On 03/21/2013 10:20 PM, Steven D'Aprano wrote: On 22/03/13 12:39, Mitya Sirenef wrote: You can do it with groupby like so: from itertools import groupby from operator import itemgetter maxDate = 2013-03-21 mmax = list() obs.sort(key=itemgetter('date')) for k, group in groupby(obs, key=itemgetter('date')): group = [dob for dob in group if dob['realtime_start'] = maxDate] if group: group.sort(key=itemgetter('realtime_start')) mmax.append(group[-1]) pprint.pprint(mmax) This suffers from the same problem of finding six records instead of one, and that four of the six have start dates before the given date instead of after it. OP said his code produces the needed result and I think his description probably doesn't match what he really intends to do (he also said he wants the same code rewritten using groupby). I reproduced the logic of his code... hopefully he can step in and clarify! Here's another solution that finds all the records that start on or after the given data (the poorly named maxDate) and displays them sorted by date. selected = [rec for rec in obs if rec['realtime_start'] = maxDate] selected.sort(key=lambda rec: rec['date']) print selected -- Lark's Tongue Guide to Python: http://lightbird.net/larks/ A little bad taste is like a nice dash of paprika. Dorothy Parker ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor