[Tutor] Please Help

2013-03-21 Thread Arijit Ukil
I am new to python. I like to calculate average of the numbers by reading 
the file 'digi_2.txt'. I have written the following code:

def average(s): return sum(s) * 1.0 / len(s)

f = open (digi_2.txt, r+)

list_of_lists1 = f.readlines()


for index in range(len(list_of_lists1)):
 

tt = list_of_lists1[index]

print 'Current value :', tt

avg =average (tt)


This gives an error:

def average(s): return sum(s) * 1.0 / len(s)
TypeError: unsupported operand type(s) for +: 'int' and 'str'

I also attach the file i am reading.



Please help to rectify.

Regards,
Arijit Ukil
Tata Consultancy Services
Mailto: arijit.u...@tcs.com
Website: http://www.tcs.com

Experience certainty.   IT Services
Business Solutions
Outsourcing




From:
Alan Gauld alan.ga...@btinternet.com
To:
tutor@python.org
Date:
03/21/2013 06:00 AM
Subject:
Re: [Tutor] Help
Sent by:
Tutor tutor-bounces+arijit.ukil=tcs@python.org



On 20/03/13 19:57, travis jeanfrancois wrote:
 I create a function that allows the user to a create sentence by
   inputing  a string and to end the sentence with a  period meaning
 inputing . .The problem is while keeps overwriting the previuos input

'While' does not do any such thing. Your code is doing that all by 
itself. What while does is repeat your code until a condition
becomes false or you explicitly break out of the loop.

 Here is my code:

 def B1():

Try to give your functions names that describe what they do.
B1() is meaningless, readSentence() would be better.


   period = .
   # The variable period is assigned

Its normal programming practice to put the comment above
the code not after it. Also comments should indicate why you
are doing something not what you are doing - we can see that
from the code.

   first = input(Enter the first word in your sentence )
   next1 = input(Enter the next word in you sentence or enter period:)

   # I need store the value so when while overwrites next1 with the next
 input the previous input is stored and will print output when I call it
 later along with last one
   # I believe the solution is some how implenting this expression x = x+
 variable

You could be right. Addition works for strings as well as numbers.
Although there are other (better) options but you may not have covered 
them in your class yet.

   while  next1 != (period) :

You don;t need the parentheses around period.
Also nextWord might be a better name than next1.
Saving 3 characters of typing is not usually worthwhile.

  next1  = input(Enter the next word in you sentence or enter 
period:)

Right, here you are overwriting next1. It's not the while's
fault - it is just repeating your code. It is you who are
overwriting the variable.

Notice that you are not using the first that you captured?
Maybe you should add next1 to first at some point? Then you
can safely overwrite next1 as much as you like?

  if next1 == (period):

Again you don;t need the parentheses around period

  next1 = next1 + period

Here, you add the period to next1 which the 'if' has
already established is now a period.

  print (Your sentence is:,first,next1,period)

And now you print out the first word plus next1 (= 2 periods) plus a 
period = 3 periods in total... preceded by the phrase Your sentence 
is: This tells us that the sample output you posted is not from this 
program... Always match the program and the output when debugging or you 
will be led seriously astray!

 PS : The #  is I just type so I can understand what each line does

The # is a comment marker. Comments are a very powerful tool that 
programmers use to explain to themselves and other programmers
why they have done what they have.

When trying to debug faults like this it is often worthwhile
grabbing a pen and drawing a chart of your variables and
their values after each time round the loop.
In this case it would have looked like

iterationperiod  first   next1
0.   I   am
1.   I   a
2.   I   novice
3.   I   ..

If you aren't sure of the values insert a print statement
and get the program to tell you, but working it out in
your head is more likely to show you the error.


HTH,

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


=-=-=
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any 

Re: [Tutor] Please Help

2013-03-21 Thread Sven
Please trim unrelated text from emails.

On 21 March 2013 10:42, Arijit Ukil arijit.u...@tcs.com wrote:

 I am new to python. I like to calculate average of the numbers by reading
 the file 'digi_2.txt'. I have written the following code:

 def average(s): return sum(s) * 1.0 / len(s)

 f = open (digi_2.txt, r+)

 list_of_lists1 = f.readlines()


 for index in range(len(list_of_lists1)):


 tt = list_of_lists1[index]

 print 'Current value :', tt

 avg =average (tt)


 This gives an error:

 def average(s): return sum(s) * 1.0 / len(s)
 TypeError: unsupported operand type(s) for +: 'int' and 'str'


tt is a string as it's read from the file. int(tt) would fix the problem.
But in addition you're also not actually calculating the average.


def average(s):
return sum(s) / len(s)


# convert the list of strings to a list of floats
tt = [float(x) for x in list_of_lists1]

avg = average(tt)

 --
./Sven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Please Help

2013-03-21 Thread Dave Angel

On 03/21/2013 06:42 AM, Arijit Ukil wrote:

I am new to python.


Since you're new to Python, I won't try to supply you an answer using 
list comprehensions, since you've probably not learned them yet.




I like to calculate average of the numbers by reading
the file 'digi_2.txt'. I have written the following code:

def average(s): return sum(s) * 1.0 / len(s)


This function presumably expects to be passed a list (or iterable) of 
ints or a list of floats as its argument.  It'll fail if given a list of 
strings.  A comment or docstring to that effect would be useful to 
remind yourself.




f = open (digi_2.txt, r+)


Why is there a plus sign in the mode string?  Not necessary, since 
you're just going to read the file straight through.




list_of_lists1 = f.readlines()



Not a good name, since that's not what readlines() returns.  It'll 
return a list of strings, each string representing one line of the file.




for index in range(len(list_of_lists1)):


Since your file is only one line long, this loop doesn't do much.




 tt = list_of_lists1[index]

 print 'Current value :', tt



At this point, It is the string read from the last line of the file. 
The other lines are not represented in any way.



avg =average (tt)


This gives an error:

def average(s): return sum(s) * 1.0 / len(s)
TypeError: unsupported operand type(s) for +: 'int' and 'str'

I also attach the file i am reading.

You shouldn't assume everyone can read the attached data.  Since it's 
short, you should just include it in your message.  It's MUCH shorter 
than all the irrelevant data you included at the end of your message.


For those others who may be reading this portion of the hijacked thread, 
here's the one line in digi_2.txt


1350696461, 448.0, 538660.0, 1350696466, 448.0


Now to try to solve the problem.  First, you don't specify what the 
numbers in the file will look like.  Looking at your code, I naturally 
assumed you had one value per line.  Instead I see a single line with 
multiple numbers separated by commas.


I'll assume that the data will always be in a single line, or that if 
there are multiple lines, every line but the last will end with a comma, 
and that the last one will NOT have a trailing comma.  If I don't assume 
something, the problem can't be solved.



Since we don't care about newlines, we can read the whole file into one 
string, with the read() function.



f = open (digi_2.txt, r)
filedata = f.read()
f.close()

Now we have to separate the data by the commas.

numstsrings = filedata.split(,)

And now we have to convert each of these numstring values from a 
string into a float.


nums = []
for numstring in numstrings:
nums.append(float(numstring))

Now we can call the average function, since we have a list of floats.

avg = average(nums)

Completely untested, so there may be typos in it.





--
DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Please Help

2013-03-21 Thread Amit Saha
Hi Arijit,

On Thu, Mar 21, 2013 at 8:42 PM, Arijit Ukil arijit.u...@tcs.com wrote:

 I am new to python. I like to calculate average of the numbers by reading
 the file 'digi_2.txt'. I have written the following code:

 def average(s): return sum(s) * 1.0 / len(s)

 f = open (digi_2.txt, r+)

 list_of_lists1 = f.readlines()


 for index in range(len(list_of_lists1)):


 tt = list_of_lists1[index]

 print 'Current value :', tt

 avg =average (tt)


 This gives an error:

 def average(s): return sum(s) * 1.0 / len(s)
 TypeError: unsupported operand type(s) for +: 'int' and 'str'

 I also attach the file i am reading.



 Please help to rectify.

The main issue here is that when you are reading from a file, to
Python, its all strings. And although, 'abc' + 'def' is valid, 'abc' +
5 isn't (for example). Hence, besides the fact that your average
calculation is not right, you will have to 'convert' the string to an
integer/float to do any arithmetic operation on them. (If you know C,
this is similar to typecasting). So, coming back to your program, I
will first demonstrate you a few things and then you can write the
program yourself.

If you were to break down this program into simple steps, they would be:

1. Read the lines from a file (Assume a generic case, where you have
more than one line in the file, and you have to calculate the average
for each such row)
2. Create a list of floating point numbers for each of those lines
3. And call your average function on each of these lists

You could of course do 2  3 together, so you create the list and call
the average function.

So, here is step 1:

with open('digi.txt','r') as f:
lines = f.readlines()

Please refer to
http://docs.python.org/2/tutorial/inputoutput.html#methods-of-file-objects
for an explanation of the advantage of using 'with'.

Now, you have *all* the lines of the file in 'lines'. Now, you want to
perform step 2 for each line in this file. Here you go:

for line in lines:
number_list = []
for number in line.split(','):
number_list.append(float(number))

 (To learn more about Python lists, see
http://effbot.org/zone/python-list.htm). It is certainly possible to
use the index of an element to access elements from a list, but this
is more Pythonic way of doing it. To understand this better, in the
variable 'line', you will have a list of numbers on a single line. For
example: 1350696461, 448.0, 538660.0, 1350696466, 448.0. Note how they
are separated by a ',' ? To get each element, we use the split( )
function, which returns a list of the individual numbers. (See:
http://docs.python.org/2/library/stdtypes.html#str.split). And then,
we use the .append() method to create the list. Now, you have a
number_list which is a list of floating point numbers for each line.

Now, step 2  3 combined:

for line in lines:
number_list = []
for number in line.split(','):
number_list.append(float(number))
print average(number_list)

Where average( ) is defined as:

def average(num_list):
return sum(num_list)/len(num_list)



There may be a number of unknown things I may have talked about, but i
hope the links will help you learn more and write your program now.

Good Luck.
-Amit.


--
http://amitsaha.github.com/
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Please Help

2013-03-21 Thread Dave Angel

On 03/21/2013 08:09 AM, Arijit Ukil wrote:

Thanks for the help.



You're welcome.

You replied privately, instead of including the list, so I'm forwarding 
the response so everyone can see it.  You also top-posted, so the 
context is backwards.



After running your code, I am getting the following error:

Traceback (most recent call last):
   File C:\Documents and Settings\207564\Desktop\Imp privacy analyzer\New
Folder\test\src\test.py, line 53, in ?
 nums.append(float(numstsrings))
TypeError: float() argument must be a string or a number




Check carefully, apparently the copy/paste on your machine inserts extra 
letters.


--
DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Importing data from a file.

2013-03-21 Thread Shall, Sydney

I have an elementary question provoked by another post today.

1. Is it the case that ALL imported data from a file is a string?
2. Does this therefor imply that said data has to be processed 
appropriately to generate the data in the form required by the program?

3. Are there defined procedures for doing the required processing?

With many thanks,

Sydney

--
Professor Sydney Shall,
Department of Haematological Medicine,
King's College London,
Medical School,
123 Coldharbour Lane,
LONDON SE5 9NU,
Tel  Fax: +44 (0)207 848 5902,
E-Mail: sydney.shall,
[correspondents outside the College should add; @kcl.ac.uk]
www.kcl.ac.uk

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Importing data from a file.

2013-03-21 Thread Amit Saha
On Thu, Mar 21, 2013 at 11:43 PM, Shall, Sydney sydney.sh...@kcl.ac.uk wrote:
 I have an elementary question provoked by another post today.

 1. Is it the case that ALL imported data from a file is a string?
 2. Does this therefor imply that said data has to be processed appropriately
 to generate the data in the form required by the program?

To the best of my knowledge, yes to both of your queries. Once you
have the element you want to process, you can make use of the type
converting functions (int(), float().. ) and use them appropriately.

 3. Are there defined procedures for doing the required processing?

If you meant conversion functions, int() and float() are examples of
those. You of course  (most of the times) have to make use of string
manipulation functions (strip(), rstrip(), etc) to extract the exact
data item you might be looking for. So, they would be the building
blocks for your processing functions.

I hope that makes some things clear.

-Amit.

-- 
http://amitsaha.github.com/
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Importing data from a file.

2013-03-21 Thread Shall, Sydney

On 21/03/2013 13:54, Amit Saha wrote:

On Thu, Mar 21, 2013 at 11:43 PM, Shall, Sydney sydney.sh...@kcl.ac.uk wrote:

I have an elementary question provoked by another post today.

1. Is it the case that ALL imported data from a file is a string?
2. Does this therefor imply that said data has to be processed appropriately
to generate the data in the form required by the program?

To the best of my knowledge, yes to both of your queries. Once you
have the element you want to process, you can make use of the type
converting functions (int(), float().. ) and use them appropriately.


3. Are there defined procedures for doing the required processing?

If you meant conversion functions, int() and float() are examples of
those. You of course  (most of the times) have to make use of string
manipulation functions (strip(), rstrip(), etc) to extract the exact
data item you might be looking for. So, they would be the building
blocks for your processing functions.

I hope that makes some things clear.

-Amit.


Yes, Thanks. This is now quite clear.
Sydney

--
Professor Sydney Shall,
Department of Haematological Medicine,
King's College London,
Medical School,
123 Coldharbour Lane,
LONDON SE5 9NU,
Tel  Fax: +44 (0)207 848 5902,
E-Mail: sydney.shall,
[correspondents outside the College should add; @kcl.ac.uk]
www.kcl.ac.uk

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Importing data from a file.

2013-03-21 Thread Dave Angel

On 03/21/2013 09:43 AM, Shall, Sydney wrote:

I have an elementary question provoked by another post today.

1. Is it the case that ALL imported data from a file is a string?


No, the imported data is a module.   For example
import sys
print type(sys)

type 'module'


At this point, sys is a object of type module.

Perhaps you really mean data returned by the readline() method of the 
file object.  In that case, it's a list of strings.


Or data returned from the readline() method of the file object.  That is 
a string.


Or data returned from the read() method of the file object.  The return 
type of that depends on the version of Python.


Be more specific, since the answer greatly depends on how you read this 
data.




2. Does this therefor imply that said data has to be processed
appropriately to generate the data in the form required by the program?


Again, you have to be specific.  The program might well want exactly 
what one of these methods returns.




3. Are there defined procedures for doing the required processing?



Sure, hundreds of thousands of them, most of them to be found in other 
people's programs.


Sorry, but your questions are so vague as to defy definitive answers.


--
DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Importing data from a file.

2013-03-21 Thread Robert Sjoblom
 3. Are there defined procedures for doing the required processing?

 If you meant conversion functions, int() and float() are examples of
 those. You of course  (most of the times) have to make use of string
 manipulation functions (strip(), rstrip(), etc) to extract the exact
 data item you might be looking for.

Let's add the pickle module though. Pickle converts (most) Python
objects into string representations, when you unpickle these string
representations you get back your object. A dictionary object becomes
a dictionary object and so on. Read more here:
http://docs.python.org/3.3/library/pickle.html#module-pickle


-- 
best regards,
Robert S.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Importing data from a file.

2013-03-21 Thread Dave Angel

On 03/21/2013 10:03 AM, Dave Angel wrote:

A typo below;  sorry.


On 03/21/2013 09:43 AM, Shall, Sydney wrote:

I have an elementary question provoked by another post today.

1. Is it the case that ALL imported data from a file is a string?


No, the imported data is a module.   For example
 import sys
 print type(sys)

type 'module'


At this point, sys is a object of type module.

Perhaps you really mean data returned by the readline() method of the


Perhaps you really mean data returned by the readlines() method of the


file object.  In that case, it's a list of strings.

Or data returned from the readline() method of the file object.  That is
a string.

Or data returned from the read() method of the file object.  The return
type of that depends on the version of Python.

Be more specific, since the answer greatly depends on how you read this
data.



2. Does this therefor imply that said data has to be processed
appropriately to generate the data in the form required by the program?


Again, you have to be specific.  The program might well want exactly
what one of these methods returns.



3. Are there defined procedures for doing the required processing?



Sure, hundreds of thousands of them, most of them to be found in other
people's programs.

Sorry, but your questions are so vague as to defy definitive answers.





--
DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Importing data from a file.

2013-03-21 Thread Alan Gauld

On 21/03/13 13:43, Shall, Sydney wrote:

I have an elementary question provoked by another post today.

1. Is it the case that ALL imported data from a file is a string?


Assuming you mean data read from a file rather than modules imported 
using 'import' then the answer is 'it depends'.


Most files are text files and the data is stored as strings and 
therefore when you read them back they will be strings. You then convert 
them to the native data using int(), float() etc.


Some files are binary files and then the data read back will be bytes 
and need to be decoded into the original data. This is often done using 
the struct module.


Either way if you use the Python read() operation on a file
you will get back a bunch of bytes. What those bytes represent depends 
on how they were written. How they are interpreted is down to the 
programmer.




2. Does this therefor imply that said data has to be processed
appropriately to generate the data in the form required by the program?


Yes, always.


3. Are there defined procedures for doing the required processing?


Yes, for the standard types. For custom types and arbitrary binary data 
you need to find out what the original encoding was and reverse it.


HTH,

--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Importing data from a file.

2013-03-21 Thread Shall, Sydney

On 21/03/2013 16:17, Alan Gauld wrote:

On 21/03/13 13:43, Shall, Sydney wrote:

I have an elementary question provoked by another post today.

1. Is it the case that ALL imported data from a file is a string?


Assuming you mean data read from a file rather than modules imported 
using 'import' then the answer is 'it depends'.


Most files are text files and the data is stored as strings and 
therefore when you read them back they will be strings. You then 
convert them to the native data using int(), float() etc.


Some files are binary files and then the data read back will be bytes 
and need to be decoded into the original data. This is often done 
using the struct module.


Either way if you use the Python read() operation on a file
you will get back a bunch of bytes. What those bytes represent depends 
on how they were written. How they are interpreted is down to the 
programmer.




2. Does this therefor imply that said data has to be processed
appropriately to generate the data in the form required by the program?


Yes, always.


3. Are there defined procedures for doing the required processing?


Yes, for the standard types. For custom types and arbitrary binary 
data you need to find out what the original encoding was and reverse it.


HTH,


Thank you Alan, That was most useful.
Cheers,
Sydney

--
Professor Sydney Shall,
Department of Haematological Medicine,
King's College London,
Medical School,
123 Coldharbour Lane,
LONDON SE5 9NU,
Tel  Fax: +44 (0)207 848 5902,
E-Mail: sydney.shall,
[correspondents outside the College should add; @kcl.ac.uk]
www.kcl.ac.uk

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Help with iterators

2013-03-21 Thread Matthew Johnson
Dear list,

I have been trying to understand out how to use iterators and in
particular groupby statements.  I am, however, quite lost.

I wish to subset the below list, selecting the observations that have
an ID ('realtime_start') value that is greater than some date (i've
used the variable name maxDate), and in the case that there is more
than one such record, returning only the one that has the largest ID
('realtime_start').

The code below does the job, however i have the impression that it
might be done in a more python way using iterators and groupby
statements.

could someone please help me understand how to go from this code to
the pythonic idiom?

thanks in advance,

Matt Johnson

_

## Code example

import pprint

obs = [{'date': '2012-09-01',
  'realtime_end': '2013-02-18',
  'realtime_start': '2012-10-15',
  'value': '231.951'},
 {'date': '2012-09-01',
  'realtime_end': '2013-02-18',
  'realtime_start': '2012-11-15',
  'value': '231.881'},
 {'date': '2012-10-01',
  'realtime_end': '2013-02-18',
  'realtime_start': '2012-11-15',
  'value': '231.751'},
 {'date': '2012-10-01',
  'realtime_end': '-12-31',
  'realtime_start': '2012-12-19',
  'value': '231.623'},
 {'date': '2013-02-01',
  'realtime_end': '-12-31',
  'realtime_start': '2013-03-21',
  'value': '231.157'},
 {'date': '2012-11-01',
  'realtime_end': '2013-02-18',
  'realtime_start': '2012-12-14',
  'value': '231.025'},
 {'date': '2012-11-01',
  'realtime_end': '-12-31',
  'realtime_start': '2013-01-19',
  'value': '231.071'},
 {'date': '2012-12-01',
  'realtime_end': '2013-02-18',
  'realtime_start': '2013-01-16',
  'value': '230.979'},
 {'date': '2012-12-01',
  'realtime_end': '-12-31',
  'realtime_start': '2013-02-19',
  'value': '231.137'},
 {'date': '2012-12-01',
  'realtime_end': '-12-31',
  'realtime_start': '2013-03-19',
  'value': '231.197'},
 {'date': '2013-01-01',
  'realtime_end': '-12-31',
  'realtime_start': '2013-02-21',
  'value': '231.198'},
 {'date': '2013-01-01',
  'realtime_end': '-12-31',
  'realtime_start': '2013-03-21',
  'value': '231.222'}]

maxDate = 2013-03-21

dobs = dict([(d, []) for d in set([e['date'] for e in obs])])

for o in obs:
dobs[o['date']].append(o)

dobs_subMax = dict([(k, [d for d in v if d['realtime_start'] = maxDate])
for k, v in dobs.items()])

rts = lambda x: x['realtime_start']

mmax = [sorted(e, key=rts)[-1] for e in dobs_subMax.values() if e]

mmax.sort(key = lambda x: x['date'])

pprint.pprint(mmax)
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Help with iterators

2013-03-21 Thread Mitya Sirenef

On 03/21/2013 08:39 PM, Matthew Johnson wrote:

Dear list,


 I have been trying to understand out how to use iterators and in
 particular groupby statements. I am, however, quite lost.

 I wish to subset the below list, selecting the observations that have
 an ID ('realtime_start') value that is greater than some date (i've
 used the variable name maxDate), and in the case that there is more
 than one such record, returning only the one that has the largest ID
 ('realtime_start').

 The code below does the job, however i have the impression that it
 might be done in a more python way using iterators and groupby
 statements.

 could someone please help me understand how to go from this code to
 the pythonic idiom?

 thanks in advance,

 Matt Johnson

 _

 ## Code example

 import pprint

 obs = [{'date': '2012-09-01',
 'realtime_end': '2013-02-18',
 'realtime_start': '2012-10-15',
 'value': '231.951'},
 {'date': '2012-09-01',
 'realtime_end': '2013-02-18',
 'realtime_start': '2012-11-15',
 'value': '231.881'},
 {'date': '2012-10-01',
 'realtime_end': '2013-02-18',
 'realtime_start': '2012-11-15',
 'value': '231.751'},
 {'date': '2012-10-01',
 'realtime_end': '-12-31',
 'realtime_start': '2012-12-19',
 'value': '231.623'},
 {'date': '2013-02-01',
 'realtime_end': '-12-31',
 'realtime_start': '2013-03-21',
 'value': '231.157'},
 {'date': '2012-11-01',
 'realtime_end': '2013-02-18',
 'realtime_start': '2012-12-14',
 'value': '231.025'},
 {'date': '2012-11-01',
 'realtime_end': '-12-31',
 'realtime_start': '2013-01-19',
 'value': '231.071'},
 {'date': '2012-12-01',
 'realtime_end': '2013-02-18',
 'realtime_start': '2013-01-16',
 'value': '230.979'},
 {'date': '2012-12-01',
 'realtime_end': '-12-31',
 'realtime_start': '2013-02-19',
 'value': '231.137'},
 {'date': '2012-12-01',
 'realtime_end': '-12-31',
 'realtime_start': '2013-03-19',
 'value': '231.197'},
 {'date': '2013-01-01',
 'realtime_end': '-12-31',
 'realtime_start': '2013-02-21',
 'value': '231.198'},
 {'date': '2013-01-01',
 'realtime_end': '-12-31',
 'realtime_start': '2013-03-21',
 'value': '231.222'}]

 maxDate = 2013-03-21

 dobs = dict([(d, []) for d in set([e['date'] for e in obs])])

 for o in obs:
 dobs[o['date']].append(o)

 dobs_subMax = dict([(k, [d for d in v if d['realtime_start'] = maxDate])
 for k, v in dobs.items()])

 rts = lambda x: x['realtime_start']

 mmax = [sorted(e, key=rts)[-1] for e in dobs_subMax.values() if e]

 mmax.sort(key = lambda x: x['date'])

 pprint.pprint(mmax)
 ___
 Tutor maillist - Tutor@python.org
 To unsubscribe or change subscription options:
 http://mail.python.org/mailman/listinfo/tutor



You can do it with groupby like so:


from itertools import groupby
from operator import itemgetter


maxDate = 2013-03-21
mmax= list()

obs.sort(key=itemgetter('date'))

for k, group in groupby(obs, key=itemgetter('date')):
group = [dob for dob in group if dob['realtime_start'] = maxDate]
if group:
group.sort(key=itemgetter('realtime_start'))
mmax.append(group[-1])

pprint.pprint(mmax)


Note that writing multiply-nested comprehensions like you did results in
very unreadable code. Do you find this code more readable?

 -m


--
Lark's Tongue Guide to Python: http://lightbird.net/larks/

Many a man fails as an original thinker simply because his memory it too
good.  Friedrich Nietzsche

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Help with iterators

2013-03-21 Thread Steven D'Aprano

On 22/03/13 11:39, Matthew Johnson wrote:

Dear list,

I have been trying to understand out how to use iterators and in
particular groupby statements.  I am, however, quite lost.


groupby is a very specialist function which is not very intuitive to
use. Sometimes I think that groupby is an excellent solution in search
of a problem.



I wish to subset the below list, selecting the observations that have
an ID ('realtime_start') value that is greater than some date (i've
used the variable name maxDate), and in the case that there is more
than one such record, returning only the one that has the largest ID
('realtime_start').



The code that you show does not so what you describe here. The most
obvious difference is that it doesn't return or display a single record,
but shows multiple records.

In your case, it selects six records, four of which have a realtime_start
that occurs BEFORE the given maxDate.

To solve the problem you describe here, of finding at most a single
record, the solution is much simpler than what you have done. Prepare a
list of observations, sorted by realtime_start. Take the latest such
observation. If the realtime_start is greater than the maxDate, you have
your answer. If not, there is no answer.

The simplest solution is usually the best. The simpler your code, the fewer
bugs it will contain.


obs.sort(key=lambda rec: rec['realtime_start'])
rec = obs[-1]
if rec['realtime_start']  maxDate:
print rec
else:
print no record found


which prints:

{'date': '2013-01-01', 'realtime_start': '2013-03-21', 'realtime_end': 
'-12-31', 'value': '231.222'}




--
Steven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Help with iterators

2013-03-21 Thread Steven D'Aprano

On 22/03/13 12:39, Mitya Sirenef wrote:


You can do it with groupby like so:


from itertools import groupby
from operator import itemgetter

maxDate = 2013-03-21
mmax= list()

obs.sort(key=itemgetter('date'))

for k, group in groupby(obs, key=itemgetter('date')):
 group = [dob for dob in group if dob['realtime_start'] = maxDate]
 if group:
 group.sort(key=itemgetter('realtime_start'))
 mmax.append(group[-1])

pprint.pprint(mmax)



This suffers from the same problem of finding six records instead of one,
and that four of the six have start dates before the given date instead
of after it.

Here's another solution that finds all the records that start on or after
the given data (the poorly named maxDate) and displays them sorted by
date.


selected = [rec for rec in obs if rec['realtime_start'] = maxDate]
selected.sort(key=lambda rec: rec['date'])
print selected




--
Steven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Help with iterators

2013-03-21 Thread Mitya Sirenef

On 03/21/2013 10:20 PM, Steven D'Aprano wrote:

On 22/03/13 12:39, Mitya  Sirenef wrote:


 You can do it with groupby like so:


 from itertools import groupby
 from operator import itemgetter

 maxDate = 2013-03-21
 mmax = list()

 obs.sort(key=itemgetter('date'))

 for k, group in groupby(obs, key=itemgetter('date')):
 group = [dob for dob in group if dob['realtime_start'] = maxDate]
 if group:
 group.sort(key=itemgetter('realtime_start'))
 mmax.append(group[-1])

 pprint.pprint(mmax)


 This suffers from the same problem of finding six records instead of one,
 and that four of the six have start dates before the given date instead
 of after it.


OP said his code produces the needed result and I think his description
probably doesn't match what he really intends to do (he also said he
wants the same code rewritten using groupby). I reproduced the logic of
his code... hopefully he can step in and clarify!






 Here's another solution that finds all the records that start on or after
 the given data (the poorly named maxDate) and displays them sorted by
 date.


 selected = [rec for rec in obs if rec['realtime_start'] = maxDate]
 selected.sort(key=lambda rec: rec['date'])
 print selected






--
Lark's Tongue Guide to Python: http://lightbird.net/larks/

A little bad taste is like a nice dash of paprika.
Dorothy Parker

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor