[Tutor] sorting a 2 gb file
Hello! I am wondering about the best way to handle sorting some data from some of my results. I have an file in the form shown at the end (please forgive any wrapparounds due to the width of the screen here- the lines starting with ENS end with the e-12 or what have you on same line.) What I would like is to generate an output file of any other ENSE000...e-4 (or whathaveyou) lines that appear in more than one place and for each of those the queries they appear related to. So if the first line ENSE1098330.2|ENSG0013573.6|ENST0350437.2 assembly=N... etc appears as a result in any other query I would like it and the queries it appears as a result to (including the score if possible). My data set the below is taken from is over 2.4 gb so speed and memory considerations come into play. Are sets more effective than lists for this? To save space in the new file I really only need the name of the result up to the | and the score at the end for each. to simplify things, the score could be dropped, and I could check it out as needed later. As always all feedback is very appreciated. Thanks, Scott FILE: This is the number 1 query tested. Results for scoring against Query= hg17_chainMm5_chr17 range=chr1:2040-3330 5'pad=0 3'pad=0 are: ENSE1098330.2|ENSG0013573.6|ENST0350437.2 assembly=N...72 1e-12 ENSE1160046.1|ENSG0013573.6|ENST0251758.3 assembly=N...72 1e-12 ENSE1404464.1|ENSG0013573.6|ENST0228264.4 assembly=N...72 1e-12 ENSE1160046.1|ENSG0013573.6|ENST0290818.3 assembly=N...72 1e-12 ENSE1343865.2|ENSG0013573.6|ENST0350437.2 assembly=N...46 8e-05 ENSE1160049.1|ENSG0013573.6|ENST0251758.3 assembly=N...46 8e-05 ENSE1343865.2|ENSG0013573.6|ENST0228264.4 assembly=N...46 8e-05 ENSE1160049.1|ENSG0013573.6|ENST0290818.3 assembly=N...46 8e-05 This is the number 2 query tested. Results for scoring against Query= hg17_chainMm5_chr1 range=chr1:82719-95929 5'pad=0 3'pad=0 are: ENSE1373792.1|ENSG0175182.4|ENST0310585.3 assembly=N...80 6e-14 ENSE1134144.2|ENSG0160013.2|ENST0307155.2 assembly=N...78 2e-13 ENSE1433065.1|ENSG0185480.2|ENST0358383.1 assembly=N...78 2e-13 ENSE1422761.1|ENSG0183160.2|ENST0360503.1 assembly=N...74 4e-12 ENSE1431410.1|ENSG0139631.6|ENST0308926.3 assembly=N...74 4e-12 ENSE1433065.1|ENSG0185480.2|ENST0358383.1 assembly=N...72 1e-11 ENSE1411753.1|ENSG0126882.4|ENST0358329.1 assembly=N...72 1e-11 ENSE1428167.1|ENSG0110497.4|ENST0314823.4 assembly=N...72 1e-11 ENSE1401130.1|ENSG0160828.5|ENST0359898.1 assembly=N...72 1e-11 ENSE1414900.1|ENSG0176920.4|ENST0356650.1 assembly=N...72 1e-11 ENSE1428167.1|ENSG0110497.4|ENST0314823.4 assembly=N...72 1e-11 ENSE1400942.1|ENSG0138670.5|ENST0356373.1 assembly=N...72 1e-11 ENSE1400116.1|ENSG0120907.6|ENST0356368.1 assembly=N...70 6e-11 ENSE1413546.1|ENSG0184209.6|ENST0344033.2 assembly=N...70 6e-11 ENSE1433572.1|ENSG0124243.5|ENST0355583.1 assembly=N...70 6e-11 ENSE1423154.1|ENSG0125875.4|ENST0354200.1 assembly=N...70 6e-11 ENSE1400109.1|ENSG0183785.3|ENST0339190.2 assembly=N...70 6e-11 ENSE1268950.4|ENSG0084112.4|ENST0303438.2 assembly=N...68 2e-10 ENSE1057279.1|ENSG0161270.6|ENST0292886.2 assembly=N...68 2e-10 ENSE1434317.1|ENSG0171453.2|ENST0304004.2 assembly=N...68 2e-10 ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] Code review
If anyone has the time to look through an entire script, I would would be very grateful for any comments, tips or suggestions on a wiki-engine script I am working on. http://www.waywood.co.uk/cgi-bin/monkeywiki.py (this will download rather than execute) It does work, but I have not been using Python very long, and am entirely self-taught in computer programming of any sort, so I have huge doubts about my 'style'. I am also aware that I probably don't 'think like a programmer' (being, in fact a furniture maker!) I did post a previous version of this about a year(?) ago, and received some very welcome suggestions, but I have changed it quite a lot since then. Also, please ignore the licensing stuff - I do intend to make the thing available like this when I am more confident about it, and I am just getting a bit ahead of myself: you guys are the first people who know it's there. Many thanks ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
RE: [Tutor] sorting a 2 gb file
I'll just Me Too on Alan's Advice. I had a similar sized project only it was binary data in an ISAM file instead of flat ASCII. I tried several pure python methods and all took forever. Finally I used Python to read-modify-input source data into a mysql database. Then I pulled the data out via python and wrote it to a new ISAM file. The whole thing took longer to code that way but boy it sure scaled MUCH better and was much quicker in the end. John Purser -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Alan Gauld Sent: Tuesday, January 25, 2005 05:09 To: Scott Melnyk; tutor@python.org Subject: Re: [Tutor] sorting a 2 gb file My data set the below is taken from is over 2.4 gb so speed and memory considerations come into play. To be honest, if this were my problem, I'd proably dump all the data into a database and use SQL to extract what I needed. Thats a much more effective tool for this kind of thing. You can do it with Python, but I think we need more understanding of the problem. For example what the various fields represent, how much of a comparison (ie which fields, case sensitivity etc) leads to equality etc. Alan G. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] sorting a 2 gb file
Alan Gauld wrote: My data set the below is taken from is over 2.4 gb so speed and memory considerations come into play. To be honest, if this were my problem, I'd proably dump all the data into a database and use SQL to extract what I needed. Thats a much more effective tool for this kind of thing. You can do it with Python, but I think we need more understanding of the problem. For example what the various fields represent, how much of a comparison (ie which fields, case sensitivity etc) leads to equality etc. And if the idea of setting up a full-blown SQL server for the problem seems like a lot of work, you might try prototyping the sort and solutions with sqlite, and only migrate to (full-fledged RDBMS of your choice) if the prototype works as you want it too and sqlite seems too slow for your needs. Andy ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
RE: [Tutor] sorting a 2 gb file- i shrunk it and turned it around
Thanks for the thoughts so far. After posting I have been thinking about how to pare down the file (much of the info in the big file was not relevant to this question at hand). After the first couple of responses I was even more motivated to shrink the file so not have to set up a db. This test will be run only now and later to verify with another test set so the db set up seemed liked more work than might be worth it. I was able to reduce my file down about 160 mb in size by paring out every line not directly related to what I want by some simple regular expressions and a couple tests for inclusion. The format and what info is compared against what is different from my original examples as I believe this is more clear. my queries are named by the lines such as: ENSE1387275.1|ENSG0187908.1|ENST0339871.1 ENSE is an exon ENSG is the gene ENST is a transcript They all have the above format, they differ in in numbers above following ENS[E,G orT]. Each query is for a different exon. For background each gene has many exons and there are different versions of which exons are in each gene in this dataset. These different collections are the transcripts ie ENST0339871.1 in short a transcript is a version of a gene here transcript 1 may be formed of exons a,b and c transcript 2 may contain exons a,b,d the other lines (results) are of the format hg17_chainMm5_chr7_random range=chr10:124355404-124355687 5'pad=...44 0.001 hg17_chainMm5_chr14 range=chr10:124355392-124355530 5'pad=0 3'pa...44 0.001 hg17_chainMm5_chr7_random range=chr10:124355404-124355687 is the important part here from 5'pad on is not important at this point What I am trying to do is now make a list of any of the results that appear in more than one transcript ## FILE SAMPLE: This is the number 1 query tested. Results for scoring against Query= ENSE1387275.1|ENSG0187908.1|ENST0339871.1 are: hg17_chainMm5_chr7_random range=chr10:124355404-124355687 5'pad=...44 0.001 hg17_chainMm5_chr14 range=chr10:124355392-124355530 5'pad=0 3'pa...44 0.001 hg17_chainMm5_chr7 range=chr10:124355391-124355690 5'pad=0 3'pad...44 0.001 hg17_chainMm5_chr6 range=chr10:124355389-124355690 5'pad=0 3'pad...44 0.001 hg17_chainMm5_chr7 range=chr10:124355388-124355687 5'pad=0 3'pad...44 0.001 hg17_chainMm5_chr7_random range=chr10:124355388-124355719 5'pad=...44 0.001 This is the number 3 query tested. Results for scoring against Query= ENSE1365999.1|ENSG0187908.1|ENST0339871.1 are: hg17_chainMm5_chr14 range=chr10:124355392-124355530 5'pad=0 3'pa...60 2e-08 hg17_chainMm5_chr7 range=chr10:124355391-124355690 5'pad=0 3'pad...60 2e-08 hg17_chainMm5_chr6 range=chr10:124355389-124355690 5'pad=0 3'pad...60 2e-08 hg17_chainMm5_chr7 range=chr10:124355388-124355687 5'pad=0 3'pad...60 2e-08 ## I would like to generate a file that looks for any results (the hg17_etc line) that occur in more than transcript (from the query line ENSE1365999.1|ENSG0187908.1|ENST0339871.1) so if hg17_chainMm5_chr7_random range=chr10:124355404-124355687 shows up again later in the file I want to know and want to record where it is used more than once, otherwise I will ignore it. I am think another reg expression to capture the transcript id followed by something that captures each of the results, and writes to another file anytime a result appears more than once, and ties the transcript ids to them somehow. Any suggestions? I agree if I had more time and was going to be doing more of this the DB is the way to go. -As an aside I have not looked into sqlite, I am hoping to avoid the db right now, I'd have to get the sys admin to give me permission to install something again etc etc. Where as I am hoping to get this together in a reasonably short script. However I will look at it later (it could be helpful for other things for me. Thanks again to all, Scott ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] Import Site Failed Resolution
Hi, For no reason I can think of, Python 2.4 on Windows 2000 is suddenly complaining about Import site failing. I am so new to python that I have no idea what this is all about. I have been running programs successfully until today. Now, no matter what program I run, this happens. Funny thing is, if I run Python from the Run dialog, I don't see this, but perhaps it already goes by before the Python window opens. I usually get a command prompt and do python file.py. In frustration, I uninstalled Python. How can I intellegently figure out what is going on here. I follow the directions Python gives about using -v, but I don't understand what all the other output is trying to tell me. I have not touched anything in any of the Python directories. Thanks for any and all help. Jim ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] Re: Import Site Failed Resolution
Hi, Here is all the information I could get from the display of the output from this error. How do I figure out what is going on and fix the problem? This is on a Windows 2000 machine. graphic 910 C:\WINNT\system32\command.com C:\PYTHONpython -v # installing zipimport hook import zipimport # builtin # installed zipimport hook # c:\python24\lib\site.pyc matches c:\python24\lib\site.py import site # precompiled from c:\python24\lib\site.pyc import os # precompiled from os.pyc 'import site' failed; traceback: Traceback (most recent call last): File c:\python24\lib\site.py, line 61, in ? import os File c:\python24\lib\os.py, line 4, in ? - all functions from posix, nt, os2, mac, or ce, e.g. unlink, stat, etc. AttributeError: 'module' object has no attribute 'path' # c:\python24\lib\warnings.pyc matches c:\python24\lib\warnings.py import warnings # precompiled from c:\python24\lib\warnings.pyc # c:\python24\lib\types.pyc matches c:\python24\lib\types.py import types # precompiled from c:\python24\lib\types.pyc # c:\python24\lib\linecache.pyc matches c:\python24\lib\linecache.py import linecache # precompiled from c:\python24\lib\linecache.pyc import os # precompiled from os.pyc Python 2.4 (#60, Nov 30 2004, 11:49:19) [MSC v.1310 32 bit (Intel)] on win32 Type help, copyright, credits or license for more information. Thanks. Jim ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Re: Import Site Failed Resolution
OK, getting closer. import site is failing because site imports os and that import is failing. The error message points to a line that should be part of the docstring at the beginning of os.py...strange. Here are the first 22 lines from my os.py - the entire docstring. If your Python2.4\Lib\os.py doesn't look like this then you could try pasting in these lines instead, or maybe reinstalling to make sure nothing else is corrupted... Next line is start of os.py rOS routines for Mac, DOS, NT, or Posix depending on what system we're on. This exports: - all functions from posix, nt, os2, mac, or ce, e.g. unlink, stat, etc. - os.path is one of the modules posixpath, ntpath, or macpath - os.name is 'posix', 'nt', 'os2', 'mac', 'ce' or 'riscos' - os.curdir is a string representing the current directory ('.' or ':') - os.pardir is a string representing the parent directory ('..' or '::') - os.sep is the (or a most common) pathname separator ('/' or ':' or '\\') - os.extsep is the extension separator ('.' or '/') - os.altsep is the alternate pathname separator (None or '/') - os.pathsep is the component separator used in $PATH etc - os.linesep is the line separator in text files ('\r' or '\n' or '\r\n') - os.defpath is the default search path for executables - os.devnull is the file path of the null device ('/dev/null', etc.) Programs that import and use 'os' stand a better chance of being portable between different platforms. Of course, they must then only use functions that are defined by all platforms (e.g., unlink and opendir), and leave all pathname manipulation to os.path (e.g., split and join). End of snippet from os.py Kent jhomme wrote: Hi, Here is all the information I could get from the display of the output from this error. How do I figure out what is going on and fix the problem? This is on a Windows 2000 machine. graphic 910 C:\WINNT\system32\command.com C:\PYTHONpython -v # installing zipimport hook import zipimport # builtin # installed zipimport hook # c:\python24\lib\site.pyc matches c:\python24\lib\site.py import site # precompiled from c:\python24\lib\site.pyc import os # precompiled from os.pyc 'import site' failed; traceback: Traceback (most recent call last): File c:\python24\lib\site.py, line 61, in ? import os File c:\python24\lib\os.py, line 4, in ? - all functions from posix, nt, os2, mac, or ce, e.g. unlink, stat, etc. AttributeError: 'module' object has no attribute 'path' # c:\python24\lib\warnings.pyc matches c:\python24\lib\warnings.py import warnings # precompiled from c:\python24\lib\warnings.pyc # c:\python24\lib\types.pyc matches c:\python24\lib\types.py import types # precompiled from c:\python24\lib\types.pyc # c:\python24\lib\linecache.pyc matches c:\python24\lib\linecache.py import linecache # precompiled from c:\python24\lib\linecache.pyc import os # precompiled from os.pyc Python 2.4 (#60, Nov 30 2004, 11:49:19) [MSC v.1310 32 bit (Intel)] on win32 Type help, copyright, credits or license for more information. Thanks. Jim ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] sorting a 2 gb file
On Tue, 25 Jan 2005, Scott Melnyk wrote: I have an file in the form shown at the end (please forgive any wrapparounds due to the width of the screen here- the lines starting with ENS end with the e-12 or what have you on same line.) What I would like is to generate an output file of any other ENSE000...e-4 (or whathaveyou) lines that appear in more than one place and for each of those the queries they appear related to. Hi Scott, One way to do this might be to do it in two passes across the file. The first pass through the file can identify records that appear more than once. The second pass can take that knowledge, and then display those records. In pseudocode, this will look something like: ### hints = identifyDuplicateRecords(filename) displayDuplicateRecords(filename, hints) ### My data set the below is taken from is over 2.4 gb so speed and memory considerations come into play. Are sets more effective than lists for this? Sets or dictionaries make the act of lookup of a key fairly cheap. In the two-pass approach, the first pass can use a dictionary to accumulate the number of times a certain record's key has occurred. Note that, because your file is so large, the dictionary probably shouldn't accumulation the whole mass of information that we've seen so far: instead, it's sufficient to record the information we need to recognize a duplicate. If you have more questions, please feel free to ask! ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] sorting a 2 gb file
On Jan 25, 2005, at 23:40, Danny Yoo wrote: In pseudocode, this will look something like: ### hints = identifyDuplicateRecords(filename) displayDuplicateRecords(filename, hints) ### My data set the below is taken from is over 2.4 gb so speed and memory considerations come into play. Are sets more effective than lists for this? Sets or dictionaries make the act of lookup of a key fairly cheap. In the two-pass approach, the first pass can use a dictionary to accumulate the number of times a certain record's key has occurred. Note that, because your file is so large, the dictionary probably shouldn't accumulation the whole mass of information that we've seen so far: instead, it's sufficient to record the information we need to recognize a duplicate. However, the first pass will consume a lot of memory. Considering the worst-case scenario where each record only appears once, you'll find yourself with the whole 2GB file loaded into memory. (or do you have a smarter way to do this?) -- Max maxnoel_fr at yahoo dot fr -- ICQ #85274019 Look at you hacker... A pathetic creature of meat and bone, panting and sweating as you run through my corridors... How can you challenge a perfect, immortal machine? ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] ascii encoding
Ok, urllib.quote worked just fine, and of course so did urllib.pathname2url. I should have run a dir() on urllib. Those functions don't appear in http://docs.python.org/lib/module-urllib.html Now, how might one go about calculating the New York time off-set from GMT? The server is in the U.S. but time.localtime() is giving me GMT. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Read file line by line
On Tue, 25 Jan 2005, Gilbert Tsang wrote: Hey you Python coders out there: Being a Python newbie, I have this question while trying to write a script to process lines from a text file line-by-line: #!/usr/bin/python fd = open( test.txt ) content = fd.readline() while (content != ): content.replace( \n, ) # process content content = fd.readline() 1. Why does the assignment-and-test in one line not allowed in Python? For example, while ((content = fd.readline()) != ): Hi Gilbert, welcome aboard! Python's design is to make statements like assignment stand out in the source code. This is different from Perl, C, and several other languages, but I think it's the right thing in Python's case. By making it a statement, we can visually scan by eye for assignments with ease. There's nothing that really technically prevents us from doing an assignment as an expression, but Python's language designer decided that it encouraged a style of programming that made code harder to maintain. By making it a statement, it removes the possiblity of making a mistake like: ### if ((ch = getch()) = 'q') { ... } ### There are workarounds that try to reintroduce assignment as an expression: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/202234 but we strongly recommend you don't use it. *grin* 2. I know Perl is different, but there's just no equivalent of while ($line = A_FILE) { } ? Python's 'for' loop has built-in knowledge about iterable objects, and that includes files. Try using: for line in file: ... which should do the trick. Hope this helps! ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Read file line by line
There's nothing that really technically prevents us from doing an assignment as an expression, but Python's language designer decided that it encouraged a style of programming that made code harder to maintain. By making it a statement, it removes the possiblity of making a mistake like: ### if ((ch = getch()) = 'q') { ... } ### hmmm. This doesn't compile. Never mind, I screwed up. *grin* But the Python FAQ does have an entry about this topic, if you're interested: http://python.org/doc/faq/general.html#why-can-t-i-use-an-assignment-in-an-expression ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] sorting a 2 gb file
On Tue, 25 Jan 2005, Max Noel wrote: My data set the below is taken from is over 2.4 gb so speed and memory considerations come into play. Are sets more effective than lists for this? Sets or dictionaries make the act of lookup of a key fairly cheap. In the two-pass approach, the first pass can use a dictionary to accumulate the number of times a certain record's key has occurred. Note that, because your file is so large, the dictionary probably shouldn't accumulation the whole mass of information that we've seen so far: instead, it's sufficient to record the information we need to recognize a duplicate. However, the first pass will consume a lot of memory. Considering the worst-case scenario where each record only appears once, you'll find yourself with the whole 2GB file loaded into memory. (or do you have a smarter way to do this?) Hi Max, My assumptions are that each record consists of some identifying string key that's associated to some value. How are we deciding that two records are talking about the same thing? I'm hoping that the set of unique keys isn't itself very large. Under this assumption, we can do something like this: ### from sets import Set def firstPass(f): Returns a set of the duplicate keys in f. seenKeys = Set() duplicateKeys = Set() for record in f: key = extractKey(record) if key in seenKeys: duplicateKeys.add(key) else: seenKeys.add(key) return duplicateKeys ### where we don't store the whole record into memory, but only the 'key' portion of the record. And if the number of unique keys is small enough, this should be fine enough to recognize duplicate records. So on the second passthrough, we can display the duplicate records on-the-fly. If this assumption is not true, then we need to do something else. *grin* One possibility might be to implement an external sorting mechanism: http://www.nist.gov/dads/HTML/externalsort.html But if we're willing to do an external sort, then we're already doing enough work that we should really consider using a DBMS. The more complicated the data management becomes, the more attractive it becomes to use a real database to handle these data management issues. We're trying to solve a problem that is already solved by a real database management system. Talk to you later! ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] ascii encoding
On Jan 26, 2005, at 00:50, Luis N wrote: Ok, urllib.quote worked just fine, and of course so did urllib.pathname2url. I should have run a dir() on urllib. Those functions don't appear in http://docs.python.org/lib/module-urllib.html Now, how might one go about calculating the New York time off-set from GMT? The server is in the U.S. but time.localtime() is giving me GMT. time.timezone gives you, I think, the offset between your current timezone and GMT. However, being myself in the GMT zone, I don't know exactly if the returned offset is positive or negative (it returns 0 here, which makes sense :D ). -- Max maxnoel_fr at yahoo dot fr -- ICQ #85274019 Look at you hacker... A pathetic creature of meat and bone, panting and sweating as you run through my corridors... How can you challenge a perfect, immortal machine? ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] ascii encoding
In other words I have to do some arithmetic: import time time.timezone 0 The server is located in Dallas, Texas. On Wed, 26 Jan 2005 15:44:48 +1300, Tony Meyer [EMAIL PROTECTED] wrote: time.timezone gives you, I think, the offset between your current timezone and GMT. However, being myself in the GMT zone, I don't know exactly if the returned offset is positive or negative (it returns 0 here, which makes sense :D ). Whether or not it's positive or negative depends on which side of GMT/UTC you are, of course :) Note that the result in is seconds, too: import time time.timezone -43200 time.timezone/60/60 -12 (I'm in NZ, 12 hours ahead of GMT/UTC). =Tony.Meyer ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] Should this be a list comprehension or something?
The following Python code works correctly; but I can't help but wonder if my for loop is better implemented as something else: a list comprehension or something else more Pythonic. My goal here is not efficiency of the code, but efficiency in my Python thinking; so I'll be thinking, for example, ah, this should be a list comprehension instead of a knee-jerk reaction to use a for loop. Comments? The point of the code is to take a sequence of objects, each object representing an amount of water with a given mass and temperature, and to return another object that represents all the water ideally combined. The formulae for the combined mass and temp are respectively: combined mass = M1 + M2 + M3 (duh) combined temp = ((M1*T1) + (M2*T2) + (M3*T3)) / (M1 + M2 + M3) Here's my code: class Water: def __init__(self, WaterMass, WaterTemperature): self.mass = WaterMass self.temperature = WaterTemperature def __repr__(self): return (%.2f, %.2f % (self.mass, self.temperature)) def CombineWater(WaterList): totalmass=0 numerator = 0; denominator = 0 for WaterObject in WaterList: totalmass += WaterObject.mass numerator += WaterObject.mass * WaterObject.temperature return Water(totalmass, numerator/totalmass) Example use: w1 = Water(50,0) w2 = Water(50,100) w3 = Water(25,50) print CombineWater((w1,w2,w3)) prints, as expected: 125.00, 50.00 ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] ascii encoding
On Jan 26, 2005, at 02:56, Luis N wrote: In other words I have to do some arithmetic: import time time.timezone 0 The server is located in Dallas, Texas. Which means it's not properly configured. On UNIX systems, to configure the timezone, you must adjust /etc/localtime so that it's a symlink that points to the appropriate timezone in /usr/share/zoneinfo . The exact layout of the /usr/share/zoneinfo folder is probably implementation-specific, but for example, here's how it is on my Mac OS X box: [EMAIL PROTECTED] ~]% ls -l /etc/localtime lrwxr-xr-x 1 root wheel 33 25 Jan 18:58 /etc/localtime - /usr/share/zoneinfo/Europe/London -- Max maxnoel_fr at yahoo dot fr -- ICQ #85274019 Look at you hacker... A pathetic creature of meat and bone, panting and sweating as you run through my corridors... How can you challenge a perfect, immortal machine? ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Should this be a list comprehension or something?
On Jan 26, 2005, at 03:17, Terry Carroll wrote: My goal here is not efficiency of the code, but efficiency in my Python thinking; so I'll be thinking, for example, ah, this should be a list comprehension instead of a knee-jerk reaction to use a for loop. Comments? The point of the code is to take a sequence of objects, each object representing an amount of water with a given mass and temperature, and to return another object that represents all the water ideally combined. The formulae for the combined mass and temp are respectively: combined mass = M1 + M2 + M3 (duh) combined temp = ((M1*T1) + (M2*T2) + (M3*T3)) / (M1 + M2 + M3) Here's my code: class Water: def __init__(self, WaterMass, WaterTemperature): self.mass = WaterMass self.temperature = WaterTemperature def __repr__(self): return (%.2f, %.2f % (self.mass, self.temperature)) def CombineWater(WaterList): totalmass=0 numerator = 0; denominator = 0 for WaterObject in WaterList: totalmass += WaterObject.mass numerator += WaterObject.mass * WaterObject.temperature return Water(totalmass, numerator/totalmass) Well, you can do this with list comprehensions, yeah: totalmass = sum([WaterObject.mass for WaterObject in WaterList]) totaltemp = sum([WaterObject.mass * WaterObject.temp for WaterObject in WaterList]) / totalmass return Water(totalmass, totaltemp) Doesn't seem that much more Pythonic to me. I find it about as readable as your code, but someone who isn't used to list comprehensions will find that weird-looking. However, someone who uses functional programming languages a lot (Lisp, Scheme, Haskell, ML...) will be familiar with that. The actual pros of that method is that it's a functional approach and that it has less lines than your approach (you can even reduce it to a one-liner by adding a third list comprehension, but at that point it starts to look ugly). As for the cons, as I said, it may seem less readable than the original version to the non-experienced; and chances are it's slower than the original version since it has to iterate through 4 lists instead of 2. In any case, when in doubt, do what you think will be easier to maintain. -- Max maxnoel_fr at yahoo dot fr -- ICQ #85274019 Look at you hacker... A pathetic creature of meat and bone, panting and sweating as you run through my corridors... How can you challenge a perfect, immortal machine? ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] how to plot a graph
i'm going to use now the matplotlib in plotting a graph. i'm currently using python 2.3(enthought edition) on win 2000/xp. i'm using boa constructor on the GUI part. i am using an MDIParentFrame. one of the child frame will be used for the table part. then another child frame will be used to show the graph, how am i going to do this? will i just import the child frame containing the tables and then i'll be able to just get the data from the table and use it to plot a graph? how am i going to assign to a variable each input to the table? can you please show me a sample code to do this? i'm a little lost since i'm a bit new to python. also, how am i going to assign to a variable anything that a user inputs to a wxTxtCtrl? any help would greatly be appreciated. thanks and more power ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor