[Tutor] sorting a 2 gb file

2005-01-25 Thread Scott Melnyk
Hello!

I am wondering about the best way to handle sorting some data from
some of my results.

I have an file in the form shown at the end  (please forgive any
wrapparounds due to the width of the  screen here- the lines starting
with ENS end with the e-12 or what have you on same line.)

What I would like is to generate an output file of  any other
ENSE000...e-4 (or whathaveyou) lines that appear in more than one
place and for each of those the queries they appear related to.

So if the first line
ENSE1098330.2|ENSG0013573.6|ENST0350437.2 assembly=N...
etc appears as a result in any other query I would like it and the
queries it appears as a result to (including the score if possible).

My data set the below is taken from is over 2.4 gb so speed and memory
considerations come into play.  Are sets more effective than lists for
this?  To save space in the new file I really only need the name of
the result up to the | and the score at the end for each.
to simplify things, the score could be dropped, and I could check it
out as needed later.

As always all feedback is very appreciated. 

Thanks,
Scott

FILE:

This is the number 1  query tested.
Results for scoring against Query= hg17_chainMm5_chr17
range=chr1:2040-3330 5'pad=0 3'pad=0
 are: 

ENSE1098330.2|ENSG0013573.6|ENST0350437.2 assembly=N...72  1e-12
ENSE1160046.1|ENSG0013573.6|ENST0251758.3 assembly=N...72  1e-12
ENSE1404464.1|ENSG0013573.6|ENST0228264.4 assembly=N...72  1e-12
ENSE1160046.1|ENSG0013573.6|ENST0290818.3 assembly=N...72  1e-12
ENSE1343865.2|ENSG0013573.6|ENST0350437.2 assembly=N...46  8e-05
ENSE1160049.1|ENSG0013573.6|ENST0251758.3 assembly=N...46  8e-05
ENSE1343865.2|ENSG0013573.6|ENST0228264.4 assembly=N...46  8e-05
ENSE1160049.1|ENSG0013573.6|ENST0290818.3 assembly=N...46  8e-05

This is the number 2  query tested.
Results for scoring against Query= hg17_chainMm5_chr1
range=chr1:82719-95929 5'pad=0 3'pad=0
 are: 

ENSE1373792.1|ENSG0175182.4|ENST0310585.3 assembly=N...80  6e-14
ENSE1134144.2|ENSG0160013.2|ENST0307155.2 assembly=N...78  2e-13
ENSE1433065.1|ENSG0185480.2|ENST0358383.1 assembly=N...78  2e-13
ENSE1422761.1|ENSG0183160.2|ENST0360503.1 assembly=N...74  4e-12
ENSE1431410.1|ENSG0139631.6|ENST0308926.3 assembly=N...74  4e-12
ENSE1433065.1|ENSG0185480.2|ENST0358383.1 assembly=N...72  1e-11
ENSE1411753.1|ENSG0126882.4|ENST0358329.1 assembly=N...72  1e-11
ENSE1428167.1|ENSG0110497.4|ENST0314823.4 assembly=N...72  1e-11
ENSE1401130.1|ENSG0160828.5|ENST0359898.1 assembly=N...72  1e-11
ENSE1414900.1|ENSG0176920.4|ENST0356650.1 assembly=N...72  1e-11
ENSE1428167.1|ENSG0110497.4|ENST0314823.4 assembly=N...72  1e-11
ENSE1400942.1|ENSG0138670.5|ENST0356373.1 assembly=N...72  1e-11
ENSE1400116.1|ENSG0120907.6|ENST0356368.1 assembly=N...70  6e-11
ENSE1413546.1|ENSG0184209.6|ENST0344033.2 assembly=N...70  6e-11
ENSE1433572.1|ENSG0124243.5|ENST0355583.1 assembly=N...70  6e-11
ENSE1423154.1|ENSG0125875.4|ENST0354200.1 assembly=N...70  6e-11
ENSE1400109.1|ENSG0183785.3|ENST0339190.2 assembly=N...70  6e-11
ENSE1268950.4|ENSG0084112.4|ENST0303438.2 assembly=N...68  2e-10
ENSE1057279.1|ENSG0161270.6|ENST0292886.2 assembly=N...68  2e-10
ENSE1434317.1|ENSG0171453.2|ENST0304004.2 assembly=N...68  2e-10
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Code review

2005-01-25 Thread Barnaby Scott
If anyone has the time to look through an entire script, I would would be
very grateful for any comments, tips or suggestions on a wiki-engine script
I am working on.

http://www.waywood.co.uk/cgi-bin/monkeywiki.py (this will download rather
than execute)

It does work, but I have not been using Python very long, and am entirely
self-taught in computer programming of any sort, so I have huge doubts about
my 'style'. I am also aware that I probably don't 'think like a programmer'
(being, in fact a furniture maker!)

I did post a previous version of this about a year(?) ago, and received some
very welcome suggestions, but I have changed it quite a lot since then.
Also, please ignore the licensing stuff - I do intend to make the thing
available like this when I am more confident about it, and I am just getting
a bit ahead of myself: you guys are the first people who know it's there.

Many thanks

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


RE: [Tutor] sorting a 2 gb file

2005-01-25 Thread John Purser
I'll just Me Too on Alan's Advice.  I had a similar sized project only it
was binary data in an ISAM file instead of flat ASCII.  I tried several
pure python methods and all took forever.  Finally I used Python to
read-modify-input source data into a mysql database.  Then I pulled the data
out via python and wrote it to a new ISAM file.  The whole thing took longer
to code that way but boy it sure scaled MUCH better and was much quicker in
the end.

John Purser

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf
Of Alan Gauld
Sent: Tuesday, January 25, 2005 05:09
To: Scott Melnyk; tutor@python.org
Subject: Re: [Tutor] sorting a 2 gb file

 My data set the below is taken from is over 2.4 gb so speed and
memory
 considerations come into play.

To be honest, if this were my problem, I'd proably dump all the data
into a database and use SQL to extract what I needed. Thats a much
more effective tool for this kind of thing.

You can do it with Python, but I think we need more understanding
of the problem. For example what the various fields represent, how
much of a comparison (ie which fields, case sensitivity etc) leads
to equality etc.

Alan G.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] sorting a 2 gb file

2005-01-25 Thread Andrew D. Fant
Alan Gauld wrote:
My data set the below is taken from is over 2.4 gb so speed and
memory
considerations come into play.

To be honest, if this were my problem, I'd proably dump all the data
into a database and use SQL to extract what I needed. Thats a much
more effective tool for this kind of thing.
You can do it with Python, but I think we need more understanding
of the problem. For example what the various fields represent, how
much of a comparison (ie which fields, case sensitivity etc) leads
to equality etc.

And if the idea of setting up a full-blown SQL server for the problem 
seems like a lot of work, you might try prototyping the sort and 
solutions with sqlite, and only migrate to (full-fledged RDBMS of your 
choice) if the prototype works as you want it too and sqlite seems too 
slow for your needs.

Andy
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


RE: [Tutor] sorting a 2 gb file- i shrunk it and turned it around

2005-01-25 Thread Scott Melnyk
Thanks for the thoughts so far.  After posting I have been thinking
about how to pare down the file (much of the info in the big file was
not relevant to this question at hand).

After the first couple of responses I was even more motivated to
shrink the file so not have to set up a db. This test will be run only
now and later to verify with another test set so the db set up seemed
liked more work than might be worth it.

I was able to reduce my file down about 160 mb in size by paring out
every line not directly related to what I want by some simple regular
expressions and a couple tests for inclusion.

The format and what info is compared against what is different from my
original examples as I believe this is more clear.


my queries are named by the lines such as:
ENSE1387275.1|ENSG0187908.1|ENST0339871.1
ENSE is an exon   ENSG is the gene ENST is a transcript

They all have the above format, they differ in in numbers above
following ENS[E,G orT].

Each query is for a different exon.  For background each gene has many
exons and there are different versions of which exons are in each gene
in this dataset.  These different collections are the transcripts ie
ENST0339871.1

in short a transcript is a version of a gene here
transcript 1 may be formed of  exons a,b and c 
transcript 2 may contain exons a,b,d 



the other lines (results) are of the format
hg17_chainMm5_chr7_random range=chr10:124355404-124355687 5'pad=...44  0.001
hg17_chainMm5_chr14 range=chr10:124355392-124355530 5'pad=0 3'pa...44  0.001

hg17_chainMm5_chr7_random range=chr10:124355404-124355687 is the
important part here from 5'pad on is not important at this point


What I am trying to do is now make a list of any of the results that
appear in more than one transcript

##
FILE SAMPLE:

This is the number 1  query tested.
Results for scoring against Query=
ENSE1387275.1|ENSG0187908.1|ENST0339871.1
 are: 

hg17_chainMm5_chr7_random range=chr10:124355404-124355687 5'pad=...44  0.001
hg17_chainMm5_chr14 range=chr10:124355392-124355530 5'pad=0 3'pa...44  0.001
hg17_chainMm5_chr7 range=chr10:124355391-124355690 5'pad=0 3'pad...44  0.001
hg17_chainMm5_chr6 range=chr10:124355389-124355690 5'pad=0 3'pad...44  0.001
hg17_chainMm5_chr7 range=chr10:124355388-124355687 5'pad=0 3'pad...44  0.001
hg17_chainMm5_chr7_random range=chr10:124355388-124355719 5'pad=...44  0.001



This is the number 3  query tested.
Results for scoring against Query=
ENSE1365999.1|ENSG0187908.1|ENST0339871.1
 are: 

hg17_chainMm5_chr14 range=chr10:124355392-124355530 5'pad=0 3'pa...60  2e-08
hg17_chainMm5_chr7 range=chr10:124355391-124355690 5'pad=0 3'pad...60  2e-08
hg17_chainMm5_chr6 range=chr10:124355389-124355690 5'pad=0 3'pad...60  2e-08
hg17_chainMm5_chr7 range=chr10:124355388-124355687 5'pad=0 3'pad...60  2e-08

##

I would like to generate a file that looks for any results (the
hg17_etc  line) that occur in more than transcript (from the query
line ENSE1365999.1|ENSG0187908.1|ENST0339871.1)


so if  
hg17_chainMm5_chr7_random range=chr10:124355404-124355687 
 shows up again later in the file I want to know and want to record
where it is used more than once, otherwise I will ignore it.

I am think another reg expression to capture the transcript id
followed by  something that captures each of the results, and writes
to another file anytime a result appears more than once, and ties the
transcript ids to them somehow.

Any suggestions?
I agree if I had more time and was going to be doing more of this the
DB is the way to go.
-As an aside I have not looked into sqlite, I am hoping to avoid the
db right now, I'd have to get the sys admin to give me permission to
install something again etc etc. Where as I am hoping to get this
together in a reasonably short script.

 However I will look at it later (it could be helpful for other things for me.

Thanks again to all,  
Scott
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Import Site Failed Resolution

2005-01-25 Thread jhomme
Hi,
For no reason I can think of, Python 2.4 on Windows 2000 is suddenly 
complaining about Import site failing. I am so new to python that I have no 
idea what this is all about. I have been running programs successfully until 
today. Now, no matter what program I run, this happens. Funny thing is, if I 
run Python from the Run dialog, I don't see this, but perhaps it already goes 
by before the Python window opens. I usually get a command prompt and do python 
file.py. In frustration, I uninstalled Python. How can I intellegently figure 
out what is going on here. I follow the directions Python gives about using -v, 
but I don't understand what all the other output is trying to tell me. I have 
not touched anything in any of the Python directories.

Thanks for any and all help.

Jim
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Re: Import Site Failed Resolution

2005-01-25 Thread jhomme
Hi,
Here is all the information I could get from the display of the output from 
this error. How do I figure out what is going on and fix the problem? This is 
on a Windows 2000 machine.

graphic 910  C:\WINNT\system32\command.com
C:\PYTHONpython -v
# installing zipimport hook
import zipimport # builtin
# installed zipimport hook
# c:\python24\lib\site.pyc matches c:\python24\lib\site.py
import site # precompiled from c:\python24\lib\site.pyc
import os # precompiled from os.pyc
'import site' failed; traceback:
Traceback (most recent call last):
File c:\python24\lib\site.py, line 61, in ?
import os
File c:\python24\lib\os.py, line 4, in ?
- all functions from posix, nt, os2, mac, or ce, e.g. unlink, stat, etc.
AttributeError: 'module' object has no attribute 'path'
# c:\python24\lib\warnings.pyc matches c:\python24\lib\warnings.py
import warnings # precompiled from c:\python24\lib\warnings.pyc
# c:\python24\lib\types.pyc matches c:\python24\lib\types.py
import types # precompiled from c:\python24\lib\types.pyc
# c:\python24\lib\linecache.pyc matches c:\python24\lib\linecache.py
import linecache # precompiled from c:\python24\lib\linecache.pyc
import os # precompiled from os.pyc
Python 2.4 (#60, Nov 30 2004, 11:49:19) [MSC v.1310 32 bit (Intel)] on win32
Type help, copyright, credits or license for more information.


Thanks.

Jim

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Re: Import Site Failed Resolution

2005-01-25 Thread Kent Johnson
OK, getting closer. import site is failing because site imports os and that import is failing. The 
error message points to a line that should be part of the docstring at the beginning of os.py...strange.

Here are the first 22 lines from my os.py - the entire docstring. If your Python2.4\Lib\os.py 
doesn't look like this then you could try pasting in these lines instead, or maybe reinstalling to 
make sure nothing else is corrupted...

 Next line is start of os.py 
rOS routines for Mac, DOS, NT, or Posix depending on what system we're on.
This exports:
  - all functions from posix, nt, os2, mac, or ce, e.g. unlink, stat, etc.
  - os.path is one of the modules posixpath, ntpath, or macpath
  - os.name is 'posix', 'nt', 'os2', 'mac', 'ce' or 'riscos'
  - os.curdir is a string representing the current directory ('.' or ':')
  - os.pardir is a string representing the parent directory ('..' or '::')
  - os.sep is the (or a most common) pathname separator ('/' or ':' or '\\')
  - os.extsep is the extension separator ('.' or '/')
  - os.altsep is the alternate pathname separator (None or '/')
  - os.pathsep is the component separator used in $PATH etc
  - os.linesep is the line separator in text files ('\r' or '\n' or '\r\n')
  - os.defpath is the default search path for executables
  - os.devnull is the file path of the null device ('/dev/null', etc.)
Programs that import and use 'os' stand a better chance of being
portable between different platforms.  Of course, they must then
only use functions that are defined by all platforms (e.g., unlink
and opendir), and leave all pathname manipulation to os.path
(e.g., split and join).

 End of snippet from os.py 
Kent

jhomme wrote:
Hi,
Here is all the information I could get from the display of the output from 
this error. How do I figure out what is going on and fix the problem? This is 
on a Windows 2000 machine.
graphic 910  C:\WINNT\system32\command.com
C:\PYTHONpython -v
# installing zipimport hook
import zipimport # builtin
# installed zipimport hook
# c:\python24\lib\site.pyc matches c:\python24\lib\site.py
import site # precompiled from c:\python24\lib\site.pyc
import os # precompiled from os.pyc
'import site' failed; traceback:
Traceback (most recent call last):
File c:\python24\lib\site.py, line 61, in ?
import os
File c:\python24\lib\os.py, line 4, in ?
- all functions from posix, nt, os2, mac, or ce, e.g. unlink, stat, etc.
AttributeError: 'module' object has no attribute 'path'
# c:\python24\lib\warnings.pyc matches c:\python24\lib\warnings.py
import warnings # precompiled from c:\python24\lib\warnings.pyc
# c:\python24\lib\types.pyc matches c:\python24\lib\types.py
import types # precompiled from c:\python24\lib\types.pyc
# c:\python24\lib\linecache.pyc matches c:\python24\lib\linecache.py
import linecache # precompiled from c:\python24\lib\linecache.pyc
import os # precompiled from os.pyc
Python 2.4 (#60, Nov 30 2004, 11:49:19) [MSC v.1310 32 bit (Intel)] on win32
Type help, copyright, credits or license for more information.
Thanks.
Jim
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] sorting a 2 gb file

2005-01-25 Thread Danny Yoo


On Tue, 25 Jan 2005, Scott Melnyk wrote:


 I have an file in the form shown at the end (please forgive any
 wrapparounds due to the width of the screen here- the lines starting
 with ENS end with the e-12 or what have you on same line.)

 What I would like is to generate an output file of  any other
 ENSE000...e-4 (or whathaveyou) lines that appear in more than one
 place and for each of those the queries they appear related to.

Hi Scott,

One way to do this might be to do it in two passes across the file.

The first pass through the file can identify records that appear more than
once.  The second pass can take that knowledge, and then display those
records.

In pseudocode, this will look something like:

###
hints = identifyDuplicateRecords(filename)
displayDuplicateRecords(filename, hints)
###



 My data set the below is taken from is over 2.4 gb so speed and memory
 considerations come into play.

 Are sets more effective than lists for this?

Sets or dictionaries make the act of lookup of a key fairly cheap.  In
the two-pass approach, the first pass can use a dictionary to accumulate
the number of times a certain record's key has occurred.

Note that, because your file is so large, the dictionary probably
shouldn't accumulation the whole mass of information that we've seen so
far: instead, it's sufficient to record the information we need to
recognize a duplicate.


If you have more questions, please feel free to ask!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] sorting a 2 gb file

2005-01-25 Thread Max Noel
On Jan 25, 2005, at 23:40, Danny Yoo wrote:
In pseudocode, this will look something like:
###
hints = identifyDuplicateRecords(filename)
displayDuplicateRecords(filename, hints)
###

My data set the below is taken from is over 2.4 gb so speed and memory
considerations come into play.
Are sets more effective than lists for this?
Sets or dictionaries make the act of lookup of a key fairly cheap.  
In
the two-pass approach, the first pass can use a dictionary to 
accumulate
the number of times a certain record's key has occurred.

Note that, because your file is so large, the dictionary probably
shouldn't accumulation the whole mass of information that we've seen so
far: instead, it's sufficient to record the information we need to
recognize a duplicate.
	However, the first pass will consume a lot of memory. Considering the 
worst-case scenario where each record only appears once, you'll find 
yourself with the whole 2GB file loaded into memory.
	(or do you have a smarter way to do this?)

-- Max
maxnoel_fr at yahoo dot fr -- ICQ #85274019
Look at you hacker... A pathetic creature of meat and bone, panting 
and sweating as you run through my corridors... How can you challenge a 
perfect, immortal machine?

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] ascii encoding

2005-01-25 Thread Luis N
Ok, urllib.quote worked just fine, and of course so did urllib.pathname2url.

I should have run a dir() on urllib. Those functions don't appear in
http://docs.python.org/lib/module-urllib.html

Now, how might one go about calculating the New York time off-set from
GMT? The server is in the U.S. but time.localtime() is giving me GMT.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Read file line by line

2005-01-25 Thread Danny Yoo


On Tue, 25 Jan 2005, Gilbert Tsang wrote:

 Hey you Python coders out there:

 Being a Python newbie, I have this question while trying to write a
 script to process lines from a text file line-by-line:

 #!/usr/bin/python
 fd = open( test.txt )
 content = fd.readline()
 while (content !=  ):
 content.replace( \n,  )
 # process content
 content = fd.readline()

 1. Why does the assignment-and-test in one line not allowed in Python?
 For example, while ((content = fd.readline()) != ):


Hi Gilbert, welcome aboard!

Python's design is to make statements like assignment stand out in the
source code.  This is different from Perl, C, and several other languages,
but I think it's the right thing in Python's case.  By making it a
statement, we can visually scan by eye for assignments with ease.


There's nothing that really technically prevents us from doing an
assignment as an expression, but Python's language designer decided that
it encouraged a style of programming that made code harder to maintain.
By making it a statement, it removes the possiblity of making a mistake
like:

###
if ((ch = getch()) = 'q') { ... }
###



There are workarounds that try to reintroduce assignment as an expression:

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/202234

but we strongly recommend you don't use it.  *grin*




 2. I know Perl is different, but there's just no equivalent of while
 ($line = A_FILE) { } ?

Python's 'for' loop has built-in knowledge about iterable objects, and
that includes files.  Try using:

for line in file:
...

which should do the trick.


Hope this helps!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Read file line by line

2005-01-25 Thread Danny Yoo


 There's nothing that really technically prevents us from doing an
 assignment as an expression, but Python's language designer decided that
 it encouraged a style of programming that made code harder to maintain.
 By making it a statement, it removes the possiblity of making a mistake
 like:

 ###
 if ((ch = getch()) = 'q') { ... }
 ###

hmmm.  This doesn't compile.  Never mind, I screwed up.  *grin*


But the Python FAQ does have an entry about this topic, if you're
interested:

http://python.org/doc/faq/general.html#why-can-t-i-use-an-assignment-in-an-expression

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] sorting a 2 gb file

2005-01-25 Thread Danny Yoo


On Tue, 25 Jan 2005, Max Noel wrote:

  My data set the below is taken from is over 2.4 gb so speed and
  memory considerations come into play.
 
  Are sets more effective than lists for this?
 
  Sets or dictionaries make the act of lookup of a key fairly cheap.
  In the two-pass approach, the first pass can use a dictionary to
  accumulate the number of times a certain record's key has occurred.
 
  Note that, because your file is so large, the dictionary probably
  shouldn't accumulation the whole mass of information that we've seen
  so far: instead, it's sufficient to record the information we need to
  recognize a duplicate.

   However, the first pass will consume a lot of memory. Considering
 the worst-case scenario where each record only appears once, you'll find
 yourself with the whole 2GB file loaded into memory.
   (or do you have a smarter way to do this?)


Hi Max,

My assumptions are that each record consists of some identifying string
key that's associated to some value.  How are we deciding that two
records are talking about the same thing?


I'm hoping that the set of unique keys isn't itself very large.  Under
this assumption, we can do something like this:

###
from sets import Set
def firstPass(f):
Returns a set of the duplicate keys in f.
seenKeys = Set()
duplicateKeys = Set()
for record in f:
key = extractKey(record)
if key in seenKeys:
duplicateKeys.add(key)
else:
seenKeys.add(key)
return duplicateKeys
###

where we don't store the whole record into memory, but only the 'key'
portion of the record.

And if the number of unique keys is small enough, this should be fine
enough to recognize duplicate records.  So on the second passthrough, we
can display the duplicate records on-the-fly.  If this assumption is not
true, then we need to do something else.  *grin*

One possibility might be to implement an external sorting mechanism:

http://www.nist.gov/dads/HTML/externalsort.html


But if we're willing to do an external sort, then we're already doing
enough work that we should really consider using a DBMS.  The more
complicated the data management becomes, the more attractive it becomes to
use a real database to handle these data management issues.  We're trying
to solve a problem that is already solved by a real database management
system.


Talk to you later!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] ascii encoding

2005-01-25 Thread Max Noel
On Jan 26, 2005, at 00:50, Luis N wrote:
Ok, urllib.quote worked just fine, and of course so did 
urllib.pathname2url.

I should have run a dir() on urllib. Those functions don't appear in
http://docs.python.org/lib/module-urllib.html
Now, how might one go about calculating the New York time off-set from
GMT? The server is in the U.S. but time.localtime() is giving me GMT.
	time.timezone gives you, I think, the offset between your current 
timezone and GMT. However, being myself in the GMT zone, I don't know 
exactly if the returned offset is positive or negative (it returns 0 
here, which makes sense :D ).

-- Max
maxnoel_fr at yahoo dot fr -- ICQ #85274019
Look at you hacker... A pathetic creature of meat and bone, panting 
and sweating as you run through my corridors... How can you challenge a 
perfect, immortal machine?

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] ascii encoding

2005-01-25 Thread Luis N
In other words I have to do some arithmetic:

 import time
 time.timezone
0

The server is located in Dallas, Texas.


On Wed, 26 Jan 2005 15:44:48 +1300, Tony Meyer [EMAIL PROTECTED] wrote:
  time.timezone gives you, I think, the offset between
  your current timezone and GMT. However, being myself in the GMT zone,
  I don't know exactly if the returned offset is positive or negative
  (it returns 0 here, which makes sense :D ).
 
 Whether or not it's positive or negative depends on which side of GMT/UTC
 you are, of course :)  Note that the result in is seconds, too:
 
  import time
  time.timezone
 -43200
  time.timezone/60/60
 -12
 
 (I'm in NZ, 12 hours ahead of GMT/UTC).
 
 =Tony.Meyer
 

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Should this be a list comprehension or something?

2005-01-25 Thread Terry Carroll
The following Python code works correctly; but I can't help but wonder if 
my for loop is better implemented as something else: a list comprehension 
or something else more Pythonic.

My goal here is not efficiency of the code, but efficiency in my Python 
thinking; so I'll be thinking, for example, ah, this should be a list 
comprehension instead of a knee-jerk reaction to use a for loop.

Comments?

The point of the code is to take a sequence of objects, each object 
representing an amount of water with a given mass and temperature, and to 
return another object that represents all the water ideally combined.  The 
formulae for the combined mass and temp are respectively:

 combined mass = M1 + M2 + M3  (duh)
 combined temp = ((M1*T1) + (M2*T2) + (M3*T3)) / (M1 + M2 + M3)

Here's my code:

class Water:
def __init__(self, WaterMass, WaterTemperature):
self.mass = WaterMass
self.temperature = WaterTemperature
def __repr__(self):
return (%.2f, %.2f % (self.mass, self.temperature))

def CombineWater(WaterList):
totalmass=0
numerator = 0; denominator = 0
for WaterObject in WaterList:
totalmass += WaterObject.mass
numerator += WaterObject.mass * WaterObject.temperature
return Water(totalmass, numerator/totalmass)


Example use:


w1 = Water(50,0)
w2 = Water(50,100)
w3 = Water(25,50)

print CombineWater((w1,w2,w3))


prints, as expected: 125.00, 50.00



___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] ascii encoding

2005-01-25 Thread Max Noel
On Jan 26, 2005, at 02:56, Luis N wrote:
In other words I have to do some arithmetic:
import time
time.timezone
0
The server is located in Dallas, Texas.
	Which means it's not properly configured. On UNIX systems, to 
configure the timezone, you must adjust /etc/localtime so that it's a 
symlink that points to the appropriate timezone in /usr/share/zoneinfo 
.
	The exact layout of the /usr/share/zoneinfo folder is probably 
implementation-specific, but for example, here's how it is on my Mac OS 
X box:

[EMAIL PROTECTED] ~]% ls -l /etc/localtime
lrwxr-xr-x  1 root  wheel  33 25 Jan 18:58 /etc/localtime - 
/usr/share/zoneinfo/Europe/London

-- Max
maxnoel_fr at yahoo dot fr -- ICQ #85274019
Look at you hacker... A pathetic creature of meat and bone, panting 
and sweating as you run through my corridors... How can you challenge a 
perfect, immortal machine?

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Should this be a list comprehension or something?

2005-01-25 Thread Max Noel
On Jan 26, 2005, at 03:17, Terry Carroll wrote:
My goal here is not efficiency of the code, but efficiency in my Python
thinking; so I'll be thinking, for example, ah, this should be a list
comprehension instead of a knee-jerk reaction to use a for loop.
Comments?
The point of the code is to take a sequence of objects, each object
representing an amount of water with a given mass and temperature, and 
to
return another object that represents all the water ideally combined.  
The
formulae for the combined mass and temp are respectively:

 combined mass = M1 + M2 + M3  (duh)
 combined temp = ((M1*T1) + (M2*T2) + (M3*T3)) / (M1 + M2 + M3)
Here's my code:

class Water:
def __init__(self, WaterMass, WaterTemperature):
self.mass = WaterMass
self.temperature = WaterTemperature
def __repr__(self):
return (%.2f, %.2f % (self.mass, self.temperature))
def CombineWater(WaterList):
totalmass=0
numerator = 0; denominator = 0
for WaterObject in WaterList:
totalmass += WaterObject.mass
numerator += WaterObject.mass * WaterObject.temperature
return Water(totalmass, numerator/totalmass)
Well, you can do this with list comprehensions, yeah:
totalmass = sum([WaterObject.mass for WaterObject in WaterList])
totaltemp = sum([WaterObject.mass * WaterObject.temp for WaterObject in 
WaterList]) / totalmass
return Water(totalmass, totaltemp)

	Doesn't seem that much more Pythonic to me. I find it about as 
readable as your code, but someone who isn't used to list 
comprehensions will find that weird-looking. However, someone who uses 
functional programming languages a lot (Lisp, Scheme, Haskell, ML...) 
will be familiar with that.

	The actual pros of that method is that it's a functional approach and 
that it has less lines than your approach (you can even reduce it to a 
one-liner by adding a third list comprehension, but at that point it 
starts to look ugly).
	As for the cons, as I said, it may seem less readable than the 
original version to the non-experienced; and chances are it's slower 
than the original version since it has to iterate through 4 lists 
instead of 2.

	In any case, when in doubt, do what you think will be easier to 
maintain.

-- Max
maxnoel_fr at yahoo dot fr -- ICQ #85274019
Look at you hacker... A pathetic creature of meat and bone, panting 
and sweating as you run through my corridors... How can you challenge a 
perfect, immortal machine?

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] how to plot a graph

2005-01-25 Thread jrlen balane
i'm going to use now the matplotlib in plotting a graph.

i'm currently using python 2.3(enthought edition) on win 2000/xp.
i'm using boa constructor on the GUI part.
 
i am using an MDIParentFrame. one of the child frame will be used for
the table part. then another child frame will be used to show the
graph, how am i going to do this? will i just import the child frame
containing the tables and then i'll be able to just get the data from
the table and use it to plot a graph?
how am i going to assign to a variable each input to the table? 
can you please show me a sample code to do this?
i'm a little lost since i'm a bit new to python.

also, how am i going to assign to a variable anything that a user
inputs to a wxTxtCtrl?

any help would greatly be appreciated. thanks and more power
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor