performance of script to write very long lines of random chars

2013-04-10 Thread gry
Dear pythonistas,
   I am writing a tiny utility to produce a file consisting of a
specified number of lines of a given length of random ascii
characters.  I am hoping to find a more time and memory efficient way,
that is still fairly simple clear, and _pythonic_.

I would like to have something that I can use at both extremes of
data:

   32M chars per line * 100 lines
or
   5 chars per line * 1e8 lines.

E.g., the output of bigrand.py for 10 characters, 2 lines might be:

gw2+M/5t.
S[[db/l?Vx

I'm using python 2.7.0 on linux.  I need to use only out-of-the box
modules, since this has to work on a bunch of different computers.
At this point I'm especially concerned with the case of a few very
long lines, since that seems to use a lot of memory, and take a long
time.
Characters are a slight subset of the printable ascii's, specified in
the examples below.  My first naive try was:

from sys import stdout
import random
nchars = 3200
rows = 10
avail_chrs =
'0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!#$%
\'()*+,-./:;=?@[\\]^_`{}'

def make_varchar(nchars):
return (''.join([random.choice(avail_chrs) for i in
range(nchars)]))

for l in range(rows):
stdout.write(make_varchar(nchars))
stdout.write('\n')

This version used around 1.2GB resident/1.2GB virtual of memory for
3min 38sec.


My second try uses much less RAM, but more CPU time, and seems rather,
umm, un-pythonic (the array module always seems a little un
pythonic...)

from sys import stdout
from array import array
import random
nchars = 3200
rows = 10
avail_chrs =
'0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!#$%
\'()*+,-./:;=?@[\\]^_`{}'
a = array('c', 'X' * nchars)

for l in range(rows):
for i in xrange(nchars):
a[i] = random.choice(avail_chrs)
a.tofile(stdout)
stdout.write('\n')

This version using array took 4 min, 29 sec, using 34MB resident/110
virtual. So, much smaller than the first attempt, but a bit slower.
Can someone suggest a better code?  And help me understand the
performance issues here?

-- George
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: performance of script to write very long lines of random chars

2013-04-10 Thread gry
On Apr 10, 9:52 pm, Michael Torrie torr...@gmail.com wrote:
 On 04/10/2013 07:21 PM, gry wrote:









  from sys import stdout
  from array import array
  import random
  nchars = 3200
  rows = 10
  avail_chrs =
  '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!#$%
  \'()*+,-./:;=?@[\\]^_`{}'
  a = array('c', 'X' * nchars)

  for l in range(rows):
      for i in xrange(nchars):
          a[i] = random.choice(avail_chrs)
      a.tofile(stdout)
      stdout.write('\n')

  This version using array took 4 min, 29 sec, using 34MB resident/110
  virtual. So, much smaller than the first attempt, but a bit slower.
  Can someone suggest a better code?  And help me understand the
  performance issues here?

 Why are you using an array?  Why not just rely on the OS to buffer the
 output.  Just write your characters straight to stdout instead of
 placing them in an array.

 At that point I believe this program will be as fast as is possible in
 Python.

Appealing idea, but it's slower than the array solution: 5min 13
secs.  vs 4min 30sec for the array:

for l in range(rows):
for i in xrange(nchars):
stdout.write(random.choice(avail_chrs))
stdout.write('\n')


os.urandom does look promising -- I have to have full control over the
charset, but urandom is very fast at generating big random strings...
stay tuned...
-- 
http://mail.python.org/mailman/listinfo/python-list


tiny script has memory leak

2012-05-14 Thread gry
sys.version -- '2.6 (r26:66714, Feb 21 2009, 02:16:04) \n[GCC 4.3.2
[gcc-4_3-branch revision 141291]]
I thought this script would be very lean and fast, but with a large
value for n (like 15), it uses 26G of virtural memory, and things
start to crumble.

#!/usr/bin/env python
'''write a file of random integers.  args are: file-name how-many'''
import sys, random

f = open(sys.argv[1], 'w')
n = int(sys.argv[2])
for i in xrange(n):
print f, random.randint(0, sys.maxint)
f.close()

What's using so much memory?
What would be a better way to do this?  (aside from checking arg
values and types, I know...)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python: Deleting specific words from a file.

2011-09-12 Thread gry
On Sep 9, 2:04 am, Terry Reedy tjre...@udel.edu wrote:
 On 9/8/2011 9:09 PM, papu wrote:



  Hello, I have a data file (un-structed messy file) from which I have
  to scrub specific list of words (delete words).

  Here is what I am doing but with no result:

  infile = messy_data_file.txt
  outfile = cleaned_file.txt

  delete_list = [word_1,word_2,word_n]
  new_file = []
  fin=open(infile,)
  fout = open(outfile,w+)
  for line in fin:
       for word in delete_list:
           line.replace(word, )
       fout.write(line)
  fin.close()
  fout.close()

 If you have very many words (and you will need all possible forms of
 each word if you do exact matches), The following (untested and
 incomplete) should run faster.

 delete_set = {word_1,word_2,word_n}
 ...
 for line in fin:
      for word in line.split()
          if word not in delete_set:
              fout.write(word) # also write space and nl.

 Depending on what your file is like, you might be better with
 re.split('(\W+)', line). An example from the manual:
   re.split('(\W+)', '...words, words...')
 ['', '...', 'words', ', ', 'words', '...', '']

 so all non-word separator sequences are preserved and written back out
 (as they will not match delete set).

 --
 Terry Jan Reedy

re.sub is handy too:
import re
delete_list=('the','rain','in','spain')
regex =  re.compile('\W' + '|'.join(delete_list) + '\W')
infile='messy'
with open(infile, 'r') as f:
for l in f:
print regex.sub('', l)
-- 
http://mail.python.org/mailman/listinfo/python-list


replace random matches of regexp

2011-09-08 Thread gry
[Python 2.7]
I have a body of text (~1MB) that I need to modify.   I need to look
for matches of a regular expression and replace a random selection of
those matches with a new string.  There may be several matches on any
line, and a random selection of them should be replaced.  The
probability of replacement should be adjustable.  Performance is not
an issue.  E.g: if I have:

SELECT max(PUBLIC.TT.I) AS SEL_0 FROM (SCHM.T RIGHT OUTER JOIN
PUBLIC.TT ON (SCHM.T.I IS NULL)) WHERE (NOT(NOT((power(PUBLIC.TT.F,
PUBLIC.TT.F) = cast(ceil(( SELECT 22 AS SEL_0FROM
(PUBLIC.TT AS PUBLIC_TT_0 JOIN PUBLIC.TT AS PUBLIC_TT_1 ON (ceil(0.46)
=sin(PUBLIC_TT_1.F)))WHERE ((zeroifnull(PUBLIC_TT_0.I) =
sqrt((0.02 + PUBLIC_TT_1.F))) OR

I might want to replace '(max|min|cos|sqrt|ceil' with public.\1, but
only with probability 0.7.  I looked and looked for some computed
thing in re's that I could stick and expression, but could not find
such(for good reasons, I know).
Any ideas how to do this?  I would go for simple, even if it's wildly
inefficient, though elegance is always admired...
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: replace random matches of regexp

2011-09-08 Thread gry
On Sep 8, 3:51 pm, gry georgeryo...@gmail.com wrote:
To elaborate(always give example of desired output...) I would hope to
get something like:

SELECT public.max(PUBLIC.TT.I) AS SEL_0 FROM (SCHM.T RIGHT OUTER JOIN
PUBLIC.TT ON (SCHM.T.I IS NULL)) WHERE (NOT(NOT((power(PUBLIC.TT.F,
PUBLIC.TT.F) = cast(ceil(( SELECT 22 AS SEL_0FROM
(PUBLIC.TT AS PUBLIC_TT_0 JOIN PUBLIC.TT AS PUBLIC_TT_1 ON
(public.ceil(0.46)
=public.sin(PUBLIC_TT_1.F)))WHERE ((zeroifnull(PUBLIC_TT_0.I)
=
public.sqrt((0.02 + PUBLIC_TT_1.F))) OR

notice the 'ceil' on the third line did not get changed.
-- 
http://mail.python.org/mailman/listinfo/python-list


list comprehension to do os.path.split_all ?

2011-07-28 Thread gry
[python 2.7] I have a (linux) pathname that I'd like to split
completely into a list of components, e.g.:
   '/home/gyoung/hacks/pathhack/foo.py'  --  ['home', 'gyoung',
'hacks', 'pathhack', 'foo.py']

os.path.split gives me a tuple of dirname,basename, but there's no
os.path.split_all function.

I expect I can do this with some simple loop, but I have such faith in
the wonderfulness of list comprehensions, that it seems like there
should be a way to use them for an elegant solution of my problem.
I can't quite work it out.  Any brilliant ideas?   (or other elegant
solutions to the problem?)

-- George
-- 
http://mail.python.org/mailman/listinfo/python-list


[issue12523] 'str' object has no attribute 'more' [/usr/lib/python3.2/asynchat.py|initiate_send|245]

2011-07-09 Thread Gry

New submission from Gry gryll...@gmail.com:

Asynchat push() function has a bug which prevents it from functioning.

This code worked fine with Python 2.

---
# https://github.com/jstoker/BasicBot
import asynchat,asyncore,socket
class asynchat_bot(asynchat.async_chat):
def __init__(self, host, port):
asynchat.async_chat.__init__(self)
self.create_socket(socket.AF_INET,socket.SOCK_STREAM)
self.set_terminator('\r\n')
self.data=''
self.remote=(host,port)
self.connect(self.remote)

def handle_connect(self):
self.push('USER BasicBot 8 %s :BasicBot! 
http://github.com/jstoker/BasicBot\r\nNICK testbot\r\n' % self.remote[0])

def get_data(self):
r=self.data
self.data=''
return r
def collect_incoming_data(self, data):
self.data+=data
def found_terminator(self):
data=self.get_data()
if data[:4] == 'PING':
self.push('PONG %s' % data[5:]+'\r\n')
if '001' in data:
self.push('JOIN #bots\r\n')
if '~hi' in data:
self.push('PRIVMSG #bots :hi.\r\n')
if __name__ == '__main__':
asynchat_bot('127.0.0.1',16667)
asyncore.loop()
---


In Python 3 however, the exception follows:


---
~/tests/BasicBot$ python3 asynchat_bot.py
error: uncaptured python exception, closing channel __main__.asynchat_bot 
connected at 0xb70078ac (class 'AttributeError':'str' object has no 
attribute 'more' [/usr/lib/python3.2/asyncore.py|write|89] 
[/usr/lib/python3.2/asyncore.py|handle_write_event|462] 
[/usr/lib/python3.2/asynchat.py|handle_write|194] 
[/usr/lib/python3.2/asynchat.py|initiate_send|245])
~/tests/BasicBot$ python3 -V
Python 3.2
~/tests/BasicBot$
---

A comment from Stackoverflow on why it happens:

---
The error seems to be raised in 
/usr/lib/python3.2/asynchat.py|initiate_send|245.

def initiate_send(self):
while self.producer_fifo and self.connected:
first = self.producer_fifo[0]
...
try:
data = buffer(first, 0, obs)
except TypeError:
data = first.more() --- here 

Seems like somebody put a string in self.producer_fifo instead of an 
asyncchat.simple_producer, which is the only class in async*.py with a more() 
method.

--
components: None
messages: 140073
nosy: Gry
priority: normal
severity: normal
status: open
title: 'str' object has no attribute 'more' 
[/usr/lib/python3.2/asynchat.py|initiate_send|245]
type: crash
versions: Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12523
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: CPython on the Web

2011-01-04 Thread gry
On Jan 4, 1:11 am, John Nagle na...@animats.com wrote:
 On 1/1/2011 11:26 PM, azakai wrote:

  Hello, I hope this will be interesting to people here: CPython running
  on the web,

 http://syntensity.com/static/python.html

  That isn't a new implementation of Python, but rather CPython 2.7.1,
  compiled from C to JavaScript using Emscripten and LLVM. For more
  details on the conversion process, seehttp://emscripten.org

On loading, I get script stack space quota is exhausted under
firefox 3.5.12, under linux.
Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.12) Gecko/20100907
Fedora/3.5.12-1.fc12 Firefox/3.5.12
-- 
http://mail.python.org/mailman/listinfo/python-list


performance of tight loop

2010-12-13 Thread gry
[python-2.4.3, rh CentOS release 5.5 linux, 24 xeon cpu's, 24GB ram]
I have a little data generator that I'd like to go faster... any
suggestions?
maxint is usually 9223372036854775808(max 64bit int), but could
occasionally be 99.
width is usually 500 or 1600, rows ~ 5000.

from random import randint

def row(i, wd, mx):
first = ['%d' % i]
rest =  ['%d' % randint(1, mx) for i in range(wd - 1)]
return first + rest
...
while True:
print copy %s from stdin direct delimiter ','; % table_name
for i in range(i,i+rows):
print ','.join(row(i, width, maxint))
print '\.'

-- 
http://mail.python.org/mailman/listinfo/python-list


regex help: splitting string gets weird groups

2010-04-08 Thread gry
[ python3.1.1, re.__version__='2.2.1' ]
I'm trying to use re to split a string into (any number of) pieces of
these kinds:
1) contiguous runs of letters
2) contiguous runs of digits
3) single other characters

e.g.   555tHe-rain.in#=1234   should give:   [555, 'tHe', '-', 'rain',
'.', 'in', '#', '=', 1234]
I tried:
 re.match('^(([A-Za-z]+)|([0-9]+)|([-.#=]))+$', 
 '555tHe-rain.in#=1234').groups()
('1234', 'in', '1234', '=')

Why is 1234 repeated in two groups?  and why doesn't tHe appear as a
group?  Is my regexp illegal somehow and confusing the engine?

I *would* like to understand what's wrong with this regex, though if
someone has a neat other way to do the above task, I'm also interested
in suggestions.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: regex help: splitting string gets weird groups

2010-04-08 Thread gry
On Apr 8, 3:40 pm, MRAB pyt...@mrabarnett.plus.com wrote:

...
 Group 1 and group 4 match '='.
 Group 1 and group 3 match '1234'.

 If a group matches then any earlier match of that group is discarded,
Wow, that makes this much clearer!  I wonder if this behaviour
shouldn't be mentioned in some form in the python docs?
Thanks much!

 so:

 Group 1 finishes with '1234'.
 Group 2 finishes with 'in'.
 Group 3 finishes with '1234'.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: regex help: splitting string gets weird groups

2010-04-08 Thread gry
     s='555tHe-rain.in#=1234'
     import re
     r=re.compile(r'([a-zA-Z]+|\d+|.)')
     r.findall(s)
    ['555', 'tHe', '-', 'rain', '.', 'in', '#', '=', '1234']
This is nice and simple and has the invertible property that Patrick
mentioned above.  Thanks much!
-- 
http://mail.python.org/mailman/listinfo/python-list


generators/iterators: filtered random choice

2006-09-15 Thread gry
I want a function (or callable something) that returns a random
word meeting a criterion.  I can do it like:

def random_richer_word(word):
'''find a word having a superset of the letters of word'''
if len(set(word) == 26): raise WordTooRichException, word
while True:
w = random.choice(words)
if set(w) - set(word):  # w has letters not present in word
return w

This seems like a perfect application for generators or iterators,
but I can't quite see how.  Any suggestions?

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: parameter files

2006-09-13 Thread gry
Russ wrote:
 I have a python module (file) that has a set of parameters associated
 with it. Let's say the module is called code.py. I would like to keep
 the parameter assignments in a separate file so that I can save a copy
 for each run without having to save the entire code.py file. Let's
 say the parameter file is called parameters.py.

 Normally, code.py would simply import the parameters.py file. However,
 I don't want the parameters to be accessible to any other file that
 imports the code.py file. to prevent such access, I preface the name of
 each parameter with an underscore. But then the parameters aren't even
 visible in code.py! So I decided to use execfile instead of import so
 the parameters are visible.

 That solved the problem, but I am just wondering if there is a better
 and/or more standard way to handle a situation like this. Any
 suggestions? Thanks.

I would try a configuration file, instead of a python module.
See ConfigParser:
http://docs.python.org/lib/module-ConfigParser.html.
You can save values for each run in a separate [section].
Execfile is a pretty big hammer for this.

-- George

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Seeking regex optimizer

2006-06-19 Thread gry

Kay Schluehr wrote:
 I have a list of strings ls = [s_1,s_2,...,s_n] and want to create a
 regular expression sx from it, such that sx.match(s) yields a SRE_Match
 object when s starts with an s_i for one i in [0,...,n].  There might
 be relations between those strings: s_k.startswith(s_1) - True or
 s_k.endswith(s_1) - True. An extreme case would be ls = ['a', 'aa',
 ...,'...ab']. For this reason SRE_Match should provide the longest
 possible match.

In a very similar case I used a simple tree of dictionaries, one node
per letter, to represent the strings.
This naturally collapses cases like ['a','aa','aaa'].  Then a recursive
function finds
the desired prefix.  This was WAY faster than the re module for large
n (tradeoff point for me was n~1000).  It requires a bit more coding,
but I think it is the natural data structure for this problem.

As others have suggested, you should first try the most naive
implementation before making a hard optimization problem out of this.

 Is there a Python module able to create an optimized regex rx from ls
 for the given constraints?
 
 Regards,
 Kay

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Thinking like CS problem I can't solve

2006-05-23 Thread gry
Alex Pavluck wrote:
 Hello.  On page 124 of Thinking like a Computer Scientist.  There is
 an exercise to take the following code and with the use of TRY: /
 EXCEPT: handle the error.  Can somone help me out?  Here is the code:

 def inputNumber(n):
 if n == 17:
 raise 'BadNumberError: ', '17 is off limits.'
 else:
 print n, 'is a nice number'
 return n

 inputNumber(17)

Yikes!  It's a very bad idea to use string literals as exceptions.
Use one of the classes from the 'exceptions' module, or derive
your own from one of them.  E.g.:

class BadNum(ValueError):
pass
def inputNumber(n):
if n == 17:
raise BadNum('17 is off limits')
else:
print n, 'is a nice number'

try:
inputNumber(17)
except BadNum, x:
print 'Uh Oh!', x

Uh Oh! 17 is off limits


See:
   http://docs.python.org/ref/try.html#try
especially the following bit:

...the clause matches the exception if the resulting object is
``compatible'' with the exception. An object is compatible with an
exception if it is either the object that identifies the exception, or
(for exceptions that are classes) it is a base class of the
exception,... Note that the object identities must match, i.e. it must
be the same object, not just an object with the same value.

Identity of string literals is a *very* slippery thing.  Don't
depend on it.  Anyway, python 2.5 gives a deprecation
warning if a string literal is used as an exception.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Thinking like CS problem I can't solve

2006-05-23 Thread gry
Alex Pavluck wrote:
 Hello.  On page 124 of Thinking like a Computer Scientist.  There is
 an exercise to take the following code and with the use of TRY: /
 EXCEPT: handle the error.  Can somone help me out?  Here is the code:

 def inputNumber(n):
 if n == 17:
 raise 'BadNumberError: ', '17 is off limits.'
 else:
 print n, 'is a nice number'
 return n

 inputNumber(17)

Yikes!  It's a very bad idea to use string literals as exceptions.
Use one of the classes from the 'exceptions' module, or derive
your own from one of them.  E.g.:

class BadNum(ValueError):
pass
def inputNumber(n):
if n == 17:
raise BadNum('17 is off limits')
else:
print n, 'is a nice number'

try:
inputNumber(17)
except BadNum, x:
print 'Uh Oh!', x

Uh Oh! 17 is off limits


See:
   http://docs.python.org/ref/try.html#try
especially the following bit:

...the clause matches the exception if the resulting object is
``compatible'' with the exception. An object is compatible with an
exception if it is either the object that identifies the exception, or
(for exceptions that are classes) it is a base class of the
exception,... Note that the object identities must match, i.e. it must
be the same object, not just an object with the same value.

Identity of string literals is a *very* slippery thing.  Don't
depend on it.  Anyway, python 2.5 gives a deprecation
warning if a string literal is used as an exception.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: --version?

2006-05-02 Thread gry
I agree.   The --version option has become quite a de-facto standard
in the linux world.  In my sys-admin role, I can blithely run
  initiate_global_thermonuclear_war --version
to find what version we have, even if I don't know what it does...

python --version

would be a very helpful addition. (Keep the -V too, if you like it
:-)

-- George

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: best way to determine sequence ordering?

2006-04-28 Thread gry
index is about the best you can do with the structure you're using.
If you made the items instances of a class, then you could define a
__cmp__ member, which would let you do:

a=Item('A')
b=Item('B')
if ab: something

The Item class could use any of various means to maintain order
information. If there are not too many values, it could have a
dictionary storing an integer for the order:

class Item(object):
   def __init__(self, value):
  self.val=value
  self.order = dict(c=0, a=1, d=2, b=3)
   def __cmp__(self, other):
  return cmp(self.order[self.val], self.order[other.val])

If you don't care about performance, or you find it clearer, just use:
  self.order = ['C', 'A', 'D', 'B']
and
   def __cmp__(self, other):
  return cmp(self.order.index(self.value),
self.order.index(other.value))


-- George Young

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: list.clear() missing?!?

2006-04-13 Thread gry
A perspective that I haven't seen raised here is inheritance.
I often say
mylist = []
if I'm done with the current contents and just want a fresh list.

But the cases where I have really needed list.clear [and laboriously
looked for it and ended up with
   del l[:]
were when the object was my own class that inherits from list, adding
some state and other functionality.  Of course I *could* have added my
own 'clear' function member, but I *wanted* it to behave like a
standard
python list in it's role of maintaining a sequence of processing steps.
So, I end up doing
   del current_process[:]
which, IMO, looks a bit queer, especially when current_process
object is a fairly elaborate data object.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: converting lists to strings to lists

2006-04-12 Thread gry
Read about string split and join.  E.g.:
l = '0.87 0.25 0.79'
floatlist = [float(s) for s in l.split()]

In the other direction:
floatlist = [0.87, 0.25, 0.79004]
outstring = ' '.join(floatlist)

If you need to control the precision(i.e. suppress the 4), read
about
the string formatting operator %.

-- George Young

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Sorting a list of objects by multiple attributes

2006-04-10 Thread gry
For multiple keys the form is quite analogous:

   L.sort(key=lambda i: (i.whatever, i.someother, i.anotherkey))

I.e., just return a tuple with the keys in order from your lambda.
Such tuples sort nicely.  

-- George Young

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to pipe to variable of a here document

2006-04-10 Thread gry
http://www.python.org/doc/topics/database/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Characters contain themselves?

2006-04-07 Thread gry
In fact, not just characters, but strings contain themselves:

 'abc' in 'abc'
True

This is a very nice(i.e. clear and concise) shortcut for:

 'the rain in spain stays mainly'.find('rain') != -1
True

Which I always found contorted and awkward.

Could you be a bit more concrete about your complaint?

-- George
[Thanks, I did enjoy looking up the Axiom of Foundation!]

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: glob and curly brackets

2006-04-07 Thread gry
This would indeed be a nice feature.
The glob module is only 75 lines of pure python.  Perhaps you would
like
to enhance it?  Take a look.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: don't understand popen2

2006-03-22 Thread gry

Martin P. Hellwig wrote:
 Hi all,

 I was doing some popen2 tests so that I'm more comfortable using it.
 I wrote a little python script to help me test that (testia.py):

 -
 someline = raw_input(something:)

 if someline == 'test':
  print(yup)
 else:
  print(nope)
 -

 And another little thing that does it's popen2 stuff:

 -
 import popen2

 std_out, std_in = popen2.popen2(testia.py)

 x=std_out.readline()
 print(x)

 std_in.writelines(notgood)

 x=std_out.readline()
 print(x)
 -

 Now what I expected was that I got the return one the first line:
 something: and on the second nope, but instead of that I got:

  
 something:
 Traceback (most recent call last):
File F:\coding\pwSync\popen_test\popen_test.py, line 8, in ?
  std_in.writelines(notgood)
 IOError: [Errno 22] Invalid argument
  

 I played around a bit with flush, write and the order of first writing
 and then reading, the best I can get is no error but still not the
 expected output. I googled a bit copied some examples that also worked
 on my machine, reread the manual and the only conclusion I have is that
 I don't even understand what I'm doing wrong.

 Would you please be so kind to explain my wrong doing?
 (python 2.4 + win32 extensions on XPProSP2)

 help(sys.stdin.writelines)
Help on built-in function writelines:

writelines(...)
writelines(sequence_of_strings) - None.  Write the strings to the
file.

Note that newlines are not added.  The sequence can be any iterable
object
producing strings. This is equivalent to calling write() for each
string

You gave it a single string, not a list(sequence) of strings.  Try
something like:
std_in.writelines([notgood])

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: String functions: what's the difference?

2006-03-09 Thread gry
First, don't appologize for asking questions.  You read, you thought,
and you tested.  That's more than many people on this list do.  Bravo!

One suggestion: when asking questions here it's a good idea to always
briefly mention which version of python and what platform (linux,
windows, etc) you're using.  It helps us answer your questions more
effectively.

For testing performance the timeit module is great.  Try something
like:
  python -mtimeit -s 'import string;from myfile import isLower'
isLower('x')

You didn't mention the test data, i.e. the character you're feeding to
isLower.
It might make a difference if the character is near the beginning or
end of the range.

As to reasons to prefer one or another implementation, one *very*
important question is which one is clearer?.  It may sound like a
minor thing, but when I'm accosted first thing in the
morning(pre-coffee) about a nasty urgent bug and sit down to pore over
code and face string.find(string.lowercase, ch) != -1, I'm not happy.

Have fun with python!
-- George Young

-- 
http://mail.python.org/mailman/listinfo/python-list


pyparsing: crash on empty element

2006-03-06 Thread gry
[python 2.3.3, pyparsing 1.3]
I have:

 def unpack_sql_array(s):
# unpack a postgres array, e.g. {'w1','w2','w3'} into a
list(str)
import pyparsing as pp
withquotes = pp.dblQuotedString.setParseAction(pp.removeQuotes)
withoutquotes = pp.CharsNotIn(',{}')
parser = pp.StringStart() + \
 pp.Literal('{').suppress() + \
 pp.delimitedList(withquotes | withoutquotes) + \
 pp.Literal('}').suppress() + \
 pp.StringEnd()
return parser.parseString(s).asList()

which works beautifully, except on the input: {}.  How can I neatly
modify the parser to return an empty list in this case?
Yes, obviously, I could say
   if s=='{}': return []
It just seems like I'm missing some simple intrinsic way to get this
out of the parser.  I am hoping to become more skillful in using the
wonderful pyparsing module!

-- George Young

-- 
http://mail.python.org/mailman/listinfo/python-list


style question: how to delete all elements in a list

2006-03-06 Thread gry
Just curious about people's sense of style:

To delete all the elements of a list, should one do:

   lst[:] = []
or
   del(lst[:])

I seem to see the first form much more often in code, but
the second one seems more clearly *deleting* elements,
and less dangerously mistaken for the completely different:
   lst = [] 

What do you think?

-- George Young

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Proper class initialization

2006-03-02 Thread gry
Christoph Zwerschke wrote:
 Usually, you initialize class variables like that:

 class A:
 sum = 45

 But what is the proper way to initialize class variables if they are the
 result of some computation or processing as in the following silly
 example (representative for more:

 class A:
  sum = 0
  for i in range(10):
  sum += i

 The problem is that this makes any auxiliary variables (like i in this
 silly example) also class variables, which is not desired.

 Of course, I could call a function external to the class

 def calc_sum(n):
  ...

 class A:
  sum = calc_sum(10)

 But I wonder whether it is possible to put all this init code into one
 class initialization method, something like that:

 class A:

  @classmethod
  def init_class(self):
  sum = 0
  for i in range(10):
  sum += i
  self.sum = sum

  init_class()

 However, this does not work, I get
 TypeError: 'classmethod' object is not callable

 Is there another way to put an initialization method for the class A
 somewhere *inside* the class A?
Hmm, the meta-class hacks mentioned are cool, but for this simple a
case how about just:

class A:
   def __init__(self):
  self.__class__.sum = self.calculate_sum()
   def calculate_sum(self):
  do_stuff
  return sum_value

Instead of __class__ you could say:
  A.sum = self.calculate_sum()
but that fails if you rename the class.  I believe either works fine
in case of classes derived from A.

-- George Young

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: critique my code, please

2006-02-06 Thread gry
Just a few suggestions:

1) use consistant formatting, preferably something like:
  http://www.python.org/peps/pep-0008.html
  E.g.:
yesno = {0:No, 1:Yes, True:Yes, False:No}

2) if (isinstance(self.random_seed,str)):
 s=s+Random Seed: %s\n % self.random_seed
 else:
 s=s+Random Seed: %d\n % self.random_seed
  is unnecessary, since %s handles any type.  Just say:
   s=s+Random Seed: %s\n % self.random_seed
  without any if statement.(unless you need fancy numeric formatting).

3) I would strongly discourage using print statements (other than for
debugging) in a GUI program.  In my experience, users fire up the GUI
and close (or kill!) the parent tty window, ignoring any dire messages
printed on stdout or stderr.  In a GUI app, errors, warnings, any
message
should be in popup dialogs or in a message bar in the main window.

4) If you want to be cute, you can use
  s += 'more text'
instead of
  s = s + 'more text'

I'm not a wx user so I can't comment on the GUI implementation.
Good luck!

-- George

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Converting date to milliseconds since 1-1-70

2006-01-24 Thread gry
NateM wrote:
 How do I convert any given date into a milliseconds value that
 represents the number of milliseconds that have passed since January 1,
 1970 00:00:00.000 GMT?
 Is there an easy way to do this like Date in java?
 Thanks,
 Nate

The main module for dates and times is datetime; so

 import datetime
 t=datetime.datetime.now()
 print t
2006-01-24 15:13:35.012755

To get at the epoch value, i.e. seconds since 1/1/1970, use the
time module:

 import time
 print time.mktime(t.timetuple())
1138133615.0

Now just add in the microseconds:
 epoch=time.mktime(d.timetuple())+(t.microsecond/100.)
 print epoch
1138133615.01

Use the % formatting operator to display more resolution:
 print '%f' % t
1138133615.012755

Note that the floating point division above is not exact and could
possibly
mangle the last digits.

Another way to this data is the datetime.strftime member:

 print d.strftime('%s.%%06d') % d.microsecond
'1138133615.012755'

This gets you a string, not a number object.  Converting the string to
a number again risks inaccuracy in the last digits:
 print float( '1138133615.012755')
1138133615.0127549

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: check to see if value can be an integer instead of string

2006-01-17 Thread gry
[EMAIL PROTECTED] wrote:
 Hello there,
 i need a way to check to see if a certain value can be an integer. I
 have looked at is int(), but what is comming out is a string that may
 be an integer. i mean, it will be formatted as a string, but i need to
 know if it is possible to be expressed as an integer.
The int builtin function never returns any value but an integer.

 like this

 var = some var passed to my script
 if var can be an integer :
 do this
 else:
 change it to an integer and do something else with it.

Be careful about thinking change it to an integer in python.  That's
not what happens.
The int builtin function looks at it's argument and, if possible,
creates a new integer object
that it thinks was represented by the argument, and returns that
integer object.

 whats the best way to do this ?

The pythonic way might be something like:

var=somestring
try:
   do_something(int(var))  #just try to convert var to integer and
proceed
except ValueError:
   do_something_else(var)  # conversion failed; do something with the
raw string value

The int builtin function tries to make an integer based on whatever
argument is supplied.
If if can not make an integer from the argument, it raises a ValueError
exception.
Don't be afraid of using exceptions in this sort of situation.  They
are pretty fast, and
ultimately clearer than a lot of extra if tests.

But do please always (unless you *really*really* know what you're
doing!) use a qualified
except clause.  E.g.

try:
   stuff
except SomeParticularException:
  other_stuff

Not:
try:
   stuff
except:
   other_stuff

The bare except clause invites all kinds of silent, unexpected bad
behavior.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: UNIX timestamp from a datetime class

2005-12-06 Thread gry

John Reese wrote:
 Hi.

  import time, calendar, datetime
  n= 1133893540.874922
  datetime.datetime.fromtimestamp(n)
 datetime.datetime(2005, 12, 6, 10, 25, 40, 874922)
  lt= _
  datetime.datetime.utcfromtimestamp(n)
 datetime.datetime(2005, 12, 6, 18, 25, 40, 874922)
  gmt= _

 So it's easy to create datetime objects from so-called UNIX timestamps
 (i.e. seconds since Jan 1, 1970 UTC).  Is there any way to get a UNIX
 timestamp back from a datetime object besides the following
 circumlocutions?

 d=datetime.datetime.fromtimestamp(1133893540.874922)
 epoch = int(d.strftime('%s'))
 usec = d.microsecond
 epoch + (usec / 100.0)
1133893540.874922

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: testing '192.168.1.4' is in '192.168.1.0/24' ?

2005-10-24 Thread gry
There was just recently announced -- iplib-0.9:
http://groups.google.com/group/comp.lang.python.announce/browse_frm/thread/e289a42714213fb1/ec53921d1545bf69#ec53921d1545bf69

 It appears to be pure python and has facilities for dealing with
netmasks. (v4 only).

-- George

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: dictionnaries and lookup tables

2005-10-11 Thread gry

[EMAIL PROTECTED] wrote:
 Hello,

 I am considering using dictionnaries as lookup tables e.g.

 D={0.5:3.9,1.5:4.2,6.5:3}

 and I would like to have a dictionnary method returning the key and
 item of the dictionnary whose key is smaller than the input of the
 method (or =,,=) but maximal (resp. maximal,minimal,minimal) eg.:

 D.smaller(3.0)
 (1.5,4.2)
 D.smaller(11.0)
 (6.5,3)
 D.smaller(-1.0)
 None (or some error message)

 Now, I know that dictionnaries are stored in a non-ordered fashion in
 python but they are so efficient in recovering values (at least wrt
 lists) that it suggests me that internally there is some ordering. I
 might be totally wrong because I don't know how the hashing is really
 done. Of course I would use such methods in much larger tables. So is
 this possible or should I stick to my own class with O(log2(N))
 recovery time?
...

I believe that to do this efficiently, you want some kind of tree, e.g.
B-tree, RB-tree, AVL-tree.  You could try the AVL tree from:
   ftp://squirl.nightmare.com/pub/python/python-ext/avl/avl-2.0.tar.gz

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Newbie Question

2005-08-19 Thread gry
Yes, eval of data from a file is rather risky.  Suppose someone gave
you
a file containing somewhere in the middle:
...
22,44,66,88,asd,asd,23,43,55
os.system('rm -rf *')
33,47,66,88,bsd,bsd,23,99,88
...

This would delete all the files in your directory!

The csv module mentioned above is the tool of choice for this task,
especially if
there are strings that could contain quotes or commas.  Doing this
right is not
at all easy.  If you really want to roll your own, and the data is
KNOWN to be fixed
and very restricted, you can do something like:

myfile contains:
13,2,'the rain',2.33
14,2,'in spain',2.34

for l in open('myfile'):
x,y,comment,height = l.split(',')
x=int(x)
y=int(y)
height=int(height)
comment=comment.strip(' ) # strip spaces and quotes from front
and back

but beware this will break if the comment contains commas.

-- George

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: stdin - stdout

2005-08-19 Thread gry
import sys
for l in sys.stdin:
sys.stdout.write(l)

-- George

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Catching stderr output from graphical apps

2005-08-10 Thread gry
Python 2.3.3, Tkinter.__version__'$Revision: 1.177 $'

Hmm, the error window pops up with appropriate title, but contains no
text.
I stuck an unbuffered write to a log file in ErrorPipe.write and got
only one line: Traceback (most recent call last):$

Any idea what's wrong?

-- George

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Retaining an object

2005-08-09 Thread gry
sysfault wrote:
 Hello, I have a function which takes a program name, and I'm using
 os.popen() to open that program via the syntax: os.popen('pidof var_name',
 'r'), but as you know var_name is not expanded within single quotes, I
 tried using double quotes, and such, but no luck. I was looking for a way
 to have var_name expanded without getting a syntax error by ommiting the
 surrounding quotes. I need to use a variable, it's the argument to a
 function.

Use the string format operator %:

var_name='magick'
os.popen('pidof %s' % var_name, 'r')

this results in running:
pidof magick

You should really read through the Python tutorial to get basic stuff
like this:

http://docs.python.org/tut/tut.html

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: socket programming

2005-07-20 Thread gry
What I have done in similar circumstances is put in a random sleep
between connections to fool the server's load manager.  Something like:

.import time
.min_pause,max_pause = (5.0, 10.0) #seconds
.while True:
.   time.sleep(random.uniform(min_pause, max_pause))
.   do_connection_and_query_stuff()

It works for me.  Just play with the pause parameters until it fails
and add a little.

-- George

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: suggestions invited

2005-06-23 Thread gry
Aditi wrote:
 hi all...i m a software engg. student completed my 2nd yr...i have been
 asked to make a project during these summer vacations...and hereby i
 would like to invite some ideas bout the design and implementation of
 an APPLICATION MONITORING SYSTEMi have to start from scrach so
 please tell me how to go bout it rite from the beggining this is the
 first time i m making a project of this complexity...
 i have to make a system used by the IT department of a company which
 contains 31 applications and their details which are being used in a
 company ...the details are...
 Application   sub application catagoryplatform
 languageversion IT
 owner functional ownerremarks source code documentation   
 last updated
 dates
 i want to design a system such that it lets the it employee enter the
 name of the application and gives him all the details about it...please
 suggest an appropriate design and the language which you think would be
 best to use...as i have enouf time with me and i can learn a new
 language as well...i currently know c and c++...your advise is welcomed
 Aditi

I suggest you first learn a bit of python: go to www.python.org and
download/install the current release; go through the online tutorial:
http://docs.python.org/tut/tut.html .

Then you might look at xml as a means for storing the data.  Xml is
structured, readable without special software(very helpful for
debugging), and easy to use for simple data.  Try the xml module from
http://pyxml.sourceforge.net/topics/download.html
[look at the demos for simple usage]   Don't be intimidated by complex
formal definitions of XML, what you need is not hard to use.

-- George

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: regexp for sequence of quoted strings

2005-05-27 Thread gry
PyParsing rocks!  Here's what I ended up with:

def unpack_sql_array(s):
import pyparsing as pp
withquotes = pp.dblQuotedString.setParseAction(pp.removeQuotes)
withoutquotes = pp.CharsNotIn(',')
parser = pp.StringStart() + \
 pp.Word('{').suppress() + \
 pp.delimitedList(withquotes ^ withoutquotes) + \
 pp.Word('}').suppress() + \
 pp.StringEnd()
return parser.parseString(s).asList()

unpack_sql_array('{the,dog\'s,foo,}')
['the', dog's, 'foo,']

[[Yes, this input is not what I stated originally.  Someday, when I
reach a higher plane of existance, I will post a *complete* and
*correct* query to usenet...]]

Does the above seem fragile or questionable in any way?
Thanks all for your comments!

-- George

-- 
http://mail.python.org/mailman/listinfo/python-list


regexp for sequence of quoted strings

2005-05-25 Thread gry
I have a string like:
 {'the','dog\'s','bite'}
or maybe:
 {'the'}
or sometimes:
 {}

[FYI: this is postgresql database array field output format]

which I'm trying to parse with the re module.
A single quoted string would, I think, be:
 r\{'([^']|\\')*'\}

but how do I represent a *sequence* of these separated
by commas?  I guess I can artificially tack a comma on the
end of the input string and do:

 r\{('([^']|\\')*',)\}

but that seems like an ugly hack...

I want to end up with a python array of strings like:

['the', dog's, 'bite']

Any simple clear way of parsing this in python would be
great; I just assume that re is the appropriate technique.
Performance is not an issue.

-- George

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Hacking the scope to pieces

2005-05-24 Thread gry

Hugh Macdonald wrote:
 We're starting to version a number of our python modules here, and
I've
 written a small function that assists with loading the versioned
 modules...

 A module would be called something like: myModule_1_0.py

 In anything that uses it, though, we want to be able to refer to it
 simply as 'myModule', with an environment variable
(MYMODULE_VERSION
 - set to 1.0) that defines the version.

Another technique that you might want to consider, is to have an
explicit
require call in the code, instead of an external environment
variable.
The python gtk interface, pygtk, is used like so:

import pygtk
pygtk.require('1.5')
import gtk

-- or

import pygtk
pygtk.require('2.0')
import gtk

I imagine you could eliminate the extra import gtk step, by clever
coding of the import hook.  You can find pygtk at:

http://ftp.gnome.org/pub/GNOME/sources/pygtk/2.4/pygtk-2.4.1.tar.gz


 I've written a module called 'moduleLoader' with the follwing
function
 in:

 def loadModule(module, version, v = globals()):
   import compiler
   loadStr = import %s_%s as %s % (module, version.replace(.,
_),
 module)
   eval(compiler.compile(loadStr, /tmp/%s_%s_errors.txt % (module,
 version.replace(., _)), single))
   v[module] = vars()[module]


 The ideal situation with this would be to be able, in whatever
script,
 to have:

 import moduleLoader
 moduleLoader.loadModule(myModule, os.getenv(MODULE_VERSION))


 However, this doesn't work. The two options that do work are:

 import moduleLoader
 moduleLoader.loadModule(myModule, os.getenv(MODULE_VERSION),
 globals())


 import moduleLoader
 moduleLoader.loadModule(myModule, os.getenv(MODULE_VERSION))
 from moduleLoader import myModule


 What I'm after is a way of moduleLoader.loadModule working back up
the
 scope and placing the imported module in the main global scope. Any
 idea how to do this?
 
 
 --
 Hugh Macdonald

-- 
http://mail.python.org/mailman/listinfo/python-list


module exports a property instead of a class -- Evil?

2005-04-29 Thread gry
I often find myself wanting an instance attribute that can take on only
a few fixed symbolic values. (This is less functionality than an enum,
since there are no *numbers* associated with the values).  I do want
the thing to fiercely object to assignments or comparisons with
inappropriate values.  My implementation below gets me:

.import mode
.class C(object):
.   status = mode.Mode('started', 'done', 'on-hold')
.
.c=C()
.c.status = 'started'
.c.status = 'stated': #Exception raised
.if c.status == 'done': something
.if c.status == 'stated': #Exception raised
.if c.status.done: something  #simpler and clearer than string compare
.if c.status  'done': something # Mode arg strings define ordering

I would appreciate comments on the overall scheme, as well as about the
somewhat sneaky (I think) exporting of a property-factory instead of a
class.  My intent is to provide a simple clear interface to the client
class (C above), but I don't want to do something *too* fragile or
confusing...
(I'd also welcome a better name than Mode...)

-- mode.py --
class _Mode:  #internal use only, not exported.
def __init__(self, *vals):
if [v for v in vals if not isinstance(v, str)]:
raise ValueError, 'Mode values must be strings'
else:
self.values = list(vals)

def set(self, val):
if val not in self.values:
raise ValueError, 'bad value for Mode: %s' % val
else:
self.state = val

def __cmp__(self, other):
if other in self.values:
return cmp(self.values.index(self.state),
self.values.index(other))
else:
raise ValueError, 'bad value for Mode comparison'

def __getattr__(self, name):
if name in self.values:
return self.state == name
else:
raise AttributeError, 'no such attribute: %s' % name


def Mode(*vals): # *function* returning a *property*, not a class.
m = _Mode(*vals)
def _insert_mode_get(self):
return m
def _insert_mode_set(self, val):
m.set(val)
return property(_insert_mode_get, _insert_mode_set)
---

Thanks,
George

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: module exports a property instead of a class -- Evil?

2005-04-29 Thread gry
Hmm, I had no idea that property was a class.  It's listed in the
library
reference manual under builtin-functions.  That will certainly make
things neater.  Thanks!

-- George

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Inelegant

2005-04-14 Thread gry
I sometimes use the implicit literal string concatenation:

def SomeFunction():
   if SomeCondition:
  MyString = 'The quick brown fox ' \
 'jumped over the ' \
 'lazy dog'
  print MyString

SomeFunction()
The quick brown fox jumped over the lazy dog


It looks pretty good, I think.  One could use triple quotes too, if the
string
contains quotes.

-- George

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: problem with the logic of read files

2005-04-12 Thread gry

[EMAIL PROTECTED] wrote:
 I am new to python and I am not in computer science. In fact I am a
biologist and I ma trying to learn python. So if someone can help me, I
will appreciate it.
 Thanks


 #!/cbi/prg/python/current/bin/python
 # -*- coding: iso-8859-1 -*-
 import sys
 import os
 from progadn import *

 ab1seq = raw_input(Entrez le répertoire où sont les fichiers à
analyser: ) or None
 if ab1seq == None :
 print Erreur: Pas de répertoire! \n
 \nAu revoir \n
 sys.exit()

 listrep = os.listdir(ab1seq)
 #print listrep

 extseq=[]

 for f in listrep:
## Minor -- this is better said as:  if f.endswith(.Seq):
  if f[-4:]==.Seq:
  extseq.append(f)
 # print extseq

 for x in extseq:
  f = open(x, r)
## seq=... discards previous data and refers only to that just
read.
## It would be simplest to process each file as it is read:
@@ seq=f.read()
@@ checkDNA(seq)
  seq=f.read()
  f.close()
  s=seq

 def checkDNA(seq):
 Retourne une liste des caractères non conformes à
l'IUPAC.

 junk=[]
 for c in range (len(seq)):
 if seq[c] not in iupac:
 junk.append([seq[c],c])
 #print junk
 print ATTN: Il y a le caractère %s en position %s  %
(seq[c],c)
 if junk == []:
  indinv=range(len(seq))
  indinv.reverse()
  resultat=
  for i in indinv:
  resultat +=comp[seq[i]]
  return resultat

 seq=checkDNA(seq)
 print seq

# The program segment you posted did not define comp or iupac,
# so it's a little hard to guess how it's supposed to work.  It
would
# be helpful if you gave a concise description of what you want the

# program to do, as well as brief sample of input data.
# I hope this helps!  -- George

 #I got the following ( as you see only one file is proceed by the
function even if more files is in extseq

 ['B1-11_win3F_B04_04.ab1.Seq']
 ['B1-11_win3F_B04_04.ab1.Seq', 'B1-11_win3R_C04_06.ab1.Seq']
 ['B1-11_win3F_B04_04.ab1.Seq', 'B1-11_win3R_C04_06.ab1.Seq',
'B1-18_win3F_D04_08.ab1.Seq']
 ['B1-11_win3F_B04_04.ab1.Seq', 'B1-11_win3R_C04_06.ab1.Seq',
'B1-18_win3F_D04_08.ab1.Seq', 'B1-18_win3R_E04_10.ab1.Seq']
 ['B1-11_win3F_B04_04.ab1.Seq', 'B1-11_win3R_C04_06.ab1.Seq',
'B1-18_win3F_D04_08.ab1.Seq', 'B1-18_win3R_E04_10.ab1.Seq',
'B1-19_win3F_F04_12.ab1.Seq']
 ..
 ['B1-11_win3F_B04_04.ab1.Seq', 'B1-11_win3R_C04_06.ab1.Seq',
'B1-18_win3F_D04_08.ab1.Seq', 'B1-18_win3R_E04_10.ab1.Seq',
'B1-19_win3F_F04_12.ab1.Seq', 'B1-19_win3R_G04_14.ab1.Seq',
'B90_win3F_H04_16.ab1.Seq', 'B90_win3R_A05_01.ab1.Seq',
'DL2-11_win3F_H03_15.ab1.Seq', 'DL2-11_win3R_A04_02.ab1.Seq',
'DL2-12_win3F_F03_11.ab1.Seq', 'DL2-12_win3R_G03_13.ab1.Seq',
'M7757_win3F_B05_03.ab1.Seq', 'M7757_win3R_C05_05.ab1.Seq',
'M7759_win3F_D05_07.ab1.Seq', 'M7759_win3R_E05_09.ab1.Seq',
'TCR700-114_win3F_H05_15.ab1.Seq', 'TCR700-114_win3R_A06_02.ab1.Seq',
'TRC666-100_win3F_F05_11.ab1.Seq', 'TRC666-100_win3R_G05_13.ab1.Seq']

 after this listing my programs proceed only the last element of this
listing (TRC666-100_win3R_G05_13.ab1.Seq)


NNTCCCGAAGTGTCCCAGAGCAAATAAATGGACCCGTAGAATACTTGAACGTGTAATCTCAAA

--
http://mail.python.org/mailman/listinfo/python-list


Re: formatting file

2005-04-06 Thread gry
SPJ wrote:
 I am new to python hence posing this question.
 I have a file with the following format:

 test11.1-1   installed
 test11.1-1   update
 test22.1-1   installed
 test22.1-2   update

 I want the file to be formatted in the following way:

 test11.1-1   1.1-2
 test22.1-1   2.1-2

For data that has a clear tree structure with keys, a quick solution
is often a dictionary, or dictionary of dictionaries.  The setdefault
idiom below is very handy for this sort of thing.  The test name
test1
is key to the top dict.  The operation update is the key to the sub
dictionary.  Setdefault returns the dict for the specified test, or a
new dict if there is none.

.d={}
.for line in open('tests.txt'):
.test,version,operation = l.split()
.d.setdefault(test,{})[operation] = version

.for test,d in d.items():
.print test, d['installed'], d['update']

[BWT, your given test data appears to have been
 wrong test11.1-1   update; please be careful not to waste
 people's time who try to help...]

-- George

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: shuffle the lines of a large file

2005-03-07 Thread gry
As far as I can tell, what you ultimately want is to be able to extract
a random (representative?) subset of sentences.  Given the huge size
of data, I would suggest not randomizing the file, but randomizing
accesses to the file.  E.g. (sorry for off-the-cuff pseudo python):
[adjust 8196 == 2**13 to your disk block size]
. while True:
. byteno = random.randint(0,length_of_file)
. #align to disk block to avoid unnecessary IO
. byteno = (byteno  13)  13  #zero out the bottom 13 bits
. f.seek(byteno) #set the file pointer to a random position
. bytes = r.read(8196) #read one block
. sentences = bytes.splitlines()[2:-1] #omit ends with partial
lines
. do_something(sentences)

If you only need 1000 sentences, use only one sentence from each block,
if you need 1M, then use them all.
[I hope I understood you problem]

-- george

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: converting time tuple to datetime struct

2005-03-03 Thread gry
[your %b is supposed to be the abbreviated month name, not the
number.  Try %m]

In [19]: datetime.datetime(*time.strptime(20-3-2005,%d-%m-%Y)[:6])
Out[19]: datetime.datetime(2005, 3, 20, 0, 0)

Cheers,
   George

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PyAC 0.1.0

2005-03-01 Thread gry

Premshree Pillai wrote:
 PyAC 0.1.0 (http://sourceforge.net/projects/pyac/)

 * ignores non-image files
 * optional arg is_ppt for ordering presentation images (eg.,
 Powerpoint files exported as images)
 * misc fixes

 Package here:
http://sourceforge.net/project/showfiles.php?group_id=106998package_id=115396release_id=309010

A few suggestions:

   Always include in the announcement a brief description of what the
software
does -- most people will not bother to track down the link to decide if
the
package is of interest to them.

   Also mention any dependencies not included in the standard
installation,
e.g. requires PIL.

   You should probably look at the distutils
(http://docs.python.org/dist/dist.html) module for a clean platform-
independant way for users to install your module.  Editing paths in the
source code is not too cool ;-).

And thanks for contributing to the python community!  Have fun!

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Initializing subclasses of tuple

2005-03-01 Thread gry
To inherit from an immutable class, like string or tuple, you need to
use the __new__ member, not __init__.  See, e.g.:

http://www.python.org/2.2.3/descrintro.html#__new__

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Postgres COPY Command with python 2.3 pg

2005-02-17 Thread gry
import pg
db = pg.DB('bind9', '192.168.192.2', 5432, None, None, 'named', None)
db.query('create temp table fffz(i int,t text)')
db.query('copy fffz from stdin')
db.putline(3\t'the')
db.putline(4\t'rain')
db.endcopy()
db.query('commit')

Note that multiple columns must be separated by tabs ('\t') (unless you
specify copy mytable with delimiter ...).

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: change windows system path from cygwin python?

2004-12-16 Thread gry
The _winreg api looks helpful; unfortunately, I'm trying to stick to
what can be got
from the cygwin install -- no _winreg.  Simplicity of installation is
quite important.
(I'm using cygwin to get the xfree86 X-server, which is the whole point
of this exercise)

I have found the cygwin command-line regtool for munging the
registry, so I plan
to use that via os.popen.
Thanks all for pointing me to the right place in the registry!

-- 
http://mail.python.org/mailman/listinfo/python-list


change windows system path from cygwin python?

2004-12-15 Thread gry
[Windows XP Pro, cygwin python 2.4, *nix hacker, windows newbie]

I want to write some kind of install script for my python app that
will add c:\cygwin\usr\bin to the system path.  I don't want
to walk around to 50 PC's and twiddle through the GUI to:

My Computer -- Control Panel -- System -- Advanced -- Environment


How can a python, or even a .bat script modify the system PATH?
It doesn't appear to be in the registry.

-- 
http://mail.python.org/mailman/listinfo/python-list