[ANNOUNCE] Twisted 10.2.0 Released

2010-11-30 Thread Glyph Lefkowitz
Twisted 10.2.0, the third Twisted release of 2010, has emerged from the 
mysterious depths of Twisted Matrix Labs, as so many releases before it.  
Survivors of the release process - what few there were of them - have been 
heard to claim that this version is awesome, even more robust, fun-sized 
and oven fresh.

Crossing several things that shouldn't ought to be, including the streams and 
the rubicon, I have assumed the triple responsibilities of feature author, 
project leader, *and* release manager for 10.2: with this dark and terrible 
power - a power which no man ought to wield alone - I have wrought a release 
which contains many exciting new features, including:

- A plug-in API for adding new types of endpoint descriptions. 
http://tm.tl/4695
- A new, simpler, substantially more robust CoreFoundation reactor.  
http://tm.tl/1833
- Improvements to the implementation of Deferred which should both improve 
performance
  and fix certain runtime errors with long callback chains. 
http://tm.tl/411
- Deferred.setTimeout is (finally) gone.  To quote the author of this 
change:
  A new era of peace has started.  http://tm.tl/1702
- NetstringReceiver is substantially faster. http://tm.tl/4378

And, of course, nearly one hundred smaller bug fixes, documentation updates, 
and general improvements.  See the NEWS file included in the release for more 
details.

Look upon our Twisted, ye mighty, and make your network applications 
event-driven: get it now, from:

http://twistedmatrix.com/

... or simply install the 'Twisted' package from PyPI.

Many thanks to Christopher Armstrong, for his work on release-automation tools 
that made this possible; to Jonathan Lange, for thoroughly documenting the 
process and thereby making my ascent to the throne of release manager possible, 
and to Jean-Paul Calderone for his tireless maintenance of our build and test 
infrastructure as well as his help with the release.

Most of all, thanks to everyone who contributed a patch, reported a bug or 
reviewed a ticket for 10.2.  Not including those already thanked, there are 41 
of you, so it would be a bit tedious to go through everyone, but you know who 
you are and we absolutely couldn't do it without you!  Thanks a ton!

-- 
http://mail.python.org/mailman/listinfo/python-announce-list

Support the Python Software Foundation:
http://www.python.org/psf/donations/


Possible to determine number of rows affected by a SQLite update or delete command?

2010-11-30 Thread python
Is there a cursor or connection property that returns the number
of rows affected by a SQLite update or delete command?

Or, if we want this information, do we have to pre-query our
database for a count of records that will be affected by an
operation?

Thank you,
Malcolm
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Possible to determine number of rows affected by a SQLite update or delete command?

2010-11-30 Thread Kushal Kumaran
On Tue, Nov 30, 2010 at 2:29 PM,  pyt...@bdurham.com wrote:
 Is there a cursor or connection property that returns the number of rows
 affected by a SQLite update or delete command?


The cursor has a rowcount attribute.  The documentation of the sqlite3
module says the implementation is quirky.  You might take a look at
it and see if it fits your needs.

 Or, if we want this information, do we have to pre-query our database for a
 count of records that will be affected by an operation?


-- 
regards,
kushal
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python 3 encoding question: Read a filename from stdin, subsequently?open that filename

2010-11-30 Thread Marc Christiansen
Dan Stromberg drsali...@gmail.com wrote:
 I've got a couple of programs that read filenames from stdin, and then
 open those files and do things with them.  These programs sort of do
 the *ix xargs thing, without requiring xargs.
 
 In Python 2, these work well.  Irrespective of how filenames are
 encoded, things are opened OK, because it's all just a stream of
 single byte characters.
 
 In Python 3, I'm finding that I have encoding issues with characters
 with their high bit set.  Things are fine with strictly ASCII
 filenames.  With high-bit-set characters, even if I change stdin's
 encoding with:
 
       import io
       STDIN = io.open(sys.stdin.fileno(), 'r', encoding='ISO-8859-1')
 
 ...even with that, when I read a filename from stdin with a
 single-character Spanish n~, the program cannot open that filename
 because the n~ is apparently internally converted to two bytes, but
 remains one byte in the filesystem.  I decided to try ISO-8859-1 with
 Python 3, because I have a Java program that encountered a similar
 problem until I used en_US.ISO-8859-1 in an environment variable to
 set the JVM's encoding for stdin.
 
 Python 2 shows the n~ as 0xf1 in an os.listdir('.').  Python 3 with an
 encoding of ISO-8859-1 wants it to be 0xc3 followed by 0xb1.
 
 Does anyone know what I need to do to read filenames from stdin with
 Python 3.1 and subsequently open them, when some of those filenames
 include characters with their high bit set?
 
 TIA!

Try using sys.stdin.buffer instead of sys.stdin. It gives you bytes
instead of strings. Also use byteliterals instead of stringliterals for
paths, i.e. os.listdir(b'.').

Marc
-- 
http://mail.python.org/mailman/listinfo/python-list


Memory issues when storing as List of Strings vs List of List

2010-11-30 Thread OW Ghim Siong

Hi all,

I have a big file 1.5GB in size, with about 6 million lines of 
tab-delimited data. I have to perform some filtration on the data and 
keep the good data. After filtration, I have about 5.5 million data left 
remaining. As you might already guessed, I have to read them in batches 
and I did so using .readlines(1). After reading each batch, I 
will split the line (in string format) to a list using .split(\t) and 
then check several conditions, after which if all conditions are 
satisfied, I will store the list into a matrix.


The code is as follows:
-Start--
a=open(bigfile)
matrix=[]
while True:
   lines = a.readlines(1)
   for line in lines:
   data=line.split(\t)
   if several_conditions_are_satisfied:
   matrix.append(data)
   print Number of lines read:, len(lines), matrix.__sizeof__:, 
matrix.__sizeof__()

   if len(lines)==0:
   break
-End-

Results:
Number of lines read: 461544 matrix.__sizeof__: 1694768
Number of lines read: 449840 matrix.__sizeof__: 3435984
Number of lines read: 455690 matrix.__sizeof__: 5503904
Number of lines read: 451955 matrix.__sizeof__: 6965928
Number of lines read: 452645 matrix.__sizeof__: 8816304
Number of lines read: 448555 matrix.__sizeof__: 9918368

Traceback (most recent call last):
MemoryError

The peak memory usage at the task manager is  2GB which results in the 
memory error.


However, if I modify the code, to store as a list of string rather than 
a list of list by changing the append statement stated above to 
matrix.append(\t.join(data)), then I do not run out of memory.


Results:
Number of lines read: 461544 matrix.__sizeof__: 1694768
Number of lines read: 449840 matrix.__sizeof__: 3435984
Number of lines read: 455690 matrix.__sizeof__: 5503904
Number of lines read: 451955 matrix.__sizeof__: 6965928
Number of lines read: 452645 matrix.__sizeof__: 8816304
Number of lines read: 448555 matrix.__sizeof__: 9918368
Number of lines read: 453455 matrix.__sizeof__: 12552984
Number of lines read: 432440 matrix.__sizeof__: 14122132
Number of lines read: 432921 matrix.__sizeof__: 15887424
Number of lines read: 464259 matrix.__sizeof__: 17873376
Number of lines read: 450875 matrix.__sizeof__: 20107572
Number of lines read: 458552 matrix.__sizeof__: 20107572
Number of lines read: 453261 matrix.__sizeof__: 22621044
Number of lines read: 413456 matrix.__sizeof__: 22621044
Number of lines read: 166464 matrix.__sizeof__: 25448700
Number of lines read: 0 matrix.__sizeof__: 25448700

In this case, the peak memory according to the task manager is about 1.5 GB.

Does anyone know why is there such a big difference memory usage when 
storing the matrix as a list of list, and when storing it as a list of 
string? According to __sizeof__ though, the values are the same whether 
storing it as a list of list, or storing it as a list of string. Is 
there any methods how I can store all the info into a list of list? I 
have tried creating such a matrix of equivalent size and it only uses 
35mb of memory but I am not sure why when using the code above, the 
memory usage shot up so fast and exceeded 2GB.


Any advice is greatly appreciated.

Regards,
Jinxiang
--
http://mail.python.org/mailman/listinfo/python-list


Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Peter Otten
Dan Stromberg wrote:

 I've got a couple of programs that read filenames from stdin, and then
 open those files and do things with them.  These programs sort of do
 the *ix xargs thing, without requiring xargs.
 
 In Python 2, these work well.  Irrespective of how filenames are
 encoded, things are opened OK, because it's all just a stream of
 single byte characters.

I think you're wrong. The filenames' encoding as they are read from stdin 
must be the same as the encoding used by the file system. If the file system 
expects UTF-8 and you feed it ISO-8859-1 you'll run into errors.

You always have to know either

(a) both the file system's and stdin's actual encoding, or 
(b) that both encodings are the same.

If byte strings work you are in situation (b) or just lucky. I'd guess the 
latter ;)
 
 In Python 3, I'm finding that I have encoding issues with characters
 with their high bit set.  Things are fine with strictly ASCII
 filenames.  With high-bit-set characters, even if I change stdin's
 encoding with:
 
 import io
 STDIN = io.open(sys.stdin.fileno(), 'r', encoding='ISO-8859-1')

I suppose you can handle (b) with

STDIN = sys.stdin.buffer

or

STDIN = io.TextIOWrapper(sys.stdin.buffer,
 encoding=sys.getfilesystemencoding())

in Python 3. I'd prefer the latter because it makes your assumptions 
explicit. (Disclaimer: I'm not sure whether I'm using the io API as Guido 
intended it)

 ...even with that, when I read a filename from stdin with a
 single-character Spanish n~, the program cannot open that filename
 because the n~ is apparently internally converted to two bytes, but
 remains one byte in the filesystem.  I decided to try ISO-8859-1 with
 Python 3, because I have a Java program that encountered a similar
 problem until I used en_US.ISO-8859-1 in an environment variable to
 set the JVM's encoding for stdin.
 
 Python 2 shows the n~ as 0xf1 in an os.listdir('.').  Python 3 with an
 encoding of ISO-8859-1 wants it to be 0xc3 followed by 0xb1.
 
 Does anyone know what I need to do to read filenames from stdin with
 Python 3.1 and subsequently open them, when some of those filenames
 include characters with their high bit set?
 
 TIA!


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Memory issues when storing as List of Strings vs List of List

2010-11-30 Thread Ulrich Eckhardt
OW Ghim Siong wrote:
 I have a big file 1.5GB in size, with about 6 million lines of
 tab-delimited data.

How many fields are there an each line?

 I have to perform some filtration on the data and 
 keep the good data. After filtration, I have about 5.5 million data left
 remaining. As you might already guessed, I have to read them in batches
 and I did so using .readlines(1).

I'd have guessed differently. Typically, I would say that you read one line,
apply whatever operation you want to it and then write out the result. At
least that is the typical operation of filtering.

 a=open(bigfile)

I guess you are on MS Windows. There, you have different handling of textual
and non-textual files with regards to the handling of line endings.
Generally, using non-textual as input is easier, because it doesn't require
any translations. However, textual input is the default, therefore:

  a = open(bigfile, rb)

Or, even better:

 with open(bigfile, rb) as a:

to make sure the file is closed correctly and in time.

 matrix=[]
 while True:
 lines = a.readlines(1)
 for line in lines:

I believe you could do

  for line in a:
  # use line here

 data=line.split(\t)

Question here: How many elements does each line contain? And what is their
content? The point is that each object has its overhead, and if the content
is just e.g. an integral number or a short string, the ratio of interesting
content to overhead is rather bad! Compare this to storing a longer string
with just the overhead of a single string object instead, it should be
obvious.

 However, if I modify the code, to store as a list of string rather than
 a list of list by changing the append statement stated above to
 matrix.append(\t.join(data)), then I do not run out of memory.

You already have the result of that join:

  matrix.append(line)

 Does anyone know why is there such a big difference memory usage when
 storing the matrix as a list of list, and when storing it as a list of
 string? According to __sizeof__ though, the values are the same whether
 storing it as a list of list, or storing it as a list of string.

I can barely believe that. How are you using __sizeof__? Why aren't you
using sys.getsizeof() instead? Are you aware that the size of a list
doesn't include the size for its content (even though it grows with the
number of elements), while the size of a string does?


 Is there any methods how I can store all the info into a list of list? I
 have tried creating such a matrix of equivalent size and it only uses
 35mb of memory but I am not sure why when using the code above, the
 memory usage shot up so fast and exceeded 2GB.

The size of an empty list is 20 here, plus 4 per element (makes sense on a
32-bit machine), excluding the elements themselves. That means that you
have around 8M elements (25448700/4). These take around 32MB of memory,
which is what you are probably seeing. The point is that your 35mb don't
include any content, probably just a single interned integer or None, so
that all elements of your list are the same and only require memory once.
In your real-world application that is obviously not so.

My suggestions:
1. Find out what exactly is going on here, in particular why our
interpretations of the memory usage differ.
2. Redesign your code to really use a filtering design, i.e. don't keep the
whole data in memory.
3. If you still have memory issues, take a look at the array library, which
should make storage of large arrays a bit more efficient.


Good luck!

Uli

-- 
Domino Laser GmbH
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932

-- 
http://mail.python.org/mailman/listinfo/python-list


ANNOUNCE: NHI1-0.10, PLMK-1.8 und libmsgque-4.8

2010-11-30 Thread Andreas Otto
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Dear User,


ANNOUNCE:Major Feature Release


  libmsgque: Application-Server-Toolkit for
 C, C++, JAVA, C#, Go, TCL, PERL, PHP, PYTHON, RUBY, VB.NET
  PLMK:  Programming-Language-Microkernel
  NHI1:  Non-Human-Intelligence #1


STATEMENT
=

It takes 2 years
and a team of qualified software developers
to implement a new programming language,
but it takes only 2 weeks to add a micro-kernel
- - aotto1968


SUMMARY
===


Add support from the programming language Go from Google


LINKS
=

  UPDATE - PLMK definition
   
http://openfacts2.berlios.de/wikien/index.php/BerliosProject:NHI1_-_TheKernel
  ChangeLog:
http://nhi1.berlios.de/theLink/changelog.htm
  libmsgque including PHP documentation:
http://nhi1.berlios.de/theLink/index.htm
  NHI1:
http://nhi1.berlios.de/
  DOWNLOAD:
http://developer.berlios.de/projects/nhi1/
  Go man pages:
reference: gomsgqueref.n
tutorial:  gomsgquetut.n



mfg, Andreas Otto (aotto1968)
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.15 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJM9OZsAAoJEGTcPijNG3/A+qwH/1WT3K8619eLzQ78dylS623r
qrZtHXRxieD+4GIBgkU7KbNu+LGztxasLW9upafmmF2mGcWtIFuiOEJtw6MJM+07
0X7elXM5WZkXK65dbLE5bbSfO0DHw5T6aIweogA3zjcjDbB3rSC/T6WIlZB4HNYh
nBj9xC6WMP7s/jEjs4i5FCRT6gTRzDDJbR+SXqNEEYc/z8wVKPUDfpU/6JGxl9MV
rPSUsO+YdZX0XI7+imiUYSVyt+kniL3C36kGON/qGDahscoQYFS6GdoI5XDzI0c+
jN7Q2Ecrphd5F5G/2plNLbVy4mPVd9k/I8VjXMaHLm+skT2Z4Zt7aF29A1FFw68=
=/O74
-END PGP SIGNATURE-
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Memory issues when storing as List of Strings vs List of List

2010-11-30 Thread Peter Otten
OW Ghim Siong wrote:

 Hi all,
 
 I have a big file 1.5GB in size, with about 6 million lines of
 tab-delimited data. I have to perform some filtration on the data and
 keep the good data. After filtration, I have about 5.5 million data left
 remaining. As you might already guessed, I have to read them in batches
 and I did so using .readlines(1). After reading each batch, I
 will split the line (in string format) to a list using .split(\t) and
 then check several conditions, after which if all conditions are
 satisfied, I will store the list into a matrix.
 
 The code is as follows:
 -Start--
 a=open(bigfile)
 matrix=[]
 while True:
 lines = a.readlines(1)
 for line in lines:
 data=line.split(\t)
 if several_conditions_are_satisfied:
 matrix.append(data)
 print Number of lines read:, len(lines), matrix.__sizeof__:,
 matrix.__sizeof__()
 if len(lines)==0:
 break
 -End-

As Ulrich says, don't use readlines(), use

for line in a:
   ... 

that way you have only one line in memory at a time instead of the huge 
lines list.

 Results:
 Number of lines read: 461544 matrix.__sizeof__: 1694768
 Number of lines read: 449840 matrix.__sizeof__: 3435984
 Number of lines read: 455690 matrix.__sizeof__: 5503904
 Number of lines read: 451955 matrix.__sizeof__: 6965928
 Number of lines read: 452645 matrix.__sizeof__: 8816304
 Number of lines read: 448555 matrix.__sizeof__: 9918368
 
 Traceback (most recent call last):
 MemoryError
 
 The peak memory usage at the task manager is  2GB which results in the
 memory error.
 
 However, if I modify the code, to store as a list of string rather than
 a list of list by changing the append statement stated above to
 matrix.append(\t.join(data)), then I do not run out of memory.
 
 Results:
 Number of lines read: 461544 matrix.__sizeof__: 1694768
 Number of lines read: 449840 matrix.__sizeof__: 3435984
 Number of lines read: 455690 matrix.__sizeof__: 5503904
 Number of lines read: 451955 matrix.__sizeof__: 6965928
 Number of lines read: 452645 matrix.__sizeof__: 8816304
 Number of lines read: 448555 matrix.__sizeof__: 9918368
 Number of lines read: 453455 matrix.__sizeof__: 12552984
 Number of lines read: 432440 matrix.__sizeof__: 14122132
 Number of lines read: 432921 matrix.__sizeof__: 15887424
 Number of lines read: 464259 matrix.__sizeof__: 17873376
 Number of lines read: 450875 matrix.__sizeof__: 20107572
 Number of lines read: 458552 matrix.__sizeof__: 20107572
 Number of lines read: 453261 matrix.__sizeof__: 22621044
 Number of lines read: 413456 matrix.__sizeof__: 22621044
 Number of lines read: 166464 matrix.__sizeof__: 25448700
 Number of lines read: 0 matrix.__sizeof__: 25448700
 
 In this case, the peak memory according to the task manager is about 1.5
 GB.
 
 Does anyone know why is there such a big difference memory usage when
 storing the matrix as a list of list, and when storing it as a list of
 string? According to __sizeof__ though, the values are the same whether
 storing it as a list of list, or storing it as a list of string. Is

sizeof gives you the shallow size of the list, basically the memory to 
hold C pointers to the items in the list. A better approximation for the 
total size of a list of lists of string is

 from sys import getsizeof as sizeof
 matrix = [[alpha, beta], [gamma, delta]]
 sizeof(matrix), sum(sizeof(row) for row in matrix), sum(sizeof(entry) 
for row in matrix for entry in row)
(88, 176, 179)
 sum(_)
443

As you can see the outer list requires only a small portion of the total 
memory, and its relative size will decrease as the matrix grows.

The above calculation may still be wrong because some of the strings could 
be identical. Collapsing identical strings into a single object is also a 
way to save memory if you have a significant number of repetitions. Try

matrix = []
with open(...) as f:
for line in f:
data = line.split(\t)
if ...:
matrix.append(map(intern, data))

to see whether it sufficiently reduces the amount of memory needed.

 there any methods how I can store all the info into a list of list? I
 have tried creating such a matrix of equivalent size and it only uses
 35mb of memory but I am not sure why when using the code above, the
 memory usage shot up so fast and exceeded 2GB.
 
 Any advice is greatly appreciated.
 
 Regards,
 Jinxiang

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: remote control firefox with python

2010-11-30 Thread Hans-Peter Jansen
On Sunday 28 November 2010, 16:22:33 News123 wrote:
 Hi,


 I wondered whether there is a simpe way to
 'remote' control fire fox with python.


 With remote controlling I mean:
 - enter a url in the title bar and click on it
 - create a new tab
 - enter another url click on it
 - save the html document of this page
 - Probably the most difficult one: emulate a click or 'right click'
 on a certain button or link of the current page.
 - other interesting things would be to be able to enter the master
   password from a script
 - to enable disable proxy settings while running.

 The reason why I want to stay within Firefox and not use any other
 'mechanize' frame work is, that the pages I want to automate might
 contain a lot of javascript for the construction of the actual page.

If webkit based rendering in an option (since its javascript engine is 
respected by web developers nowadays..), you might want to check out  
PyQt, based on current versions of Qt. It provides very easy access to 
a full featured web browser engine without sacrificing low level 
details. All your requirements are provided easily (if you're able to 
grok the Qt documentation, e.g. ignore all C++ clutter, you're set).

I've transcoded all available QtWebKit examples to python lately, 
available here:

http://www.riverbankcomputing.com/pipermail/pyqt/2010-November/028614.html

The attachment is a tar.bz2 archive, btw.

Clicking is archived by:

webelement.evaluateJavaScript(
var event = document.createEvent('MouseEvents');
event.initEvent('click', true, true);
this.dispatchEvent(event);
)

Cheers,
Pete
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: TDD in python

2010-11-30 Thread Roy Smith
In article 
58fe3680-21f5-42f8-9341-e069cbb88...@r19g2000prm.googlegroups.com,
 rustom rustompm...@gmail.com wrote:

 Looking around I found this:
 http://bytes.com/topic/python/answers/43330-unittest-vs-py-test
 where Raymond Hettinger no less says quite unequivocally that he
 prefers test.py to builtin unittest
 because it is not so heavy-weight
 
 Is this the general consensus nowadays among pythonistas?
 [Note I tend to agree but Ive no experience so asking]

Both frameworks have their fans; I doubt you'll find any consensus.

Pick one, learn it, and use it.  What's important is that you write 
tests, write lots of tests, and write good tests.  Which framework you 
use is a detail.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Albert Hopkins
On Tue, 2010-11-30 at 11:52 +0100, Peter Otten wrote:
Dan Stromberg wrote:
 
  I've got a couple of programs that read filenames from stdin, and
then
  open those files and do things with them.  These programs sort of do
  the *ix xargs thing, without requiring xargs.
  
  In Python 2, these work well.  Irrespective of how filenames are
  encoded, things are opened OK, because it's all just a stream of
  single byte characters.
 
 I think you're wrong. The filenames' encoding as they are read from
stdin 
 must be the same as the encoding used by the file system. If the file
system 
 expects UTF-8 and you feed it ISO-8859-1 you'll run into errors.
 
 I think this is wrong.  In Unix there is no concept of filename
encoding.  Filenames can have any arbitrary set of bytes (except '/' and
'\0').   But the filesystem itself neither knows nor cares about
encoding.

You always have to know either
 
 (a) both the file system's and stdin's actual encoding, or 
 (b) that both encodings are the same.
 
 
If this is true, then I think that it is wrong to do in Python3.  Any
language should be able to deal with the filenames that the host OS
allows.

Anyway, going on with the OP.. can you open stdin so that you can accept
arbitrary bytes instead of strings and then open using the bytes as the
filename? I don't have that much experience with Python3 to say for
sure.

-a


-- 
http://mail.python.org/mailman/listinfo/python-list


How does GC affect generator context managers?

2010-11-30 Thread Jason
I've been reading through the docs for contextlib and PEP 343, and
came across this:

Note that we're not guaranteeing that the finally-clause is
executed immediately after the generator object becomes unused,
even though this is how it will work in CPython.

...referring to context managers created via the
contextlib.contextmanager decorator containing cleanup code in a
finally clause. While I understand that Python-the-language does not
specify GC semantics, and different implementations can do different
things with that, what I don't get is how GC even relates to a context
manager created from a generator.

As I understood it, when the with block exits, the __exit__() method
is called immediately. This calls the next() method on the underlying
generator, which forces it to run to completion (and raise a
StopIteration), which includes the finally clause... right?

— Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


nike shoes , fashi on clothes ; brand hand bags

2010-11-30 Thread SA sada
 Dear customers, thank you for your support of our company.
Here, there's good news to tell you: The company recently
launched a number of new fashion items! ! Fashionable
and welcome everyone to come buy. If necessary, please
plut: http://www.vipshops.org ==

 http://www.vipshops.org ==

 http://www.vipshops.org ==

 http://www.vipshops.org ==

 http://www.vipshops.org ==

 http://www.vipshops.org ==

 http://www.vipshops.org ==
1) More pictures available on our website (= http://www.vipshops.org
)
2) Many colors available .
3) Perfect quality,
4) 100% safe door to door delivery,
Best reputation , Best services
Posted: 4:13 pm on November 21st
-- 
http://mail.python.org/mailman/listinfo/python-list


how to go on learning python

2010-11-30 Thread Xavier Heruacles
I'm basically a c/c++ programmer and recently come to python for some web
development. Using django and javascript I'm afraid I can develop some web
application now. But often I feel I'm not good at python. I don't know much
about generators, descriptors and decorators(although I can use some of it
to accomplish something, but I don't think I'm capable of knowing its
internals). I find my code ugly, and it seems near everything are already
gotten done by the libraries. When I want to do something, I just find some
libraries or modules and then just finish the work. So I'm a bit tired of
just doing this kind of high level scripting, only to find myself a bad
programmer. Then my question is after one coded some kind of basic app, how
one can keep on learning programming using python?
Do some more interesting projects? Read more general books about
programming? or...?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Needed: Real-world examples for Python's Cooperative Multiple Inheritance

2010-11-30 Thread Sol Toure
Most of the examples presented here can use the decorator pattern instead.
Especially the window system


On Mon, Nov 29, 2010 at 5:27 PM, Gregory Ewing
greg.ew...@canterbury.ac.nzwrote:

 Paul Rubin wrote:

  The classic example though is a window system, where you have a window
 class, and a scroll bar class, and a drop-down menu class, etc. and
 if you want a window with a scroll bar and a drop-down menu, you inherit
 from all three of those classes.


 Not in any GUI library I've ever seen. Normally there would
 be three objects involved in such an arrangement, a Window,
 a ScrollBar and a DropDownMenu, connected to each other in
 some way.

 --
 Greg

 --
 http://mail.python.org/mailman/listinfo/python-list




-- 
http://www.afroblend.com
African news as it happens.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How does GC affect generator context managers?

2010-11-30 Thread Duncan Booth
Jason jason.hee...@gmail.com wrote:

 As I understood it, when the with block exits, the __exit__() method
 is called immediately. This calls the next() method on the underlying
 generator, which forces it to run to completion (and raise a
 StopIteration), which includes the finally clause... right?
 
That is true if the with block exits, but if the with block (or 
try..finally block) contains yield you have a generator. In that case 
if you simply drop the generator on the floor the cleanup at the end of the 
with will still happen, but maybe not until the generator is garbage 
collected.

def foo():
   with open(foo) as foo:
  for line in foo:
  yield line

...

bar = foo()
print bar.next()
del bar # May close the file now or maybe later...

   


-- 
Duncan Booth http://kupuguy.blogspot.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Programming games in historical linguistics with Python

2010-11-30 Thread Dax Bloom
Hello,

Following a discussion that began 3 weeks ago I would like to ask a
question regarding substitution of letters according to grammatical
rules in historical linguistics. I would like to automate the
transformation of words according to complex rules of phonology and
integrate that script in a visual environment.
Here follows the previous thread:
http://groups.google.com/group/comp.lang.python/browse_thread/thread/3c55f9f044c3252f/fe7c2c82ecf0dbf5?lnk=gstq=evolutionary+linguistics#fe7c2c82ecf0dbf5

Is there a way to refer to vowels and consonants as a subcategory of
text? Is there a function to remove all vowels? How should one create
and order the dictionary file for the rules? How to chain several
transformations automatically from multiple rules? Finally can anyone
show me what existing python program or phonological software can do
this?

What function could tag syllables, the word nucleus and the codas? How
easy is it to bridge this with a more visual environment where
interlinear, aligned text can be displayed with Greek notations and
braces as usual in the phonology textbooks?

Best regards,

Dax Bloom
-- 
http://mail.python.org/mailman/listinfo/python-list


Help: problem in setting the background colour ListBox

2010-11-30 Thread ton ph
Hi everyone ,
 I have a requirement of displaying my data in a textCtrl like widget , but
i need that the data in the row be clickable ,
so as when i click the data i could be able to get fire and even and get me
the selected data value.After a long
search i  found ListBox to be perfect for my use but When i try to set the
backGround colour to the colour of my
application requirement i am not able to do so, but i am able to set the
foreground colour .
Hope someone will guide me in solving my problem
Thanks
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Memory issues when storing as List of Strings vs List of List

2010-11-30 Thread Tim Chase

On 11/30/2010 04:29 AM, OW Ghim Siong wrote:

a=open(bigfile)
matrix=[]
while True:
 lines = a.readlines(1)
 for line in lines:
 data=line.split(\t)
 if several_conditions_are_satisfied:
 matrix.append(data)
 print Number of lines read:, len(lines), matrix.__sizeof__:,
matrix.__sizeof__()
 if len(lines)==0:
 break


As others have mentiond, don't use .readlines() but use the 
file-object as an iterator instead.  This can even be rewritten 
as a simple list-comprehension:


  from csv import reader
  matrix = [data
for data
in reader(file('bigfile.txt', 'rb'), delimiter='\t')
if several_conditions_are_satisfied(data)
]

Assuming that you're throwing away most of the data (the final 
matrix fits well within memory, even if the source file doesn't).


-tkc



--
http://mail.python.org/mailman/listinfo/python-list


Re: Python 2.7.1

2010-11-30 Thread Antoine Pitrou
On Mon, 29 Nov 2010 15:11:28 -0800 (PST)
Spider matt...@cuneiformsoftware.com wrote:
  2.7 includes many features that were first released in Python 3.1. The 
  faster io module ...
 
 I understand that I/O in Python 3.0 was slower than 2.x (due to quite
 a lot of the code being in Python rather than C, I gather), and that
 this was fixed up in 3.1. So, io in 3.1 is faster than in 3.0.
 
 Is it also true that io is faster in 2.7 than 2.6? That's what the
 release notes imply, but I wonder whether that comment has been back-
 ported from the 3.1 release notes, and doesn't actually apply to 2.7.

The `io` module, which was backported from 3.1/3.2, is faster than in
2.6, but that's not what is used by default in 2.x when calling e.g.
open() or file() (you'd have to use io.open() instead).

So, as you suspect, the speed of I/O in 2.7 hasn't changed. The `io`
module is available in 2.6/2.7 so that you can experiment with some 3.x
features without switching, and in this case it's much faster than 2.6.

Regards

Antoine.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Antoine Pitrou
On Mon, 29 Nov 2010 21:52:07 -0800 (PST)
Yingjie Lan lany...@yahoo.com wrote:
 --- On Tue, 11/30/10, Dan Stromberg drsali...@gmail.com wrote:
  In Python 3, I'm finding that I have encoding issues with
  characters
  with their high bit set.  Things are fine with strictly
  ASCII
  filenames.  With high-bit-set characters, even if I
  change stdin's
  encoding with:
 
 Co-ask. I have also had problems with file names in
 Chinese characters with Python 3. I unzipped the 
 turtle demo files into the desktop folder (of
 course, the word 'desktop' is in Chinese, it is
 a windows XP system, localization is Chinese), then
 all in a sudden some of the demos won't work
 anymore. But if I move it to a folder whose 
 path contains only english characters, everything
 comes back to normal.

Can you try the latest 3.2alpha4 (*) and check if this is fixed?
If not, then could you please open a bug on http://bugs.python.org ?

(*) http://python.org/download/releases/3.2/

Thank you

Antoine.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Memory issues when storing as List of Strings vs List of List

2010-11-30 Thread Antoine Pitrou
On Tue, 30 Nov 2010 18:29:35 +0800
OW Ghim Siong o...@bii.a-star.edu.sg wrote:
 
 Does anyone know why is there such a big difference memory usage when 
 storing the matrix as a list of list, and when storing it as a list of 
 string?

That's because any object has a fixed overhead (related to metadata and
allocation), so storing a matrix line as a sequence of several objects
rather than a single string makes the total overhead larger,
especially when the payload of each object is small.

If you want to mitigate the issue, you could store your lines as tuples
rather than lists, since tuples have a smaller memory footprint:

matrix.append(tuple(data))

 According to __sizeof__ though, the values are the same whether 
 storing it as a list of list, or storing it as a list of string.

As mentioned by others, __sizeof__ only gives you the size of the
container, not the size of the contained values (which is where the
difference is here).

Regards

Antoine.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to go on learning python

2010-11-30 Thread Alice Bevan–McGregor

Howdy Xavier!

[Apologies for the length of this; I didn't expect to write so much!]

I've been a Python programmer for many years now (having come from a 
PHP, Perl, C, and Pascal background) and I'm constantly learning new 
idioms and ways of doing things that are more Pythonic; cleaner, more 
efficient, or simply more beautiful.  I learn by coding, rather than by 
reading books, taking lectures, or sitting idly watching screencasts.  
I constantly try to break the problems I come up with in my head into 
smaller and smaller pieces, then write the software for those pieces in 
as elegant a method as possible.


Because of my turtles all the way down design philosophy, a lot of my 
spare time projects have no immediate demonstrable benefit; I code them 
for fun!  I have a folder full of hundreds of these little projects, 
the vast majority of which never see a public release.  I also collect 
little snippets of code that I come across[1] or write, and often 
experiment with performance tests[2] of small Python snippets.


Often I'll assign myself the task of doing something far outside my 
comfort zone; a recent example is writing a HTTP/1.1 web server.  I had 
no idea how to do low-level socket programming in Python, let alone how 
HTTP actually worked under-the-hood, and because my goal wasn't 
(originally) to produce a production-quality product for others it gave 
me the freedom to experiment, rewrite, and break things in as many ways 
as I wanted.  :)  I had people trying to convince me that I shouldn't 
re-invent the wheel (just use Twisted!) though they mis-understood 
the reason for my re-invention: to learn.


It started as a toy 20-line script to dump a static HTTP/1.0 response 
on each request and has grown into a ~270 line fully HTTP/1.1 
compliant, ultra-performant multi-process HTTP server rivalling pretty 
much every other pure-Python web server I've tested.  (I still don't 
consider it production ready, though.)  Progressive enhancement as I 
came up with and implemented ideas meant that sometimes I had to 
rewrite it from scratch, but I'm quite proud of the result and have 
learned far more than I expected in the process.


While I don't necessarily study books on Python, I did reference HTTP: 
The Definitive Guide and many websites in developing that server, and I 
often use the Python Quick Reference[3] when I zone out and forget 
something basic or need to find something more advanced.


In terms of understanding how Python works, or how you can use certain 
semantics (or even better, why you'd want to!) Python Enhancement 
Proposals (PEPs) can be an invaluable resource.  For example, PEP 
318[4] defines what a decorator is, why they're useful, how they work, 
and how you can write your own.  Pretty much everything built into 
Python after Python 2.0 was first described, reasoned, and discussed in 
a PEP.


If you haven't seen this already, the Zen of Python[5] (a PEP) has many 
great guidelines.  I try to live and breathe the Zen.


So that's my story: how I learn to improve my own code.  My motto, 
re-inventing the wheel, every time, is the short version of the 
above.  Of course, for commercial work I don't generally spend so much 
time on the nitty-gritty details; existing libraries are there for a 
reason, and, most of the time, Getting Things Done™ is more important 
than linguistic purity!  ;)


— Alice.

[1] https://github.com/GothAlice/Random/
[2] https://gist.github.com/405354
[3] http://rgruet.free.fr/PQR26/PQR2.6.html
[4] http://www.python.org/dev/peps/pep-0318/
[5] http://www.python.org/dev/peps/pep-0020/


--
http://mail.python.org/mailman/listinfo/python-list


Iran slams Wiki-release as US psywar - WIKILEAKS is replacing those BIN LADEN communiques of CIA (the global ELITE) intended to threaten MASSES

2010-11-30 Thread small Pox
Iran slams Wiki-release as US psywar - WIKILEAKS is replacing those
BIN LADEN communiques of CIA (the global ELITE) intended to threaten
MASSES

CIA is the criminal agency of the global elite.

They want to destroy the middle class from the planet and also create
a global tyranny of a police state.

http://presstv.ir/detail/153128.html
http://presstv.ir/detail/153128.html
http://presstv.ir/detail/153128.html

Iran slams Wiki-release as US psywar
Mon Nov 29, 2010 12:56PM
Share | Email | Print
Iran's President Mahmoud Ahmadinejad has questioned the recent
'leaked' documents published by Wikileaks website, describing them as
part of a US psychological warfare.


In response to a question by Press TV on Monday over the whistleblower
website's leaks, President Mahmoud Ahmadinejad said let me first
correct you. The material was not leaked, but rather released in an
organized effort.

The US administration releases documents and makes a judgment based
on them. They are mostly like a psychological warfare and lack legal
basis, President Ahmadinejad told reporters on Monday.

The documents will certainly have no political effects. Nations are
vigilant today and such moves will have no impact on international
relations, the Iranian chief executive added at the press briefing in
Tehran.

President Ahmadinejad stressed that the Wikileaks game is not even
worth a discussion and that no one would waste their time analysing
them.

The countries in the region are like friends and brothers and these
acts of mischief will not affect their relations, he concluded.

Talks with the West

The president announced that aside from Brazil and Turkey a number of
other countries may take part in the new round of talks between Iran
and the P5+1 -- Britain, China, France, Russia, the US, plus Germany.

Human rights

They (Western powers) trample on the dignity of man, their identity
and real freedom. They infringe all of these and then they call it
human rights, Ahmadinejad said.

Earlier this month, the UN General Assembly's Third Committee accused
Iran of violating human rights regulations.

The 118-member Non-Aligned Movement and the 57-member Organization of
the Islamic Conference have condemned the resolution against the
Islamic Republic.

In 2005, the human rights [issue] got a new mechanism in the United
Nations ... human rights was pushed away and human rights was used for
political manipulation, Secretary General of Iran's High Council for
Human Rights Mohammed Javad Larijani told Press TV following the vote
on the resolution.

This is while the United Nations Human Rights Council reviewed the US
human rights record for the first time in its history. The council
then issued a document making 228 suggestions to the US to improve its
rights record.

IAEA 'leak'

The president said that Iran has always had a positive relationship
with the International Atomic Energy Agency but criticized the UN
nuclear agency for caving under pressure from the masters of power
and wealth.

The president said due to this pressure the IAEA has at times adopted
unfair and illegal stances against the Islamic Republic.

Their recent one (IAEA report) is better than the previous ones and
is closer to the truth but still all the facts are not reflected, he
added. Of course the latest report also has shortcomings, for example
all [of Iran's nuclear] information has been released and these are
secret and confidential documents belonging to the country.

Ahmadinejad said since Iran was following a policy of nuclear
transparency, it did not care about the leaks, but called the move
'illegal.

New world order

The world needs order … an order in which different people form
different walks of life enjoy equal rights and proper dignity, the
president said in his opening speech before taking questions form
Iranian and foreign journalist.

The president added that the world was already on the path to setting
up this order.

Iran isolation

When asked to comment on the US and Western media claims that Iran has
become highly isolated in the region despite an active diplomacy with
Persian Gulf littoral states, the president said the remarks were part
of the discourse of hegemony.

In the hegemonic discourse, it seems that concepts and words take on
different meanings than those offered by dictionaries, Ahmadinejad
said.

When they say they have isolated Iran, it means that they themselves
are isolated and when they say Iran is economically weak, it means
that it has strengthened, the president reasoned.

When they say there is a dictatorship somewhere, it means that country
is really chosen by the people and vise a versa, the president further
noted, adding, I do not want to name names.

ZHD/HGH/SF/MMN/MB Comments
Add Comment Click Here
Note: The views expressed and the links provided on our comment pages
are the personal views of individual contributors and do not
necessarily reflect the views of Press TV.
check this out
11/30/2010 9:18:05 AMit is a coincidence 

Re: remote control firefox with python

2010-11-30 Thread baloan
On Nov 28, 4:22 pm, News123 news1...@free.fr wrote:
 Hi,

 I wondered whether there is a simpe way to
 'remote' control fire fox with python.

 With remote controlling I mean:
 - enter a url in the title bar and click on it
 - create a new tab
 - enter another url click on it
 - save the html document of this page
 - Probably the most difficult one: emulate a click or 'right click' on a
 certain button or link of the current page.
 - other interesting things would be to be able to enter the master
         password from a script
 - to enable disable proxy settings while running.

 The reason why I want to stay within Firefox and not use any other
 'mechanize' frame work is, that the pages I want to automate might
 contain a lot of javascript for the construction of the actual page.

 Thanks in advance for any pointers ideas.

I have had some good experience with Sikuli.

http://sikuli.org/

Regards, Andreas
bal...@gmail.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Using property() to extend Tkinter classes but Tkinter classes are old-style classes?

2010-11-30 Thread Giacomo Boffi
Terry Reedy tjre...@udel.edu writes:

 On 11/28/2010 3:47 PM, pyt...@bdurham.com wrote:
 I had planned on subclassing Tkinter.Toplevel() using property() to wrap
 access to properties like a window's title.
 After much head scratching and a peek at the Tkinter.py source, I
 realized that all Tkinter classes are old-style classes (even under
 Python 2.7).
 1. Is there a technical reason why Tkinter classes are still old-style
 classes?

 To not break old code. Being able to break code by upgrading all
 classes in the stdlib was one of the reasons for 3.x.

In 3.x, are Tkinter classes still derived by old-style classes?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Peter Otten
Albert Hopkins wrote:

 On Tue, 2010-11-30 at 11:52 +0100, Peter Otten wrote:
 Dan Stromberg wrote:
 
  I've got a couple of programs that read filenames from stdin, and
 then
  open those files and do things with them.  These programs sort of do
  the *ix xargs thing, without requiring xargs.
  
  In Python 2, these work well.  Irrespective of how filenames are
  encoded, things are opened OK, because it's all just a stream of
  single byte characters.
 
 I think you're wrong. The filenames' encoding as they are read from stdin
 must be the same as the encoding used by the file system. If the file 
 system expects UTF-8 and you feed it ISO-8859-1 you'll run into errors.
 
 I think this is wrong.  In Unix there is no concept of filename
 encoding.  Filenames can have any arbitrary set of bytes (except '/' and
 '\0').   But the filesystem itself neither knows nor cares about
 encoding.

I think you misunderstood what I was trying to say. If you write a list of 
filenames into files.txt, and use an encoding (ISO-8859-1, say) other than 
that used by the shell to display file names (on Linux typically UTF-8 these 
days) and then write a Python script exist.py that reads filenames and 
checks for the files' existence, 

$ python3 exist.py  files.txt

will report that a file

b'\xe4\xf6\xfc.txt' 

doesn't exist. The user looking at his editor with the encoding set to 
ISO-8859-1 seeing the line

äöü.txt

and then going to the console typing

$ ls
äöü.txt

will be confused even though everything is working correctly. 
The system may be shuffling bytes, but the user thinks in codepoints and 
sometimes assumes that codepoints and bytes are the same.

 You always have to know either
 
 (a) both the file system's and stdin's actual encoding, or
 (b) that both encodings are the same.
 
 
 If this is true, then I think that it is wrong to do in Python3.  Any
 language should be able to deal with the filenames that the host OS
 allows.
 
 Anyway, going on with the OP.. can you open stdin so that you can accept
 arbitrary bytes instead of strings and then open using the bytes as the
 filename? 

You can access the underlying stdin.buffer that feeds you the raw bytes with 
no attempt to shoehorn them into codepoints. You can use filenames that are 
not valid in the encoding that the system uses to display filenames:

$ ls
$ python3
Python 3.1.1+ (r311:74480, Nov  2 2009, 15:45:00)
[GCC 4.4.1] on linux2
Type help, copyright, credits or license for more information.
 with open(b\xe4\xf6\xfc.txt, w) as f:
... f.write(hello\n)
...
6

$ ls
???.txt

 I don't have that much experience with Python3 to say for sure.

Me neither.

Peter

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Using property() to extend Tkinter classes but Tkinter classes are old-style classes?

2010-11-30 Thread Hans Mulder

Giacomo Boffi wrote:

Terry Reedy tjre...@udel.edu writes:


On 11/28/2010 3:47 PM, pyt...@bdurham.com wrote:

I had planned on subclassing Tkinter.Toplevel() using property() to wrap
access to properties like a window's title.
After much head scratching and a peek at the Tkinter.py source, I
realized that all Tkinter classes are old-style classes (even under
Python 2.7).
1. Is there a technical reason why Tkinter classes are still old-style
classes?

To not break old code. Being able to break code by upgrading all
classes in the stdlib was one of the reasons for 3.x.


In 3.x, are Tkinter classes still derived by old-style classes?


3.x does not provide old-style classes.

Oh, and the name Tkinter was changed to tkinter: all modules in the
standard library have lower case names in 3.x.

HTH,

-- HansM
--
http://mail.python.org/mailman/listinfo/python-list


Re: Using property() to extend Tkinter classes but Tkinter classes are old-style classes?

2010-11-30 Thread Robert Kern

On 11/30/10 11:00 AM, Giacomo Boffi wrote:

Terry Reedytjre...@udel.edu  writes:


On 11/28/2010 3:47 PM, pyt...@bdurham.com wrote:

I had planned on subclassing Tkinter.Toplevel() using property() to wrap
access to properties like a window's title.
After much head scratching and a peek at the Tkinter.py source, I
realized that all Tkinter classes are old-style classes (even under
Python 2.7).
1. Is there a technical reason why Tkinter classes are still old-style
classes?


To not break old code. Being able to break code by upgrading all
classes in the stdlib was one of the reasons for 3.x.


In 3.x, are Tkinter classes still derived by old-style classes?


No.

[~]$ python3
Python 3.1.2 (r312:79360M, Mar 24 2010, 01:33:18)
[GCC 4.0.1 (Apple Inc. build 5493)] on darwin
Type help, copyright, credits or license for more information.
 import tkinter
 tkinter.Tk.mro()
[class 'tkinter.Tk', class 'tkinter.Misc', class 'tkinter.Wm', class 
'object']



--
Robert Kern

I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth.
  -- Umberto Eco

--
http://mail.python.org/mailman/listinfo/python-list


C struct to Python

2010-11-30 Thread Eric Frederich
I am not sure how to proceed.
I am writing a Python interface to a C library.
The C library uses structures.
I was looking at the struct module but struct.unpack only seems to
deal with data that was packed using struct.pack or some other buffer.
All I have is the struct itself, a pointer in C.
Is there a way to unpack directly from a memory address?

Right now on the C side of things I can create a buffer of the struct
data like so...

MyStruct ms;
unsigned char buffer[sizeof(MyStruct) + 1];
memcpy(buffer, ms, sizeof(MyStruct));
return Py_BuildValue(s#, buffer, sizeof(MyStruct));

Then on the Python side I can unpack it using struct.unpack.

I'm just wondering if I need to jump through these hoops of packing it
on the C side or if I can do it directly from Python.

Thanks,
~Eric
-- 
http://mail.python.org/mailman/listinfo/python-list


Almost free iPod

2010-11-30 Thread iGet
I know nothing is ever free and that is true.  However, you can get
things really cheap.  Two offers I am working on right now are: (Copy
and Paste link into your web browser)

A Free iPod 64gb - http://www.YouriPodTouch4free.com/index.php?ref=6695331

Here is how it works:

You click on one of the links above, select the item you want, then
enter your email in the sign-up section.The next page it will ask
you if you want to do the offer as referral or points, I would suggest
referral. Now it is going to take you to your main page.  Here you
will need to complete a level A offer or 50 points in level B
offers.

Now you may have the question, is this legit.  Surf the internet about
these sites and you will find out that they are legit.  I will not
lie; it is hard to get the referrals needed to get the items.

A suggestion is try joining the Freebie Forums.  There are several
people at these forums doing the same thing we are doing and this may
help you get some referrals quicker.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: C struct to Python

2010-11-30 Thread geremy condra
On Tue, Nov 30, 2010 at 10:57 AM, Eric Frederich
eric.freder...@gmail.com wrote:
 I am not sure how to proceed.
 I am writing a Python interface to a C library.
 The C library uses structures.
 I was looking at the struct module but struct.unpack only seems to
 deal with data that was packed using struct.pack or some other buffer.
 All I have is the struct itself, a pointer in C.
 Is there a way to unpack directly from a memory address?

 Right now on the C side of things I can create a buffer of the struct
 data like so...

    MyStruct ms;
    unsigned char buffer[sizeof(MyStruct) + 1];
    memcpy(buffer, ms, sizeof(MyStruct));
    return Py_BuildValue(s#, buffer, sizeof(MyStruct));

 Then on the Python side I can unpack it using struct.unpack.

 I'm just wondering if I need to jump through these hoops of packing it
 on the C side or if I can do it directly from Python.

 Thanks,
 ~Eric

ctypes[0] sounds like a possible solution, although if you're already
writing a C extension it might be better practice to just write a
Python object that wraps your C struct appropriately. If you're not
wedded to the C extension, though, I've had very good luck writing C
interfaces with with ctypes and a few useful decorators [1], [2].
Others prefer Cython[3], which I like for speed but which sometimes
seems to get in my way when I'm trying to interface with existing
code. There's a good, if somewhat dated, overview of a few other
strategies here[4].

Geremy Condra

[0]: http://docs.python.org/library/ctypes.html
[1]: http://code.activestate.com/recipes/576734-c-struct-decorator/
[2]: http://code.activestate.com/recipes/576731/
[3]: http://www.cython.org/
[4]: http://www.suttoncourtenay.org.uk/duncan/accu/integratingpython.html
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Martin v. Loewis
 Does anyone know what I need to do to read filenames from stdin with
 Python 3.1 and subsequently open them, when some of those filenames
 include characters with their high bit set?

If your files on disk use file names encoded in iso-8859-1, don't set
your locale to a UTF-8 locale (as you apparently do), but set it to
a locale that actually matches the encoding that you use.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list


[Q] get device major/minor number

2010-11-30 Thread Thomas Portmann
Hello all,

In a script I would like to extract all device infos from block or
character device. The stat function gives me most of the infos
(mode, timestamp, user and group id, ...), however I did not find how
to get the devices major and minor numbers. Of course I could do it by
calling an external program, but is it possible to stay within python?

In the example below, I would like to get the major (8) and minor (0,
1, 2) numbers of /dev/sda{,1,2}. How can I get them?

u...@host:~$ ls -l /dev/sda /dev/sda1 /dev/sda2
brw-rw 1 root disk 8, 0 Nov 30 19:10 /dev/sda
brw-rw 1 root disk 8, 1 Nov 30 19:10 /dev/sda1
brw-rw 1 root disk 8, 2 Nov 30 19:10 /dev/sda2
u...@host:~$ python3.1 -c 'import os
for el in [,1,2]: print(os.stat(/dev/sda+el));'
posix.stat_result(st_mode=25008, st_ino=1776, st_dev=5, st_nlink=1,
st_uid=0, st_gid=6, st_size=0, st_atime=1291140641,
st_mtime=1291140640, st_ctime=1291140640)
posix.stat_result(st_mode=25008, st_ino=1780, st_dev=5, st_nlink=1,
st_uid=0, st_gid=6, st_size=0, st_atime=1291140644,
st_mtime=1291140641, st_ctime=1291140641)
posix.stat_result(st_mode=25008, st_ino=1781, st_dev=5, st_nlink=1,
st_uid=0, st_gid=6, st_size=0, st_atime=1291140644,
st_mtime=1291140641, st_ctime=1291140641)

Thanks


Tom
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [Q] get device major/minor number

2010-11-30 Thread Dan M
On Tue, 30 Nov 2010 21:09:14 +0100, Thomas Portmann wrote:

 Hello all,
 
 In a script I would like to extract all device infos from block or
 character device. The stat function gives me most of the infos (mode,
 timestamp, user and group id, ...), however I did not find how to get
 the devices major and minor numbers. Of course I could do it by calling
 an external program, but is it possible to stay within python?
 
 In the example below, I would like to get the major (8) and minor (0, 1,
 2) numbers of /dev/sda{,1,2}. How can I get them?

I think the os.major() and os.minor() calls ought to do what you want.

 import os
 s = os.stat('/dev/sda1')
 os.major(s.st_rdev)
8
 os.minor(s.st_rdev)
1
 

d...@dan:~$ ls -l /dev/sda1
brw-rw 1 root disk 8, 1 2010-11-18 05:41 /dev/sda1

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Memory issues when storing as List of Strings vs List of List

2010-11-30 Thread Ben Finney
OW Ghim Siong o...@bii.a-star.edu.sg writes:

 I have a big file 1.5GB in size, with about 6 million lines of
 tab-delimited data. I have to perform some filtration on the data and
 keep the good data. After filtration, I have about 5.5 million data
 left remaining. As you might already guessed, I have to read them in
 batches and I did so using .readlines(1).

Why do you need to handle the batching in your code? Perhaps you're not
aware that a file object is already an iterator for the lines of text in
the file.

 After reading each batch, I will split the line (in string format) to
 a list using .split(\t) and then check several conditions, after
 which if all conditions are satisfied, I will store the list into a
 matrix.

As I understand it, you don't need a line after moving to the next. So
there's no need to maintain a manual buffer of lines at all; please
explain if there is something additional requiring a huge buffer of
input lines.

 The code is as follows:
 -Start--
 a=open(bigfile)
 matrix=[]
 while True:
lines = a.readlines(1)
for line in lines:
data=line.split(\t)
if several_conditions_are_satisfied:
matrix.append(data)
print Number of lines read:, len(lines), matrix.__sizeof__:,
 matrix.__sizeof__()
if len(lines)==0:
break
 -End-

Using the file's native line iterator::

infile = open(bigfile)
matrix = []
for line in infile:
record = line.split(\t)
if several_conditions_are_satisfied:
matrix.append(record)

 Results:
 Number of lines read: 461544 matrix.__sizeof__: 1694768
 Number of lines read: 449840 matrix.__sizeof__: 3435984
 Number of lines read: 455690 matrix.__sizeof__: 5503904
 Number of lines read: 451955 matrix.__sizeof__: 6965928
 Number of lines read: 452645 matrix.__sizeof__: 8816304
 Number of lines read: 448555 matrix.__sizeof__: 9918368

 Traceback (most recent call last):
 MemoryError

If you still get a MemoryError, you can use the ‘pdb’ module
URL:http://docs.python.org/library/pdb.html to debug it interactively.

Another option is to catch the MemoryError and construct a diagnostic
message similar to the one you had above::

import sys

infile = open(bigfile)
matrix = []
for line in infile:
record = line.split(\t)
if several_conditions_are_satisfied:
try:
matrix.append(record)
except MemoryError:
matrix_len = len(matrix)
sys.stderr.write(
len(matrix): %(matrix_len)d\n % vars())
raise

 I have tried creating such a matrix of equivalent size and it only
 uses 35mb of memory but I am not sure why when using the code above,
 the memory usage shot up so fast and exceeded 2GB.

 Any advice is greatly appreciated.

With large data sets, and the manipulation and computation you will
likely be wanting to perform, it's probably time to consider the NumPy
library URL:http://numpy.scipy.org/ which has much more powerful array
types, part of the SciPy library URL:http://www.scipy.org/.

-- 
 \“[It's] best to confuse only one issue at a time.” —Brian W. |
  `\  Kernighan, Dennis M. Ritchie, _The C programming language_, 1988 |
_o__)  |
Ben Finney
-- 
http://mail.python.org/mailman/listinfo/python-list


SAX unicode and ascii parsing problem

2010-11-30 Thread goldtech
Hi,

I'm trying to parse an xml file using SAX. About half-way through a
file I get this error:

Traceback (most recent call last):
  File C:\Python26\Lib\site-packages\pythonwin\pywin\framework
\scriptutils.py, line 325, in RunScript
exec codeObject in __main__.__dict__
  File E:\sc\b2.py, line 58, in module
parser.parse(open(r'ppb5.xml'))
  File C:\Python26\Lib\xml\sax\expatreader.py, line 107, in parse
xmlreader.IncrementalParser.parse(self, source)
  File C:\Python26\Lib\xml\sax\xmlreader.py, line 123, in parse
self.feed(buffer)
  File C:\Python26\Lib\xml\sax\expatreader.py, line 207, in feed
self._parser.Parse(data, isFinal)
  File C:\Python26\Lib\xml\sax\expatreader.py, line 304, in
end_element
self._cont_handler.endElement(name)
  File E:\sc\b2.py, line 51, in endElement
d.write(csv+\n)
UnicodeEncodeError: 'ascii' codec can't encode characters in position
146-147: ordinal not in range(128)

I'm using ActivePython 2.6. I trying to figure out the simplest fix.
If there's a Python way to just take the source XML file and covert/
process it so this will not happen - that would be best. Or should I
just update to Python 3 ?

I tried this but nothing changed, I thought this might convert it and
then I'd paerse the new file - didn't work:

uc = open(r'E:\sc\ppb4.xml').read().decode('utf8')
ascii = uc.decode('ascii')
mex9 = open( r'E:\scrapes\ppb5.xml', 'w' )
mex9.write(ascii)

Again I'm looking for something simple even it's a few more lines of
codes...or upgrade(?)

Thanks, appreciate any help.
mex9.close()
-- 
http://mail.python.org/mailman/listinfo/python-list


get a free domain , free design , and free host

2010-11-30 Thread mohammed_a_o
get a free domain , free design , and free host

http://freedesignandhost.co.cc/

get a free domain , free design , and free host


http://freedesignandhost.co.cc/free-design.php


http://freedesignandhost.co.cc/free-host.php


http://freedesignandhost.co.cc/free-domain.php
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [Q] get device major/minor number

2010-11-30 Thread Thomas Portmann
On Tue, Nov 30, 2010 at 9:18 PM, Dan M d...@catfolks.net wrote:
 On Tue, 30 Nov 2010 21:09:14 +0100, Thomas Portmann wrote:

 In the example below, I would like to get the major (8) and minor (0, 1,
 2) numbers of /dev/sda{,1,2}. How can I get them?

 I think the os.major() and os.minor() calls ought to do what you want.

 import os
 s = os.stat('/dev/sda1')
 os.major(s.st_rdev)
 8
 os.minor(s.st_rdev)
 1

Thank you very much Dan, this is exactly what I was looking for.


Tom
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [Q] get device major/minor number

2010-11-30 Thread Dan M
On Tue, 30 Nov 2010 21:35:43 +0100, Thomas Portmann wrote:

 Thank you very much Dan, this is exactly what I was looking for.
 
 
 Tom

You're very welcome.

-- 
http://mail.python.org/mailman/listinfo/python-list


SAX unicode and ascii parsing problem

2010-11-30 Thread goldtech
Hi,

I'm trying to parse an xml file using SAX. About half-way through a
file I get this error:

Traceback (most recent call last):
  File C:\Python26\Lib\site-packages\pythonwin\pywin\framework
\scriptutils.py, line 325, in RunScript
exec codeObject in __main__.__dict__
  File E:\sc\b2.py, line 58, in module
parser.parse(open(r'ppb5.xml'))
  File C:\Python26\Lib\xml\sax\expatreader.py, line 107, in parse
xmlreader.IncrementalParser.parse(self, source)
  File C:\Python26\Lib\xml\sax\xmlreader.py, line 123, in parse
self.feed(buffer)
  File C:\Python26\Lib\xml\sax\expatreader.py, line 207, in feed
self._parser.Parse(data, isFinal)
  File C:\Python26\Lib\xml\sax\expatreader.py, line 304, in
end_element
self._cont_handler.endElement(name)
  File E:\sc\b2.py, line 51, in endElement
d.write(csv+\n)
UnicodeEncodeError: 'ascii' codec can't encode characters in position
146-147: ordinal not in range(128)

I'm using ActivePython 2.6. I trying to figure out the simplest fix.
If there's a Python way to just take the source XML file and covert/
process it so this will not happen - that would be best. Or should I
just update to Python 3 ?

I tried this but nothing changed, I thought this might convert it and
then I'd paerse the new file - didn't work:

uc = open(r'E:\sc\ppb4.xml').read().decode('utf8')
ascii = uc.decode('ascii')
mex9 = open( r'E:\scrapes\ppb5.xml', 'w' )
mex9.write(ascii)

Again I'm looking for something simple even it's a few more lines of
codes...or upgrade(?)

Thanks, appreciate any help.
mex9.close()
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: SAX unicode and ascii parsing problem

2010-11-30 Thread Steve Holden
On 11/30/2010 3:43 PM, goldtech wrote:
 Hi,
 
 I'm trying to parse an xml file using SAX. About half-way through a
 file I get this error:
 
 Traceback (most recent call last):
   File C:\Python26\Lib\site-packages\pythonwin\pywin\framework
 \scriptutils.py, line 325, in RunScript
 exec codeObject in __main__.__dict__
   File E:\sc\b2.py, line 58, in module
 parser.parse(open(r'ppb5.xml'))
   File C:\Python26\Lib\xml\sax\expatreader.py, line 107, in parse
 xmlreader.IncrementalParser.parse(self, source)
   File C:\Python26\Lib\xml\sax\xmlreader.py, line 123, in parse
 self.feed(buffer)
   File C:\Python26\Lib\xml\sax\expatreader.py, line 207, in feed
 self._parser.Parse(data, isFinal)
   File C:\Python26\Lib\xml\sax\expatreader.py, line 304, in
 end_element
 self._cont_handler.endElement(name)
   File E:\sc\b2.py, line 51, in endElement
 d.write(csv+\n)
 UnicodeEncodeError: 'ascii' codec can't encode characters in position
 146-147: ordinal not in range(128)
 
 I'm using ActivePython 2.6. I trying to figure out the simplest fix.
 If there's a Python way to just take the source XML file and covert/
 process it so this will not happen - that would be best. Or should I
 just update to Python 3 ?
 
 I tried this but nothing changed, I thought this might convert it and
 then I'd paerse the new file - didn't work:
 
 uc = open(r'E:\sc\ppb4.xml').read().decode('utf8')
 ascii = uc.decode('ascii')
 mex9 = open( r'E:\scrapes\ppb5.xml', 'w' )
 mex9.write(ascii)
 
 Again I'm looking for something simple even it's a few more lines of
 codes...or upgrade(?)
 
 Thanks, appreciate any help.
 mex9.close()

I'm just as stumped as I was when you first asked this question 13
minutes ago. ;-)

regards
 Steve

-- 
Steve Holden   +1 571 484 6266   +1 800 494 3119
PyCon 2011 Atlanta March 9-17   http://us.pycon.org/
See Python Video!   http://python.mirocommunity.org/
Holden Web LLC http://www.holdenweb.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: IMAP support

2010-11-30 Thread pakalk
Please, give me an example of raw query to IMAP server?

And why do you focus on Nevermind is so ekhm... nevermind... ??
Cannot you just help?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: SAX unicode and ascii parsing problem

2010-11-30 Thread goldtech
snip...

 I'm just as stumped as I was when you first asked this question 13
 minutes ago. ;-)

 regards
  Steve

snip...

Hi Steve,

Think I found it, for example:

line = 'my big string'
line.encode('ascii', 'ignore')

I processed the problem strings during parsing with this and it works
now. Got this from:

http://stackoverflow.com/questions/2365411/python-convert-unicode-to-ascii-without-errors


Best, Lee

:^)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: SAX unicode and ascii parsing problem

2010-11-30 Thread Justin Ezequiel
can't check right now but are you sure it's the parser and not
this line
d.write(csv+\n)
that's failing?
what is d?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: IMAP support

2010-11-30 Thread Adam Tauno Williams
On Tue, 2010-11-30 at 13:03 -0800, pakalk wrote: 
 Please, give me an example of raw query to IMAP server?

http://www.devshed.com/c/a/Python/Python-Email-Libraries-part-2-IMAP/2/

I'm not certain what you mean by raw query.

 And why do you focus on Nevermind is so ekhm... nevermind... ??
 Cannot you just help?

This list does suffer from a case of attitude.  Most programming
forums have that; Python attitude has its own special flavor.

-- 
http://mail.python.org/mailman/listinfo/python-list


Reading by positions plain text files

2010-11-30 Thread javivd
Hi all,

Sorry, newbie question:

I have database in a plain text file (could be .txt or .dat, it's the
same) that I need to read in python in order to do some data
validation. In other files I read this kind of files with the split()
method, reading line by line. But split() relies on a separator
character (I think... all I know is that it's work OK).

I have a case now in wich another file has been provided (besides the
database) that tells me in wich column of the file is every variable,
because there isn't any blank or tab character that separates the
variables, they are stick together. This second file specify the
variable name and his position:


VARIABLE NAME   POSITION (COLUMN) IN FILE
var_name_1  123-123
var_name_2  124-125
var_name_3  126-126
..
..
var_name_N  512-513 (last positions)

How can I read this so each position in the file it's associated with
each variable name?

Thanks a lot!!

Javier

-- 
http://mail.python.org/mailman/listinfo/python-list


How to initialize each multithreading Pool worker with an individual value?

2010-11-30 Thread Valery Khamenya
Hi,

multithreading.pool Pool has a promissing initializer argument in its
constructor.
However it doesn't look possible to use it to initialize each Pool's
worker with some individual value (I'd wish to be wrong here)

So, how to initialize each multithreading Pool worker with the
individual values?

The typical use case might be a connection pool, say, of 3 workers,
where each of 3 workers has its own TCP/IP port.

from multiprocessing.pool import Pool

def port_initializer(_port):
global port
port = _port

def use_connection(some_packet):
global _port
print sending data over port # %s % port

if __name__ == __main__:
ports=((4001,4002, 4003), )
p = Pool(3, port_initializer, ports) # oops... :-)
some_data_to_send = range(20)
p.map(use_connection, some_data_to_send)


best regards
--
Valery A.Khamenya
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: IMAP support

2010-11-30 Thread pakalk
On 30 Lis, 22:26, Adam Tauno Williams awill...@whitemice.org wrote:
 On Tue, 2010-11-30 at 13:03 -0800, pakalk wrote:
  Please, give me an example of raw query to IMAP server?

 http://www.devshed.com/c/a/Python/Python-Email-Libraries-part-2-IMAP/2/

 I'm not certain what you mean by raw query.

m = imap()
m.query('UID SORT ...') # etc.

Thanks for link :)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Reading by positions plain text files

2010-11-30 Thread Tim Harig
On 2010-11-30, javivd javiervan...@gmail.com wrote:
 I have a case now in wich another file has been provided (besides the
 database) that tells me in wich column of the file is every variable,
 because there isn't any blank or tab character that separates the
 variables, they are stick together. This second file specify the
 variable name and his position:

 VARIABLE NAME POSITION (COLUMN) IN FILE
 var_name_1123-123
 var_name_2124-125
 var_name_3126-126
 ..
 ..
 var_name_N512-513 (last positions)

I am unclear on the format of these positions.  They do not look like
what I would expect from absolute references in the data.  For instance,
123-123 may only contain one byte??? which could change for different
encodings and how you mark line endings.  Frankly, the use of the
world columns in the header suggests that the data *is* separated by
line endings rather then absolute position and the position refers to
the line number. In which case, you can use splitlines() to break up
the data and then address the proper line by index.  Nevertheless,
you can use file.seek() to move to an absolute offset in the file,
if that really is what you are looking for.
-- 
http://mail.python.org/mailman/listinfo/python-list


Catching user switching and getting current active user from root on linux

2010-11-30 Thread mpnordland
I have situation where I need to be able to get the current active
user, and catch user switching eg user1 locks screen, leaves computer,
user2 comes, and logs on.
basically, when there is any type of user switch my script needs to
know.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Catching user switching and getting current active user from root on linux

2010-11-30 Thread Tim Harig
On 2010-11-30, mpnordland mpnordl...@gmail.com wrote:
 I have situation where I need to be able to get the current active
 user, and catch user switching eg user1 locks screen, leaves computer,
 user2 comes, and logs on.
 basically, when there is any type of user switch my script needs to
 know.

Well you could use inotify to trigger on any changes to /var/log/wtmp.
When a change is detected, you could check of deltas in the output of who
-a to figure out what has changed since the last time wtmp triggered.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Catching user switching and getting current active user from root on linux

2010-11-30 Thread James Mills
On Wed, Dec 1, 2010 at 8:54 AM, Tim Harig user...@ilthio.net wrote:
 Well you could use inotify to trigger on any changes to /var/log/wtmp.
 When a change is detected, you could check of deltas in the output of who
 -a to figure out what has changed since the last time wtmp triggered.

This is a good idea and you could also
make use of the following library:

http://pypi.python.org/pypi?:action=searchterm=utmpsubmit=search

cheers
James

-- 
-- James Mills
--
-- Problems are solved by method
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to go on learning python

2010-11-30 Thread Terry Reedy

On 11/30/2010 9:37 AM, Xavier Heruacles wrote:

I'm basically a c/c++ programmer and recently come to python for some
web development. Using django and javascript I'm afraid I can develop
some web application now. But often I feel I'm not good at python. I
don't know much about generators, descriptors and decorators(although I
can use some of it to accomplish something, but I don't think I'm
capable of knowing its internals). I find my code ugly, and it seems
near everything are already gotten done by the libraries. When I want to
do something, I just find some libraries or modules and then just finish
the work. So I'm a bit tired of just doing this kind of high level
scripting, only to find myself a bad programmer. Then my question is
after one coded some kind of basic app, how one can keep on learning
programming using python?
Do some more interesting projects? Read more general books about
programming? or...?


You can use both your old C skills and new Python skills by helping to 
develop Python by working on issues on the tracker bugs.python.org. If 
you are interested but needed help getting started, ask.


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


Re: Reading by positions plain text files

2010-11-30 Thread MRAB

On 30/11/2010 21:31, javivd wrote:

Hi all,

Sorry, newbie question:

I have database in a plain text file (could be .txt or .dat, it's the
same) that I need to read in python in order to do some data
validation. In other files I read this kind of files with the split()
method, reading line by line. But split() relies on a separator
character (I think... all I know is that it's work OK).

I have a case now in wich another file has been provided (besides the
database) that tells me in wich column of the file is every variable,
because there isn't any blank or tab character that separates the
variables, they are stick together. This second file specify the
variable name and his position:


VARIABLE NAME   POSITION (COLUMN) IN FILE
var_name_1  123-123
var_name_2  124-125
var_name_3  126-126
..
..
var_name_N  512-513 (last positions)

How can I read this so each position in the file it's associated with
each variable name?


It sounds like a similar problem to this:

http://groups.google.com/group/comp.lang.python/browse_thread/thread/53e6f41bfff6/123422d510187dc3?show_docid=123422d510187dc3
--
http://mail.python.org/mailman/listinfo/python-list


Re: Programming games in historical linguistics with Python

2010-11-30 Thread Vlastimil Brom
2010/11/30 Dax Bloom bloom@gmail.com:
 Hello,

 Following a discussion that began 3 weeks ago I would like to ask a
 question regarding substitution of letters according to grammatical
 rules in historical linguistics. I would like to automate the
 transformation of words according to complex rules of phonology and
 integrate that script in a visual environment.
 Here follows the previous thread:
 http://groups.google.com/group/comp.lang.python/browse_thread/thread/3c55f9f044c3252f/fe7c2c82ecf0dbf5?lnk=gstq=evolutionary+linguistics#fe7c2c82ecf0dbf5

 Is there a way to refer to vowels and consonants as a subcategory of
 text? Is there a function to remove all vowels? How should one create
 and order the dictionary file for the rules? How to chain several
 transformations automatically from multiple rules? Finally can anyone
 show me what existing python program or phonological software can do
 this?

 What function could tag syllables, the word nucleus and the codas? How
 easy is it to bridge this with a more visual environment where
 interlinear, aligned text can be displayed with Greek notations and
 braces as usual in the phonology textbooks?

 Best regards,

 Dax Bloom
 --
 http://mail.python.org/mailman/listinfo/python-list


Hi,
as far as I know, there is no predefined function or library for
distinguishing vowels or consonants, but these can be simply
implemented individually according to the exact needs.

e.g. regular expressions can be used here: to remove vowels, the code
could be (example from the command prompt):

 import re
 re.sub(r(?i)[aeiouy], , This is a SAMPLE TEXT)
'Ths s  SMPL TXT'


See http://docs.python.org/library/re.html
or
http://www.regular-expressions.info/
for the regexp features.

You may eventually try the new development version regex, which adds
many interesting new features and remove some limitations
http://bugs.python.org/issue2636

In some cases regular expressions aren't really appropriate or may
become too complicated.
Sometimes a parsing library like pyparsing may be a more adequate tool:
http://pyparsing.wikispaces.com/

If the rules are simple enough, that they can be formulated for single
characters or character clusters with a regular expression, you can
model the phonological changes as a series of replacements with
matching patterns and the respective replacement patterns.

For character-wise matching and replacing the regular expressions are
very effective; using lookarounds
http://www.regular-expressions.info/lookaround.html
even some combinatorics for conditional changes can be expressed;
however, i would find some more complex conditions, suprasegmentals,
morpheme boundaries etc. rather difficult to formalise this way...

hth,
  vbr
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Dan Stromberg
On Tue, Nov 30, 2010 at 11:47 AM, Martin v. Loewis mar...@v.loewis.de wrote:
 Does anyone know what I need to do to read filenames from stdin with
 Python 3.1 and subsequently open them, when some of those filenames
 include characters with their high bit set?

 If your files on disk use file names encoded in iso-8859-1, don't set
 your locale to a UTF-8 locale (as you apparently do), but set it to
 a locale that actually matches the encoding that you use.

 Regards,
 Martin


It'd be great if all programs used the same encoding on a given OS,
but at least on Linux, I believe historically filenames have been
created with different encodings.  IOW, if I pick one encoding and go
with it, filenames written in some other encoding are likely to cause
problems.  So I need something for which a filename is just a blob
that shouldn't be monkeyed with.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Dan Stromberg
On Tue, Nov 30, 2010 at 7:19 AM, Antoine Pitrou solip...@pitrou.net wrote:
 On Mon, 29 Nov 2010 21:52:07 -0800 (PST)
 Yingjie Lan lany...@yahoo.com wrote:
 --- On Tue, 11/30/10, Dan Stromberg drsali...@gmail.com wrote:
  In Python 3, I'm finding that I have encoding issues with
  characters
  with their high bit set.  Things are fine with strictly
  ASCII
  filenames.  With high-bit-set characters, even if I
  change stdin's
  encoding with:

 Co-ask. I have also had problems with file names in
 Chinese characters with Python 3. I unzipped the
 turtle demo files into the desktop folder (of
 course, the word 'desktop' is in Chinese, it is
 a windows XP system, localization is Chinese), then
 all in a sudden some of the demos won't work
 anymore. But if I move it to a folder whose
 path contains only english characters, everything
 comes back to normal.

 Can you try the latest 3.2alpha4 (*) and check if this is fixed?
 If not, then could you please open a bug on http://bugs.python.org ?

 (*) http://python.org/download/releases/3.2/

 Thank you

 Antoine.

I have the same problem using 3.2alpha4: the word man~ana (6
characters long) in a filename causes problems (I'm catching the
exception and skipping the file for now) despite using what I believe
is an 8-bit, all 256-bytes-are-characters encoding: iso-8859-1.  'not
sure if you wanted both of us to try this, or Yingjie alone though.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Dan Stromberg
On Tue, Nov 30, 2010 at 9:53 AM, Peter Otten __pete...@web.de wrote:
 $ ls
 $ python3
 Python 3.1.1+ (r311:74480, Nov  2 2009, 15:45:00)
 [GCC 4.4.1] on linux2
 Type help, copyright, credits or license for more information.
 with open(b\xe4\xf6\xfc.txt, w) as f:
 ...     f.write(hello\n)
 ...
 6

 $ ls
 ???.txt

This sounds like a strong prospect for how to get things working (I
didn't realize open would accept a bytes argument for the filename),
but I'm also interested in whether reading filenames from stdin and
subsequently opening them is supposed to just work given a suitable
encoding - like with Java which also uses unicode strings.  In Java,
I'm told that ISO-8859-1 is supposed to guarantee a roundtrip
conversion.
-- 
http://mail.python.org/mailman/listinfo/python-list


Change one list item in place

2010-11-30 Thread Gnarlodious
This works for me:

def sendList():
return [item0, item1]

def query():
l=sendList()
return [Formatting only {0} into a string.format(l[0]), l[1]]

query()


However, is there a way to bypass the

l=sendList()

and change one list item in-place? Possibly a list comprehension
operating on a numbered item?

-- Gnarlie
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to initialize each multithreading Pool worker with an individual value?

2010-11-30 Thread Dan Stromberg
On Tue, Nov 30, 2010 at 1:35 PM, Valery Khamenya khame...@gmail.com wrote:
 Hi,

 multithreading.pool Pool has a promissing initializer argument in its
 constructor.
 However it doesn't look possible to use it to initialize each Pool's
 worker with some individual value (I'd wish to be wrong here)

 So, how to initialize each multithreading Pool worker with the
 individual values?

 The typical use case might be a connection pool, say, of 3 workers,
 where each of 3 workers has its own TCP/IP port.

 from multiprocessing.pool import Pool

 def port_initializer(_port):
    global port
    port = _port

 def use_connection(some_packet):
    global _port
    print sending data over port # %s % port

 if __name__ == __main__:
    ports=((4001,4002, 4003), )
    p = Pool(3, port_initializer, ports) # oops... :-)
    some_data_to_send = range(20)
    p.map(use_connection, some_data_to_send)

Using an initializer with multiprocessing is something I've never tried.

I have used queues with multiprocessing though, and I believe you
could use them, at least as a fallback plan if you prefer to get the
initializer to work.

If you create in the parent a queue in shared memory (multiprocessing
facilitates this nicely), and fill that queue with the values in your
ports tuple, then you could have each child in the worker pool extract
a single value from this queue so each worker can have its own, unique
port value.

HTH
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Nobody
On Mon, 29 Nov 2010 21:26:23 -0800, Dan Stromberg wrote:

 Does anyone know what I need to do to read filenames from stdin with
 Python 3.1 and subsequently open them, when some of those filenames
 include characters with their high bit set?

Use bytes rather than str. Everywhere. This means reading names from
sys.stdin.buffer (which is a binary stream) rather than sys.stdin (which
is a text stream). If you pass a bytes to an I/O function (e.g. open()),
it will just pass the bytes directly to the OS without any decoding.

But really, if you're writing *nix system utilities, you should probably
stick with Python 2.x until the end of time. Using 3.x will just make life
difficult for no good reason (e.g. in 3.x, os.environ also contains
Unicode strings).

-- 
http://mail.python.org/mailman/listinfo/python-list


Intro to Python slides, was Re: how to go on learning python

2010-11-30 Thread Dan Stromberg
On Tue, Nov 30, 2010 at 6:37 AM, Xavier Heruacles xheruac...@gmail.com wrote:
 I'm basically a c/c++ programmer and recently come to python for some web
 development. Using django and javascript I'm afraid I can develop some web
 application now. But often I feel I'm not good at python. I don't know much
 about generators, descriptors and decorators(although I can use some of it
 to accomplish something, but I don't think I'm capable of knowing its
 internals). I find my code ugly, and it seems near everything are already
 gotten done by the libraries. When I want to do something, I just find some
 libraries or modules and then just finish the work. So I'm a bit tired of
 just doing this kind of high level scripting, only to find myself a bad
 programmer. Then my question is after one coded some kind of basic app, how
 one can keep on learning programming using python?
 Do some more interesting projects? Read more general books about
 programming? or...?
 --
 http://mail.python.org/mailman/listinfo/python-list

You could check out these slides from an Intro to Python talk I'm
giving tonight:

http://stromberg.dnsalias.org/~dstromberg/Intro-to-Python/

...perhaps especially the Further Resources section at the end.  The
Koans might be very nice for you, as might Dive Into Python.

BTW, if you're interested in Python and looking into Javascript anew,
you might look at Pyjamas.  It lets you write web apps in Python that
also run on a desktop; you can even call into Raphael from it.  Only
thing about it is it's kind of a young project compared to most Python
implementations.

PS: I mostly came from C too - knowing C can be a real advantage for a
Python programmer sometimes.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Nobody
On Tue, 30 Nov 2010 18:53:14 +0100, Peter Otten wrote:

 I think this is wrong.  In Unix there is no concept of filename
 encoding.  Filenames can have any arbitrary set of bytes (except '/' and
 '\0').   But the filesystem itself neither knows nor cares about
 encoding.
 
 I think you misunderstood what I was trying to say. If you write a list of 
 filenames into files.txt, and use an encoding (ISO-8859-1, say) other than 
 that used by the shell to display file names (on Linux typically UTF-8 these 
 days) and then write a Python script exist.py that reads filenames and 
 checks for the files' existence, 

I think you misunderstood.

In the Unix kernel, there aren't any encodings. Strings of bytes are
/just/ strings of bytes. A text file containing a list of filenames
doesn't /have/ an encoding. The filenames passed to API functions don't
/have/ an encoding.

This is why Unix filenames are case-sensitive: because there isn't any
case. The number 65 has no more in common with the number 97 than it
does with the number 255. The fact that 65 is the ASCII code for A while
97 is the ASCII code for a doesn't come into it. Case-insensitive
filenames require knowledge of the encoding in order to determine when
filenames are equivalent. DOS/Windows tried this and never really got it
right (it works fine on a standalone system, or within later versions of
a Windows-only ecosystem, but becomes a nightmare when files get
transferred between systems via older or non-Microsoft channels).

Python 3.x's decision to treat filenames (and environment variables) as
text even on Unix is, in short, a bug. One which, IMNSHO, will mean that
Python 2.x is still around when Python 4 is released.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Change one list item in place

2010-11-30 Thread MRAB

On 01/12/2010 01:08, Gnarlodious wrote:

This works for me:

def sendList():
 return [item0, item1]

def query():
 l=sendList()
 return [Formatting only {0} into a string.format(l[0]), l[1]]

query()


However, is there a way to bypass the

l=sendList()

and change one list item in-place? Possibly a list comprehension
operating on a numbered item?


There's this:

return [Formatting only {0} into a string.format(x) if i == 0 
else x for i, x in enumerate(sendList())]


but that's too clever for its own good. Keep it simple. :-)
--
http://mail.python.org/mailman/listinfo/python-list


Re: Programming games in historical linguistics with Python

2010-11-30 Thread Gnarlodious
Have you considered entering all this data into an SQLite database?
You could do fast searches based on any features you deem relevant to
the phoneme. Using an SQLite editor application you can get started
building a database right away. You can add columns as you get the
inspiration, along with any tags you want. Putting it all in database
tables can really make chaotic linguistic data seem manageable.

My own linguistics project uses mostly SQLite and a number of
OrderedDict's based on .plist files. It is all working very nicely,
although I haven't tried to deal with any phonetics (yet).

-- Gnarlie
http://Sectrum.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Change one list item in place

2010-11-30 Thread Gnarlodious
Thanks.
Unless someone has a simpler solution, I'll stick with 2 lines.

-- Gnarlie
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Reading by positions plain text files

2010-11-30 Thread javivd
On Nov 30, 11:43 pm, Tim Harig user...@ilthio.net wrote:
 On 2010-11-30, javivd javiervan...@gmail.com wrote:

  I have a case now in wich another file has been provided (besides the
  database) that tells me in wich column of the file is every variable,
  because there isn't any blank or tab character that separates the
  variables, they are stick together. This second file specify the
  variable name and his position:

  VARIABLE NAME      POSITION (COLUMN) IN FILE
  var_name_1                 123-123
  var_name_2                 124-125
  var_name_3                 126-126
  ..
  ..
  var_name_N                 512-513 (last positions)

 I am unclear on the format of these positions.  They do not look like
 what I would expect from absolute references in the data.  For instance,
 123-123 may only contain one byte??? which could change for different
 encodings and how you mark line endings.  Frankly, the use of the
 world columns in the header suggests that the data *is* separated by
 line endings rather then absolute position and the position refers to
 the line number. In which case, you can use splitlines() to break up
 the data and then address the proper line by index.  Nevertheless,
 you can use file.seek() to move to an absolute offset in the file,
 if that really is what you are looking for.

I work in a survey research firm. the data im talking about has a lot
of 0-1 variables, meaning yes or no of a lot of questions. so only one
position of a character is needed (not byte), explaining the 123-123
kind of positions of a lot of variables.

and no, MRAB, it's not the similar problem (at least what i understood
of it). I have to associate the position this file give me with the
variable name this file give me for those positions.

thank you both and sorry for my english!

J
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread MRAB

On 01/12/2010 01:28, Nobody wrote:

On Tue, 30 Nov 2010 18:53:14 +0100, Peter Otten wrote:


I think this is wrong.  In Unix there is no concept of filename
encoding.  Filenames can have any arbitrary set of bytes (except '/' and
'\0').   But the filesystem itself neither knows nor cares about
encoding.


I think you misunderstood what I was trying to say. If you write a list of
filenames into files.txt, and use an encoding (ISO-8859-1, say) other than
that used by the shell to display file names (on Linux typically UTF-8 these
days) and then write a Python script exist.py that reads filenames and
checks for the files' existence,


I think you misunderstood.

In the Unix kernel, there aren't any encodings. Strings of bytes are
/just/ strings of bytes. A text file containing a list of filenames
doesn't /have/ an encoding. The filenames passed to API functions don't
/have/ an encoding.

This is why Unix filenames are case-sensitive: because there isn't any
case. The number 65 has no more in common with the number 97 than it
does with the number 255. The fact that 65 is the ASCII code for A while
97 is the ASCII code for a doesn't come into it. Case-insensitive
filenames require knowledge of the encoding in order to determine when
filenames are equivalent. DOS/Windows tried this and never really got it
right (it works fine on a standalone system, or within later versions of
a Windows-only ecosystem, but becomes a nightmare when files get
transferred between systems via older or non-Microsoft channels).

Python 3.x's decision to treat filenames (and environment variables) as
text even on Unix is, in short, a bug. One which, IMNSHO, will mean that
Python 2.x is still around when Python 4 is released.


If the filenames are to be shown to a user then there needs to be a
mapping between bytes and glyphs. That's an encoding. If different
users use different encodings then exchange of textual data becomes
difficult. That's where encodings which can be used globally come in.
By the time Python 4 is released I'd be surprised if Unix hadn't
standardised on a single encoding like UTF-8.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Reading by positions plain text files

2010-11-30 Thread MRAB

On 01/12/2010 02:03, javivd wrote:

On Nov 30, 11:43 pm, Tim Hariguser...@ilthio.net  wrote:

On 2010-11-30, javivdjaviervan...@gmail.com  wrote:


I have a case now in wich another file has been provided (besides the
database) that tells me in wich column of the file is every variable,
because there isn't any blank or tab character that separates the
variables, they are stick together. This second file specify the
variable name and his position:



VARIABLE NAME  POSITION (COLUMN) IN FILE
var_name_1 123-123
var_name_2 124-125
var_name_3 126-126
..
..
var_name_N 512-513 (last positions)


I am unclear on the format of these positions.  They do not look like
what I would expect from absolute references in the data.  For instance,
123-123 may only contain one byte??? which could change for different
encodings and how you mark line endings.  Frankly, the use of the
world columns in the header suggests that the data *is* separated by
line endings rather then absolute position and the position refers to
the line number. In which case, you can use splitlines() to break up
the data and then address the proper line by index.  Nevertheless,
you can use file.seek() to move to an absolute offset in the file,
if that really is what you are looking for.


I work in a survey research firm. the data im talking about has a lot
of 0-1 variables, meaning yes or no of a lot of questions. so only one
position of a character is needed (not byte), explaining the 123-123
kind of positions of a lot of variables.

and no, MRAB, it's not the similar problem (at least what i understood
of it). I have to associate the position this file give me with the
variable name this file give me for those positions.

thank you both and sorry for my english!


You just have to parse the second file to build a list (or dict)
containing the name, start position and end position of each variable:

variables = [(var_name_1, 123, 123), ...]

and then work through that list, extracting the data between those
positions in the first file and putting the values in another list (or
dict).

You also need to check whether the positions are 1-based or 0-based
(Python uses 0-based).
--
http://mail.python.org/mailman/listinfo/python-list


Re: How to initialize each multithreading Pool worker with an individual value?

2010-11-30 Thread James Mills
On Wed, Dec 1, 2010 at 7:35 AM, Valery Khamenya khame...@gmail.com wrote:
 multithreading.pool Pool has a promissing initializer argument in its
 constructor.
 However it doesn't look possible to use it to initialize each Pool's
 worker with some individual value (I'd wish to be wrong here)

 So, how to initialize each multithreading Pool worker with the
 individual values?

 The typical use case might be a connection pool, say, of 3 workers,
 where each of 3 workers has its own TCP/IP port.

 from multiprocessing.pool import Pool

 def port_initializer(_port):
    global port
    port = _port

 def use_connection(some_packet):
    global _port
    print sending data over port # %s % port

 if __name__ == __main__:
    ports=((4001,4002, 4003), )
    p = Pool(3, port_initializer, ports) # oops... :-)
    some_data_to_send = range(20)
    p.map(use_connection, some_data_to_send)

I assume you are talking about multiprocessing
despite you mentioning multithreading in the mix.

Have a look at the source code for multiprocessing.pool
and how the Pool object works and what it does
with the initializer argument. I'm not entirely sure it
does what you expect and yes documentation on this
is lacking...

cheers
James

-- 
-- James Mills
--
-- Problems are solved by method
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Reading by positions plain text files

2010-11-30 Thread Tim Chase

On 11/30/2010 08:03 PM, javivd wrote:

On Nov 30, 11:43 pm, Tim Hariguser...@ilthio.net  wrote:

VARIABLE NAME  POSITION (COLUMN) IN FILE
var_name_1 123-123
var_name_2 124-125
var_name_3 126-126
..
..
var_name_N 512-513 (last positions)



and no, MRAB, it's not the similar problem (at least what i understood
of it). I have to associate the position this file give me with the
variable name this file give me for those positions.


MRAB may be referring to my reply in that thread where you can do 
something like


  OFFSETS = 'offsets.txt'
  offsets = {}
  f = file(OFFSETS)
  f.next() # throw away the headers
  for row in f:
varname, rest = row.split()[:2]
# sanity check
if varname in offsets:
  print [%s] in %s twice?! % (varname, OFFSETS)
if '-' not in rest: continue
start, stop = map(int, rest.split('-'))
offsets[varname] = slice(start, stop+1) # 0-based offsets
#offsets[varname] = slice(start+1, stop+2) # 1-based offsets
  f.close()

  def do_something_with(data):
# your real code goes here
print data['var_name_2']

  for row in file('data.txt'):
data = dict((name, row[offsets[name]]) for name in offsets)
do_something_with(data)

There's additional robustness-checks I'd include if your 
offsets-file isn't controlled by you (people send me daft data).


-tkc




--
http://mail.python.org/mailman/listinfo/python-list


To Thread or not to Thread....?

2010-11-30 Thread Jack Keegan
Hi there,

I'm currently writing an application to control and take measurements during
an experiments. This is to be done on an embedded computer running XPe so I
am happy to have python available, although I am pretty new to it.
The application basically runs as a state machine, which transitions through
it's states based on inputs read in from a set of general purpose
input/output (GPIO) lines. So when a certain line is pulled low/high, do
something and move to another state. All good so far and since I get through
main loop pretty quickly, I can just do a read of the GPIO lines on each
pass through the loop and respond accordingly.
However, in one of the states I have to start reading in, and storing frames
from a camera. In another, I have to start reading accelerometer data from
an I2C bus (which operates at 400kHz). I haven't implemented either yet but
I would imagine that, in the case of the camera data, reading a frame would
take a large amount of time as compared to other operations. Therefore, if I
just tried to read one (or one set of) frames on each pass through the loop
then I would hold up the rest of the application. Conversely, as the I2C bus
will need to be read at such a high rate, I may not be able to get the
required data rate I need even without the camera data. This naturally leads
me to think I need to use threads.
As I am no expert in either I2C, cameras, python or threading I thought I
would chance asking for some advice on the subject. Do you think I need
threads here or would I be better off using some other method. I was
previously toying with the idea of using generators to create weightless
threads (as detailed in
http://www.ibm.com/developerworks/library/l-pythrd.html) for reading the
GPIOs. Do you think this would work in this situation?
Another option would be to write separately programs, perhaps even in C, and
spawn these in the background when needed. I'm a little torn as to which way
to go. If it makes a difference and more in case you are wondering, I will
be interfacing to the GPIOs, cameras and I2C bus through a set of C DLLs
using Ctypes.

Any help or suggestions will be greatly appreciated,

Thanks very much,

Jack
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Albert Hopkins
On Wed, 2010-12-01 at 02:14 +, MRAB wrote:
 If the filenames are to be shown to a user then there needs to be a
 mapping between bytes and glyphs. That's an encoding. If different
 users use different encodings then exchange of textual data becomes
 difficult.

That's presentation, that's separate.  Indeed, I have my user encoding
set to UTF-8, and if there is a filename that's not valid utf-8 then my
GUI (GNOME will show (invalid encoding) and even allow me to rename it
and my shell (bash) will show '?' next to the invalid characters (and
make it a little more challenging to rename ;)).  And I can freely copy
these invalid files across different (Unix) systems, because the OS
doesn't care about encoding.

But that's completely different from the actual name of the file.  Unix
doesn't care about presentation in filenames. It just cares about the
data.  There are not glyphs in Unix, only in the UI that runs on top
of it.

Or to put it another way, Unix's filename encoding is RAW-DATA.  It's
not textual data.  The fact that most filenames contain mainly
human-readable text is a convenient convention, but not required or
enforced by the OS.

  That's where encodings which can be used globally come in.
 By the time Python 4 is released I'd be surprised if Unix hadn't
 standardised on a single encoding like UTF-8. 

I have serious doubts about that.  At least in the Linux world the
kernel wants to stay out of encoding debates (except where it has to
like Window filesystems). But the point is that:

The world does not revolve around Python.  Unix filenames have been
encoding-agnostic long before Python was around.  If Python3 does not
support this then it's a regression on Python's part.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Change one list item in place

2010-11-30 Thread Steve Holden
On 11/30/2010 8:28 PM, MRAB wrote:
 On 01/12/2010 01:08, Gnarlodious wrote:
 This works for me:

 def sendList():
  return [item0, item1]

 def query():
  l=sendList()
  return [Formatting only {0} into a string.format(l[0]), l[1]]

 query()


 However, is there a way to bypass the

 l=sendList()

 and change one list item in-place? Possibly a list comprehension
 operating on a numbered item?

 There's this:
 
 return [Formatting only {0} into a string.format(x) if i == 0 else
 x for i, x in enumerate(sendList())]
 
 but that's too clever for its own good. Keep it simple. :-)

I quite agree. That solution is so clever it would be asking for a fight
walking into a bar in Glasgow.

However, an unpacking assignment can make everything much more
comprehensible [pun intended] by removing the index operations. The
canonical solution would be something like:

def query():
x, y = sendList()
return [Formatting only {0} into a string.format(x), y]

regards
 Steve
-- 
Steve Holden   +1 571 484 6266   +1 800 494 3119
PyCon 2011 Atlanta March 9-17   http://us.pycon.org/
See Python Video!   http://python.mirocommunity.org/
Holden Web LLC http://www.holdenweb.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Reading by positions plain text files

2010-11-30 Thread Tim Harig
On 2010-12-01, javivd javiervan...@gmail.com wrote:
 On Nov 30, 11:43 pm, Tim Harig user...@ilthio.net wrote:
 On 2010-11-30, javivd javiervan...@gmail.com wrote:

  I have a case now in wich another file has been provided (besides the
  database) that tells me in wich column of the file is every variable,
  because there isn't any blank or tab character that separates the
  variables, they are stick together. This second file specify the
  variable name and his position:

  VARIABLE NAME      POSITION (COLUMN) IN FILE
  var_name_1                 123-123
  var_name_2                 124-125
  var_name_3                 126-126
  ..
  ..
  var_name_N                 512-513 (last positions)

 I am unclear on the format of these positions.  They do not look like
 what I would expect from absolute references in the data.  For instance,
 123-123 may only contain one byte??? which could change for different
 encodings and how you mark line endings.  Frankly, the use of the
 world columns in the header suggests that the data *is* separated by
 line endings rather then absolute position and the position refers to
 the line number. In which case, you can use splitlines() to break up
 the data and then address the proper line by index.  Nevertheless,
 you can use file.seek() to move to an absolute offset in the file,
 if that really is what you are looking for.

 I work in a survey research firm. the data im talking about has a lot
 of 0-1 variables, meaning yes or no of a lot of questions. so only one
 position of a character is needed (not byte), explaining the 123-123
 kind of positions of a lot of variables.

Then file.seek() is what you are looking for; but, you need to be aware of
line endings and encodings as indicated.  Make sure that you open the file
using whatever encoding was used when it was generated or you could have
problems with multibyte characters affecting the offsets.
-- 
http://mail.python.org/mailman/listinfo/python-list


Regarding searching directory and to delete it with specific pattern.

2010-11-30 Thread Ramprakash Jelari thinakaran
Hi all,
Would like to search list of directories with specific pattern and delete
it?.. How can i do it?.
Example: in /home/jpr/ i have the following list of directories.
1.2.3-2,  1.2.3-10, 1.2.3-8, i would like to delete the directories other
than 1.2.3-10 which is the higher value?..


Regards,
JPR.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Martin v. Löwis
 It'd be great if all programs used the same encoding on a given OS,
 but at least on Linux, I believe historically filenames have been
 created with different encodings.  IOW, if I pick one encoding and go
 with it, filenames written in some other encoding are likely to cause
 problems.  So I need something for which a filename is just a blob
 that shouldn't be monkeyed with.

In that case, you should use byte strings as file names, not
character strings.

Regards,
Martin

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Martin v. Loewis
 The world does not revolve around Python.  Unix filenames have been
 encoding-agnostic long before Python was around.  If Python3 does not
 support this then it's a regression on Python's part.

Fortunately, Python 3 does support that.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Martin v. Loewis
 This sounds like a strong prospect for how to get things working (I
 didn't realize open would accept a bytes argument for the filename),
 but I'm also interested in whether reading filenames from stdin and
 subsequently opening them is supposed to just work given a suitable
 encoding - like with Java which also uses unicode strings.  In Java,
 I'm told that ISO-8859-1 is supposed to guarantee a roundtrip
 conversion.

It's the same in Python. However, as in Java, Python will *not*
necessarily use ISO-8859-1 when you pass a (Unicode) string to
open; instead, it will (as will Java) use your locale's encoding.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list


Watch the YANK CIA BUSTARDS Censorship live delete this post on KEY WIKILEAK REVELATIONS - RARE

2010-11-30 Thread small Pox
http://www.telegraph.co.uk/news/worldnews/northamerica/usa/8152326/WikiLeaks-release-Timeline-of-the-key-WikiLeaks-revelations.html

WikiLeaks release: Timeline of the key WikiLeaks revelations

 By Jon Swaine in New York  6:53PM GMT 22 Nov 2010

December 2007: Guantanamo Bay operating procedures

A US Army manual for soldiers at Camp Delta discloses that prisoners
were denied access to the Red Cross for up to four weeks and that
inmates could earn “special rewards”, including a roll of lavatory
paper, for good behaviour and co-operation.

September 2008: Sarah Palin's email account

Emails taken from the then-Republican Vice-Presidential candidate's
personal account suggest that she has been using it for official
business as Governor of Alaska. Doing so could have helped her avoid
having her communications subjected to state laws on the disclosure of
public records.

November 2008: BNP membership list

The names, addresses and occupations of more than 13,000 members of
the far-Right British party are released in one file. The list shows
that members include police officers, senior members of the military,
doctors and other professionals.

October 2009: Trafigura report

An internal study about the effects of dumping waste by the energy
trading company discloses that it used amateurish processes while
dumping gasoline on the Ivory Coast and probably would have left
dangerous sulphur compounds untreated

November 2009: Climategate emails

More than 1,000 emails sent between staff at the University of East
Anglia's Climate Research Unit appeared to show that scientists
distorted research to boost their argument that global warming was man-
made, causing an international media storm.

November 2009: September 11 pager messages

About half a million pager messages sent in New York City on September
11, 2001, tell the story of the 9/11 terrorist attacks through
individuals. Personal messages from people caught up in the carnage
emerge, prompting criticism from commentators who claim the leak is an
invasion of privacy.

April 2010: Apache helicopter attack on journalists

Video footage shows 15 people, including two people working for the
Reuters news agency, being gunned down by a US Army helicopter in
Iraq. The crew, who were not disciplined, mistook their targets'
camera equipment for weapons.

July 2010: Afghanistan war logs

Tens of thousands of classified US military documents tell of the
daily events of war in Afghanistan. The logs disclose that the Taliban
is receiving greater assistance from the Pakistani intelligence
services than was previously known and that the US runs a secret
assassination squad. They also raise questions over potential crimes
committed by coalition troops.

October 2010: Iraq war logs

Almost 400,000 classified US military documents recording the Iraq war
suggest that evidence of the torture of Iraqis by coalition troops was
ignored and record civilian deaths in more detail than was previously
known. More than 66,000 civilians suffered “violent deaths” between
2004 and the end of 2009, they show.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: SAX unicode and ascii parsing problem

2010-11-30 Thread Stefan Behnel

goldtech, 30.11.2010 22:15:

Think I found it, for example:

line = 'my big string'
line.encode('ascii', 'ignore')

I processed the problem strings during parsing with this and it works
now.


That's not the right way of dealing with encodings, though. You should open 
the file with a well defined encoding (using codecs.open() or io.open() in 
Python = 2.6), and then write the unicode strings into it just as you get 
them.


Stefan

--
http://mail.python.org/mailman/listinfo/python-list


[issue9639] urllib2's AbstractBasicAuthHandler is limited to 6 requests

2010-11-30 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Grr.  Why wasn't this fix backported to the release maintenance branch before 
2.6.6 was released?  I've just had an application break as a result of 
upgrading from 2.6.5 to 2.6.6.

Oh well, too late now. :-(

/grumble

--
nosy: +mark.dickinson

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9639
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9639] urllib2's AbstractBasicAuthHandler is limited to 6 requests

2010-11-30 Thread Senthil Kumaran

Senthil Kumaran orsent...@gmail.com added the comment:

Ouch. My mistake. Had not realize then, that code that actually broke things 
was merged in 2.6.x and it had to be fixed too. :(

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9639
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9639] urllib2's AbstractBasicAuthHandler is limited to 6 requests

2010-11-30 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Ah well, it turned out to be fairly easy to work around, at least. :-)

Just in case any other urllib2 users have to deal with this in 2.6.6 (and also 
manage to find their way to this bug report :-):  it's easy to monkeypatch your 
way around the problem.  E.g.:

import sys
import urllib2

if sys.version_info[:2] == (2, 6) and sys.version_info[2] = 6:
def fixed_http_error_401(self, req, fp, code, msg, headers):
url = req.get_full_url()
response = self.http_error_auth_reqed('www-authenticate',
  url, req, headers)
self.retried = 0
return response

urllib2.HTTPBasicAuthHandler.http_error_401 = fixed_http_error_401

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9639
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10588] imp.find_module raises unexpected SyntaxError

2010-11-30 Thread Emile Anclin

New submission from Emile Anclin emile.anc...@logilab.fr:

Considering following file: 
$ cat pylint/test/input/func_unknown_encoding.py 
# -*- coding: IBO-8859-1 -*-
 check correct unknown encoding declaration


__revision__ = ''
$

When we try to find that module, imp.find_module raises SyntaxError:

 from imp import find_module
 find_module('func_unknown_encoding', None)
Traceback (most recent call last):
  File stdin, line 1, in module
SyntaxError: encoding problem: with BOM

It should be considered as a bug, as stated by  Brett Cannon:

 Considering these semantics changed between Python 2 and 3 w/o a
 discernable benefit (I would consider it a negative as finding a
 module should not be impacted by syntactic correctness; the full act
 of importing should be the only thing that cares about that), I would
 consider it a bug that should be filed.

--
messages: 122896
nosy: emile.anclin
priority: normal
severity: normal
status: open
title: imp.find_module raises unexpected SyntaxError
type: behavior
versions: Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10588
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9709] test_distutils warning: initfunc exported twice on Windows

2010-11-30 Thread Stefan Krah

Stefan Krah stefan-use...@bytereef.org added the comment:

Without the patch, you see the warning if test_build_ext is run in
verbose mode. With the patch, the warning disappears.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9709
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3243] Support iterable bodies in httplib

2010-11-30 Thread Xuanji Li

Xuanji Li xua...@gmail.com added the comment:

pitrou: actually that seems a bit suspect now... you need to handle 'data' 
differently depending on its type, and while you can determine the type by 
finding out when 'data' throws certain exceptions, it doesn't seem like what 
exceptions were meant for.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue3243
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10537] OS X IDLE 2.7rc1 from 64-bit installer hangs when you paste something.

2010-11-30 Thread Ned Deily

Ned Deily n...@acm.org added the comment:

More data points: using the 2.7.1 release source tarball, the problem is 
reproducible on 10.6 when dynamically linked to the Apple Tcl/Tk 8.5 and 
executing in either 64-bit or 32-bit mode.  It is not reproducible when using 
ActiveState Tcl/Tk 8.5.9, AS Tcl/Tk 8.4.19, or Apple Tcl/Tk 8.4 (none of which, 
of course, is available in 64-bit mode).

Unfortunately, the obvious workaround for the 64-bit/32-bit variant - building 
with one of the working 32-bit versions - does not result in a working IDLE.app 
or bin/idle since IDLE and its subprocesses are all launched in 64-bit mode 
(where possible) on 10.6.  For testing, it is possible to demonstrate 32-bit 
mode in a 64-/32- build with a properly built _tkinter.so by using the -n 
parameter, which causes IDLE to run with no subprocesses:
   arch -i386 /usr/local/bin/idle2.7 -n

Next step: see if the Issue6075 patches help with the Apple 8.5 Tk and, if not, 
add stuff to force both IDLE.app and bin/idle and their subprocesses to run 
only in 32-bit mode: probably either some more lipo-ing and/or adding 
posix_spawns.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10537
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10464] netrc module not parsing passwords containing #s.

2010-11-30 Thread Xuanji Li

Xuanji Li xua...@gmail.com added the comment:

bumping...can someone review this? The reported bug seems valid enough.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10464
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10576] Add a progress callback to gcmodule

2010-11-30 Thread Kristján Valur Jónsson

Kristján Valur Jónsson krist...@ccpgames.com added the comment:

Hi, as I stated, the original patch was simply our original implementation.
Here is a new patch.  It is simpler:
1) it exposes a gc.callbacks list where users can register themselves, in the 
spirit of sys.meta_path
2) One can have multiple callbacks
3) Improve error handling
4) The callback is called with a phase argument, currently 0 for start, and 1 
for the end.

Let's start bikeshedding the calling signature.  I like having a single 
callback, since multiple callables are a nuisance to manage.

Once we agree, I'll post a patch for the documentation, and unittests.

--
Added file: http://bugs.python.org/file19884/gccallback2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10576
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10576] Add a progress callback to gcmodule

2010-11-30 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 Let's start bikeshedding the calling signature.  I like having a
 single callback, since multiple callables are a nuisance to manage.

IMO the callback should have a second argument as a dict containing
various statistics that we can expand over time. The generation number,
for example, should be present.

As for the phase number, if 0 means start and 1 means end, you can't
decently add another phase anyway (having 2 mean somewhere between 0
and 1 would be completely confusing).

PS : please don't use C++-style comments in your patch

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10576
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10576] Add a progress callback to gcmodule

2010-11-30 Thread Kristján Valur Jónsson

Kristján Valur Jónsson krist...@ccpgames.com added the comment:

You are right, Antoine.
How about a string and a dict?  the string can be start and stop and we can 
add interesting information to the dict as you suggest.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10576
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3243] Support iterable bodies in httplib

2010-11-30 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 pitrou: actually that seems a bit suspect now... you need to handle
 'data' differently depending on its type,

Yes, but you can't know all appropriate types in advance, so it's better
to try and catch the TypeError.

I don't understand your changes in Lib/urllib/request.py. len(data) will
raise anyway.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue3243
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3243] Support iterable bodies in httplib

2010-11-30 Thread Xuanji Li

Xuanji Li xua...@gmail.com added the comment:

I don't fully understand Lib/urllib/request.py either, I just ported it and ran 
the unittests... it seems like what it does is that if you send an iterator 
through as 'data' you can't know the length in advance, and rather than let the 
len(data) raise an exception catlee thought it's better to raise an exception 
to tell the user exactly why the code failed (ie, because the user sent an 
iterator and there's no way to meaningfully find the Content-Length of that).

As for the catching exceptions vs using isinstance: I thought about it for a 
while, I think something like this feels right to me:

  try:
  self.sock.sendall(data)
  except TypeError:

  if isinstance(data, collections.Iterable):
  for d in t:
  self.sock.sendall(d)
  else:
  raise TypeError(data should be a bytes-like object or an iterable, 
got %r % type(it))


anyway, calling iter(data) is equivalent to calling data.__iter__(), so 
catching the exception is equivalent to hasattr(data, '__iter__'), which is 
roughly the same as isinstance(data, collections.Iterable). so we try the most 
straightforward method (sending everything) then if that fails, data is either 
an iterator or a wrong type.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue3243
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10588] imp.find_module raises unexpected SyntaxError

2010-11-30 Thread Ron Adam

Changes by Ron Adam ron_a...@users.sourceforge.net:


--
nosy: +ron_adam

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10588
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3243] Support iterable bodies in httplib

2010-11-30 Thread Davide Rizzo

Davide Rizzo sor...@gmail.com added the comment:

 len(data) will raise anyway.

No, it won't, if the iterable happens to be a sequence.

--
nosy: +davide.rizzo

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue3243
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3243] Support iterable bodies in httplib

2010-11-30 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

  len(data) will raise anyway.
 
 No, it won't, if the iterable happens to be a sequence.

Well, it seems the patch is confused between iterable and iterator. Only
iterators have a __next__, but they usually don't have a __len__.

The patch should really check for iterables, so it should use:

if isinstance(data, collections.Iterable)
raise ValueError#etc.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue3243
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9873] urllib.parse: Allow bytes in some APIs that use string literals internally

2010-11-30 Thread Nick Coghlan

Nick Coghlan ncogh...@gmail.com added the comment:

Committed in r86889

The docs changes should soon be live at:
http://docs.python.org/dev/library/urllib.parse.html

If anyone would like to suggest changes to the wording of the docs for post 
beta1, or finds additional corner cases that the new bytes handling can't cope 
with, feel free to create a new issue.

--
resolution:  - accepted
stage: needs patch - committed/rejected
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9873
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



  1   2   >