Re: RSS feed creation?

2011-11-07 Thread Stefan Behnel

Stefan Behnel, 07.11.2011 08:22:

Dan Stromberg, 06.11.2011 21:00:

Is there an opensource Python tool for creating RSS feeds, that doesn't
require large dependencies?

I found feedformatter.py on pypi, but it seems a little old, and its sole
automated test gives a traceback.

Is there a better starting point?

(I'd of course prefer something that'll run on 3.x and 2.x, but will settle
for either)


I'd just go with ElementTree and builder.py.

http://effbot.org/zone/element-builder.htm

http://effbot.python-hosting.com/file/stuff/sandbox/elementlib/builder.py


Hmm, interesting, that last link doesn't seem to work for me anymore. 
Here's a copy, however:


http://svn.effbot.org/public/stuff/sandbox/elementlib/builder.py

There's also an extended version for lxml.etree:

https://raw.github.com/lxml/lxml/master/src/lxml/builder.py

You'll quickly see if it works as expected with plain ET when you use it. 
It should in general.



Building an RSS-API on top of that is so trivial that it's not worth any
further dependencies anyway. Not sure if this version of builder.py runs in
Py3, but it certainly won't be hard to fix even if it doesn't currently.


Stefan

--
http://mail.python.org/mailman/listinfo/python-list


Re: Python lesson please

2011-11-07 Thread Peter Otten
gene heskett wrote:

> Greetings experts:
> 
> I just dl'd the duqu driver finder script from a link to NSS on /., and
> fixed enough of the tabs in it to make it run error-free.  At least python
> isn't having a litter of cows over the indentation now.
> 
> But it also runs instantly on linux.
> 
> This line looks suspect to me:
>  rootdir = sys.argv[1]
> 
> And I have a suspicion it is null on a linux box.
> 
> How can I fix that best?

Are you talking about this one?

https://github.com/halsten/Duqu-detectors/blob/master/DuquDriverPatterns.py

With a current checkout I don't get any tab-related (nor other) errors, so I 
would prefer to run the script as-is. Also, the  README clearly states that 
you have to invoke it with

python DuquDriverPatterns.py ./directoryOfMalware

and the line you are quoting then puts the value "./directoryOfMalware" into 
the rootdir variable.

If you want to normalize the code to 4-space indents I recomment that you 
use

http://hg.python.org/cpython/file/bbc929bc2224/Tools/scripts/reindent.py

On Ubuntu (and probably any other Debian-based distro) you'll find a version 
of that in 

/usr/share/doc/python2.6/examples/Tools/scripts/reindent.py

or similar once you've installed the python-examples package.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: question about Tkinter delete

2011-11-07 Thread Peter Otten
Kristen Aw wrote:

> I don't understand why I get this error. I'm trying to delete the existing 
points, then redraw them after this bit of code to 'animate' my simulation.
> 
> def update(self, point1, point2):
> # Deletes existing points
> if self.point1:
> self.w.delete(point1)
> self.master.update_idletasks()
> if self.point2:
> self.w.delete(point2)
> self.master.update_idletasks()
> #draw new point
> # . . .
> 
> The error message that I get is:
> . . . in update
> self.w.delete(point1)
>   File "C:\PYTHON26\LIB\LIB-TK\Tkinter.py", line 2181, in delete
> self.tk.call((self._w, 'delete') + args)
> TclError: invalid command name ".44593760"

Your snippet and your problem description are both a bit short to be sure. 
Is self.w a Tkinter.Canvas widget? It seems the canvas doesn't exist anymore 
when you're trying to delete objects on it. Here's a demonstration:

>>> import Tkinter as tk
>>> root = tk.Tk()
>>> canvas = tk.Canvas(root, height=100, width=100)
>>> canvas.delete("123")
>>> canvas.destroy()
>>> canvas.delete("123")
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/lib/python2.6/lib-tk/Tkinter.py", line 2184, in delete
self.tk.call((self._w, 'delete') + args)
_tkinter.TclError: invalid command name ".139675427025264"


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python lesson please

2011-11-07 Thread gene heskett
On Monday, November 07, 2011 05:35:15 AM Peter Otten did opine:

> gene heskett wrote:
> > Greetings experts:
> > 
> > I just dl'd the duqu driver finder script from a link to NSS on /.,
> > and fixed enough of the tabs in it to make it run error-free.  At
> > least python isn't having a litter of cows over the indentation now.
> > 
> > But it also runs instantly on linux.
> > 
> > This line looks suspect to me:
> >  rootdir = sys.argv[1]
> > 
> > And I have a suspicion it is null on a linux box.
> > 
> > How can I fix that best?
> 
> Are you talking about this one?
> 
> https://github.com/halsten/Duqu-detectors/blob/master/DuquDriverPatterns
> .py
 
Yes.  My save as renamed it, still has about 30k of tabs in it.  But I 
pulled it again, using the 'raw' link, saved it, no extra tabs.

But it still doesn't work for linux.  My python is 2.6.6

> With a current checkout I don't get any tab-related (nor other) errors,
> so I would prefer to run the script as-is. Also, the  README clearly
> states that you have to invoke it with
> 
> python DuquDriverPatterns.py ./directoryOfMalware
> 
> and the line you are quoting then puts the value "./directoryOfMalware"
> into the rootdir variable.

If only it would...  Using this version, the failure is silent and instant.  
Besides, the malware could be anyplace on the system.  But it needs to skip 
/dev since it hangs on the midi tree, /mnt and /media because they are not 
part of the running system even if disks are mounted there.
 
> If you want to normalize the code to 4-space indents I recomment that
> you use
> 
> http://hg.python.org/cpython/file/bbc929bc2224/Tools/scripts/reindent.py
 
Got it, where does it normally live? I apparently have a python-2.6.6 
install.

> On Ubuntu (and probably any other Debian-based distro) you'll find a
> version of that in
> 
PCLos is rpm based, lots of mandriva stuff in it.

> /usr/share/doc/python2.6/examples/Tools/scripts/reindent.py
>
Path does not exist.  Ends at /usr/share/doc
from there I have:
gene@coyote doc]$ ls|grep python
gimp-python-2.6.11/
gnome-python-gconf-2.28.1/
gnome-python-gnomeprint-2.32.0/
gnome-python-gtksourceview-2.32.0/
libxml2-python-2.7.8/
python-2.6.6/
python3-3.2.1/
python3-docs-3.2.1/
python-cairo-1.10.0/
python-configobj-4.7.2/
python-decorator-3.3.1/
python-docs-2.6.6/
python-enchant-1.5.3/
python-gobject-2.28.6/
python-gpgme-0.1/
python-gtksourceview-2.10.0/
python-libxml2dom-0.4.7/
python-lxml-2.2.8/
python-markupsafe-0.9.3/
python-notify-0.1.1/
python-paramiko-1.7.6/
python-paste-1.7.4/
python-pkg-resources-0.6c11/
python-psyco-1.6/
python-pybluez-0.18/
python-pycrypto-2.3/
python-pygments-1.3.1/
python-pytools-2011.3/
python-pyxml-0.8.4/
python-rhpl-0.212/
python-sexy-0.1.9/
python-simpletal-4.2/
python-sympy-0.6.7/
python-utmp-0.8/

The python-2.6.6 and 3.2.1 directories only contain a README.mdv

> or similar once you've installed the python-examples package.

On PCLos it doesn't even exist in the repo's.

Good links, thank you.

Cheers, Gene
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
My web page: 
"Elvis is my copilot."
-- Cal Keegan
-- 
http://mail.python.org/mailman/listinfo/python-list


logging: handle everything EXCEPT certain loggers

2011-11-07 Thread Gábor Farkas
hi,

is there a way to setup log-handlers in a way that they log logs from
every logger, exept certain ones?

basically i want the handler to handle everything, except log-records
that were generated by loggers from "something.*"
can this be done?

i tried to create filters, but the log-record does not have access to
his logger, so i cannot filter based on it's "path".

right now the only idea i have is to setup a filter for the
"something.*" path, have it mark somehow the log-records,
and then create a filter on the global level, that will drop such
log-records. is there a simpler solution?

thanks,
gabor
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python lesson please

2011-11-07 Thread Andreas Perstinger

On 2011-11-07 12:22, gene heskett wrote:

On Monday, November 07, 2011 05:35:15 AM Peter Otten did opine:

 Are you talking about this one?

 https://github.com/halsten/Duqu-detectors/blob/master/DuquDriverPatterns
 .py


Yes.  My save as renamed it, still has about 30k of tabs in it.  But I
pulled it again, using the 'raw' link, saved it, no extra tabs.

But it still doesn't work for linux.  My python is 2.6.6


Go to the directory where you've downloaded the file and type:

python DuquDriverPatterns.py .

What output do you get?

Bye, Andreas
--
http://mail.python.org/mailman/listinfo/python-list


Re: Python lesson please

2011-11-07 Thread Dave Angel

On 11/07/2011 06:22 AM, gene heskett wrote:

On Monday, November 07, 2011 05:35:15 AM Peter Otten did opine:


Are you talking about this one?

https://github.com/halsten/Duqu-detectors/blob/master/DuquDriverPatterns
.py


Yes.  My save as renamed it, still has about 30k of tabs in it.  But I
pulled it again, using the 'raw' link, saved it, no extra tabs.

But it still doesn't work for linux.  My python is 2.6.6

To start with, what's the md5 of the file you downloaded and are 
testing?  I get c4592a187f8f7880d3b685537e3bf9a5
from md5sum.  If you get something different, one of us changed the 
file, or you got it before today.


The whole tab issue is a red-herring in this case.  But I don't see how 
you can find 30k tabs in a thousand lines.  And if I were going to detab 
it, I'd pick 4 spaces, so the code doesn't stretch across the page.





python DuquDriverPatterns.py ./directoryOfMalware

and the line you are quoting then puts the value "./directoryOfMalware"
into the rootdir variable.

If only it would...  Using this version, the failure is silent and instant.
Besides, the malware could be anyplace on the system.  But it needs to skip
/dev since it hangs on the midi tree, /mnt and /media because they are not
part of the running system even if disks are mounted there.

First, run it on the current directory, and it should list the files in 
that directory:


I ran it in the directory I unzipped it into, so there are two files, 
the README and the source file itself.


$ python DuquDriverPatterns.py   .
Scanning ./README:
No match for pattern #0 on file named: README
No match for pattern #1 on file named: README
No match for pattern #2 on file named: README

etc.

The only way I can see to get NO output is to run it on an empty directory:
$mkdir junk
$ python DuquDriverPatterns.py   junk

As for skipping certain directories, we can deal with that as soon as 
you get proper behavior for any subtree of directories.


Have you tried adding a print ("Hello World " + rootdir) just before the

for root, subFolders, files in os.walk(rootdir):

line ?  Or putting a   print len(files)  just after it (indented, of 
course) ?


--

DaveA

--
http://mail.python.org/mailman/listinfo/python-list


Re: logging: handle everything EXCEPT certain loggers

2011-11-07 Thread Jean-Michel Pichavant

Gábor Farkas wrote:

hi,

is there a way to setup log-handlers in a way that they log logs from
every logger, exept certain ones?

basically i want the handler to handle everything, except log-records
that were generated by loggers from "something.*"
can this be done?

i tried to create filters, but the log-record does not have access to
his logger, so i cannot filter based on it's "path".

right now the only idea i have is to setup a filter for the
"something.*" path, have it mark somehow the log-records,
and then create a filter on the global level, that will drop such
log-records. is there a simpler solution?

thanks,
gabor
  


Are you sure ?
LogRecord objects have a name attribute. You could do something like

return 'IdontWantYou' not in record.name

in your filter.

JM

--
http://mail.python.org/mailman/listinfo/python-list


Re: logging: handle everything EXCEPT certain loggers

2011-11-07 Thread Gábor Farkas
2011/11/7 Jean-Michel Pichavant :
> Gábor Farkas wrote:
>>
>> is there a way to setup log-handlers in a way that they log logs from
>> every logger, exept certain ones?
>>
>> i tried to create filters, but the log-record does not have access to
>> his logger, so i cannot filter based on it's "path".
>
> Are you sure ?
> LogRecord objects have a name attribute. You could do something like
>
> return 'IdontWantYou' not in record.name
>
> in your filter.

d'oh .. thanks, i somehow overlooked the name attribute. it's exactly
what i need.

thanks,
gabor
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python lesson please

2011-11-07 Thread Peter Otten
gene heskett wrote:

> On Monday, November 07, 2011 05:35:15 AM Peter Otten did opine:
> 
>> gene heskett wrote:
>> > Greetings experts:
>> > 
>> > I just dl'd the duqu driver finder script from a link to NSS on /.,
>> > and fixed enough of the tabs in it to make it run error-free.  At
>> > least python isn't having a litter of cows over the indentation now.
>> > 
>> > But it also runs instantly on linux.
>> > 
>> > This line looks suspect to me:
>> >  rootdir = sys.argv[1]
>> > 
>> > And I have a suspicion it is null on a linux box.
>> > 
>> > How can I fix that best?
>> 
>> Are you talking about this one?
>> 
>> https://github.com/halsten/Duqu-detectors/blob/master/DuquDriverPatterns
>> .py
>  
> Yes.  My save as renamed it, still has about 30k of tabs in it.  But I
> pulled it again, using the 'raw' link, saved it, no extra tabs.
> 
> But it still doesn't work for linux.  My python is 2.6.6

Maybe the browser messes up things. Try installing git and then make a 
clone:

$ git clone git://github.com/halsten/Duqu-detectors

>> With a current checkout I don't get any tab-related (nor other) errors,
>> so I would prefer to run the script as-is. Also, the  README clearly
>> states that you have to invoke it with
>> 
>> python DuquDriverPatterns.py ./directoryOfMalware
>> 
>> and the line you are quoting then puts the value "./directoryOfMalware"
>> into the rootdir variable.
> 
> If only it would...  Using this version, the failure is silent and
> instant.

The actual code which comprises only the last 30 lines of the script looks 
like it is written by a newbie. Try replacing the bare except: with 
something noisy along the lines of

except Exception as e:
print e
continue

> Besides, the malware could be anyplace on the system.  But it needs to
> skip /dev since it hangs on the midi tree, /mnt and /media because they
> are not part of the running system even if disks are mounted there.

I don't think the script is meant to find malware on a running system. 
Rather you would mount a suspicious harddisk and pass the mountpoint to the 
script. Of course I'm only guessing...

>> or similar once you've installed the python-examples package.
> 
> On PCLos it doesn't even exist in the repo's.

Maybe it's in python's srpm, or in a python-dev.rpm or similar.
If all else fails you can download the source distribution from python.org 
at

http://www.python.org/download/releases/2.6.7/


-- 
http://mail.python.org/mailman/listinfo/python-list


read from file with mixed encodings in Python3

2011-11-07 Thread Jaroslav Dobrek
Hello,

in Python3, I often have this problem: I want to do something with
every line of a file. Like Python3, I presuppose that every line is
encoded in utf-8. If this isn't the case, I would like Python3 to do
something specific (like skipping the line, writing the line to
standard error, ...)

Like so:

try:
   
except UnicodeDecodeError:
  ...

Yet, there is no place for this construction. If I simply do:

for line in f:
print(line)

this will result in a UnicodeDecodeError if some line is not utf-8,
but I can't tell Python3 to stop:

This will not work:

for line in f:
try:
print(line)
except UnicodeDecodeError:
...

because the UnicodeDecodeError is caused in the "for line in f"-part.

How can I catch such exceptions?

Note that recoding the file before opening it is not an option,
because often files contain many different strings in many different
encodings.

Jaroslav
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: read from file with mixed encodings in Python3

2011-11-07 Thread Dave Angel

On 11/07/2011 09:23 AM, Jaroslav Dobrek wrote:

Hello,

in Python3, I often have this problem: I want to do something with
every line of a file. Like Python3, I presuppose that every line is
encoded in utf-8. If this isn't the case, I would like Python3 to do
something specific (like skipping the line, writing the line to
standard error, ...)

Like so:

try:

except UnicodeDecodeError:
   ...

Yet, there is no place for this construction. If I simply do:

for line in f:
 print(line)

this will result in a UnicodeDecodeError if some line is not utf-8,
but I can't tell Python3 to stop:

This will not work:

for line in f:
 try:
 print(line)
 except UnicodeDecodeError:
 ...

because the UnicodeDecodeError is caused in the "for line in f"-part.

How can I catch such exceptions?

Note that recoding the file before opening it is not an option,
because often files contain many different strings in many different
encodings.

Jaroslav
A file with mixed encodings isn't a text file.  So open it with 'rb' 
mode, and use read() on it.  Find your own line-endings, since a given 
'\n' byte may or may not be a line-ending.


Once you've got something that looks like a line, explicitly decode it 
using utf-8.  Some invalid lines will give an exception and some will 
not.  But perhaps you've got some other gimmick to tell the encoding for 
each line.


--

DaveA

--
http://mail.python.org/mailman/listinfo/python-list


Re: read from file with mixed encodings in Python3

2011-11-07 Thread Peter Otten
Jaroslav Dobrek wrote:

> Hello,
> 
> in Python3, I often have this problem: I want to do something with
> every line of a file. Like Python3, I presuppose that every line is
> encoded in utf-8. If this isn't the case, I would like Python3 to do
> something specific (like skipping the line, writing the line to
> standard error, ...)
> 
> Like so:
> 
> try:
>
> except UnicodeDecodeError:
>   ...
> 
> Yet, there is no place for this construction. If I simply do:
> 
> for line in f:
> print(line)
> 
> this will result in a UnicodeDecodeError if some line is not utf-8,
> but I can't tell Python3 to stop:
> 
> This will not work:
> 
> for line in f:
> try:
> print(line)
> except UnicodeDecodeError:
> ...
> 
> because the UnicodeDecodeError is caused in the "for line in f"-part.
> 
> How can I catch such exceptions?
> 
> Note that recoding the file before opening it is not an option,
> because often files contain many different strings in many different
> encodings.

I don't see those files often, but I think they are all seriously broken. 
There's no way to recover the information from files with unknown mixed 
encodings. However, here's an approach that may sometimes work: 

>>> with open("tmp.txt", "rb") as f:
... for line in f:
... try:
... line = "UTF-8 " + line.decode("utf-8")
... except UnicodeDecodeError:
... line = "Latin-1 " + line.decode("latin-1")
... print(line, end="")
...
UTF-8 äöü
Latin-1 äöü
UTF-8 äöü


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python ORMs Supporting POPOs and Substituting Layers in Django

2011-11-07 Thread John Gordon
In <415d875d-bc6d-4e69-bcf8-39754b450...@n18g2000vbv.googlegroups.com> Travis 
Parks  writes:

> Which web frameworks have people here used and which have they found
> to be: scalable, RAD compatible, performant, stable and/or providing
> good community support? I am really trying to get as much feedback as

I've used Django and it seems to be a very nice framework.  However I've
only done one project so I haven't delved too deeply.

-- 
John Gordon   A is for Amy, who fell down the stairs
gor...@panix.com  B is for Basil, assaulted by bears
-- Edward Gorey, "The Gashlycrumb Tinies"

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python lesson please

2011-11-07 Thread gene heskett
On Monday, November 07, 2011 10:38:32 AM Andreas Perstinger did opine:

> On 2011-11-07 12:22, gene heskett wrote:
> > On Monday, November 07, 2011 05:35:15 AM Peter Otten did opine:
> >>  Are you talking about this one?
> >>  
> >>  https://github.com/halsten/Duqu-detectors/blob/master/DuquDriverPatt
> >>  erns .py
> > 
> > Yes.  My save as renamed it, still has about 30k of tabs in it.  But I
> > pulled it again, using the 'raw' link, saved it, no extra tabs.
> > 
> > But it still doesn't work for linux.  My python is 2.6.6
> 
> Go to the directory where you've downloaded the file and type:
> 
> python DuquDriverPatterns.py .
> 
> What output do you get?

Well now, I'll be dipped.  It scanned that directory, took it perhaps 15 
minutes, without finding anything.  So I gave it two dots & its munching 
its way through the ../Mail/inbox now.  Why the hell can't it be given a 
valid absolute path without editing it directly into the rootdir = 
statement?

This may be a usable tool, but I think that before it was committed to a 
daily cron script, we would need some history as to where to look for such 
shenanigans as its certainly not fast enough to turn it loose to scan the 
whole system on a daily basis.  This on a quad core 2.1Ghz phenom, 4 gigs 
of dram.

And I just found one of its Achilles heels, it is now stuck on a pipe file 
at /home/gene/.kde4/share/apps/kaffeine/dvbpipe:
prw--- 1 gene gene  0 Sep 24 18:50 dvbpipe.m2t|

And using no cpu.

I was going to ctl+c it but this is where, after several such, that it took 
the machine down yesterday. But it appears as only one process to htop (I 
keep a copy of it running as root here) and that killed it clean, no crash.

So, it needs an exception (or likely several) of file types to stay away 
from, starting with pipes like the above.  But I am not the one to carve 
that code as I have NDI how to go about writing a check stanza for that 
condition in python.

Perhaps winderz does not have 'pipe' files so the authors never got caught 
out on this?  The only windows experience I have is the copy of xp that was 
on the lappy I bought back in 2005 or so to take with me when I am on the 
road (I am a broadcast engineer who gets sent here and there to "put out 
the fires" when the station is off the air.  Despite being retired for 9 
years now at 77 yo, my phone still rings occasionally)
I went straight from amigados-3.2 to redhat-5.0 in the late '90's, 
bypassing windows completely. I built the redhat machine from scratch.
The lappy's xp, used only for warranty testing, got overwritten by mandriva 
2008 when the warranty had expired.

You could call me anti-M$ I think.  :)

> Bye, Andreas

Thanks for listening, Andreas.

Now I wonder how to get a message back to the authors that its broken in at 
least two aspects...

Cheers, Gene
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
My web page: 

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python lesson please

2011-11-07 Thread gene heskett
On Monday, November 07, 2011 11:30:45 AM Dave Angel did opine:
Back on the list..
> On 11/07/2011 06:22 AM, gene heskett wrote:
> > On Monday, November 07, 2011 05:35:15 AM Peter Otten did opine:
> > 
> > 
> >> Are you talking about this one?
> >> 
> >> https://github.com/halsten/Duqu-detectors/blob/master/DuquDriverPatte
> >> rns .py
> > 
> > Yes.  My save as renamed it, still has about 30k of tabs in it.  But I
> > pulled it again, using the 'raw' link, saved it, no extra tabs.
> > 
> > But it still doesn't work for linux.  My python is 2.6.6
> 
> To start with, what's the md5 of the file you downloaded and are
> testing?  I get c4592a187f8f7880d3b685537e3bf9a5

[root@coyote Download]# md5sum DuquDriverPatterns.py
c4592a187f8f7880d3b685537e3bf9a5  DuquDriverPatterns.py, same as yours.

> from md5sum.  If you get something different, one of us changed the
> file, or you got it before today.
> 
> The whole tab issue is a red-herring in this case.  But I don't see how
> you can find 30k tabs in a thousand lines.  And if I were going to detab
> it, I'd pick 4 spaces, so the code doesn't stretch across the page.

Down toward the bottom of the file, the tab indentations were as high as 33 
leading tabs per line.  Each stanza of the data was tab indented 2 
additional tabs from the one above it in the original file.  30k was 
perhaps a poor SWAG, but 10 to 15k seems an entirely reasonable guess.
 
> > 
> > 
> >> python DuquDriverPatterns.py ./directoryOfMalware
> >> 
> >> and the line you are quoting then puts the value
> >> "./directoryOfMalware" into the rootdir variable.
> > 
> > If only it would...  Using this version, the failure is silent and
> > instant. Besides, the malware could be anyplace on the system.  But
> > it needs to skip /dev since it hangs on the midi tree, /mnt and
> > /media because they are not part of the running system even if disks
> > are mounted there.
> 
> First, run it on the current directory, and it should list the files in
> that directory:
> 
> I ran it in the directory I unzipped it into, so there are two files,
> the README and the source file itself.
> 
> $ python DuquDriverPatterns.py   .
> Scanning ./README:
> No match for pattern #0 on file named: README
> No match for pattern #1 on file named: README
> No match for pattern #2 on file named: README
> 
> etc.
> 
> The only way I can see to get NO output is to run it on an empty
> directory: $mkdir junk
> $ python DuquDriverPatterns.py   junk
> 
> As for skipping certain directories, we can deal with that as soon as
> you get proper behavior for any subtree of directories.
> 
> Have you tried adding a print ("Hello World " + rootdir) just before the
> 
> for root, subFolders, files in os.walk(rootdir):
> 
> line ?  Or putting a   print len(files)  just after it (indented, of
> course) ?

No, I did try to print the value of rootdir though, indented the same, and 
got a null printout, not even a line feed.

Thanks Dave.

Cheers, Gene
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
My web page: 
The older I grow, the less important the comma becomes.  Let the reader
catch his own breath.
-- Elizabeth Clarkson Zwart
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: xml-rpc server on wine

2011-11-07 Thread Adam Tauno Williams
On Sat, 2011-11-05 at 05:50 -0700, pacopyc wrote:
> Hi, I have a XML-RPC server python running on VM Windows (on Linux)
> and a XML-RPC client python on Linux. Server and client have different
> IP address. I'd like migrate server on wine. How can communicate
> server and client? IP address is different or is the same?
> Can you help me?

Not really, this doesn't have much of anything to do with Python.  If
you run a network application on wine [assuming that even works] the
application will have the same IP/interface as any other application or
service running on the host.  Wine is not a 'virtualization' solution.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: RSS feed creation?

2011-11-07 Thread Adam Tauno Williams
On Mon, 2011-11-07 at 08:22 +0100, Stefan Behnel wrote:
> Dan Stromberg, 06.11.2011 21:00:
> > Is there an opensource Python tool for creating RSS feeds, that doesn't
> > require large dependencies?
> > I found feedformatter.py on pypi, but it seems a little old, and its sole
> > automated test gives a traceback.
> > Is there a better starting point?
> > (I'd of course prefer something that'll run on 3.x and 2.x, but will settle
> > for either)
> I'd just go with ElementTree and builder.py.
> http://effbot.org/zone/element-builder.htm

+1


RSS is just XML, just use the XML toolchain.

And use  to check what
you create.

-- 
http://mail.python.org/mailman/listinfo/python-list


Extracting elements over multiple lists?

2011-11-07 Thread JoeM
Howdy,

If I have a few lists like

a=[1,2,3,4,5]
b=["one", "two", "three", "four", "five"]
c=["cat", "dog", "parrot", "clam", "ferret"]

what is the most pythonic method of removing the first element from
all of the lists?

A list comprehension such as [arr[1:] for arr in a,b,c]
gives a single 2d list, which is not what I'm shooting for. Any
suggestions?




-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Extracting elements over multiple lists?

2011-11-07 Thread John Gordon
In  JoeM 
 writes:

> a=[1,2,3,4,5]
> b=["one", "two", "three", "four", "five"]
> c=["cat", "dog", "parrot", "clam", "ferret"]

> what is the most pythonic method of removing the first element from
> all of the lists?

for arr in [a,b,c]:
  arr.pop(0)

-- 
John Gordon   A is for Amy, who fell down the stairs
gor...@panix.com  B is for Basil, assaulted by bears
-- Edward Gorey, "The Gashlycrumb Tinies"

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python ORMs Supporting POPOs and Substituting Layers in Django

2011-11-07 Thread John Gordon
In  John Gordon  writes:

> In <415d875d-bc6d-4e69-bcf8-39754b450...@n18g2000vbv.googlegroups.com> Travis 
> Parks  writes:

> > Which web frameworks have people here used and which have they found
> > to be: scalable, RAD compatible, performant, stable and/or providing
> > good community support? I am really trying to get as much feedback as

> I've used Django and it seems to be a very nice framework.  However I've
> only done one project so I haven't delved too deeply.

You are probably looking for more detail than "It's a nice framework" :-)

The database model in Django is powerful; it allows you to do queries in
native Python code without delving into backend SQL stuff.

I don't know how scalable/performant the database model is, as the one
project I worked on didn't deal with a ton of data.  (But I'd be surprised
if it had poor performance.)

The URL dispatcher provides a very nice and logical way to associate a
given URL with a given method call.

Community support is excellent.

-- 
John Gordon   A is for Amy, who fell down the stairs
gor...@panix.com  B is for Basil, assaulted by bears
-- Edward Gorey, "The Gashlycrumb Tinies"

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Extracting elements over multiple lists?

2011-11-07 Thread Laurent Claessens

Le 07/11/2011 18:12, JoeM a écrit :

Howdy,

If I have a few lists like

a=[1,2,3,4,5]
b=["one", "two", "three", "four", "five"]
c=["cat", "dog", "parrot", "clam", "ferret"]

what is the most pythonic method of removing the first element from
all of the lists?


Do you want to remove the first item of each list, or to create new 
lists that contain the same as a,b,c but with one element less ?


Something like what you wrote :
[arr[1:] for arr in a,b,c]
will create *new* lists.


Assuming you don't want new lists, I would do :

a=[1,2,3,4,5]
b=["one", "two", "three", "four", "five"]
c=["cat", "dog", "parrot", "clam", "ferret"]

for x in [a,b,c]:
x.remove(x[0])

print a
print b
print c

I think that writing
>>> [x.remove(x[0]) for x in [a,b,c]]
instead of the for loop is cheating ... but it also does the job.

Have a good after noon
Laurent









--
http://mail.python.org/mailman/listinfo/python-list


Re: Extracting elements over multiple lists?

2011-11-07 Thread JoeM
Thanks guys, I was just looking for a one line solution instead of a
for loop if possible. Why do you consider

[x.remove(x[0]) for x in [a,b,c]]

cheating? It seems compact and elegant enough for me.



Cheers
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python lesson please

2011-11-07 Thread Dave Angel

On 11/07/2011 11:40 AM, gene heskett wrote:

On Monday, November 07, 2011 11:30:45 AM Dave Angel did opine:
Back on the list..

On 11/07/2011 06:22 AM, gene heskett wrote:

On Monday, November 07, 2011 05:35:15 AM Peter Otten did opine:



Are you talking about this one?

https://github.com/halsten/Duqu-detectors/blob/master/DuquDriverPatte
rns .py


Yes.  My save as renamed it, still has about 30k of tabs in it.  But I
pulled it again, using the 'raw' link, saved it, no extra tabs.

But it still doesn't work for linux.  My python is 2.6.6


To start with, what's the md5 of the file you downloaded and are
testing?  I get c4592a187f8f7880d3b685537e3bf9a5


[root@coyote Download]# md5sum DuquDriverPatterns.py
c4592a187f8f7880d3b685537e3bf9a5  DuquDriverPatterns.py, same as yours.


from md5sum.  If you get something different, one of us changed the
file, or you got it before today.

The whole tab issue is a red-herring in this case.  But I don't see how
you can find 30k tabs in a thousand lines.  And if I were going to detab
it, I'd pick 4 spaces, so the code doesn't stretch across the page.


Down toward the bottom of the file, the tab indentations were as high as 33
leading tabs per line.  Each stanza of the data was tab indented 2
additional tabs from the one above it in the original file.  30k was
perhaps a poor SWAG, but 10 to 15k seems an entirely reasonable guess.

What program are you using to read the file and support that claim? 
Neither emacs nor gedit shows more than one leading tab on any line I 
looked.  And if you set tabs to 4 columns, the file looks quite 
reasonable.  Doing a quick scan I see max of 5 tabs on any single line, 
and 1006 total.



maxtabs = 0
totaltabs = 0
f = open("DuquDriverPatterns.py", "r")
for line in f:

cline = line.replace("\t", "")
tabs = len(line) - len(cline)
if tabs:
print tabs
maxtabs = max(maxtabs, tabs)
totaltabs += tabs

print "max=", maxtabs
print "total=", totaltabs








python DuquDriverPatterns.py ./directoryOfMalware

and the line you are quoting then puts the value
"./directoryOfMalware" into the rootdir variable.


If only it would...  Using this version, the failure is silent and
instant.


The only way I've been able to make it "silent and instant" was to give 
it the name of an empty directory, or a typo representing no directory 
at all.




Besides, the malware could be anyplace on the system.  But
it needs to skip /dev since it hangs on the midi tree, /mnt and
/media because they are not part of the running system even if disks
are mounted there.


First, run it on the current directory, and it should list the files in
that directory:

I ran it in the directory I unzipped it into, so there are two files,
the README and the source file itself.

$ python DuquDriverPatterns.py   .
Scanning ./README:
No match for pattern #0 on file named: README
No match for pattern #1 on file named: README
No match for pattern #2 on file named: README

etc.

The only way I can see to get NO output is to run it on an empty
directory: $mkdir junk
$ python DuquDriverPatterns.py   junk

As for skipping certain directories, we can deal with that as soon as
you get proper behavior for any subtree of directories.

Have you tried adding a print ("Hello World " + rootdir) just before the

for root, subFolders, files in os.walk(rootdir):

line ?  Or putting a   print len(files)  just after it (indented, of
course) ?


No, I did try to print the value of rootdir though, indented the same, and
got a null printout, not even a line feed.



If you had put the print I suggested, it would at least print the words 
"Hello World".  Since it did not, you probably didn't actually add the 
line where I suggested.



Thanks Dave.

Cheers, Gene


In another message you said it doesn't work on absolute file paths.  But 
it does.  You can replace any relative directory name with the absolute 
version, and it won't change the behavior.  I suspect you were caught up 
by a typo for the absolute path string.



--

DaveA
--
http://mail.python.org/mailman/listinfo/python-list


Re: Extracting elements over multiple lists?

2011-11-07 Thread John Gordon
In  JoeM 
 writes:

> Thanks guys, I was just looking for a one line solution instead of a
> for loop if possible. Why do you consider

> [x.remove(x[0]) for x in [a,b,c]]

> cheating? It seems compact and elegant enough for me.

I wouldn't call it cheating, but that solution does a fair bit of
unneccessary work (creating a list comprehension that is never used.)

-- 
John Gordon   A is for Amy, who fell down the stairs
gor...@panix.com  B is for Basil, assaulted by bears
-- Edward Gorey, "The Gashlycrumb Tinies"

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Extracting elements over multiple lists?

2011-11-07 Thread Peter Otten
JoeM wrote:

> Thanks guys, I was just looking for a one line solution instead of a
> for loop if possible. Why do you consider
> 
> [x.remove(x[0]) for x in [a,b,c]]
> 
> cheating? It seems compact and elegant enough for me.

I think it's a misconception that you are avoiding the for-loop. You move it 
into [...] and declare it more elegant, but in reality you are creating a 
throwaway list of None-s. You are adding cruft to your code. 

That is not only superfluous, but also misleading. A simple for-loop like

for x in a, b, c:
del x[0]

on the other hand makes your intention crystal-clear.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Extracting elements over multiple lists?

2011-11-07 Thread Jean-Michel Pichavant

JoeM wrote:

Thanks guys, I was just looking for a one line solution instead of a
for loop if possible. Why do you consider

[x.remove(x[0]) for x in [a,b,c]]

cheating? It seems compact and elegant enough for me.



Cheers
  
This is a one liner, but since you asked something *pythonic*, John's 
solution is the best imo:


for arr in [a,b,c]:
 arr.pop(0)

(Peter's "del" solution is quite close, but I find the 'del' statement 
tricky in python and will mislead many python newcomers)


JM
--
http://mail.python.org/mailman/listinfo/python-list


memory management

2011-11-07 Thread Juan Declet-Barreto
Hi,

Can anyone provide links or basic info on memory management, variable 
dereferencing, or the like? I have a script that traverses a file structure 
using os.walk and adds directory names to a list. It works for a small number 
of directories, but when I set it loose on a directory with thousands of 
dirs/subdirs, it crashes the DOS session and also the Python shell (when I run 
it from the shell).  This makes it difficult to figure out if the allocated 
memory or heap space for the DOS/shell session have overflown, or why it is 
crashing.

Juan Declet-Barreto [cid:image001.png@01CC9D4A.CB6B9D70]
GIS Specialist, Information Technology Dept.
City of Mesa
Office: 480.644.4751
juan.declet-barr...@mesaaz.gov

[cid:image002.png@01CC9D4A.CB6B9D70]

<><>-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python lesson please

2011-11-07 Thread gene heskett
On Monday, November 07, 2011 02:43:11 PM Dave Angel did opine:

> On 11/07/2011 11:40 AM, gene heskett wrote:
> > On Monday, November 07, 2011 11:30:45 AM Dave Angel did opine:
> > Back on the list..
> > 
> >> On 11/07/2011 06:22 AM, gene heskett wrote:
> >>> On Monday, November 07, 2011 05:35:15 AM Peter Otten did opine:
> >>> 
> >>> 
>  Are you talking about this one?
>  
>  https://github.com/halsten/Duqu-detectors/blob/master/DuquDriverPat
>  te rns .py
> >>> 
> >>> Yes.  My save as renamed it, still has about 30k of tabs in it.  But
> >>> I pulled it again, using the 'raw' link, saved it, no extra tabs.
> >>> 
> >>> But it still doesn't work for linux.  My python is 2.6.6
> >> 
> >> To start with, what's the md5 of the file you downloaded and are
> >> testing?  I get c4592a187f8f7880d3b685537e3bf9a5
> > 
> > [root@coyote Download]# md5sum DuquDriverPatterns.py
> > c4592a187f8f7880d3b685537e3bf9a5  DuquDriverPatterns.py, same as
> > yours.
> > 
> >> from md5sum.  If you get something different, one of us changed the
> >> file, or you got it before today.
> >> 
> >> The whole tab issue is a red-herring in this case.  But I don't see
> >> how you can find 30k tabs in a thousand lines.  And if I were going
> >> to detab it, I'd pick 4 spaces, so the code doesn't stretch across
> >> the page.
> > 
> > Down toward the bottom of the file, the tab indentations were as high
> > as 33 leading tabs per line.  Each stanza of the data was tab
> > indented 2 additional tabs from the one above it in the original
> > file.  30k was perhaps a poor SWAG, but 10 to 15k seems an entirely
> > reasonable guess.
> 
> What program are you using to read the file and support that claim?

vim.  But remember, this first one started out as a copy/paste from the 
firefox-7.0.1 screen.

> Neither emacs nor gedit shows more than one leading tab on any line I
> looked.  And if you set tabs to 4 columns, the file looks quite
> reasonable.  Doing a quick scan I see max of 5 tabs on any single line,
> and 1006 total.

I have no tabs left in the operative code, the python interpreter was 
having a cow if even one was in that last 30-35 lines of code.
> 
> 
> maxtabs = 0
> totaltabs = 0
> f = open("DuquDriverPatterns.py", "r")
> for line in f:
> 
>  cline = line.replace("\t", "")
>  tabs = len(line) - len(cline)
>  if tabs:
>  print tabs
>  maxtabs = max(maxtabs, tabs)
>  totaltabs += tabs
> 
> print "max=", maxtabs
> print "total=", totaltabs
> 
> >>> 
> The only way I've been able to make it "silent and instant" was to give
> it the name of an empty directory, or a typo representing no directory
> at all.
> 
[...]
> >> line ?  Or putting a   print len(files)  just after it (indented, of
> >> course) ?
> > 
> > No, I did try to print the value of rootdir though, indented the same,
> > and got a null printout, not even a line feed.

Indented the same as the rootdir statement itself, which in python would 
seem to make it immediately sequential to the roodir = statement.
 
> If you had put the print I suggested, it would at least print the words
> "Hello World".  Since it did not, you probably didn't actually add the
> line where I suggested.
> 
> > Thanks Dave.
> > 
> > Cheers, Gene
> 
> In another message you said it doesn't work on absolute file paths.  But
> it does.  You can replace any relative directory name with the absolute
> version, and it won't change the behavior.  I suspect you were caught up
> by a typo for the absolute path string.

I am gene, running as gene, what could be wrong with giving it /home/gene 
as the argument?

I have another dir in /home/amanda, that I build the alpha and beta amanda 
stuff in.  Let me try that.  And again, this works but I forgot about the 
.ccache directory, so it will take a while to finish.

Now, as a python lesson to me, I will do a blink compare between the 2 
files this evening & see what I munged.  ATM, I am working on a gunstock 
for as long as my feet and back can stand the standing, so sitting here is 
a 'break' from that.  Yeah, I occasionally call myself a JOAT. ;-)

Thanks Dave.

Cheers, Gene
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
My web page: 
Experience is that marvelous thing that enables you recognize a mistake
when you make it again.
-- Franklin P. Jones
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Extracting elements over multiple lists?

2011-11-07 Thread Dave Angel

On 11/07/2011 01:01 PM, JoeM wrote:

Thanks guys, I was just looking for a one line solution instead of a
for loop if possible. Why do you consider

[x.remove(x[0]) for x in [a,b,c]]

cheating? It seems compact and elegant enough for me.



Cheers
Are you considering the possibility that two of these names might 
reference the same list?


a = [42, 44, 6, 19, 48]
b = a
c = b


for x in [a,b,c]:
x.remove(x[0])

now a will have  [19,48] as its content.



--

DaveA

--
http://mail.python.org/mailman/listinfo/python-list


Re: memory management

2011-11-07 Thread Dave Angel


On 11/07/2011 02:43 PM, Juan Declet-Barreto wrote:

Hi,

Can anyone provide links or basic info on memory management, variable 
dereferencing, or the like? I have a script that traverses a file structure 
using os.walk and adds directory names to a list. It works for a small number 
of directories, but when I set it loose on a directory with thousands of 
dirs/subdirs, it crashes the DOS session and also the Python shell (when I run 
it from the shell).  This makes it difficult to figure out if the allocated 
memory or heap space for the DOS/shell session have overflown, or why it is 
crashing.

Juan Declet-Barreto [ciId:image001.png@01CC9D4A.CB6B9D70]
I don't have any reference to point you to, but CPython's memory 
management is really pretty simple.  However, it's important to tell us 
the build of Python, as there are several, with very different memory 
rules.  For example Jython, which is Python running in a Java VM, lets 
the java garbage collector handle things, and it's entirely different.


Likewise, the OS may be relevant.  You're using Windows-kind of 
terminology, but that doesn't prove you're on Windows, nor does it say 
what version.


Assuming 32 bit CPython 2.7 on XP, the principles are simple.  When an 
object is no longer accessible, it gets garbage collected*.   So if you 
build a list inside a function, and the only reference is from a 
function's local var, then the whole list will be freed when the 
function exits.  The mistakes many people make are unnecessarily using 
globals, and using lists when iterables would work just as well.


The tool on XP to tell how much memory is in use is the task manager.  
As you point out, its hard to catch a short-running app in the act.  So 
you want to add a counter to your code (global), and see how high it 
gets when it crashes.  Then put a test in your code for the timer value, 
and do an "input" somewhat earlier.


At that point, see how much memory the program is actually using.

Now, when an object is freed, a new one of the same size is likely to 
immediately re-use the space.  But if they're all different sizes, it's 
somewhat statistical.  You might get fragmentation, for example.  When 
Python's pool is full, it asks the OS for more (perhaps using swap 
space), but I don't think it ever gives it back.  So your memory use is 
a kind of ceiling case.  That's why it's problematic to build a huge 
data structure, and then walk through it, then delete it.  The script 
will probably continue to show the peak memory use, indefinitely.


* (technically, this is ref counted.  When the ref reaches zero the 
object is freed.  Real gc is more lazy scanning)



--
http://mail.python.org/mailman/listinfo/python-list


Re: A Python script to put CTAN into git (from DVDs)

2011-11-07 Thread Jonathan Fine

On 06/11/11 20:28, Jakub Narebski wrote:


Note that for gitPAN each "distribution" (usually but not always
corresponding to single Perl module) is in separate repository.
The dependencies are handled by CPAN / CPANPLUS / cpanm client
(i.e. during install).


Thank you for your interest, Jakub, and also for this information.  With 
TeX there's a difficult which Perl, I think, does not have.  With TeX we 
process documents, which may demand specific versions of packages. 
LaTeX users are concerned that move on to a later version will cause 
documents to break.



Putting all DVD (is it "TeX Live" DVD by the way?) into single
repository would put quite a bit of stress to git; it was created for
software development (although admittedly of large project like Linux
kernel), not 4GB+ trees.


I'm impressed by how well git manages it.  It took about 15 minutes to 
build the 4GB tree, and it was disk speed rather than CPU which was the 
bottleneck.



Once you've done that, it is then possible and sensible to select
suitable interesting subsets, such as releases of a particular
package. Users could even define their own subsets, such as "all
resources needed to process this file, exactly as it processes on my
machine".


This could be handled using submodules, by having superrepository that
consist solely of references to other repositories by the way of
submodules... plus perhaps some administrativa files (like README for
whole CTAN, or search tool, or DVD install, etc.)

This could be the used to get for example contents of DVD from 2010.


We may be at cross purposes.  My first task is get the DVD tree into 
git, performing necessary transformations such as expanding zip files 
along the way.  Breaking the content into submodules can, I believe, be 
done afterwards.


With DVDs from several years it could take several hours to load 
everything into git.  For myself, I'd like to do that once, more or less 
as a batch process, and then move on to the more interesting topics. 
Getting the DVD contents into git is already a significant piece of work.


Once done, I can them move on to what you're interested in, which is 
organising the material.  And I hope that others in the TeX community 
will get involved with that, because I'm not building this repository 
just for myself.



But even though submodules (c.f. Subversion svn:external, Mecurial
forest extension, etc.) are in Git for quite a bit of time, it doesn't
have best user interface.


In addition, many TeX users have a TeX DVD.  If they import it into a
git repository (using for example my script) then the update from 2011
to 2012 would require much less bandwidth.


???


A quick way to bring your TeX distribution up to date is to do a delta 
with a later distribution, and download the difference.  That's what git 
does, and it does it well.  So I'm keen to convert a TeX DVD into a git 
repository, and then differences can be downloaded.



Finally, I'd rather be working within git that modified copy of the
ISO when doing the subsetting.  I'm pretty sure that I can manage to
pull the small repositories from the big git-CTAN repository.


No you cannot.  It is all or nothing; there is no support for partial
_clone_ (yet), and it looks like it is a hard problem.

Nb. there is support for partial _checkout_, but this is something
different.


From what I know, I'm confident that I can achieve what I want using 
git.  I'm also confident that my approach is not closing off any 
possible approached.  But if I'm wrong you'll be able to say: I told you so.



Commit = tree + parent + metadata.


Actually, any number of parents, including none.  What metadata do I 
have to provide?  At this time nothing, I think, beyond that provided by 
the name of a reference (to the root of a tree).



I think you would very much want to have linear sequence of trees,
ordered via DAG of commits.  "Naked" trees are rather bad idea, I think.


As I recall the first 'commit' to the git repository for the Linux
kernel was just a tree, with a reference to that tree as a tag.  But
no commit.


That was a bad accident that there is a tag that points directly to a
tree of _initial import_, not something to copy.


Because git is a distributed version control system, anyone who wants to 
can create such a directed acyclic graph of commits.  And if it's useful 
I'll gladly add it to my copy of the repository.


best regards


Jonathan

--
http://mail.python.org/mailman/listinfo/python-list


RE: memory management

2011-11-07 Thread Juan Declet-Barreto
Well, I am using Python 2.5 (and the IDLE shell) in Windows XP, which ships 
with ESRI's ArcGIS. In addition, I am using some functions in the 
arcgisscripting Python geoprocessing module for geographic information systems 
(GIS) applications, which can complicate things. I am currently isolating 
standard library Python code (e.g., os.walk()) from the arcgisscripting module 
to evaluate in which module the environment crash is occurring. 

-Original Message-
From: Dave Angel [mailto:da...@dejaviewphoto.com] 
Sent: Monday, November 07, 2011 1:20 PM
To: Juan Declet-Barreto
Cc: python-list@python.org
Subject: Re: memory management


On 11/07/2011 02:43 PM, Juan Declet-Barreto wrote:
> Hi,
>
> Can anyone provide links or basic info on memory management, variable 
> dereferencing, or the like? I have a script that traverses a file structure 
> using os.walk and adds directory names to a list. It works for a small number 
> of directories, but when I set it loose on a directory with thousands of 
> dirs/subdirs, it crashes the DOS session and also the Python shell (when I 
> run it from the shell).  This makes it difficult to figure out if the 
> allocated memory or heap space for the DOS/shell session have overflown, or 
> why it is crashing.
>
> Juan Declet-Barreto [ciId:image001.png@01CC9D4A.CB6B9D70]
I don't have any reference to point you to, but CPython's memory management is 
really pretty simple.  However, it's important to tell us the build of Python, 
as there are several, with very different memory rules.  For example Jython, 
which is Python running in a Java VM, lets the java garbage collector handle 
things, and it's entirely different.

Likewise, the OS may be relevant.  You're using Windows-kind of terminology, 
but that doesn't prove you're on Windows, nor does it say what version.

Assuming 32 bit CPython 2.7 on XP, the principles are simple.  When an 
object is no longer accessible, it gets garbage collected*.   So if you 
build a list inside a function, and the only reference is from a function's 
local var, then the whole list will be freed when the function exits.  The 
mistakes many people make are unnecessarily using globals, and using lists when 
iterables would work just as well.

The tool on XP to tell how much memory is in use is the task manager.  
As you point out, its hard to catch a short-running app in the act.  So you 
want to add a counter to your code (global), and see how high it gets when it 
crashes.  Then put a test in your code for the timer value, and do an "input" 
somewhat earlier.

At that point, see how much memory the program is actually using.

Now, when an object is freed, a new one of the same size is likely to 
immediately re-use the space.  But if they're all different sizes, it's 
somewhat statistical.  You might get fragmentation, for example.  When Python's 
pool is full, it asks the OS for more (perhaps using swap space), but I don't 
think it ever gives it back.  So your memory use is a kind of ceiling case.  
That's why it's problematic to build a huge data structure, and then walk 
through it, then delete it.  The script will probably continue to show the peak 
memory use, indefinitely.

* (technically, this is ref counted.  When the ref reaches zero the object is 
freed.  Real gc is more lazy scanning)


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: memory management

2011-11-07 Thread Stefan Krah
Juan Declet-Barreto  wrote:
> Well, I am using Python 2.5 (and the IDLE shell) in Windows XP, which
> ships with ESRI's ArcGIS. In addition, I am using some functions in the
> arcgisscripting Python geoprocessing module for geographic information
> systems (GIS) applications, which can complicate things. I am currently
> isolating standard library Python code (e.g., os.walk()) from the
> arcgisscripting module to evaluate in which module the environment
> crash is occurring.

It might be a good idea to check if the problem also occurs with Python 2.7
since Python 2.5 is no longer maintained.


Stefan Krah


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: memory management

2011-11-07 Thread Dave Angel

On 11/07/2011 03:33 PM, Juan Declet-Barreto wrote:

Well, I am using Python 2.5 (and the IDLE shell) in Windows XP, which ships 
with ESRI's ArcGIS. In addition, I am using some functions in the 
arcgisscripting Python geoprocessing module for geographic information systems 
(GIS) applications, which can complicate things. I am currently isolating 
standard library Python code (e.g., os.walk()) from the arcgisscripting module 
to evaluate in which module the environment crash is occurring.
You top-posted.  In this mailing list, one should type new information 
after the quoted information, not before.


Perhaps a pre-emptive strike is in order.  On the assumption that it may 
be a memory problem, how about you turn the app inside out.  Instead of 
walking the entire tree, getting a list with all the paths, and then 
working on the list, how about doing the work on each file as you get 
it.  Or even make your own generator from os.walk, so the app can call 
your logic on each file, and never have all the file (name)s in memory 
at the same time.



Generator:

def  filelist(top, criteria):
  for  a, b, c in os.walk():
 for fiile in files:
   apply some criteria
   yield file


Now the main app can iterate through this "list" in the usual way

for filename in filelist(top, "*.txt"):
   dosomething...



--

DaveA

--
http://mail.python.org/mailman/listinfo/python-list


all() is slow?

2011-11-07 Thread OKB (not okblacke)
I noticed this (Python 2.6.5 on Windows XP):

>>> import random, timeit
>>> def myAll(x):
... for a in x:
... if a not in (True, False):
... return False
... return True
>>> x = [random.choice([True, False]) for a in xrange(0, 500)]
>>> timeit.timeit('myAll(x)', 'from __main__ import myAll, x', 
number=10)
0: 9.7685158309226452
>>> timeit.timeit('all(a in (True, False) for a in x)', 'from __main__ 
import x', number=10)
1: 12.348196768024984
>>> x = [random.randint(0,100) for a in xrange(0, 500)]
>>> def myAll(x):
... for a in x:
... if not a <= 100:
... return False
... return True
>>> timeit.timeit('myAll(x)', 'from __main__ import myAll, x', 
number=10)
4: 2.8248207523582209
>>> timeit.timeit('all(a <= 100 for a in x)', 'gc.enable(); from 
__main__ import x', number=10)
5: 4.6433557896324942

What is the point of the all() function being a builtin if it's 
slower than writing a function to do the check myself?

-- 
--OKB (not okblacke)
Brendan Barnwell
"Do not follow where the path may lead.  Go, instead, where there is
no path, and leave a trail."
--author unknown
-- 
http://mail.python.org/mailman/listinfo/python-list


RE: python-based downloader (youtube-dl) missing critical feature ...

2011-11-07 Thread Prasad, Ramit
>Maybe Lbrtchx is one of the Sheldon Cooper's nicknames :o)
>
>JM
>
>PS : I have the feeling that my nerdy reference will fall flat...

Not completely ;)

Ramit


Ramit Prasad | JPMorgan Chase Investment Bank | Currencies Technology
712 Main Street | Houston, TX 77002
work phone: 713 - 216 - 5423

--

This email is confidential and subject to important disclaimers and
conditions including on offers for the purchase or sale of
securities, accuracy and completeness of information, viruses,
confidentiality, legal privilege, and legal entity disclaimers,
available at http://www.jpmorgan.com/pages/disclosures/email.  
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: all() is slow?

2011-11-07 Thread Chris Rebert
On Mon, Nov 7, 2011 at 1:00 PM, OKB (not okblacke)
 wrote:

>        What is the point of the all() function being a builtin if it's
> slower than writing a function to do the check myself?

Regardless of whether it's slower (which I expect someone will be
along to debunk or explain shortly), do you really want to have to
write an additional boilerplate function or block of code /every
single time/ you want to do such a check? The runtime speed difference
is unlikely to be worth your time as a developer in many cases. And by
Murphy's Law, you *will* make errors writing these repetitive code
blocks (e.g. forget to negate the conditional), whereas reusing all()
makes that much less likely.

The trade-off is run-time speed for developer
productivity/convenience; Python tends to lean towards the latter (to
varying degrees).

Cheers,
Chris
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: memory management

2011-11-07 Thread Chris Angelico
On Tue, Nov 8, 2011 at 6:43 AM, Juan Declet-Barreto
 wrote:
>
> I have a script that traverses a file structure using os.walk and adds 
> directory names to a list. It works for a small number of directories, but 
> when I set it loose on a directory with thousands of dirs/subdirs, it crashes 
> the DOS session and also the Python shell (when I run it from the shell).

This seems a little unusual - it crashes more than its own process? If
you use Start... Run and type in "cmd" (or use the "Command Prompt"
icon, whereever that is), and invoke Python from there, it crashes cmd
as well as Python? If so, I'd be suspecting either some really weird
bug in the Python interpreter (try upgrading to 2.7), or a hardware
fault in your computer (try running MemTest or running it on a
different computer).

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: all() is slow?

2011-11-07 Thread david vierra
On Nov 7, 11:00 am, "OKB (not okblacke)"
 wrote:

>         What is the point of the all() function being a builtin if it's
> slower than writing a function to do the check myself?
>

But, you didn't write an all() function.  You wrote a more specialized
allBoolean() function. I think this comparison is more fair to the
builtin all():

>>> def myAll(x):
... for a in x:
... if not a: return False
... return True
...
>>> timeit.timeit('myAll(a in (True, False) for a in x)', 'from __main__ import 
>>> myAll, x', number=10)
14.510986388627998
>>> timeit.timeit('all(a in (True, False) for a in x)', 'from __main__ import 
>>> x', number=10)
12.209779342432576
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: A Python script to put CTAN into git (from DVDs)

2011-11-07 Thread Jakub Narebski
The following message is a courtesy copy of an article
that has been posted to comp.text.tex as well.

Jonathan Fine  writes:
> On 06/11/11 20:28, Jakub Narebski wrote:
> 
> > Note that for gitPAN each "distribution" (usually but not always
> > corresponding to single Perl module) is in separate repository.
> > The dependencies are handled by CPAN / CPANPLUS / cpanm client
> > (i.e. during install).
> 
> Thank you for your interest, Jakub, and also for this information.
> With TeX there's a difficult which Perl, I think, does not have.  With
> TeX we process documents, which may demand specific versions of
> packages. LaTeX users are concerned that move on to a later version
> will cause documents to break.

How you can demand specific version of package?

In the "\usepackage[options]{packages}[version]" LaTeX command the
 argument specifies _minimal_ (oldest) version.  The same
is true for Perl "use Module VERSION LIST".
 
Nevertheless while with "use Module VERSION" / "use Module VERSION LIST"
you can request minimal version of Perl Module, the META build-time spec 
can include requirement of exact version of required package:

http://p3rl.org/CPAN::Meta::Spec

  Version Ranges
  ~~

  Some fields (prereq, optional_features) indicate the particular
  version(s) of some other module that may be required as a
  prerequisite. This section details the Version Range type used to
  provide this information.

  The simplest format for a Version Range is just the version number
  itself, e.g. 2.4. This means that *at least* version 2.4 must be
  present. To indicate that *any* version of a prerequisite is okay,
  even if the prerequisite doesn't define a version at all, use the
  version 0.

  Alternatively, a version range *may* use the operators < (less than),
  <= (less than or equal), > (greater than), >= (greater than or
  equal), == (equal), and != (not equal). For example, the
  specification < 2.0 means that any version of the prerequisite less
  than 2.0 is suitable.

  For more complicated situations, version specifications *may* be
  AND-ed together using commas. The specification >= 1.2, != 1.5, <
  2.0 indicates a version that must be *at least* 1.2, *less than* 2.0,
  and *not equal to* 1.5.

> > Putting all DVD (is it "TeX Live" DVD by the way?) into single
> > repository would put quite a bit of stress to git; it was created for
> > software development (although admittedly of large project like Linux
> > kernel), not 4GB+ trees.
> 
> I'm impressed by how well git manages it.  It took about 15 minutes to
> build the 4GB tree, and it was disk speed rather than CPU which was
> the bottleneck.

I still think that using modified contrib/fast-import/import-zips.py
(or import-tars.perl, or import-directories.perl) would be a better
solution here...
 
[...]
> We may be at cross purposes.  My first task is get the DVD tree into
> git, performing necessary transformations such as expanding zip files
> along the way.  Breaking the content into submodules can, I believe,
> be done afterwards.

'reposurgeon' might help there... or might not.  Same with git-subtree
tool.

But now I understand that you are just building tree objects, and
creating references to them (with implicit ordering given by names,
I guess).  This is to be a start of further work, isn't it?

> With DVDs from several years it could take several hours to load
> everything into git.  For myself, I'd like to do that once, more or
> less as a batch process, and then move on to the more interesting
> topics. Getting the DVD contents into git is already a significant
> piece of work.
> 
> Once done, I can them move on to what you're interested in, which is
> organising the material.  And I hope that others in the TeX community
> will get involved with that, because I'm not building this repository
> just for myself.

[...]

> > > In addition, many TeX users have a TeX DVD.  If they import it into a
> > > git repository (using for example my script) then the update from 2011
> > > to 2012 would require much less bandwidth.
> >
> > ???
> 
> A quick way to bring your TeX distribution up to date is to do a delta
> with a later distribution, and download the difference.  That's what
> git does, and it does it well.  So I'm keen to convert a TeX DVD into
> a git repository, and then differences can be downloaded.

Here perhaps you should take a look at git-based 'bup' backup system.

Anyway I am not sure if for git to be able to generate deltas well you
have to have DAG of commits, so Git can notice what you have and what
you have not.  Trees might be not enough here. (!)
 
> > Commit = tree + parent + metadata.
> 
> Actually, any number of parents, including none.  What metadata do I
> have to provide?  At this time nothing, I think, beyond that provided
> by the name of a reference (to the root of a tree).

Metadata = commit message (here you can e.g. put the official name of
DVD), author and committer info (name, email, date and time, 

Re: A Python script to put CTAN into git (from DVDs)

2011-11-07 Thread Jonathan Fine

On 07/11/11 21:49, Jakub Narebski wrote:

[snip]


But now I understand that you are just building tree objects, and
creating references to them (with implicit ordering given by names,
I guess).  This is to be a start of further work, isn't it?


Yes, that's exactly the point, and my apologies if I was not clear enough.

I'll post again when I've finished the script and performed placed 
several years of DVD into git.  Then the discussion will be more 
concrete - we have this tree, how do we make it more useful.


Thank you for your contributions, particularly telling me about gitpan.

--
Jonathan
--
http://mail.python.org/mailman/listinfo/python-list


Re: all() is slow?

2011-11-07 Thread Chris Angelico
On Tue, Nov 8, 2011 at 8:46 AM, david vierra  wrote:
> But, you didn't write an all() function.  You wrote a more specialized
> allBoolean() function. I think this comparison is more fair to the
> builtin all():

So really, it's not "all() is slow" but "function calls are slow".
Maybe it'd be worthwhile making an all-factory:

def my_all(code,lst):
exec("""def tmp_all(x):
for a in x:
if not ("""+code+"""): return False
return True
""")
return tmp_all(lst)
timeit.timeit('my_all("a in (True, False)",x)','from __main__ import
my_all,x',number=10)

Bad code imho, but it _is_ faster than both the original and the builtin.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


ctypes accessing functions with double pointers

2011-11-07 Thread Eleftherios Garyfallidis
Hello,

Is it possible using ctypes to call C functions from a shared object
containing double pointers e.g. int foo(float **i) and if yes how?

Best wishes,
Eleftherios
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ctypes accessing functions with double pointers

2011-11-07 Thread Chris Rebert
On Mon, Nov 7, 2011 at 2:06 PM, Eleftherios Garyfallidis
 wrote:
> Hello,
>
> Is it possible using ctypes to call C functions from a shared object
> containing double pointers e.g. int foo(float **i) and if yes how?

(Untested conjecture:)

import ctypes
# ...create ctypes_wrapped_foo...
the_float = ctypes.c_float(42.1)
float_ptr = ctypes.byref(the_float)
i = ctypes.byref(float_ptr)
result_integer = ctypes_wrapped_foo(i)

Cheers,
Chris
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: all() is slow?

2011-11-07 Thread Joshua Landau
See these all vs myAll tests:

%~> python all_test
0.5427970886230469
1.1579840183258057

3.3052260875701904
3.4992029666900635

3.303942918777466
1.7343430519104004

3.18320894241333
1.6191949844360352

In the first pair and the second pair, the pairs receive the same input.
The builtin all outperforms the user-defined.
In the second pair, the builtin receives a generator whereas the function
just runs. A generator has to be called once every iteration, and this
significantly slows it.

The main use of "all" is ease, though, as mentioned before.
The second is speed when testing lists/generators that don't need to be
wrapped.

Note:
%~> pypy all_test
0.0657250881195
0.0579369068146

0.751952171326
0.657609939575

0.748466968536
0.0586581230164

0.801791906357
0.0550608634949

If speed is your concern, there are simple solutions.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Extracting elements over multiple lists?

2011-11-07 Thread Terry Reedy

On 11/7/2011 1:22 PM, John Gordon wrote:

In  
JoeM  writes:


Thanks guys, I was just looking for a one line solution instead of a
for loop if possible. Why do you consider



[x.remove(x[0]) for x in [a,b,c]]



cheating? It seems compact and elegant enough for me.


It looks like incomplete code with 'somelists = ' or other context 
omitted. It saves no keypresses '[',...,SPACE,...,']' versus 
...,':',ENTER,TAB,... . (TAB with a decent Python aware editor.)



I wouldn't call it cheating, but that solution does a fair bit of
unneccessary work (creating a list comprehension that is never used.)


The comprehension ( the code) is used, but the result is not. If the 
source iterator has a large number of items rather than 3, the throwaway 
list could become an issue. Example.


fin = open('source.txt')
fout= open('dest.txt, 'w')
for line in fin:
  fout.write(line.strip())
# versus
[fout.write(line.strip()) for line in fin]

If source.txt has 100 millions lines, the 'clever' code looks less 
clever ;=). Comprehensions are intended for creating collections (that 
one actually wants) and for normal Python coding are best used for that.


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


Re: memory management

2011-11-07 Thread Terry Reedy

On 11/7/2011 3:47 PM, Stefan Krah wrote:

Juan Declet-Barreto  wrote:

Well, I am using Python 2.5 (and the IDLE shell) in Windows XP, which
ships with ESRI's ArcGIS. In addition, I am using some functions in the
arcgisscripting Python geoprocessing module for geographic information
systems (GIS) applications, which can complicate things. I am currently
isolating standard library Python code (e.g., os.walk()) from the
arcgisscripting module to evaluate in which module the environment
crash is occurring.


What *exactly* do you mean by "crash"?


It might be a good idea to check if the problem also occurs with Python 2.7
since Python 2.5 is no longer maintained.


And 2.7 has hundreds of more recent bug fixes. Just one could make a 
difference.


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


Re: Python lesson please

2011-11-07 Thread Terry Reedy

On 11/7/2011 11:30 AM, gene heskett wrote:


Perhaps winderz does not have 'pipe' files so the authors never got caught
out on this?


Last I know, Windows not only had no pipe files but also no real 
in-memory pipes. Maybe one or both of those has changed.


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


Re: Python lesson please

2011-11-07 Thread gene heskett
On Monday, November 07, 2011 07:34:05 PM Terry Reedy did opine:

> On 11/7/2011 11:30 AM, gene heskett wrote:
> > Perhaps winderz does not have 'pipe' files so the authors never got
> > caught out on this?
> 
> Last I know, Windows not only had no pipe files but also no real
> in-memory pipes. Maybe one or both of those has changed.

Sheesh..  How the heck do you get anything done on winderz then. :(

Answer not needed as they regularly reinvent the wheel because everything 
has to be self contained, poorly IMO.  Most of their wheels ride a bit 
rough. ;)

Cheers, Gene
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
My web page: 
I believe that professional wrestling is clean and everything else in
the world is fixed.
-- Frank Deford, sports writer
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python ORMs Supporting POPOs and Substituting Layers in Django

2011-11-07 Thread Travis Parks
On Nov 7, 12:44 pm, John Gordon  wrote:
> In  John Gordon  writes:
>
> > In <415d875d-bc6d-4e69-bcf8-39754b450...@n18g2000vbv.googlegroups.com> 
> > Travis Parks  writes:
> > > Which web frameworks have people here used and which have they found
> > > to be: scalable, RAD compatible, performant, stable and/or providing
> > > good community support? I am really trying to get as much feedback as
> > I've used Django and it seems to be a very nice framework.  However I've
> > only done one project so I haven't delved too deeply.
>
> You are probably looking for more detail than "It's a nice framework" :-)
>
> The database model in Django is powerful; it allows you to do queries in
> native Python code without delving into backend SQL stuff.
>
> I don't know how scalable/performant the database model is, as the one
> project I worked on didn't deal with a ton of data.  (But I'd be surprised
> if it had poor performance.)
>
> The URL dispatcher provides a very nice and logical way to associate a
> given URL with a given method call.
>
> Community support is excellent.
>
> --
> John Gordon                   A is for Amy, who fell down the stairs
> gor...@panix.com              B is for Basil, assaulted by bears
>                                 -- Edward Gorey, "The Gashlycrumb Tinies"

I started the battle today. The "new guy" was trying to sell me on
CodeIgnitor. I haven't looked at it, but it is PHP, so I really want
to avoid it. The good thing is that all of his "friends" have been
telling him to get into Python. I have been trying to convince him
that PHP isn't cut out for background services and is mostly a front-
end language. Python is much more geared towards hardcore data
processing. Why write the system in two languages?

I have been spending a lot of time looking at the Pyramid project: the
next generation of the Pylons project. It looks powerful, but it seems
to be a lot more complex than Django.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to mix-in __getattr__ after the fact?

2011-11-07 Thread Lie Ryan

On 10/31/2011 11:01 PM, dhyams wrote:


Thanks for all of the responses; everyone was exactly correct, and
obeying the binding rules for special methods did work in the example
above.  Unfortunately, I only have read-only access to the class
itself (it was a VTK class wrapped with SWIG), so I had to find
another way to accomplish what I was after.



As a big huge hack, you can always write a wrapper class:

class Wrapper(object):
def __init__(self, *args, **kwargs):
self.__object = MySWIGClass(*args, **kwargs)
def __getattr__(self, attr):
try:
return getattr(self.__object, attr)
except AttributeError:
...

--
http://mail.python.org/mailman/listinfo/python-list


easy_install doesn't install non-package *.py file

2011-11-07 Thread Makoto Kuwata
I got trouble about easy_install command.

My package:

  README.rst
  setup.py
  foobar/
  foobar/__init__.py
  foobar/data/
  foobar/data/template.py

In the above example, 'foobar/data/template.py' is just a
template data file (= not a python module file).
(notice that 'foobar/data/__init__.py' doesn't exist.)

In this case, 'foobar/data/template.py' file is NOT installed
when trying 'easy_install foobar'.
This is trouble what I got.

I found that:

* foobar.tar.gz created by 'easy_install sdist' contains
  'foobar/data/template.py' correctly.
* foobar.egg created by 'easy_install bdist' doesn't contain
  'foobar/data/template.py' file.

Question: how can I enforce easy_install command to
install 'foobar/data/template.py' (or non-package *.py file)?

--
regars,
makoto kuwata
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python lesson please

2011-11-07 Thread Cameron Simpson
On 07Nov2011 15:00, gene heskett  wrote:
| On Monday, November 07, 2011 02:43:11 PM Dave Angel did opine:
| > On 11/07/2011 11:40 AM, gene heskett wrote:
| > > Down toward the bottom of the file, the tab indentations were as high
| > > as 33 leading tabs per line.  Each stanza of the data was tab
| > > indented 2 additional tabs from the one above it in the original
| > > file.  30k was perhaps a poor SWAG, but 10 to 15k seems an entirely
| > > reasonable guess.
| > 
| > What program are you using to read the file and support that claim?
| 
| vim.  But remember, this first one started out as a copy/paste from the 
| firefox-7.0.1 screen.

I don't suppose you had autoident turned on?

I hate using cu/paste to fetch data; _always_ use a "download" link, or
use the browser's "save page as" facility.

But still, if your MD5 checksums now match...

Cheers,
-- 
Cameron Simpson  DoD#743
http://www.cskk.ezoshosting.com/cs/

Footnotes that extend to a second page are an abject failure of design.
- Bringhurst, _The Elements of Typographic Style_
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python ORMs Supporting POPOs and Substituting Layers in Django

2011-11-07 Thread Lie Ryan

On 11/08/2011 01:21 PM, Travis Parks wrote:

On Nov 7, 12:44 pm, John Gordon  wrote:

In  John Gordon  writes:


In<415d875d-bc6d-4e69-bcf8-39754b450...@n18g2000vbv.googlegroups.com>  Travis 
Parks  writes:

Which web frameworks have people here used and which have they found
to be: scalable, RAD compatible, performant, stable and/or providing
good community support? I am really trying to get as much feedback as

I've used Django and it seems to be a very nice framework.  However I've
only done one project so I haven't delved too deeply.


You are probably looking for more detail than "It's a nice framework" :-)

The database model in Django is powerful; it allows you to do queries in
native Python code without delving into backend SQL stuff.

I don't know how scalable/performant the database model is, as the one
project I worked on didn't deal with a ton of data.  (But I'd be surprised
if it had poor performance.)

The URL dispatcher provides a very nice and logical way to associate a
given URL with a given method call.

Community support is excellent.

--
John Gordon   A is for Amy, who fell down the stairs
gor...@panix.com  B is for Basil, assaulted by bears
 -- Edward Gorey, "The Gashlycrumb Tinies"


I started the battle today. The "new guy" was trying to sell me on
CodeIgnitor. I haven't looked at it, but it is PHP, so I really want
to avoid it. The good thing is that all of his "friends" have been
telling him to get into Python. I have been trying to convince him
that PHP isn't cut out for background services and is mostly a front-
end language. Python is much more geared towards hardcore data
processing. Why write the system in two languages?

I have been spending a lot of time looking at the Pyramid project: the
next generation of the Pylons project. It looks powerful, but it seems
to be a lot more complex than Django.


CodeIgniter is a very fine framework, however it builds on top of a 
shitty excuse of a language called PHP.


I've found that Django has a much better debugging tools; when a Django 
page produces an exception, it would always produce a useful error page. 
I haven't been able to do the same in CodeIgniter (nor in any PHP 
framework I've used, I'm starting to think it's a language limitation); 
often when you have errors, PHP would just silently return empty or 
partial pages even with all the debugging flags on.


IMO, Python has a much nicer choice of built-in data structure for data 
processing. Python has a much more mature object-orientation, e.g. I 
prefer writing l.append(x) rather than array_push(l, x). I think these 
qualities are what makes you think Python is much, much more suitable 
for data processing than PHP; and I wholesomely agree.


Database abstraction-wise, Django's ORM wins hands down against 
CodeIgniter's ActiveRecord. CodeIgniter's ActiveRecord is basically just 
a thin wrapper that abstracts the perks of various database engine. 
Django's ORM is a full blown ORM, it handles foreign key relationships 
in OO way. The only disadvantage of Django's ORM is that since it's 
written in Python, if you need to write a program working on the same 
database that doesn't use Django nor Python, then you'll have a problem 
since you'll have to duplicate the foreign key relationships.


With all the bashing of PHP, PHP do have a few advantages. PHP and 
CodeIgniter is much easier to set up and running than Django; and the 
ability to create a .php file and have it running without having to 
write the routing file is sometimes a bliss. And PHP are often used as 
their own templating language; in contrast with Django which uses a 
separate templating language. Having a full blown language as your 
templating language can be a double-edged sword, but it is useful 
nevertheless for experimental work.


IMO, while it is easier to get up and running in PHP, in the long run 
Python is much better in almost any other aspects.


--
http://mail.python.org/mailman/listinfo/python-list


overview on dao

2011-11-07 Thread Simeon Chaos
Dao is a a functional logic solver (similar to lambdaProlog, Curry)
written in python. The links related to dao are here:

pypi distribution and document: http://pypi.python.org/pypi/daot
code repository: https://github.com/chaosim/dao
dao groups on google: Group name: daot, Group home page:
http://groups.google.com/group/daot, Group email address:
d...@googlegroups.com
old stuffs: http://code.google.com/p/daot(old, deprecated)
google+ pages: https://plus.google.com/112050694070234685790

Dao has the features such as

*   lisp style programming:

*  call/cc, block/return-from, unwind-protect, catch/throw;

Dao is implemented by continuation passing style, so it is
natural to implement such stuff. And I have some little improvement to
the trampoline technology because I need it coexist with the
technology of using yield statement to implement unifying and
backtracking. The code for evaluating lisp style expression is
borrowed from the book "Lisp in Small Pieces" wrote by Christian
Queinnec and Ecole Polytechnique.

*  function and macro;

   The concept of function and macro in dao become more general
than in lisp, because we can define them with multiple rules and take
advantage of unifying and backtracking.

*  eval, so called meta circular evaluation.

* prolog style programming:

*  logic variable and unify;

*   backtracking;

*   findall, repeat/fail, call, once, and so on;

*   cut.

At first, unify is implemented by using exception, and
backtracking by using two continuations(succeed continuation and fail
continuation) technology, borrowed the code from pypy prolog. Now, the
fail continuation is removed, and both unifying and backtracking is
implemented  by using 'yield' statement, which I learned from
YieldProlog (http://yieldprolog.sourceforge.net). Dao run faster than
before by using yield statement, removing class definition of
continuation, and not boxing atomic and list values(compare to pypy
prolog without translation or jit). Up to now I do not use the pypy's
translation or jit feature to speedup, and all of the tests in dao
0.7.3 run in about two minutes.

* many other useful builtins that simulate lisp and prolog
primitives.

* some builtins that cooperate with python.

* builtin parser, which is the most powerful parser I have seen.

  The parser in dao is basically a recursive decent parser with
backtracking, but It also support direct or indirect left recursive
rules by using memorization when needed. The programmer can toggle
memorization of any command that is not left recursive. the grammar in
dao is some similar to DCG(definite clause grammar), but is more
flexible than DCG. It have the expressive power beyond context free or
sensitive grammar, parsing expression grammar. Dao can be used to
parse any object, not limiting to text. Many builtin terminal and
combinative parsing primitives are provided.

  In dao, I have found and implemented the unrivalled technology to
uniting parser and evaluator by the meta circular evaluation. So Dao
can be used to implement a programming language in which the syntax is
dynamic, that is to say, the syntax can be defined on the fly by the
programmer easily. A little sample to demonstrate this technology is
given in the files dao/samples/sexpression.py and dao/dao/tests/
testsexpresson.py.

-

Dinpy: a child language born and live in python.

  Dinpy can be looked as the syntax sugar for dao in python. It arises
accidentally when I wrote tests for dao. A detailed introduction is as
follows: I hate the too many parentheses when I wrote tests for the
'let' statement of lisp, so I replace embedded tuples with dict for
the bindings, and after the spark of inspiration, the door to dinpy
was open. I learned a new method for inventing a new language from it:
use the syntax based on the operator of the mother language for
building the child language.

--

I have written some Chinese documents for dao, but few English. The
Chinese document is not complete yet. With the functional logic
programming and dynamic grammar on the shoulders of the great python,
many possibilities arises with dao, such as parsing, inventing
embedded DSL with operator syntax, independent domain specific
language or general language, text processing, natural language
processing, expert system, artificial intelligence, web application,
and so on.

Now:

* I hope more people know and use dao. Or maybe something wrong in dao
prevent it being used in real application, and I hope to know what it
is.

* Maybe anyone have interest and time to join in developing dao or
writing some documents or articles?

* More tests are needed always, and I hope to get some bug report from
any other people.

* the benchmarks of the dao, comparation with similar package, and so
on.

* I have a long todo list, I hope someone 

Re: Question about 'iterable cursors'

2011-11-07 Thread Lie Ryan

On 11/07/2011 05:04 PM, John Nagle wrote:

Realize that SQLite is not a high-performance multi-user database.
You use SQLite to store your browser preferences, not your customer
database.


I agree with SQLite is not multi-user; I disagree that SQLite is not a 
high-performance database. In single user cases, SQLite should far 
outperform a client-server-based database engine since it doesn't have 
the client-server overhead.


--
http://mail.python.org/mailman/listinfo/python-list


simpler over view on dao: a functional logic solver with builtin parsing power, and dinpy, the sugar syntax for dao in python

2011-11-07 Thread Simeon Chaos
Dao is a a functional logic solver (similar to lambdaProlog, Curry)
written in python. The links related to dao are here:

pypi distribution and document: http://pypi.python.org/pypi/daot

code repository: https://github.com/chaosim/dao

dao groups on google: http://groups.google.com/group/daot,
d...@googlegroups.com

old stuffs: http://code.google.com/p/daot(old, deprecated)

google+ pages: https://plus.google.com/112050694070234685790

Dao has the features such as

*   lisp style programming:

*  call/cc, block/return-from, unwind-protect, catch/throw;

*  function and macro;

*  eval, so called meta circular evaluation.

* prolog style programming:

*  logic variable and unify;

*   backtracking;

*   findall, repeat/fail, call, once, and so on;

*   cut.

* many other useful builtins that simulate lisp and prolog
primitives.

* some builtins that cooperate with python.

* builtin parser, which is the most powerful parser I have seen, it
support the features as below:

*   paramater grammar, similar to DCG( definite clause grammar),
but more flexible
*   dynamic grammar,
*   left recursive
*   memoriaziont parsing result
*   many builtins, include terminal and cominative matchers.

-

Dinpy: a child language born and live in python.

  Dinpy can be looked as the syntax sugar for dao in python. A piece
of code in dinpy is listed as below:

parse_text(char(x1)+any(~char('b')+some(char(x1)))+eoi,
'ab'),

let( i<<0 ). do[ repeat, prin(i), ++i, iff(i<3).do[fail] ],

letr( a << fun(x) [ and_p(b(x),c(x)) ]
  [ d(x) ],
  b << fun(1) ['b1']
  (4) ['b4'],
  c << fun(4) ['c4'],
  d << fun(3) ['d3'],
 ).do[
 a(x), prin(x) ],

each(i)[1:3].
  loop[prin(i)],

i << 0,
loop[ i << i+1, prin(i)].when(i==3),

case(1).
  of(1)[prin(1)].
  of(2)[prin(2)]

--

Some Chinese documents for dao is written, but few English. The
Chinese document is not complete yet.

With the functional logic programming and dynamic grammar on the
shoulders of the great python, many possibilities arises with dao,
such as parsing, inventing embedded DSL with operator syntax,
independent domain specific language or general language, text
processing, natural language processing, expert system, artificial
intelligence, web application, and so on.

Now:

* I hope more people know and use dao. Or maybe something wrong in dao
prevent it being used in real application, and I hope to know what it
is.

* Maybe anyone have interest and time to join in developing dao or
writing some documents or articles?

* More tests are needed always, and I hope to get some bug report from
any other people.

* the benchmarks of the dao, comparation with similar package, and so
on.

* I have a long todo list, I hope someone else can join in dao project.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to mix-in __getattr__ after the fact?

2011-11-07 Thread Steven D'Aprano
On Tue, 08 Nov 2011 15:17:14 +1100, Lie Ryan wrote:

> On 10/31/2011 11:01 PM, dhyams wrote:
>>
>> Thanks for all of the responses; everyone was exactly correct, and
>> obeying the binding rules for special methods did work in the example
>> above.  Unfortunately, I only have read-only access to the class itself
>> (it was a VTK class wrapped with SWIG), so I had to find another way to
>> accomplish what I was after.
>>
>>
> As a big huge hack, you can always write a wrapper class:
> 
> class Wrapper(object):
>  def __init__(self, *args, **kwargs):
>  self.__object = MySWIGClass(*args, **kwargs)
>  def __getattr__(self, attr):
>  try:
>  return getattr(self.__object, attr)
>  except AttributeError:
>  ...


That's not a hack, that's a well-respected design pattern called 
Delegation.

http://rosettacode.org/wiki/Delegate
http://en.wikipedia.org/wiki/Delegation_pattern


In this case, you've implemented about half of automatic delegation:

http://code.activestate.com/recipes/52295

which used to be much more important in Python prior to the type/class 
unification in version 2.2.


To also delegate special dunder methods using new-style classes, see this:

http://code.activestate.com/recipes/252151



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python lesson please

2011-11-07 Thread gene heskett
On Tuesday, November 08, 2011 12:53:20 AM Cameron Simpson did opine:

> On 07Nov2011 15:00, gene heskett  wrote:
> | On Monday, November 07, 2011 02:43:11 PM Dave Angel did opine:
> | > On 11/07/2011 11:40 AM, gene heskett wrote:
> | > > Down toward the bottom of the file, the tab indentations were as
> | > > high as 33 leading tabs per line.  Each stanza of the data was
> | > > tab indented 2 additional tabs from the one above it in the
> | > > original file.  30k was perhaps a poor SWAG, but 10 to 15k seems
> | > > an entirely reasonable guess.
> | > 
> | > What program are you using to read the file and support that claim?
> | 
> | vim.  But remember, this first one started out as a copy/paste from
> | the firefox-7.0.1 screen.
> 
> I don't suppose you had autoident turned on?
> 
I think it is.  I gave up turning it off long ago because it was always on 
on the next launch.  Today I've forgotten how to turn it off.  Like hitting 
oneself in the head with a hammer, it feels so good when you stop.  :)

> I hate using cu/paste to fetch data; _always_ use a "download" link, or
> use the browser's "save page as" facility.

Which would have saved all the html codes too, this code was being 
displayed in a window of the main window.
 
> But still, if your MD5 checksums now match...

Not on that file, but on the next pull it was, and works now.  And on the 
first file, the blink compare disclosed I had some indentation wrong, and 
that there was a lowercase b in front of all the opening double quotes used 
that I didn't get originally.  I have no clue what this:
b"hex data" means to python.

> Cheers,


Cheers, Gene
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
My web page: 
To love is good, love being difficult.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python lesson please

2011-11-07 Thread Chris Angelico
On Tue, Nov 8, 2011 at 5:29 PM, gene heskett  wrote:
> Not on that file, but on the next pull it was, and works now.  And on the
> first file, the blink compare disclosed I had some indentation wrong, and
> that there was a lowercase b in front of all the opening double quotes used
> that I didn't get originally.  I have no clue what this:
>        b"hex data" means to python.

That's the repr() of a Bytes string (as opposed to a Unicode string)
in Python 3. If that's your only issue, I'd say you have it working
fine under Python 3; if there are other problems, try running it under
Python 2.7.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Extracting elements over multiple lists?

2011-11-07 Thread Laurent Claessens

Le 07/11/2011 19:01, JoeM a écrit :

Thanks guys, I was just looking for a one line solution instead of a
for loop if possible. Why do you consider

[x.remove(x[0]) for x in [a,b,c]]

cheating? It seems compact and elegant enough for me.


I have the feeling that it does not do what I expect it does just by 
seeing the line. It is list comprehension, but the point is absolutely 
not in creating a list.


I'd say it breaks the rule «Explicit is better than implicit.» while 
«Special cases aren't special enough to break the rules.»


But well... could be a matter of taste; I prefer the loop.

Laurent
--
http://mail.python.org/mailman/listinfo/python-list


Re: Python ORMs Supporting POPOs and Substituting Layers in Django

2011-11-07 Thread Chris Angelico
On Tue, Nov 8, 2011 at 4:09 PM, Lie Ryan  wrote:
> IMO, Python has a much nicer choice of built-in data structure for data
> processing. Python has a much more mature object-orientation, e.g. I prefer
> writing l.append(x) rather than array_push(l, x). I think these qualities
> are what makes you think Python is much, much more suitable for data
> processing than PHP; and I wholesomely agree.
>

Two more examples where Python's lists are superior to PHP's arrays.
Array literal syntax feels like a function call, but list literals are
slim and easy to use inside expressions (try creating a nested array
as a function argument - you'll get a forest of parens). Also,
dereferencing an array only works on an array variable - if you have a
function that returns an array, you can't dereference it directly:

$foo = func()[1];   # doesn't work
$foo = func(); $foo=$foo[1];  # works

I much prefer the "everything's an object" notion. C's array literals
are just as weird (although in C, you can directly dereference a
literal character array - "ABCDEFG"[note_idx] will give you a note
name as a char)... much easier when a variable name is just an
expression, a function call is an expression, a literal is an
expression, and you can work with them all the same way.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list