Re: Very strange issues with collections.Mapping

2018-01-19 Thread John Krukoff via Python-list
Have you ruled out the possibility that collections.Mapping has been
(perhaps temporarily) assigned to something else?

On Thu, Jan 18, 2018 at 2:37 PM, Jason Swails 
wrote:

> Hello!
>
> I am running into a very perplexing issue that is very rare, but creeps up
> and is crashing my app.
>
> The root cause of the issue comes down to the following check returning
> true:
>
> isinstance([], collections.Mapping)
>
> Obviously you can get this behavior if you register `list` as a subclass of
> the Mapping ABC, but I'm not doing that.  Because the issue is so rare (but
> still common enough that I need to address it), it's hard to reproduce in a
> bench test.
>
> What I am going to try is to essentially monkey-patch
> collections.Mapping.register with a method that dumps a stack trace
> whenever it's called at the time of initial import so I can get an idea of
> where this method could *possibly* be getting called with a list as its
> argument.
>
> The annoying thing here is that wherever the bug is happening, the crash
> happens *way* far away (in a third-party library).  I've also determined it
> as the root cause of two crashes that seem completely unrelated (except
> that the crash is caused by a list not behaving like a dict shortly after
> isinstance(obj, collections.Mapping) returns True).  These are the
> libraries I'm using:
>
> amqp
> billiard
> celery
> dj-database-url
> Django
> django-redis-cache
> enum34
> gunicorn
> kombu
> newrelic
> psycopg2
> pyasn1
> pytz
> redis
> requests
> rsa
> six
> vine
> voluptuous
>
> It's a web application, as you can probably tell.  The main reason I ask
> here is because I'm wondering if anybody has encountered this before and
> managed to hunt down which of these libraries is doing something naughty?
>
> Thanks!
> Jason
>
> --
> Jason M. Swails
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [ANN]VTD-XML 2.9

2010-08-20 Thread John Krukoff
On Thu, 2010-08-19 at 17:40 -0700, dontcare wrote:
 VTD-XML 2.9, the next generation XML Processing API for SOA and Cloud
 computing, has been released. Please visit  
 https://sourceforge.net/projects/vtd-xml/files/
 to download the latest version.
 
 * Strict Conformance
   # VTD-XML now fully conforms to XML namespace 1.0 spec
 * Performance Improvement
   # Significantly improved parsing performance for small XML files
 * Expand Core VTD-XML API
   # Adds getPrefixString(), and toNormalizedString2()
 * Cutting/Splitting
   # Adds getSiblingElementFragment()
 * A number of bug fixes and code enhancement including:
   # Fixes a bug for reading very large XML documents on some
 platforms
   # Fixes a bug in parsing processing instruction
   # Fixes a bug in outputAndReparse()

So, correct me if I'm wrong, but it doesn't look like this project even
has a python version. So, why is it on the python-announce list?

-- 
John Krukoff
Land Title Guarantee Company
jkruk...@ltgc.com

-- 
http://mail.python.org/mailman/listinfo/python-announce-list

Support the Python Software Foundation:
http://www.python.org/psf/donations/


Re: [ANN]VTD-XML 2.9

2010-08-20 Thread John Krukoff
On Thu, 2010-08-19 at 17:40 -0700, dontcare wrote:
 VTD-XML 2.9, the next generation XML Processing API for SOA and Cloud
 computing, has been released. Please visit  
 https://sourceforge.net/projects/vtd-xml/files/
 to download the latest version.
 
 * Strict Conformance
   # VTD-XML now fully conforms to XML namespace 1.0 spec
 * Performance Improvement
   # Significantly improved parsing performance for small XML files
 * Expand Core VTD-XML API
   # Adds getPrefixString(), and toNormalizedString2()
 * Cutting/Splitting
   # Adds getSiblingElementFragment()
 * A number of bug fixes and code enhancement including:
   # Fixes a bug for reading very large XML documents on some
 platforms
   # Fixes a bug in parsing processing instruction
   # Fixes a bug in outputAndReparse()

So, correct me if I'm wrong, but it doesn't look like this project even
has a python version. So, why is it on the python-announce list?

-- 
John Krukoff
Land Title Guarantee Company
jkruk...@ltgc.com

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Best Pythonic Approach to Annotation/Metadata?

2010-07-15 Thread John Krukoff
On Thu, 2010-07-15 at 12:37 -0700, Sparky wrote:
snip
 the above is a good pythonic way to solve this problem? I am using
 2.6.

Hopefully a helpful correction, but if you're running on google app
engine, you're using python 2.5 on the google side irrespective of what
you're running for development.

-- 
John Krukoff jkruk...@ltgc.com
Land Title Guarantee Company

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pretty printing with ElementTree

2010-07-09 Thread John Krukoff
On Fri, 2010-07-09 at 15:46 -0700, abhijeet thatte wrote:
 Hi,
 
 
 Does any one know how to use pretty printing with ElementTree while
 generating xml files. 
 We can use that with lxml. But I want to stick with it ElementTree. 
 
 
 Thanks,
 Abhijeet


It's pretty simple minded, but this recipe from the element tree
documentation may do what you want:
http://effbot.org/zone/element-lib.htm#prettyprint

-- 
John Krukoff jkruk...@ltgc.com
Land Title Guarantee Company

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is not operator?

2010-07-08 Thread John Krukoff
On Thu, 2010-07-08 at 13:10 -0700, sturlamolden wrote:
 What happens here? Does Python (2.6.5) have an is not operator?
 
  a = 5
  print (a is not False)
 True
  print (a is (not False))
 False
  print (not (a is False))
 True
 
 It seems y is not x fits well with spoken English, but it is also a
 bit surprising that y is not x does not mean y is (not x) but not
 (y is x). Why does Python reorder is and not operators, and what are
 the formal rules for this behavior?

Don't forget about the similar not in, as in:

 'a' not in 'abc'
False

This is probably the section of documentation you want:
http://docs.python.org/reference/expressions.html#notin

-- 
John Krukoff jkruk...@ltgc.com
Land Title Guarantee Company

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Queue peek?

2010-03-02 Thread John Krukoff
On Tue, 2010-03-02 at 22:54 +0100, mk wrote:
snip
 No need to use synchro primitives like locks?
 
 I know that it may work, but that strikes me as somehow wrong... I'm 
 used to using things like Lock().acquire() and Lock().release() when 
 accessing shared data structures, whatever they are.
snip

This is one of those places where the GIL is a good thing, and makes
your life simpler. You could consider it that the interpreter does the
locking for you for such primitive operations, if you like.
-- 
John Krukoff jkruk...@ltgc.com
Land Title Guarantee Company

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Overcoming python performance penalty for multicore CPU

2010-02-08 Thread John Krukoff
On Mon, 2010-02-08 at 01:10 -0800, Paul Rubin wrote:
 Stefan Behnel stefan...@behnel.de writes:
  Well, if multi-core performance is so important here, then there's a pretty
  simple thing the OP can do: switch to lxml.
 
  http://blog.ianbicking.org/2008/03/30/python-html-parser-performance/
 
 Well, lxml is uses libxml2, a fast XML parser written in C, but AFAIK it
 only works on well-formed XML.  The point of Beautiful Soup is that it
 works on all kinds of garbage hand-written legacy HTML with mismatched
 tags and other sorts of errors.  Beautiful Soup is slower because it's
 full of special cases and hacks for that reason, and it is written in
 Python.  Writing something that complex in C to handle so much
 potentially malicious input would be quite a lot of work to write at
 all, and very difficult to ensure was really safe.  Look at the many
 browser vulnerabilities we've seen over the years due to that sort of
 problem, for example.  But, for web crawling, you really do need to
 handle the messy and wrong HTML properly.
 

Actually, lxml has an HTML parser which does pretty well with the
standard level of broken one finds most often on the web. And, when it
falls down, it's easy to integrate BeautifulSoup as a slow backup for
when things go really wrong (as J Kenneth King mentioned earlier):

http://codespeak.net/lxml/lxmlhtml.html#parsing-html

At least in my experience, I haven't actually had to parse anything that
lxml couldn't handle yet, however.
-- 
John Krukoff jkruk...@ltgc.com
Land Title Guarantee Company

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Best library to make XSLT 2.0 transformation

2009-05-19 Thread John Krukoff
On Tue, 2009-05-19 at 13:42 +0200, Diez B. Roggisch wrote:
 wdveloper wrote:
 
  Hi there,
  
  I need to make xml transformation using XSLT 2.0 (since i want to use
  the powerful tag xsl:result-document to produce multiple files).
  In your experience, which kind of library out there is better?
 
 XSLT is a standard, so if you find a library that implements it, there
 shouldn't be much differences.
 
 Having said that, these days lxml2 is the craze, and it contains
 xslt-processing. It would be my first (and hopefully last) stop.
 
 Diez

While I firmly believe lxml is the best python XML library available,
being built on libxml2 means that it only supports XSLT 1.0. As far as I
know, if you want 2.0 support, you still need to be using one of the
Java XSLT processors.
-- 
John Krukoff jkruk...@ltgc.com
Land Title Guarantee Company

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why can function definitions only use identifiers, and not attribute references or any other primaries?

2009-04-23 Thread John Krukoff
On Thu, 2009-04-23 at 12:26 -0300, Jeremy Banks wrote:
  Things like your suggestion are called syntactic-sugar  -- syntax that
  adds a convenience, but *no* new functionality.  Python has plenty of
  syntactic-sugars, and more will be added in the future.  To make an
  argument for such an addition, one would have to describe some compelling
  (and general) use cases in a well-argued PEP.  You're welcome to try, but be
  forewarned, most PEP's (especially syntax changing PEPs) don't fly far.
 
 Thank you very much for the feedback. I might throw something at
 Python-Ideas if I think I can come up with an adequate justification
 and don't come accross a previous similar propsition (though if I do
 miss it I'm sure it will be pointed out to me fairly quickly). I fully
 appreciate the small chance of success, but it certainly couldn't hurt
 to give it a try.
 --
 http://mail.python.org/mailman/listinfo/python-list

You probably want to be searching for multi-line lambda to find the past
decade or so of this argument, as that's where most people who argued
for this came from. But, if you'd just like a bit of discouragement,
here's GvR arguing that there's just no good way to mix statements and
expressions in python:
http://www.artima.com/weblogs/viewpost.jsp?thread=147358

-- 
John Krukoff jkruk...@ltgc.com
Land Title Guarantee Company

--
http://mail.python.org/mailman/listinfo/python-list


Re: [ANN]: circuits-1.0b1 released!

2008-12-30 Thread John Krukoff
On Wed, 2008-12-31 at 09:44 +1000, James Mills wrote:
 Hi all,
 
 I'm pleased to announce the release of circuits-1.0b1

I'm curious, you've a number of comparisons to Twisted on your site FAQ
section, but this sounds like a much closer project to Kamaelia
(http://www.kamaelia.org/Home). Are these actually similar or am I
missing something important that differentiates circuits?

-- 
John Krukoff jkruk...@ltgc.com
Land Title Guarantee Company

--
http://mail.python.org/mailman/listinfo/python-list


Re: Class instantiation fails when passed in a file but work via line by line interpreter

2008-11-18 Thread John Krukoff
On Tue, 2008-11-18 at 10:45 -0800, Jeff Tchang wrote:
 Odd issue I am having with class instantiation on Python 2.5.2 (Windows).
 
 I have a custom module with a few classes in it. The module is named SAML.py.
 There is a copy of it in C:\Python25\Lib\site-packages\SAML.py.
 
 Basically when I try to run a python file that tries to create an
 instance of the class Subject I get this error:
 AttributeError: type object 'SAML' has no attribute 'Subject'
 
 In SAML.py I have the class...
 
 class Subject(object):
   ...
   ...
   etc
 
 However, when I run the same line by line by starting up python it works.
 
  import SAML
  subject = SAML.Subject([EMAIL PROTECTED],EMailAddress)
  print subject
 SAML.Subject object at 0x00C94770
 
 I've double checked I am loading the correct module by the usage of the -v 
 flag.
 What else should I be checking?
 
 -Jeff
 --
 http://mail.python.org/mailman/listinfo/python-list

Random related question. If you're writing a SAML implementation have
you found an XML Signature implementation that works reliably from
python? I've had a hell of a time finding something that doesn't
segfault and is interoperable with .NET.

As far as your problem goes, I would guess that somewhere you've got a
class named SAML that's shadowing your module. See the difference in the
error messages:

Python 2.5.2 (r252, Oct 31 2008, 10:47:40) 
[GCC 4.1.2 (Gentoo 4.1.2 p1.1)] on linux2
Type help, copyright, credits or license for more information.
 import os
 type( os )
type 'module'
 os.doesnotexist
Traceback (most recent call last):
  File stdin, line 1, in module
AttributeError: 'module' object has no attribute 'doesnotexist'
 class os( object ):
...  pass
... 
 os.doesnotexist
Traceback (most recent call last):
  File stdin, line 1, in module
AttributeError: type object 'os' has no attribute 'doesnotexist'

-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

--
http://mail.python.org/mailman/listinfo/python-list


Re: Python 2.5: wrong number of arguments given in TypeError for function argument aggregation (dictionary input vs the norm)

2008-10-30 Thread John Krukoff
On Fri, 2008-10-31 at 08:55 +1000, James Mills wrote:
 What you have discovered is not a bug :)
 
 cheers
 James
 

Are you sure? It looks like his complaint isn't that it doesn't work,
but that the error message is misleading.

With the setup:

Python 2.5.2 (r252:60911, Sep 22 2008, 12:08:38) 
[GCC 4.1.2 (Gentoo 4.1.2 p1.1)] on linux2
Type help, copyright, credits or license for more information.
 def foo( a, b, c ):
...  pass
... 

Compare the error messages from:

 foo( **{ 'a' : 1, 'c' : 3 } )
Traceback (most recent call last):
  File stdin, line 1, in module
TypeError: foo() takes exactly 3 non-keyword arguments (1 given)

to the error message here:

 foo( **{ 'a' : 1, 'b' : 3 } )
Traceback (most recent call last):
  File stdin, line 1, in module
TypeError: foo() takes exactly 3 non-keyword arguments (2 given)

Is it even possible to get an error message in terms of required keyword
arguments? I seem to remember seeing a note about keyword only arguments
recently...

-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company


--
http://mail.python.org/mailman/listinfo/python-list


Re: Event-driven framework (other than Twisted)?

2008-10-01 Thread John Krukoff
You could take a look at this interesting looking server that popped up
on the mailing list a while back:

http://code.google.com/p/yield/

On Wed, 2008-10-01 at 01:01 -0700, Phillip B Oldham wrote:
 Are there any python event driven frameworks other than twisted?
 --
 http://mail.python.org/mailman/listinfo/python-list
-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

--
http://mail.python.org/mailman/listinfo/python-list


Re: python/xpath question..

2008-09-03 Thread John Krukoff
On Wed, 2008-09-03 at 13:36 -0700, bruce wrote:
 morning
 
 i apologize up front as this is really more of an xpath question..
 
 in my python, i'm using the xpath function to iterate/parse some html. i can
 do something like
 
 s=d.xpath(//tr/td/text())
 count=len(s)
 
 and get the number of nodes that have text
 
 i can then do something like
 s=d.xpath(//tr/td)
 count2=len(s)
 
 and get the number of total nodes...
 by subtracting, i can get the number of nodes, without text.. is there an
 easier way??!!
 count2-count
 
 ie, if i have something like
 tr
 td/td
 tdfoo/td
 /tr
 
 is there a way to get the count that there is a single td node with
 text()=
 
 thanks
 
 
 --
 http://mail.python.org/mailman/listinfo/python-list

Well, you could just do the test (and the count!) in the xpath
expression:

count( //tr/td[ text() !=  ] )

It sounds like you're not familiar with xpath? I would recommend the
O'Reilly XSLT book, it has an excellent introduction to xpath in chapter
3.
-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

--
http://mail.python.org/mailman/listinfo/python-list


Re: pickle passing client/server design

2008-08-22 Thread John Krukoff
On Fri, 2008-08-22 at 10:09 -0700, DwBear75 wrote:
 I am contemplating the need for a way to handle high speed data
 passing between two processes. One process would act as a queue that
 would 'buffer' data coming from another processes. Seems that the
 easiest way to handle the data would be to just pass pickles. Further,
 I'm thinking that using a unix domain socket would make this a simple
 way to pass high volumes of pickles. Are there any examples of an
 architecture like these, where a process is a client, sending pickles
 to a server listening on a domain socket?
 
 I'm am thinking there would be a need to have a semaphore, and some
 ACK or NACK that the server process got the whole pickle.
 --
 http://mail.python.org/mailman/listinfo/python-list

Quick bit of advice, don't reinvent the wheel, use PYRO:
http://pyro.sourceforge.net/index.html
-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

--
http://mail.python.org/mailman/listinfo/python-list


Re: iterparse and unicode

2008-08-20 Thread John Krukoff
On Wed, 2008-08-20 at 15:36 -0700, George Sakkis wrote:
 It seems xml.etree.cElementTree.iterparse() is not unicode aware:
 
  from StringIO import StringIO
  from xml.etree.cElementTree import iterparse
  s = 
  u'name\u03a0\u03b1\u03bd\u03b1\u03b3\u03b9\u03ce\u03c4\u03b7\u03c2/name'
  for event,elem in iterparse(StringIO(s)):
 ... print elem.text
 ...
 Traceback (most recent call last):
   File stdin, line 1, in module
   File string, line 64, in __iter__
 UnicodeEncodeError: 'ascii' codec can't encode characters in position
 6-15: ordinal not in range(128)
 
 Am I using it incorrectly or it doesn't currently support unicode ?
 
 George
 --
 http://mail.python.org/mailman/listinfo/python-list

As iterparse expects an actual file as input, using a unicode string is
problematic. If you want to use iterparse, the simplest way would be to
encode your string before inserting it into the StringIO object, as so:

 for event,elem in iterparse(StringIO(s.encode('UTF8')):
... print elem.text
...

If you encode using UTF-8, you don't need to worry about the ?xml header 
bit as suggested previously, as it's the default for XML.

If you're using unicode extensively, you should consider using lxml, 
which implements the same interface as ElementTree, but handles unicode 
better (though it also doesn't run your example above without first 
encoding the string):
http://codespeak.net/lxml/parsing.html#python-unicode-strings

You may also find the target parser interface to be more accepting of 
unicode than iterparse, though it requires a different parsing interface:
http://codespeak.net/lxml/parsing.html#the-target-parser-interface

-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

--
http://mail.python.org/mailman/listinfo/python-list

Re: Create a process with a time to live

2008-08-15 Thread John Krukoff
On Fri, 2008-08-15 at 10:00 -0700, Carl J. Van Arsdall wrote:
 Hey python[],
 
 I want to create a process that would expire if it didn't complete in 
 a set amount of time.  I don't seem to see any timeout values to pass 
 os.system - does anyone know of a good way to do this? 
 
 So far all I can think of is some kind of watchdog thread - but that 
 seems like overkill to do such a simple thing.
 
 
 -Carl

Well, if you're on unix, the signal module is probably the easiest
method:

Python 2.5.2 (r252:60911, Jul 31 2008, 15:38:58) 
[GCC 4.1.2 (Gentoo 4.1.2 p1.1)] on linux2
Type help, copyright, credits or license for more information.
 import signal
 signal.alarm( 1 )
0
 Alarm clock

(Interpreter exits)

-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

--
http://mail.python.org/mailman/listinfo/python-list


Re: Digitally sign PDF files

2008-08-15 Thread John Krukoff
On Mon, 2008-08-11 at 14:13 -0700, haxier wrote:
 On 11 ago, 22:29, Hartmut Goebel [EMAIL PROTECTED] wrote:
 
   I'm developing an application with some reports and we're looking for
   advice. This reports should be openoffice.org .odf files, pdf files,
   and perhaps microsoft word files (.doc, .docx?) and must be digitally
   signed. Is out there some kind of libraries to ease this tasks?
 
  For signing you can use OpenSSL or the more complete M2crypto modules.
  But this is only the crypto part of the task.
 
 M2Crypto? I didn't know of it... surely I must check it.
 
 It's a very delicate component (security and reliability is a must)
 and don't know how openssl works in windows environments.
 
* Access to the local user certificate store, and read PEM or PKCS12
certificate files.
 
  If the certificate store is just a file, both packages can to this. If
  the store is some otehr format or maybe the Windows registry, some
  additional functions are required, but should be easy to implement.
 
 Certificates can be both: PKCS12 (.p12) files and under the windows
 certificate store.
 
 The best option could be some kind of thin wrapper around windows
 CryotoAPI, so access to hardware tokens and smartcard readers should
 be easy because under Linux everything seems tied to Mozilla NSS
 libraries.
 
   * Sign documents: as a binary stream, within an specific document
   (pdf, odt, doc)
 
  This is the hardest part of the task, since the signature has to be
  embedded into the document.
 
 OpenOffice.org uses XML DSIG (libxmlsec, libxml2) as stated here[1]
 but I can't find more than this[2] implementation/wrapper of libxmlsec
 
 PDF signing... I can't find something like iText for Python... I've
 finded examples like this[3] based on Jython... perhaps I should look
 at jython because java 1.6 has full access to Windows CryptoAPI and
 full XML-DSIG support[4]
 
 IronPython could also be an interesting option for obvious reasons and
 there's and iText port for .NET
 
 Thanks
 
 [1] 
 http://marketing.openoffice.org/ooocon2004/presentations/friday/timmermann_digital_signatures.pdf
 [2] http://xmlsig.sourceforge.net/build.html
 [3] http://kelpi.com/script/00cd7c
 [4] 
 http://java.sun.com/javase/6/docs/technotes/guides/security/xmldsig/XMLDigitalSignature.html
 --
 http://mail.python.org/mailman/listinfo/python-list

A note on libxmlsec, there are also these python bindings available:
http://pyxmlsec.labs.libre-entreprise.org/index.php?section=examples

-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

--
http://mail.python.org/mailman/listinfo/python-list


Re: Replace Several Items

2008-08-13 Thread John Krukoff
On Wed, 2008-08-13 at 09:39 -0700, gjhames wrote:
 I wish to replace several characters in my string to only one.
 Example, -, . and / to nothing 
 I did like that:
 my_string = my_string.replace(-, ).replace(., ).replace(/,
 ).replace(), ).replace((, )
 
 But I think it's a ugly way.
 
 What's the better way to do it?
 --
 http://mail.python.org/mailman/listinfo/python-list


The maketrans interface is a bit clunky, but this is what
string.translate is best at:

 import string
 '-./other'.translate( string.maketrans( '', '' ), '-./' )
'other'

It'd be interesting to see where it falls in the benchmarks, though.

It's worth noting that the interface for translate is quite different
for unicode strings.
-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

--
http://mail.python.org/mailman/listinfo/python-list

Re: Psycho question

2008-08-08 Thread John Krukoff

On Fri, 2008-08-08 at 12:18 -0500, David C. Ullrich wrote:
 Curiously smug grin g is exactly how I'd planned on doing it
 before trying anything. The one thing that puzzles me about
 all the results is why // is so much slower than / inside
 that Psyco loop.
 
  Tim Delaney
 

One possibility for the performance difference, is that as I understand
it the psyco developer has moved on to working on pypy, and probably
isn't interested in keeping psyco updated and optimized for new python
syntax.

Somebody correct me if I'm wrong, but last I heard there's no
expectation of a python 3.0 compatible version of psyco, either.

-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

--
http://mail.python.org/mailman/listinfo/python-list


Re: Best practise implementation for equal by value objects

2008-08-06 Thread John Krukoff

On Wed, 2008-08-06 at 05:50 -0700, Slaunger wrote:
 Hi,
 
 I am new here and relatively new to Python, so be gentle:
 
 Is there a recommended generic implementation of __repr__ for objects
 equal by value to assure that eval(repr(x)) == x independet of which
 module the call is made from?
 
 Example:
 
 class Age:
 
 def __init__(self, an_age):
 self.age = an_age
 
 def __eq__(self, obj):
 self.age == obj.age
 
 def __repr__(self):
 return self.__class__.__name__ + \
(%r) % self.age
 
 age_ten = Age(10)
 print repr(age_ten)
 print eval(repr(age_ten))
 print eval(repr(age_ten)).age
 
 Running this gives
 
 Age(10)
 Age(10)
 10
 
 Exactly as I want to.
 
 The problem arises when the Age class is iomported into another module
 in another package as then there is a package prefix and the above
 implementation of __repr__ does not work.
 
 I have then experimented with doing somthing like
 
 def __repr__(self):
 return self.__module__ + '.' + self.__class__.__name__ +
 (%r) % self.age
 
 This seems to work when called from the outside, but not from the
 inside of the module. That is, if I rerun the script above the the
 module name prefixed to the representation I get the following error
 
 Traceback (most recent call last):
   File valuetest.py, line 15, in module
 print eval(repr(age_ten))
 __main__.Age(10)
   File string, line 1, in module
 NameError: name '__main__' is not defined
 
 This is pretty annoying.
 
 My question is: Is there a robust generic type of implementation of
 __repr__ which I can use instead?
 
 This is something I plan to reuse for many different Value classes, so
 I would like to get it robust.
 
 Thanks,
 Slaunger
 --
 http://mail.python.org/mailman/listinfo/python-list

Are you really sure this is what you want to do, and that a less tricky
serialization format such as that provided by the pickle module wouldn't
work for you?

-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

--
http://mail.python.org/mailman/listinfo/python-list


Re: when does the GIL really block?

2008-08-01 Thread John Krukoff

On Thu, 2008-07-31 at 18:27 -0700, Craig Allen wrote:
 I have followed the GIL debate in python for some time.  I don't want
 to get into the regular debate about if it should be gotten rid of
 (though I am curious about the status of that for Python 3)...
 personally I think I can do multi-threaded programming well, but I
 also see the benefits of a multiprocess approach. I'm not so
 egotistical that I don't realize perhaps my mt programming has not
 been right (though it worked and was debuggable) or more likely that
 doing it right I have avoided even trying some things people want mt
 programming to do... i.e. to do mt programming right you start to use
 queues a lot, inter-thread asynchronous, non-blocking, communication,
 which is essentially the multi-process approach, using IPC (except
 that the threads can see the same memory when, in your special case,
 you know that's ok. Given something like a reader-writer lock, this
 can have benefits... but again, whatever.
 
 My question is that given this problem, years ago before I started
 writing in python I wrote some short programs in python which could,
 in fact, busy both my CPUs.  In retrospect I assume I did not have
 code in my run function that causes a GIL lock... so I have done this
 again.
 
 I start two threads... I use gkrellm to watch my processors (dual
 processor machine).  If I merely print a number... both CPUS are
 getting 90% simultaneous loads. If I increment a counter and print it
 too, the same, and if I create a small list and sort it, the same. I
 did not expect this... I expected to see one processor pegged at
 around 100%, which should sometimes switch to the other processor.
 Granted, the same program in C/C++ would peg both processors at
 100%... but given that the overhead in the interpreter cannot explain
 the extra usage, I assume the code in my thread's run functions is
 actually executing non-serially.
 
 I assume this is because what I am doing does not require the GIL to
 be locked for a significant part of the time my code is running...
 what code could I put in my run function to see the behavior I
 expected?  What code could I put there to take advantage of the
 possibility that really the GIL is not locked enough to cause actual
 serialization of the threads...  anyone care to explain?
 --
 http://mail.python.org/mailman/listinfo/python-list

It's worth mentioning that the most common place for the python
interpreter to release the GIL is during I/O, which printing a number to
the screen certainly counts as. You might try again with a set of loops
that only increment, and don't print, and you may more obviously see the
GIL in action.
-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

--
http://mail.python.org/mailman/listinfo/python-list


Re: seemingly simple list indexing problem

2008-07-30 Thread John Krukoff

On Wed, 2008-07-30 at 12:06 -0700, Tobiah wrote:
 On Mon, 28 Jul 2008 23:41:51 -0700, iu2 wrote:
 
  On Jul 29, 3:59 am, John Machin [EMAIL PROTECTED] wrote:
  On Jul 29, 8:10 am, John Krukoff [EMAIL PROTECTED] wrote:
 
 
 
 
 
   On Mon, 2008-07-28 at 16:24 -0500, Ervan Ensis wrote:
My programming skills are pretty rusty and I'm just learning Python so
this problem is giving me trouble.
 
I have a list like [108, 58, 68].  I want to return the sorted indices
of these items in the same order as the original list.  So I should
return [2, 0, 1]
 
For a list that's already in order, I'll just return the indices, i.e.
[56, 66, 76] should return [0, 1, 2]
 
Any help would be appreciated.
 
--
   http://mail.python.org/mailman/listinfo/python-list
 
   If your lists aren't so large that memory is an issue, this might be a
   good place for a variation of decorate, sort, undecorate.
 
listToSort = [ 108, 58, 68 ]
decorated = [ ( data, index ) for index, data in
 
   enumerate( listToSort ) ] decorated
 
   [(108, 0), (58, 1), (68, 2)] result = [ None, ] * len( listToSort )
for sortedIndex, ( ignoredValue, originalIndex ) in
 
   enumerate( sorted( decorated ) ):
   ... result[ originalIndex ] = sortedIndex
   ... result
 
   [2, 0, 1]
 
  Simpliciter:
 
 
 
   data = [99, 88, 77, 88, 66]
   [x[1] for x in sorted(zip(data, xrange(len(data]
  [4, 2, 1, 3, 0]
 
  Use case? Think data == database table, result == index ...- Hide quoted 
  text -
 
  - Show quoted text -
  
  I think it is wrong, using this on my data returns the wrong result
   data = [108, 58, 68, 108]
  [x[1] for x in sorted(zip(data, xrange(len(data]
  [1, 2, 0, 3]
 
 It looked wrong to me at first, but I think this is correct.
 The two largest numbers were at positions 0 and 3.  They come
 last in the result.  58 is the smallest, and was at position
 1, which is the first in the result.
 
 
 
 ** Posted from http://www.teranews.com **
 --
 http://mail.python.org/mailman/listinfo/python-list

It's wrong for the OP's sample, at least:
 data = [ 108, 58, 68 ]
 [x[1] for x in sorted(zip(data, xrange(len(data]
[1, 2, 0]

Which should be [2, 0, 1], according to the OP. I don't think the
problem description makes it clear until you stare at the example a bit,
but he's asking for a list where each value is the index in the sorted
list of the value in that position in the original list. What you've got
here is the index in the original list of the value that would be in
that location when sorted.

Clear as mud?

Stealing the quite clever __getitem__ as indirect key idea, this is as
short a solution as I've got:

 listToSort = [ 108, 58, 68 ]
 result = [ None, ] * len( listToSort )
 for sortedIndex, originalIndex in
enumerate( sorted( range( len( listToSort ) ), key =
listToSort.__getitem__ ) ):
...   result[ originalIndex ] = sortedIndex
... 
 result
[2, 0, 1]

I haven't benchmarked, but I bet it'd be better overall to do the range
sort in place, instead of with sorted. I'd like to be able to get rid of
the intermediate result initialization, but I haven't found a method
that's sane. This'd be my entry for obfuscated one-liner, though :)

 listToSort = [ 108, 58, 68 ]
 import operator
 map( operator.itemgetter( 1 ), sorted( dict( ( originalIndex,
sortedIndex ) for sortedIndex, originalIndex in
enumerate( sorted( xrange( len( listToSort ) ), key =
listToSort.__getitem__ ) ) ).items( ) ) )
[2, 0, 1]


-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

--
http://mail.python.org/mailman/listinfo/python-list


Re: seemingly simple list indexing problem

2008-07-30 Thread John Krukoff

On Wed, 2008-07-30 at 14:08 -0700, [EMAIL PROTECTED]
wrote:
 On 29 Jul., 01:05, Raymond Hettinger [EMAIL PROTECTED] wrote:
  [Ervan Ensis]
 
   I have a list like [108, 58, 68].  I want to return
   the sorted indices of these items in the same order
   as the original list.  So I should return [2, 0, 1]
 
  One solution is to think of the list indexes
  being sorted according the their corresponding
  values in the input array:
 
   s = [ 108, 58, 68 ]
   sorted(range(len(s)), key=s.__getitem__)
 
  [1, 2, 0]
 
 
 To get the desired output you have to apply it twice:
  sorted(range(len(s)), key=sorted(range(len(s)), 
  key=s.__getitem__).__getitem__)
 [2, 0, 1]
 
 Wolfram
 --
 http://mail.python.org/mailman/listinfo/python-list

Thanks, I knew I was missing something simpler.
-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

--
http://mail.python.org/mailman/listinfo/python-list


Re: seemingly simple list indexing problem

2008-07-28 Thread John Krukoff

On Mon, 2008-07-28 at 18:40 -0300, Guilherme Polo wrote:
 On Mon, Jul 28, 2008 at 6:24 PM, Ervan Ensis [EMAIL PROTECTED] wrote:
  My programming skills are pretty rusty and I'm just learning Python so this
  problem is giving me trouble.
 
  I have a list like [108, 58, 68].  I want to return the sorted indices of
  these items in the same order as the original list.  So I should return [2,
  0, 1]
 
 You could simply do this:
 
 a = [108, 58, 68]
 b = sorted(a)
 [b.index(c) for c in a]
 
 
  For a list that's already in order, I'll just return the indices, i.e. [56,
  66, 76] should return [0, 1, 2]
 
  Any help would be appreciated.
 
  --
  http://mail.python.org/mailman/listinfo/python-list
 
 
 
 

Which'll work fine, unless you end up with a repeated value such as:

a = [ 108, 58, 68, 108 ]

If you have to deal with that, would need a more complicated solution to
find the first free index slot of the available choices.

-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

--
http://mail.python.org/mailman/listinfo/python-list


Re: seemingly simple list indexing problem

2008-07-28 Thread John Krukoff
On Mon, 2008-07-28 at 16:24 -0500, Ervan Ensis wrote:
 My programming skills are pretty rusty and I'm just learning Python so
 this problem is giving me trouble.
 
 I have a list like [108, 58, 68].  I want to return the sorted indices
 of these items in the same order as the original list.  So I should
 return [2, 0, 1]
 
 For a list that's already in order, I'll just return the indices, i.e.
 [56, 66, 76] should return [0, 1, 2]
 
 Any help would be appreciated.
 
 --
 http://mail.python.org/mailman/listinfo/python-list

If your lists aren't so large that memory is an issue, this might be a
good place for a variation of decorate, sort, undecorate.

 listToSort = [ 108, 58, 68 ]
 decorated = [ ( data, index ) for index, data in
enumerate( listToSort ) ]
 decorated
[(108, 0), (58, 1), (68, 2)]
 result = [ None, ] * len( listToSort )
 for sortedIndex, ( ignoredValue, originalIndex ) in
enumerate( sorted( decorated ) ):
... result[ originalIndex ] = sortedIndex
... 
 result
[2, 0, 1]

-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

--
http://mail.python.org/mailman/listinfo/python-list


Re: seemingly simple list indexing problem

2008-07-28 Thread John Krukoff

On Mon, 2008-07-28 at 15:00 -0700, Gary Herron wrote:
 Guilherme Polo wrote:
  On Mon, Jul 28, 2008 at 6:24 PM, Ervan Ensis [EMAIL PROTECTED] wrote:

  My programming skills are pretty rusty and I'm just learning Python so this
  problem is giving me trouble.
 
  I have a list like [108, 58, 68].  I want to return the sorted indices of
  these items in the same order as the original list.  So I should return [2,
  0, 1]
  
 
  You could simply do this:
 
  a = [108, 58, 68]
  b = sorted(a)
  [b.index(c) for c in a]

 
 Yuck.  Slow, and it fails if duplicate list elements exist.
 
 Also...  This looks like a beginners programming assignment.Let's 
 let him try it himself.  We can offer help rather than full solutions if 
 he has specific Python questions.
 
 
 
 
 

  For a list that's already in order, I'll just return the indices, i.e. [56,
  66, 76] should return [0, 1, 2]
 
  Any help would be appreciated.
 
  --
  http://mail.python.org/mailman/listinfo/python-list
 
  
 
 
 

 
 --
 http://mail.python.org/mailman/listinfo/python-list

Sorry, problem was interesting to solve, so I may have jumped the gun. I
do wonder why OP was asking for this though, as now that you mention it
I can't think of a use case outside of a homework assignment.

-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

--
http://mail.python.org/mailman/listinfo/python-list


Re: seemingly simple list indexing problem

2008-07-28 Thread John Krukoff

On Mon, 2008-07-28 at 16:00 -0700, iu2 wrote:
 On Jul 29, 12:10 am, John Krukoff [EMAIL PROTECTED] wrote:
  On Mon, 2008-07-28 at 16:24 -0500, Ervan Ensis wrote:
   My programming skills are pretty rusty and I'm just learning Python so
   this problem is giving me trouble.
 
   I have a list like [108, 58, 68].  I want to return the sorted indices
   of these items in the same order as the original list.  So I should
   return [2, 0, 1]
 
   For a list that's already in order, I'll just return the indices, i.e.
   [56, 66, 76] should return [0, 1, 2]
 
   Any help would be appreciated.
 
   --
  http://mail.python.org/mailman/listinfo/python-list
 
  If your lists aren't so large that memory is an issue, this might be a
  good place for a variation of decorate, sort, undecorate.
 
   listToSort = [ 108, 58, 68 ]
   decorated = [ ( data, index ) for index, data in
 
  enumerate( listToSort ) ] decorated
 
  [(108, 0), (58, 1), (68, 2)] result = [ None, ] * len( listToSort )
   for sortedIndex, ( ignoredValue, originalIndex ) in
 
  enumerate( sorted( decorated ) ):
  ... result[ originalIndex ] = sortedIndex
  ... result
 
  [2, 0, 1]
 
  --
  John Krukoff [EMAIL PROTECTED]
  Land Title Guarantee Company
 
 Inspired by your idea and the above one, here is another try:
 
  a0 = [108, 58, 68, 108, 58]
  a1 = [(x, y) for x, y in enumerate(a0)]

You know this line is a no-op, right?

  a1
 [(0, 108), (1, 58), (2, 68), (3, 108), (4, 58)]
  a2 = sorted(a1, lambda x, y: cmp(x[1], y[1]))

If you're going to do the unpacking here for the sort, just use
enumerate directly. Since this is a simple case, should use the key
parameter instead of the cmp parameter for speed. Can also use the
operator module to avoid a bit of lambda overhead:

 import operator
 a2 = sorted( enumerate( a0 ), key = operator.itemgetter( 1 ) )

  a2
 [(1, 58), (4, 58), (2, 68), (0, 108), (3, 108)]
  a3 = [a2.index(x) for x in a1]
  a3
 [3, 0, 2, 4, 1]
 
 The idea is to make each item unique (by making it a tuple), and then
 apply the naive solution.
 
 --
 http://mail.python.org/mailman/listinfo/python-list

Using index is still kinda evil, due to the exponential time required
since you're rescanning half the list (on average) to find each index
value.

-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

--
http://mail.python.org/mailman/listinfo/python-list


Re: Question by someone coming from C...

2008-06-10 Thread John Krukoff

On Wed, 2008-06-11 at 00:43 +0200, Christian Heimes wrote:
 John Krukoff wrote:
  Since you probably want access to these from many different places in
  your code, I find the simplest way is to create a logging module of your
  own (not called logging, obviously) and instantiate all of your loggers
  in that namespace, then import that one module as needed.
 
 No, don't do that. Simple do
 
 import logging
 log = logging.getLogger(some_name)
 
 The logging module takes care of the rest. The logging.getLogger()
 function creates a new logger *only* when the name hasn't been used yet.
 
 Christian
 
 --
 http://mail.python.org/mailman/listinfo/python-list

Nifty, I never noticed that function, thanks for pointing it out as
it'll make my application a bit simpler.

Now, if they'd only add a syslog module that uses the libc syslog
interface (sure, it wouldn't be thread safe, but at least it'd be
portable to AIX, unlike the current one), I'll be a happy camper.
-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

--
http://mail.python.org/mailman/listinfo/python-list


Re: Question by someone coming from C...

2008-06-09 Thread John Krukoff
On Mon, 2008-06-09 at 15:02 -0700, Skye wrote:
 On Jun 9, 2:35 pm, Matimus [EMAIL PROTECTED] wrote:
  The only time to do that sort of thing (in python) is when interacting
  with something else that isn't written in Python though. In general,
  for logging, just use the standard logging 
  module:http://docs.python.org/lib/module-logging.html
 
 Thanks!  It looks like subclassing the logging module would be a much
 better idea :)
 
 Skye
 
 --
 http://mail.python.org/mailman/listinfo/python-list

Really, the logging module is heavy weight enough, it's very doubtful
you need to add any functionality at all.

If you look at the documentation for logger objects, you'll see that you
can use a number of them, and set different logging levels for each one.
So, to get equivalent functionality to your bitfields, instead make a
separate logger for each category (config, options, blah...) and call
the object to log for that type.

i.e. config.debug( ... ), options.error( ... )

Since you probably want access to these from many different places in
your code, I find the simplest way is to create a logging module of your
own (not called logging, obviously) and instantiate all of your loggers
in that namespace, then import that one module as needed.

Perhpas you want one logging message to qualify as several different
types? That's the only place where I could see this being too verbose,
but even then it'd be easy to make a wrapper function that made that
easy given a collection of loggers.
-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

--
http://mail.python.org/mailman/listinfo/python-list


RE: no inputstream?

2008-05-15 Thread John Krukoff
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:python-
 [EMAIL PROTECTED] On Behalf Of max
 Sent: Thursday, May 15, 2008 8:02 AM
 To: python-list@python.org
 Subject: Re: no inputstream?
 
 On May 15, 9:51 am, castironpi [EMAIL PROTECTED] wrote:
  On May 15, 8:37 am, Marc 'BlackJack' Rintsch [EMAIL PROTECTED] wrote:
 
   On Thu, 15 May 2008 06:08:35 -0700, max wrote:
i currently have locations of the mp3s in question as strings, which
works for parsing local files, but gives me a No such file or
directory error when it tries to process URLs.  it seems terribly
inefficient to download each mp3 just to get at that small tag data,
and i assume there's a way to do this with file() or open() or
something, i just can't get it to work.
 
   You can use `urllib2.urlopen()` to open URLs as files.  But if you
 deal
   with ID3 V1 tags you'll have to download the file anyway because those
 are
   in the last 128 bytes of an MP3 file.
 
   Ciao,
           Marc 'BlackJack' Rintsch
 
  Just don't import time.  What would you do with an autolocking timer,
  such as time.sleep( ) on a thread?  I am tongue tied in the presence
  of a lady.
 
 thanks guys.  i guess i just figured there'd be a way to get at those
 id3 bytes at the end without downloading the whole file.  if java can
 do this, seems like i should just stick with that implementation, no?
 --
 http://mail.python.org/mailman/listinfo/python-list

First off, ignore castironpi, it's a turing test failure.

Second, I'm curious as to how Java manages this. I'd think their streams
would have to be pretty magic to pull this off after the HTTP connection has
already been built.

Anyway, as I see it, this is more of a HTTP protocol question. I think what
you need to do is set the HTTP Range header to bytes=-128, see the urllib2
documentation for how. It's not that hard. Only down side is that not all
HTTP servers support the Range header, and it's an optional part of the HTTP
spec anyway. As far as I know it's the only way to get partial transfers,
though.

Also, are you sure you're dealing with v1 tags and not v2? Since v2 tags are
stored at the beginning (or sometimes end with v2.4) of the file. You might
be better off just opening the file with urllib2 and handing it off to
whatever id3 tag reading library you're using. As long as it's reasonably
smart, it should only download the part of the file it needs (which if the
tag happens to be v1, will be the whole file).

I'd love to know how Java handles all that automatically through a generic
stream interface, though.
--
John Krukoff
[EMAIL PROTECTED]

--
http://mail.python.org/mailman/listinfo/python-list


Re: no inputstream?

2008-05-15 Thread John Krukoff

On Thu, 2008-05-15 at 15:35 -0700, max wrote:
 On May 15, 6:18 pm, MRAB [EMAIL PROTECTED] wrote:
  On May 15, 9:00 pm, max [EMAIL PROTECTED] wrote:
 
   you're right, my java implementation does indeed parse for Id3v2
   (sorry for the confusion).  i'm using the getrawid3v2() method of this
   bitstream class (http://www.javazoom.net/javalayer/docs/docs0.4/
   javazoom/jl/decoder/Bitstream.html) to return an inputstream that then
   i buffer and parse.  apologies if i misrepresented my code!
 
   back to python, i wonder if i'm misusing the mutagen id3 module.  this
   brief tutorial (http://www.sacredchao.net/quodlibet/wiki/Development/
   Mutagen/Tutorial) leads me to believe that something like this might
   work:
 
   from mutagen.mp3 import MP3
   id3tags = MP3(urllib2.urlopen(URL))
 
   but this gives me the following TypeError: coercing to Unicode: need
   string or buffer, instance found.  does this mean i need to convert
   the file-like object that is returned by urlopen() into a unicode
   object?  if so, do i just decode() with 'utf-8', or is this more
   complex?  as of now, doing so gives me mostly No such file or
   directory errors, with a few HTTP 404s.
 
  [snip]
  I think it's expecting the path of the MP3 but you're giving it the
  contents.
 
 cool, so how do i give it the path, if not in the form of a URL
 string?  maybe this is obvious...
 --
 http://mail.python.org/mailman/listinfo/python-list

It doesn't look like you can, with mutagen. So, time to find a different
library that supports arbitrary file objects instead of only file paths.
I'd suggest starting here:
http://pypi.python.org/pypi?%3Aaction=searchterm=id3submit=search

Possibly one with actual documentation, since that would also be a step
up from mutagen.

-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

--
http://mail.python.org/mailman/listinfo/python-list


Re: no inputstream?

2008-05-15 Thread John Krukoff

On Thu, 2008-05-15 at 17:11 -0600, John Krukoff wrote:
 On Thu, 2008-05-15 at 15:35 -0700, max wrote:
  On May 15, 6:18 pm, MRAB [EMAIL PROTECTED] wrote:
   On May 15, 9:00 pm, max [EMAIL PROTECTED] wrote:
  
you're right, my java implementation does indeed parse for Id3v2
(sorry for the confusion).  i'm using the getrawid3v2() method of this
bitstream class (http://www.javazoom.net/javalayer/docs/docs0.4/
javazoom/jl/decoder/Bitstream.html) to return an inputstream that then
i buffer and parse.  apologies if i misrepresented my code!
  
back to python, i wonder if i'm misusing the mutagen id3 module.  this
brief tutorial (http://www.sacredchao.net/quodlibet/wiki/Development/
Mutagen/Tutorial) leads me to believe that something like this might
work:
  
from mutagen.mp3 import MP3
id3tags = MP3(urllib2.urlopen(URL))
  
but this gives me the following TypeError: coercing to Unicode: need
string or buffer, instance found.  does this mean i need to convert
the file-like object that is returned by urlopen() into a unicode
object?  if so, do i just decode() with 'utf-8', or is this more
complex?  as of now, doing so gives me mostly No such file or
directory errors, with a few HTTP 404s.
  
   [snip]
   I think it's expecting the path of the MP3 but you're giving it the
   contents.
  
  cool, so how do i give it the path, if not in the form of a URL
  string?  maybe this is obvious...
  --
  http://mail.python.org/mailman/listinfo/python-list
 
 It doesn't look like you can, with mutagen. So, time to find a different
 library that supports arbitrary file objects instead of only file paths.
 I'd suggest starting here:
 http://pypi.python.org/pypi?%3Aaction=searchterm=id3submit=search
 
 Possibly one with actual documentation, since that would also be a step
 up from mutagen.
 

After a bit of time looking around, looks like nearly all the python id3
modules expect to work with filenames, instead of file objects.

I can't vouch for it, and the documentation still looks sparse, but this
module at least looks capable of accepting a file object:
http://pypi.python.org/pypi/tagpy

Looks like it'd be a challenge to build if you're on windows, since it
depends on an external library.

Alternately, you could probably create a subclass of the mutagen stuff
that used an existing file object instead of opening a new one. No idea
what that might break, but seems like it would be worth a try.

As last ditch option, could write the first few kb of the file out to a
temp file and see if mutagen will load the partial file.

-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

--
http://mail.python.org/mailman/listinfo/python-list


Re: no inputstream?

2008-05-15 Thread John Krukoff

On Thu, 2008-05-15 at 17:32 -0600, John Krukoff wrote:
 On Thu, 2008-05-15 at 17:11 -0600, John Krukoff wrote:
  On Thu, 2008-05-15 at 15:35 -0700, max wrote:
   On May 15, 6:18 pm, MRAB [EMAIL PROTECTED] wrote:
On May 15, 9:00 pm, max [EMAIL PROTECTED] wrote:
   
 you're right, my java implementation does indeed parse for Id3v2
 (sorry for the confusion).  i'm using the getrawid3v2() method of this
 bitstream class (http://www.javazoom.net/javalayer/docs/docs0.4/
 javazoom/jl/decoder/Bitstream.html) to return an inputstream that then
 i buffer and parse.  apologies if i misrepresented my code!
   
 back to python, i wonder if i'm misusing the mutagen id3 module.  this
 brief tutorial (http://www.sacredchao.net/quodlibet/wiki/Development/
 Mutagen/Tutorial) leads me to believe that something like this might
 work:
   
 from mutagen.mp3 import MP3
 id3tags = MP3(urllib2.urlopen(URL))
   
 but this gives me the following TypeError: coercing to Unicode: need
 string or buffer, instance found.  does this mean i need to convert
 the file-like object that is returned by urlopen() into a unicode
 object?  if so, do i just decode() with 'utf-8', or is this more
 complex?  as of now, doing so gives me mostly No such file or
 directory errors, with a few HTTP 404s.
   
[snip]
I think it's expecting the path of the MP3 but you're giving it the
contents.
   
   cool, so how do i give it the path, if not in the form of a URL
   string?  maybe this is obvious...
   --
   http://mail.python.org/mailman/listinfo/python-list
  
  It doesn't look like you can, with mutagen. So, time to find a different
  library that supports arbitrary file objects instead of only file paths.
  I'd suggest starting here:
  http://pypi.python.org/pypi?%3Aaction=searchterm=id3submit=search
  
  Possibly one with actual documentation, since that would also be a step
  up from mutagen.
  
 
 After a bit of time looking around, looks like nearly all the python id3
 modules expect to work with filenames, instead of file objects.
 
 I can't vouch for it, and the documentation still looks sparse, but this
 module at least looks capable of accepting a file object:
 http://pypi.python.org/pypi/tagpy
 
 Looks like it'd be a challenge to build if you're on windows, since it
 depends on an external library.
 
 Alternately, you could probably create a subclass of the mutagen stuff
 that used an existing file object instead of opening a new one. No idea
 what that might break, but seems like it would be worth a try.
 
 As last ditch option, could write the first few kb of the file out to a
 temp file and see if mutagen will load the partial file.
 

Okay, now I'm officially spending too much time looking through this
stuff.

However, looks like the load method of the MP3 class is what you'd
want to override to change mutagen's file loading behavior. Probably
pass the URL as the filename, and take a cut  paste version of the
default load method from ID3FileType and change it to use urllib2 to
open it instead of a local file open.

Might work. Might not. No warranty express or implied.
-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

--
http://mail.python.org/mailman/listinfo/python-list


Re: no inputstream?

2008-05-15 Thread John Krukoff

On Thu, 2008-05-15 at 17:42 -0600, John Krukoff wrote:
 On Thu, 2008-05-15 at 17:32 -0600, John Krukoff wrote:
  On Thu, 2008-05-15 at 17:11 -0600, John Krukoff wrote:
   On Thu, 2008-05-15 at 15:35 -0700, max wrote:
On May 15, 6:18 pm, MRAB [EMAIL PROTECTED] wrote:
 On May 15, 9:00 pm, max [EMAIL PROTECTED] wrote:

  you're right, my java implementation does indeed parse for Id3v2
  (sorry for the confusion).  i'm using the getrawid3v2() method of 
  this
  bitstream class (http://www.javazoom.net/javalayer/docs/docs0.4/
  javazoom/jl/decoder/Bitstream.html) to return an inputstream that 
  then
  i buffer and parse.  apologies if i misrepresented my code!

  back to python, i wonder if i'm misusing the mutagen id3 module.  
  this
  brief tutorial 
  (http://www.sacredchao.net/quodlibet/wiki/Development/
  Mutagen/Tutorial) leads me to believe that something like this might
  work:

  from mutagen.mp3 import MP3
  id3tags = MP3(urllib2.urlopen(URL))

  but this gives me the following TypeError: coercing to Unicode: 
  need
  string or buffer, instance found.  does this mean i need to convert
  the file-like object that is returned by urlopen() into a unicode
  object?  if so, do i just decode() with 'utf-8', or is this more
  complex?  as of now, doing so gives me mostly No such file or
  directory errors, with a few HTTP 404s.

 [snip]
 I think it's expecting the path of the MP3 but you're giving it the
 contents.

cool, so how do i give it the path, if not in the form of a URL
string?  maybe this is obvious...
--
http://mail.python.org/mailman/listinfo/python-list
   
   It doesn't look like you can, with mutagen. So, time to find a different
   library that supports arbitrary file objects instead of only file paths.
   I'd suggest starting here:
   http://pypi.python.org/pypi?%3Aaction=searchterm=id3submit=search
   
   Possibly one with actual documentation, since that would also be a step
   up from mutagen.
   
  
  After a bit of time looking around, looks like nearly all the python id3
  modules expect to work with filenames, instead of file objects.
  
  I can't vouch for it, and the documentation still looks sparse, but this
  module at least looks capable of accepting a file object:
  http://pypi.python.org/pypi/tagpy
  
  Looks like it'd be a challenge to build if you're on windows, since it
  depends on an external library.
  
  Alternately, you could probably create a subclass of the mutagen stuff
  that used an existing file object instead of opening a new one. No idea
  what that might break, but seems like it would be worth a try.
  
  As last ditch option, could write the first few kb of the file out to a
  temp file and see if mutagen will load the partial file.
  
 
 Okay, now I'm officially spending too much time looking through this
 stuff.
 
 However, looks like the load method of the MP3 class is what you'd
 want to override to change mutagen's file loading behavior. Probably
 pass the URL as the filename, and take a cut  paste version of the
 default load method from ID3FileType and change it to use urllib2 to
 open it instead of a local file open.
 
 Might work. Might not. No warranty express or implied.

Hrm, damn, looks like you'd also have to create a custom ID3 class and
override load there too, since that gets called from the ID3FileType
load method. Definitely looks like work.
-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

--
http://mail.python.org/mailman/listinfo/python-list


Re: i want to add a timeout to my code

2008-04-29 Thread John Krukoff

On Tue, 2008-04-29 at 14:47 -0700, maehhheeyy wrote:
 On Apr 17, 4:24 pm, Miki [EMAIL PROTECTED] wrote:
  On Apr 17, 1:10 pm,maehhheeyy[EMAIL PROTECTED] wrote:
 
   I want to add a timeout so that when I pull out my gps from my serial
   port, it would wait for a bit then loop and then see if it's there. I
   also want to add a print statement saying that there is no GPS device
   found. However when I run my code and unplug my serial port, my code
   will just hang until I plug it back in.
   This is my code right now:
 
   def GetGPS():
 data = []
 #Open com1: 9600,8,N,1
 fi = serial.Serial(0, timeout = 1)
 print '[gps module] SERIAL PORT OPEN ON COM1:'
 
   can anyone help me please? Thanks.
 
  http://docs.python.org/lib/node545.html
 
  HTH,
  --
  Miki [EMAIL PROTECTED]http://pythonwise.blogspot.com
 
 I tried the code onto my codes but what came out was that in the line
 signal.signal(signal.SIGSLRM, handler), an attributeError appeared
 reading that 'module' object has no attribute 'SIGALRM'
 --
 http://mail.python.org/mailman/listinfo/python-list

Are you writing your program on windows, or some other platform which is
not unix?

-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

--
http://mail.python.org/mailman/listinfo/python-list


RE: convert xhtml back to html

2008-04-24 Thread John Krukoff
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:python-
 [EMAIL PROTECTED] On Behalf Of Tim Arnold
 Sent: Thursday, April 24, 2008 9:34 AM
 To: python-list@python.org
 Subject: convert xhtml back to html
 
 hi, I've got lots of xhtml pages that need to be fed to MS HTML Workshop
 to
 create  CHM files. That application really hates xhtml, so I need to
 convert
 self-ending tags (e.g. br /) to plain html (e.g. br).
 
 Seems simple enough, but I'm having some trouble with it. regexps trip up
 because I also have to take into account 'img', 'meta', 'link' tags, not
 just the simple 'br' and 'hr' tags. Well, maybe there's a simple way to do
 that with regexps, but my simpleminded img[^(/)]+/ doesn't work. I'm
 not
 enough of a regexp pro to figure out that lookahead stuff.
 
 I'm not sure where to start now; I looked at BeautifulSoup and
 BeautifulStoneSoup, but I can't see how to modify the actual tag.
 
 thanks,
 --Tim Arnold
 
 
 --
 http://mail.python.org/mailman/listinfo/python-list


One method which wouldn't require much python code, would be to run the
XHTML through a simple identity XSL tranform with the output method set to
HTML. It would have the benefit that you wouldn't have to worry about any of
the specifics of the transformation, though you would need an external
dependency.

As far as I know, both 4suite and lxml (my personal favorite:
http://codespeak.net/lxml/) support XSLT in python. 

It might work out fine for you, but mixing regexps and XML always seems to
work out badly in the end for me.
-
John Krukoff
[EMAIL PROTECTED]

--
http://mail.python.org/mailman/listinfo/python-list


[issue643841] New class special method lookup change

2008-04-22 Thread John Krukoff

John Krukoff [EMAIL PROTECTED] added the comment:

I've been following the py3k maliing list disscussion for this issue, 
and wanted to add a note about the proposed solution described here:
http://mail.python.org/pipermail/python-3000/2008-April/013004.html

The reason I think this approach is valuable is that in all of the 
proxy classes I've written, I'm concerned about which behaviour of the 
proxied class I want to override, not which behaviour I want to keep. 
In other words, when I proxy something, my mental model has always 
been, okay, I want something that behaves just like X, except it does 
this (usually small bit) differently.

This is also why I expect my proxies to keep working the same when I 
change the proxied class, without having to go and update the proxy to 
also use the new behaviour.

So, yeah, very much in favor of a base proxy class in the standard 
library.


Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue643841

___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Recurring patterns: Am I missing it, or can we get these added to the language?

2008-04-15 Thread John Krukoff

On Tue, 2008-04-15 at 11:51 -0700, Erich wrote:
 Hello all,
 
 Today I found myself once again defining two functions that I use all
 the time: nsplit and iterable.  These little helper functions of mine
 get used all the time when I work. Im sick of having to define them
 (but am very good at it these days, less than 1 typo per function!).
 It leads me to the following questions
 
 1. Is this functionality already built in and im just missing it
 2. Is there some well known, good technique for these that I missed?
 3. Insert question I need to ask here (with a response)
 
 These are the funtions w/ explaination:
 
 def nsplit(s,p,n):
 n -= 1
 l = s.split(p, n)
 if len(l)  n:
 l.extend([''] * (n - len(l)))
 return l
 
 This is like split() but returns a list of exactly lenght n. This is
 very useful when using unpacking, e.g.:
 x, y = nsplit('foo,bar,baz', ',', 2)
 
 def iterable(item, count_str=False):
 if not count_str and isinstance(item, str):
 return False
 try:
 iter(item)
 except:
 return False
 return True
 This is just simple boolean test for whether or not an object is
 iterable. I would like to see this in builtins, to mirror callable.
 The optional count_str adds flexibility for string handling, since
 sometimes I need to iterate over a string, but usually not. I
 frequently use it to simplify my case handling in this type of
 costruct:
 
 def foo(bar):
 bar = bar if iterable(bar) else [bar]
 for x in bar:
 
 
 Thanks for feeback,
 Erich

As far as I know there is no built in function that does exactly what
you want. You can certainly simplify your nsplit function a bit, but as
mentioned, it's probably best just to create your own package and keep
your utility functions there.

It's worth noting that you almost certainly want to be doing
isinstance( item, basestring ) in your iterable function instead of
isinstance( item, str ), or things will get very surprising for you as
soon as you have to deal with a unicode string.

If you don't want the hassle of creating a separate package, and you're
only interested in having these functions be handy on your local python
install, you could also add them into your sitecustomize file as
described here:
http://docs.python.org/lib/module-site.html

On linux, that's as easy as creating a file
named /usr/lib/python2.5/sitecustomize.py that inserts whatever you want
into the __builtin__ module, and it'll be automatically imported
whenever you run python.

I'd doubt there's a case for getting this functionality added to the
language, as your use case seems pretty specific, and it's just not that
hard to write the function that does what you want to do.

-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Recurring patterns: Am I missing it, or can we get these added to the language?

2008-04-15 Thread John Krukoff

On Tue, 2008-04-15 at 13:48 -0700, Jeffrey Froman wrote:
 Tim Chase wrote:
 def nsplit(s, delim=None, maxsplit=None):
   if maxsplit:
 results = s.split(delim, maxsplit)
 result_len = len(results)
 if result_len  maxsplit:
   results.extend([''] * (maxsplit - result_len)
 return results
   else:
 return s.split(delim)
 
 I'll add a couple more suggestions:
 
 1. Delay the test for maxsplit, as str.split() does the right thing if
 maxsplit is None.
 
 2. Use a generator to pad the list, to avoid interim list creation. This
 works fine, because list.extend() accepts any iterable. This also shortens
 the code a bit, because xrange() does the right thing in this case with
 negative numbers. For example:
 
 def nsplit(s, delim=None, maxsplit=None):
 results = s.split(delim, maxsplit)
 if maxsplit is not None:
 results.extend('' for i in xrange(maxsplit - len(results)))
 return results
 
 
 Jeffrey
 

Neither of these quite match what the OP's nsplit function did, as his n
parameter (maxsplit here) actually specified the number of list items in
the result, not the number of splits to perform. Which makes matching
the default split parameters kind of pointless, as why bother doing all
this work to return a 0 item list in the default maxsplit = None case.
-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Converting a tuple to a list

2008-04-08 Thread John Krukoff

On Wed, 2008-04-09 at 00:46 +0200, Gabriel Ibanez wrote:
 Gabriel Ibanez wrote:
  Hi all ..
 
  I'm trying to using the map function to convert a tuple to a list, without
  success.
 
  I would like to have a lonely line that performs the same as loop of the
  next script:
 
  ---
  # Conveting tuple - list
 
  tupla = ((1,2), (3,4), (5,6))
 
  print tupla
 
  lista = []
  for a in tupla:
  for b in a:
  lista.append(b)
  print lista
  ---
 
  Any idea ?
 
  Thanks ...
 
  # Gabriel
 
 
 list(tupla)
 
 would probably do it.
 
 regards
  Steve
 --
 Steve Holden+1 571 484 6266   +1 800 494 3119
 Holden Web LLC  http://www.holdenweb.com/
 
 
 --
 http://mail.python.org/mailman/listinfo/python-list
 
 
 
 
 That would just make a list of tuples, I think he wants [1, 2, 3, 4, 5, 6].
 
 Try:  l = [x for z in t for x in z]
 
 --Brian
 
 
 ---
 
 
 Thanks Steve and Brian,
 
 Brian: that is !!
 
 However, it's a bit difficult to understand now. I have read it several 
 times :)
 
 

Another solution using the itertools module:

 import itertools
 t = ( ( 1, 2 ), ( 3, 4 ), ( 5, 6 ) )
 list( itertools.chain( *t ) )
[1, 2, 3, 4, 5, 6]

Though the list part is probably unnecessary for most uses. The problem
gets interesting when you want to recursively flatten an iterable of
arbitratrily deeply nested iterables.

-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

-- 
http://mail.python.org/mailman/listinfo/python-list


[issue643841] New class special method lookup change

2008-04-01 Thread John Krukoff

John Krukoff [EMAIL PROTECTED] added the comment:

I assume when you say that the documentation has already been updated, 
you mean something other than what's shown at:
http://docs.python.org/dev/reference/datamodel.html#new-style-and-
classic-classes
or
http://docs.python.org/dev/3.0/reference/datamodel.html#new-style-and-
classic-classes ?

As both of those claim to still not be up to date in relation to new 
style classes, and the __getattr__  __getattribute__ sections under 
special names make no reference to how magic methods are handled. 
Additionally, the Class Instances section under the type heirachy 
makes mention of how attributes are looked up, but doesn't mention the 
new style differences even in the 3.0 documentation.

Sure, there's this note under Special Method Names: 
For new-style classes, special methods are only guaranteed to work if 
defined in an object’s class, not in the object’s instance dictionary. 

But that only helps you figure it out if you already know what the 
problem is, and it's hardly comprehensive.

I'm not arguing that this is something that's going to change, as we're 
way past the point of discussion on the implementation, but this looks 
far more annoying if you're looking at it from the perspective of 
proxying to container classes or numeric types in a generic fashion. My 
two use cases I've had to convert are for lazy initialization of an 
object and for building an object that uses RPC to proxy access to an 
object to a remote server.

In both cases, since they are generic proxies that once initialized are 
supposed to behave exactly like the proxied instance, the list of magic 
methods to pass along is ridiculously long. Sure, I have to handle 
__copy__  __deepcopy__, and __getstate__  __setstate__ to make sure 
that they return instances of the proxy rather than the proxied object, 
but other than that there's over 50 attributes to override for new 
style classes just to handle proxying to numeric and container types. 

It's hard to get fancy about it too, as I can't just dynamically add 
the needed attributes to my instances by looking at the object to be 
proxied, it really has to be a static list of everything that python 
supports on the class. Additionally, while metaclasses might help here, 
there's still the problem that while my old style proxy class has 
continued to work fine as magic attributes have been added over python 
revisions, my new style equivalent will have to be updated work 
currectly as magic methods are added. Which, given the 2.x track seems 
to happen pretty frequently. Some of the bugs from that would have been 
quite tricky to track down too, such as the __cmp__ augmentation with 
the rich comparison methods.

None of the solutions really seem ideal, or at least as good as what 
old style classes provided, which is why I was hoping for some support 
in the 3.0 standard library or the conversion tool.


Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue643841

___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue643841] New class special method lookup change

2008-03-28 Thread John Krukoff

John Krukoff [EMAIL PROTECTED] added the comment:

I was just bit by this today in converting a proxy class from old style
to new style. The official documentation was of no help in discoverting
that neither __getattr__ or __getattribute__ are used to look up magic
attribute names. Even the link to New-style Classes off the
development documentation page is useless, as none of the resources
there (http://www.python.org/doc/newstyle/) mention the incompatible change.

This seems like an issue that is going to come up more frequently as
python 3000 pushes everyone to using only new style classes. It'd be
very useful if whatever conversion tool we get, or the python 3000
standard library includes a proxy class or metaclass that is able to
help with this conversion, such as this one:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/252151

Though preferably with some knowledge of all exising magic names.

--
assignee:  - georg.brandl
components: +Documentation -None
nosy: +georg.brandl, jkrukoff
versions: +Python 2.2.1, Python 2.2.2, Python 2.2.3, Python 2.3, Python 2.4, 
Python 2.5, Python 2.6


Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue643841

___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Does __import__ require a module to have a .py suffix?

2008-03-12 Thread John Krukoff
On Wed, 2008-03-12 at 09:22 -0700, mrstephengross wrote:
 Hi all. I've got a python file called 'foo' (no extension). I want to
 be able to load it as a module, like so:
 
   m = __import__('foo')
 
 However, the interpreter tells me No module named foo. If I rename
 it foo.py, I can indeed import it. Is the extension required? Is there
 any way to override that requirement?
 
 Thanks,
 --Steve

I recently solved a similar issue, importing from a string, with this
code:

 service = imp.new_module( 'ServiceModule' )
 compiled = compile( '''...some code here...''', 'string', 'exec',
0, 1 )
 exec compiled in service.__dict__

You could probably shorten it for your needs by using execfile instead.
If it's not in the current directory, you'll probably run into some
issues with further imports not working as expected unless you set the
names/paths right.

-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

-- 
http://mail.python.org/mailman/listinfo/python-list


RE: GC performance with lists

2007-09-04 Thread John Krukoff
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:python-
 [EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]
 Sent: Tuesday, September 04, 2007 8:07 AM
 To: python-list@python.org
 Subject: GC performance with lists
 
 While working on some python wrapping, I've run into some problems
 where the GC seems to take an unreasonable amount of time to run. The
 code below is a demonstration:
 
 import gc
 #gc.disable()
 
 data = []
 for i in xrange(10):
 
 shortdata = []
 for j in range(57):
 mytuple = (j, i+1, i+2, i+3, i+4, i+5, i+6)
 shortdata.append(mytuple)
 data.extend(shortdata)
 
 print len(data)
 
 with gc disabled (the second line) the code runs in 15 seconds, with
 it enabled it runs in 2:15, or ~9x slower. I expected some gc
 overhead, but not an order of magnitude! Am I doing something
 obviously wrong in the above code?
 
 Thanks,
  ...Eric
 
 --
 http://mail.python.org/mailman/listinfo/python-list

The only real optimization I see for this is moving the common
subexpressions (i+1, i+2, etc...) out of the loop as previous poster
suggested. 

Something that looks like this, maybe:
data = []
append = data.append
for i in xrange( 10 ):
shortdata = ( i+1, i+2, i+3, i+4, i+5, i+6 )
for j in range( 57 ):
append( ( j, ) + shortdata )

That'll help a little, I just checked the docs to be sure, and collection is
triggered by the number of allocations - number of deallocations going over
a certain threshold (700 by default). 

You do realize just what a massive data structure you're building here, too,
right? On my box, building this consumes about 750Mb of memory. You're doing
a massive number of allocations to create it, too, approximately 40 million.
So, if the gc gets called every 700 allocations, you're spending a lot of
time in the gc, and it's got a huge amount of memory it's sweeping.

It sounds to me like you're triggering worst case behaviour for the gc, and
should either change the threshold values, or simply disable the gc before
the loop and reenable it after. A single gc run at the end probably won't be
so bad as all the intermediate ones, though my box has pretty severe issues
doing anything after creating this data structure as it starts swapping to
disc.

My architechtural suggestion would be to refactor this as an iterator if at
all possible, so as to avoid the massive allocation burden.

-
John Krukoff
[EMAIL PROTECTED]

-- 
http://mail.python.org/mailman/listinfo/python-list


RE: 'Advanced' list comprehension? query

2007-08-08 Thread John Krukoff
[EMAIL PROTECTED] wrote:
 Hi,
 
 I'm playing around with list comprehension, and I'm trying to find the
 most aesthetic way to do the following:
 
 I have two lists:
 
 noShowList = ['one', 'two', 'three']
 
 myList = ['item one', 'item four', 'three item']
 
 I want to show all the items from 'myList' that do not contain any of
 the strings in 'noShowList'.
 
 i.e. 'item four'
 
 I can do it like this:
 
 def inItem(noShowList, listitem):
 return [x for x in noShowList if x in listitem]
 
 print [x for x in myList if not inItem(noShowList, x)]
 
 and I can do it (horribly) with:
 
 print [x for x in myList if not (lambda y, z:[i for i in y if i in z])
 (noShowList, x)]
 
 I can also print out the items that DO contain the 'noShowList'
 strings with:
 
 print [x for x in myList for y in noShowList if y in x]
 
 but I can't get the 'not' bit to work in the above line.
 
 Any ideas?
 Thanks!
 
 --
 http://mail.python.org/mailman/listinfo/python-list

So, conceptually speaking, you're dealing with two loops here, one over the
items to filter, and one over the items to check for substring matches. If
you want to do that with list comprehensions, I'd make it obvious that
there's two of them:

 [ listItem for listItem in myList if not [ noShow for noShow in
noShowList if noShow in listItem ] ]
['item four']

This is a pretty good place for the functional programming tools though,
specifically filter,
http://docs.python.org/tut/node7.html#SECTION00713
, which gives a solution that looks like this:

 filter( lambda listItem : not [ noShow for noShow in noShowList if
noShow in listItem ], myList )
['item four']

or using purely functional tools, like this:

 filter( lambda listItem : not sum( map( lambda noShow: noShow in
listItem, noShowList ) ), myList )
['item four']

All these solutions have the problem that they're still less efficient than
the unwrapped for loop, like so:

 aFiltered = []
 for listItem in myList:
... for noShow in noShowList:
... if noShow in listItem:
... break
... else:
... aFiltered.append( listItem )
... 
 aFiltered
['item four']

This is due to the list comprehensions testing all the possiblities, instead
of giving up on the first one found. You can jam that early break into the
functional approach using itertools, but it starts to look really ugly on
one line (requires 2.5 for if expression):

 list( itertools.ifilter( lambda listItem : True if len( list(
itertools.takewhile( lambda test : not test, itertools.imap( lambda noShow:
noShow in listItem, noShowList ) ) ) ) == len( noShowList) else False,
myList ) )
['item four']

Which can be made to look much better by breaking the 'noShow in listItem'
test out into a separate function, and does have the advantage that by using
itertools.ifilter this is a lazy approach. There's got to be a better way to
do the test to see if takewhile bailed early than using len, though.

-
John Krukoff
[EMAIL PROTECTED]

-- 
http://mail.python.org/mailman/listinfo/python-list


RE: 'Advanced' list comprehension? query

2007-08-08 Thread John Krukoff
John Krukoff wrote:
 
 [EMAIL PROTECTED] wrote:
  Hi,
 
  I'm playing around with list comprehension, and I'm trying to find the
  most aesthetic way to do the following:
 
  I have two lists:
 
  noShowList = ['one', 'two', 'three']
 
  myList = ['item one', 'item four', 'three item']
 
  I want to show all the items from 'myList' that do not contain any of
  the strings in 'noShowList'.
 
  i.e. 'item four'
 
  I can do it like this:
 
  def inItem(noShowList, listitem):
  return [x for x in noShowList if x in listitem]
 
  print [x for x in myList if not inItem(noShowList, x)]
 
  and I can do it (horribly) with:
 
  print [x for x in myList if not (lambda y, z:[i for i in y if i in z])
  (noShowList, x)]
 
  I can also print out the items that DO contain the 'noShowList'
  strings with:
 
  print [x for x in myList for y in noShowList if y in x]
 
  but I can't get the 'not' bit to work in the above line.
 
  Any ideas?
  Thanks!
 
  --
  http://mail.python.org/mailman/listinfo/python-list
 
 So, conceptually speaking, you're dealing with two loops here, one over
 the items to filter, and one over the items to check for substring
 matches. If you want to do that with list comprehensions, I'd make it
 obvious that there's two of them:
 
  [ listItem for listItem in myList if not [ noShow for noShow in
 noShowList if noShow in listItem ] ]
 ['item four']
 
 This is a pretty good place for the functional programming tools though,
 specifically filter,
 http://docs.python.org/tut/node7.html#SECTION00713
 , which gives a solution that looks like this:
 
  filter( lambda listItem : not [ noShow for noShow in noShowList if
 noShow in listItem ], myList )
 ['item four']
 
 or using purely functional tools, like this:
 
  filter( lambda listItem : not sum( map( lambda noShow: noShow in
 listItem, noShowList ) ), myList )
 ['item four']
 
 All these solutions have the problem that they're still less efficient
 than the unwrapped for loop, like so:
 
  aFiltered = []
  for listItem in myList:
 ... for noShow in noShowList:
 ... if noShow in listItem:
 ... break
 ... else:
 ... aFiltered.append( listItem )
 ...
  aFiltered
 ['item four']
 
 This is due to the list comprehensions testing all the possiblities,
 instead of giving up on the first one found. You can jam that early break
 into the functional approach using itertools, but it starts to look really
 ugly on one line (requires 2.5 for if expression):
 
  list( itertools.ifilter( lambda listItem : True if len( list(
 itertools.takewhile( lambda test : not test, itertools.imap( lambda
 noShow: noShow in listItem, noShowList ) ) ) ) == len( noShowList) else
 False, myList ) )
 ['item four']
 
 Which can be made to look much better by breaking the 'noShow in listItem'
 test out into a separate function, and does have the advantage that by
 using itertools.ifilter this is a lazy approach. There's got to be a
 better way to do the test to see if takewhile bailed early than using len,
 though.
 
 -
 John Krukoff
 [EMAIL PROTECTED]

Ah, yeah, any was the function I needed to simplify things and do the early
break. I really need to upgrade to 2.5.

 list( itertools.ifilter( lambda listItem : not any( itertools.imap(
lambda noShow: noShow in listItem, noShowList ) ), myList ) )
['item four']

Thanks Jason!

-
John Krukoff
[EMAIL PROTECTED]


-- 
http://mail.python.org/mailman/listinfo/python-list


RE: Reading multiline values using ConfigParser

2007-06-21 Thread John Krukoff
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:python-
 [EMAIL PROTECTED] On Behalf Of Phoe6
 Sent: Wednesday, June 20, 2007 8:51 PM
 To: python-list@python.org
 Subject: Re: Reading multiline values using ConfigParser
 
 On Jun 20, 10:35 pm, John Krukoff [EMAIL PROTECTED] wrote:
 
   Is there anyway, I can include multi-line value in the configfile? I
 
 
  Following the link to RFC 822 (http://www.faqs.org/rfcs/rfc822.html)
  indicates that you can spread values out over multiple lines as long as
  there is a space or tab character imeediately after the CRLF.
 
 Thanks for the response. It did work!
 
  config = ConfigParser()
  config.read(Testcases.txt)
 ['Testcases.txt']
  output = config.get(Information, Testcases)
  print output
 
 tct123
 tct124
 tct125
  output
 '\ntct123\ntct124\ntct125'
 
 
 However, as I am going to provide Testcases.txt to be user editable,
 I cannot assume or ask users to provide value testcases surronded by
 spaces. I got to figure out a workaround here.
 
 Thanks,
 Senthil
 
 --
 http://mail.python.org/mailman/listinfo/python-list

Sounds like you're stuck modifying ConfigParser to do what you want, or
writing your own configuration file parsing utilities.

From looking through the ConfigParser source, looks like all the parsing
work is inside the _read method, so shouldn't be too painful to make a
subclass that does what you want.

-
John Krukoff
[EMAIL PROTECTED]



-- 
http://mail.python.org/mailman/listinfo/python-list


RE: Reading multiline values using ConfigParser

2007-06-20 Thread John Krukoff
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:python-
 [EMAIL PROTECTED] On Behalf Of Phoe6
 Sent: Wednesday, June 20, 2007 10:51 AM
 To: python-list@python.org
 Subject: Reading multiline values using ConfigParser
 
 Hi,
 I have a configfile, in fact, I am providing a configfile in the
 format:
 
 [Information]
 Name: Foo
 Author: Bar
 
 Testcases:
 tct123
 tct124
 tct101
 
 The last values is a multi-line.
 
 ConfigParser is unable to recognize a multi-line value and splits out
 error.
 
 C:\ATF-Taskspython CreateTask.py
 Traceback (most recent call last):
   File CreateTask.py, line 13, in ?
 config.read('TaskDetails.txt')
   File C:\Python24\lib\ConfigParser.py, line 267, in read
 self._read(fp, filename)
   File C:\Python24\lib\ConfigParser.py, line 490, in _read
 raise e
 ConfigParser.ParsingError: File contains parsing errors:
 TaskDetails.txt
 [line 15]: 'tct123\n'
 [line 16]: 'tct124\n'
 [line 17]: 'tct101\n'
 
 I am using ConfigParser in the following way:
 
 config = ConfigParser.ConfigParser()
 config.read('TaskDetails.txt')
 config.get(Information,Testcases):
 
 Is there anyway, I can include multi-line value in the configfile? I
 was thinking of following option:value for a portion of the file and
 read the portion with multi-line as a normal file, but ConfigParser()
 is not allowing multi-line value itself.
 
 Any ideas/ Suggestions? :
 
 Ofcourse, throw away ConfigParser and use your own parser is there,
 but I would like to use that as the last option.
 
 Thanks,
 Senthil
 
 --
 http://mail.python.org/mailman/listinfo/python-list


Did you see the note in the docs about line continuations?

The configuration file consists of sections, led by a [section] header and
followed by name: value entries, with continuations in the style of RFC
822; name=value is also accepted.

Following the link to RFC 822 (http://www.faqs.org/rfcs/rfc822.html)
indicates that you can spread values out over multiple lines as long as
there is a space or tab character imeediately after the CRLF. If my
interpretation is correct (I've never tried this), then this would be a
legal multi-line value.

Testcases:
 tct123
 tct124
 tct101

It looks like you'd just get this back as the string 'tct123 tct124 tct101'
though, so you'd have to split on whitespace to get the individual values.
And, well, if you want to support whitespace in your value names, I don't
think this will work at all.

-
John Krukoff
[EMAIL PROTECTED]



-- 
http://mail.python.org/mailman/listinfo/python-list


RE: Questions about mathematical and statistical functionality in Python

2007-06-14 Thread John Krukoff
On Jun 14, 4:02 pm, Talbot Katz [EMAIL PROTECTED] wrote: 
 Greetings Pythoners!
 
 I hope you'll indulge an ignorant outsider.  I work at a financial
 software
 firm, and the tool I currently use for my research is R, a software
 environment for statistical computing and graphics.  R is designed with
 matrix manipulation in mind, and it's very easy to do regression and time
 series modeling, and to plot the results and test hypotheses.  The kinds
 of
 functionality we rely on the most are standard and robust versions of
 regression and principal component / factor analysis, bayesian methods
 such
 as Gibbs sampling and shrinkage, and optimization by linear, quadratic,
 newtonian / nonlinear, and genetic programming; frequently used graphics
 include QQ plots and histograms.  In R, these procedures are all available
 as functions (some of them are in auxiliary libraries that don't come with
 the standard distribution, but are easily downloaded from a central
 repository).
 
 For a variety of reasons, the research group is considering adopting
 Python.
   Naturally, I am curious about the mathematical, statistical, and
 graphical
 functionality available in Python.  Do any of you out there use Python in
 financial research, or other intense mathematical/statistical computation?
 Can you compare working in Python with working in a package like R or S-
 Plus
 or Matlab, etc.?  Which of the procedures I mentioned above are available
 in
 Python?  I appreciate any insight you can provide.  Thanks!
 
 --  TMK  --
 212-460-5430  home
 917-656-5351  cell
 
 
 --
 http://mail.python.org/mailman/listinfo/python-list

It is worth noting that there's a bridge available to allow python to
integrate cleanly with R, the Rpy project:
http://rpy.sourceforge.net/

Which should allow you to use python for whatever it is you need without
abandoning R for your mathematical/statistical work.

-
John Krukoff
[EMAIL PROTECTED]

-- 
http://mail.python.org/mailman/listinfo/python-list


RE: Accessing global namespace from module

2007-06-11 Thread John Krukoff
On Jun 11, 11:02 am, reubendb [EMAIL PROTECTED] wrote:
 Hello,
 I am new to Python. I have the following question / problem.
 I have a visualization software with command-line interface (CLI),
 which essentially is a Python (v. 2.5) interpreter with functions
 added to the global namespace. I would like to keep my own functions
 in a separate module and then import that module to the main script
 (that will be executed using the CLI interpreter). The problem is, I
 cannot access the functions in the global namespace of the main script
 from my module. Is there anyway to do that ?
 
 Here is an example of what I meant. The function AddPlot() and
 DrawPlots() are added to the global namespace by the software CLI. If
 I do this:
 
 mainscript.py:
 ---
 AddPlot(scatter, coordinate)
 # set other things here
 DrawPlots()
 
 it works fine. But I want to be able to do this:
 
 myModule.py:
 --
 def defaultScatterPlot():
   AddPlot(scatter, coordinate)
   #do other things
   DrawPlots()
 
 and then in mainscript.py:
 ---
 import myModule
 myModule.defaultScatterPlot()
 
 This won't work because myModule.py doesnot have access to AddPlot().
 How do I do something like this ?
 
 Thank you in advance for any help.
 RDB
 
 --
 http://mail.python.org/mailman/listinfo/python-list

Since the visulization software creator wasn't kind enough to bundle the
drawing functions up into a module for you, you can just do it yourself.

 import sys, new
 plotModule = new.module( 'plot' )
 plotModule.AddPlot = AddPlot
 plotModule.DrawPlots = DrawPlots
 sys.modules[ 'plot' ] = plotModule

Then, you can import your fake module from anywhere, and access its
contents.

 import plot
 plot
module 'plot' (built-in)
 plot.AddPlot
function AddPlot at 0x0099E830

Hope that helps.

-
John Krukoff
[EMAIL PROTECTED]

-- 
http://mail.python.org/mailman/listinfo/python-list


RE: Accessing global namespace from module

2007-06-11 Thread John Krukoff
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:python-
 [EMAIL PROTECTED] On Behalf Of Reuben D.
 Budiardja
 Sent: Monday, June 11, 2007 7:19 PM
 To: python-list@python.org
 Subject: Re: Accessing global namespace from module
 
 On Monday 11 June 2007 17:10:03 Gabriel Genellina wrote:
  En Mon, 11 Jun 2007 17:29:35 -0300, reubendb [EMAIL PROTECTED]
 escribió:
   On Jun 11, 3:30 pm, Gabriel Genellina [EMAIL PROTECTED]
  
   wrote:
   En Mon, 11 Jun 2007 15:18:58 -0300, reubendb [EMAIL PROTECTED]
  
   escribió:
The problem is I don't define the functions AddPlot() and
 DrawPlots().
It's built into the python interpreter of the CLI version of the
program I mentioned, and they are defined on the main script. I
 load
the main script using something like software -cli -s
mainscript.py.
In the mainscript.py I import myModule, but of course myModule does
not have access to the functions defined in the global namespace of
mainscript.py.
  
   Don't you have some import statements at the top of mainscript.py
 that
   are
   responsible for bringing AddPlot and DrawPlots into the current
   namespace?
   Import the same things in your second module.
  
   No, I *don't* have any import statement mainscript.py. When using this
   software's CLI, AddPlot and DrawPlots are available to me
   automagically from mainscript.py. Hence my question: How do I make
   this available from other module. Is there any way at all ?
 
  Yes: create your own module on-the-fly, using the recipe posted earlier
 by
  John Krukoff.
  If there are many functions, try enumerating them all:
 
  import sys
   from types import ModuleType as module
 
  plotModule = module('plot')
  for key,value in globals().items():
   if key[:2] != '__':
   setattr(plotModule, key, value)
 
  sys.modules['plot'] = plotModule
 
 Great ! That seems to work, thanks a lot.
 One last question. Do I have to do this for ever script I write, or can I
 put
 this into separate file and include it somehow ?
 I am going to have several mainscripts.py, and all is going to import
 myModule
 that will need access to this plots subroutine. It'll be great if I can
 put
 this trick on a single file that is included by the main scripts, to avoid
 violating DRY principle.
 
 Thanks for all the help.
 RDB
 --
 Reuben D. Budiardja
 --
 http://mail.python.org/mailman/listinfo/python-list

Well, an alternative way to access the main script namespace is using:

 import __main__
 __main__.AddPlot( blah, blah )

And so on, from within your imported file that you want to have muck about
in the main namespace. I've no idea if your custom application will setup
__main__ properly, but the documentation indicates that it should work the
same way for an embedded application.

You can probably substitute dir( __main__ ) for globals( ) in the above
script and have it set things up automatically.

-
John Krukoff
[EMAIL PROTECTED]

-- 
http://mail.python.org/mailman/listinfo/python-list


RE: Any way to refactor this?

2007-04-13 Thread John Krukoff
Bruno Desthuilliers wrote:
 John Salerno a écrit :
  Setting aside, for the moment, the utility of this method or even if
  there's a better way, I'm wondering if this is an efficient way to do
  it. I admit, there was some copying and pasting, which is what prompts
  me to ask the question. Here's the method. (I hope it looks ok, because
  it looks really weird for me right now)
 
  def _create_3D_xhatches():
  for x in xrange(-axis_length, axis_length + 1):
  if x == 0: continue
  visual.cylinder(pos=(x,-hatch_length,0),
  axis=(0,hatch_length*2,0), radius=hatch_radius)
  visual.cylinder(pos=(x,0,-hatch_length),
  axis=(0,0,hatch_length*2), radius=hatch_radius)
  visual.cylinder(pos=(-hatch_length,x,0),
  axis=(hatch_length*2,0,0), radius=hatch_radius)
  visual.cylinder(pos=(0,x,-hatch_length),
  axis=(0,0,hatch_length*2), radius=hatch_radius)
  visual.cylinder(pos=(-hatch_length,0,x),
  axis=(hatch_length*2,0,0), radius=hatch_radius)
  visual.cylinder(pos=(0,-hatch_length,x),
  axis=(0,hatch_length*2,0), radius=hatch_radius)
 
  Since each call to cylinder requires a slightly different format, I
  figured I had to do it this way.
 
  From a purely efficiency POV, there are some obviously possible
 improvements. The first one is to alias visual.cylinder, so you save on
 lookup time. The other one is to avoid useless recomputation of
 -hatch_length and hatch_length*2.
 
 def _create_3D_xhatches():
  cy = visual.cylinder
  for x in xrange(-axis_length, axis_length + 1):
  if x == 0: continue
  b = -hatch_length
  c = hatch_length*2
  cy(pos=(x, b, 0), axis=(0, c, 0), radius=hatch_radius)
  cy(pos=(x, 0, b), axis=(0, 0, c), radius=hatch_radius)
  cy(pos=(b, x, 0), axis=(c, 0, 0), radius=hatch_radius)
  cy(pos=(0, x, b), axis=(0, 0, c), radius=hatch_radius)
  cy(pos=(b, 0, x), axis=(c, 0, 0), radius=hatch_radius)
  cy(pos=(0, b, x), axis=(0, c, 0), radius=hatch_radius)
 
 A second step would be to try and generate the permutations by code
 instead of writing them all by hand, but I suppose the order is
 significant...
 There's still an obvious pattern, which is that the  position of 'c' in
 the axis tuple mirrors the position of 'b' in the pos tuple. There might
   be some way to use this to let the computer handle some part of the
 repetition...
 
 My 2 cents...
 --
 http://mail.python.org/mailman/listinfo/python-list

Because it was fun, and the previous refactoring made the pattern easy to
see, here's the way not to do this. (Requires python 2.5 for trinary
operator)

# From the ASPN cookbook
def permutations(L):
if len(L) = 1:
yield L
else:
a = [L.pop(0)]
for p in permutations(L):
for i in range(len(p)+1):
yield p[:i] + a + p[i:]

def _create_3D_xhatches():
for x in xrange(-axis_length, axis_length + 1):
if x  0: 
# Use None as placeholder for comparison.
for pos in permutations([x, None, 0]):
visual.cylinder(pos = [-hatch_length if coord is None else
coord for coord in pos], axis = [hatch_length * 2 if coord is None else 0
for coord in pos], radius = hatch_radius)

Undoubtedly slower, nearly impossible to read, but has minimum of code
duplication! 

If I was answering this question for real, I'd suggest the same arguments as
tuple solution that was suggested earlier, i.e.:

def _create_3D_xhatches():
list_of_kwargs = [
{pos:[ x, -hatch_length, 0], axis:[0, 2 * hatch_length, 0 ]},
{pos:[ x, 0, -hatch_length], axis:[0, 0, 2 * hatch_length ]},
And so on for all permutations...
]

for x in xrange(-axis_length, axis_length + 1):
if x == 0: 
continue

for kwargs in list_of_kwargs:
visual.cylinder( radius = hatch_radius, **kwargs )

As it pulls out the two obviously common components, the function call and
the radius parameter.
-
John Krukoff
[EMAIL PROTECTED]


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: some OT: how to solve this kind of problem in our program?

2006-12-26 Thread John Krukoff
On Tue, 2006-12-26 at 17:39 -0300, Gabriel Genellina wrote:
 At Monday 25/12/2006 21:24, Paul McGuire wrote:
 
 For example, for all the complexity in writing Sudoku solvers, there are
 fewer than 3.3 million possible permutations of 9 rows of the digits 1-9,
 and far fewer permutations that match the additional column and box
 constraints.  Why not just compute the set of valid solutions, and compare
 an input mask with these?
 
 Are you sure? There are 9!=362880 rows of digits 1-9; taking 9 of 
 these at random gives about 10**50 possibilities. Of course just a 
 few match the additional constraints. Maybe you can trivially reduce 
 them (just looking for no dupes on the first column) but anyway its a 
 large number... (Or I'm wrong computing the possibilities...)
 
 
 -- 
 http://mail.python.org/mailman/listinfo/python-list

Fortunately, somebody has already written a paper on the subject:
http://www.afjarvis.staff.shef.ac.uk/sudoku/sudoku.pdf

It looks like the number is actually rather large, and I'd expect even
with a specialized data structure for compression (probably some kind of
tree with bitwise digit packing?) would not fit in memory on any box I
own.

I would wonder if loading that much data isn't slower than solving the
puzzle.
-- 
John Krukoff [EMAIL PROTECTED]
Land Title Guarantee Company

-- 
http://mail.python.org/mailman/listinfo/python-list