[Python-Dev] Is it safe to assume that Python 2.7 is always built with unicode support?

2012-04-26 Thread Stefano Taschini
Hello every one,

I'm looking into issue 1065986 [1], and in order to submit a patch I need
to know whether I have to take into account the eventuality that cpyhon 2.7
be built without unicode support.

As far as I can see it is no longer possible to configure cpython 2.7 with
--disable-unicode as a consequence of the merge 59157:62babf456005 on 27
Feb 2010 of the commit 59153:8b2048bca33c of the same day.

Since I could not find an discussion on the topic leading explicitly to
this decision, I was wondering whether this is in fact an unintended
consequence of the check introduced in 59153:8b2048bca33c, which excludes
"no" from the acceptable values for configuring unicode support.

In conclusion, can you guys confirm that I don't have to worry that cpython
2.7 could be built with no unicode support? Or not?

If so, shouldn't it be properly documented, at least in Misc/NEWS ?

Bye,
Stefano

[1] http://bugs.python.org/issue1065986
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Implement PEP 412: Key-sharing dictionaries (closes #13903)

2012-04-26 Thread Kristján Valur Jónsson
Thanks.
Meanwhile, I blogged about tuning the dict implementation.
Preliminary testing seems to indicate that tuning it to conserve memory saves 
us 2Mb of wasted slots on the login screen.  No small thing on a PS3 system.
http://blog.ccpgames.com/kristjan/2012/04/25/optimizing-the-dict/
I wonder if we shouldn't make those factors into #defines as I did in my 2.7 
modifications, and even provide a "memory saving" predefine for embedders.
(Believe it or not, sometimes python performance is not an issue at all, but 
memory usage is.)

K

> -Original Message-
> From: Nick Coghlan [mailto:ncogh...@gmail.com]
> Sent: 24. apríl 2012 11:42
> To: Kristján Valur Jónsson
> Cc: R. David Murray; Antoine Pitrou; python-dev@python.org
> Subject: Re: [Python-Dev] cpython: Implement PEP 412: Key-sharing
> dictionaries (closes #13903)
> 
> On Tue, Apr 24, 2012 at 8:24 PM, Kristján Valur Jónsson
>  wrote:
> > Perhaps I should write about this on my blog.  Updating the memory
> > allocation macro layer in cPython for embedding is something I'd be
> > inclined to contribute, but it will involve a large amount of
> > bikeshedding, I'm sure :)
> 
> Trawl the tracker before you do - I'm pretty sure there's a patch (from the
> Nokia S60 port, IIRC) that adds a couple of macro definitions so that platform
> ports and embedding applications can intercept malloc() and free() calls.
> 
> It would be way out of date by now, but I seem to recall thinking it looked
> reasonable at a quick glance.
> 
> Cheers,
> Nick.
> 
> --
> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Implement PEP 412: Key-sharing dictionaries (closes #13903)

2012-04-26 Thread Kristján Valur Jónsson


> -Original Message-
> From: "Martin v. Löwis" [mailto:mar...@v.loewis.de]
> 
> This is easy in a debug build, using sys.getobjects(). In a release build, 
> you can
> use pympler:
> 
> start = pympler.muppy.get_size(pympler.muppy.get_objects())
> run_complicated_tests()
> end = pympler.muppy.get_size(pympler.muppy.get_objects())
> print "delta mem: %d" % (end-start)

Thanks for pointing out pympler to me.  Sounds like fun, I'll try it out.  
I should point out that gc.get_objects() also works, if you don't care about 
stuff like ints and floats.

Another reason why I like the runtime stats we have built in, however, is that 
they provide no query overhead.
You can query the current resource usage as often as you like and this is 
important in a running app.  We log python memory usage every second or so.

Cheers,

K

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Is it safe to assume that Python 2.7 is always built with unicode support?

2012-04-26 Thread martin

I'm looking into issue 1065986 [1], and in order to submit a patch I need
to know whether I have to take into account the eventuality that cpyhon 2.7
be built without unicode support.


It's intended (at least, it is *my* intention) that Python 2.7 can be built
without Unicode support, and it's a bug if that is not possible anymore.
Certain embedded configurations might want that.

That doesn't mean that the bug needs to be fixed; this can be deferred until
somebody actually requests that bug being fixed, or better, until somebody
contributes a patch to do so.

However, it *does* mean that we shouldn't further break the feature, at least
not knowingly.

OTOH, it's clear that certain functionality cannot work if Unicode is  
disabled,

so it may be acceptable if pydoc breaks in such a configuration.

Regards,
Martin


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sys.implementation

2012-04-26 Thread Barry Warsaw
On Apr 25, 2012, at 11:31 PM, Eric Snow wrote:

>The proposal of adding sys.implementation has come up a couple times
>over the last few years. [1][2]  While the reaction has been
>overwhelmingly positive, nothing has come of it.  I've created a
>tracker issue and a patch:
>
>http://bugs.python.org/issue14673
>
>The patch adds a struct sequence that holds ("name" => "CPython",
>"version" => sys.version_info).  If later needs dictate more fields,
>we can cross that bridge then.
>
>Are there any objections?  Considering the positive reaction and the
>scope of the addition, does this need a PEP?

It's somewhat of a corner case, but I think a PEP couldn't hurt.  The
rationale section would be useful, at least.

-Barry
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: Close #10142: Support for SEEK_HOLE/SEEK_DATA

2012-04-26 Thread Benjamin Peterson
2012/4/26 jesus.cea :
> http://hg.python.org/cpython/rev/86dc014cdd74
> changeset:   76570:86dc014cdd74
> user:        Jesus Cea 
> date:        Thu Apr 26 16:39:35 2012 +0200
> summary:
>  Close #10142: Support for SEEK_HOLE/SEEK_DATA
>
> files:
>  Doc/library/io.rst       |   5 +
>  Doc/library/os.rst       |   4 
>  Lib/_pyio.py             |  12 +++-
>  Lib/os.py                |   1 +
>  Lib/test/test_posix.py   |  20 
>  Misc/NEWS                |   2 ++
>  Modules/_io/bufferedio.c |  21 ++---
>  Modules/posixmodule.c    |   7 +++
>  8 files changed, 60 insertions(+), 12 deletions(-)
>
>
> diff --git a/Doc/library/io.rst b/Doc/library/io.rst
> --- a/Doc/library/io.rst
> +++ b/Doc/library/io.rst
> @@ -291,6 +291,11 @@
>       .. versionadded:: 3.1
>          The ``SEEK_*`` constants.
>
> +      .. versionadded:: 3.3
> +         Some operating systems could support additional values, like
> +         :data:`os.SEEK_HOLE` or :data:`os.SEEK_DATA`. The valid values
> +         for a file could depend on it being open in text or binary mode.
> +

Why are they only listed in "os" and not "io".

>    .. method:: seekable()
>
>       Return ``True`` if the stream supports random access.  If ``False``,
> diff --git a/Doc/library/os.rst b/Doc/library/os.rst
> --- a/Doc/library/os.rst
> +++ b/Doc/library/os.rst
> @@ -992,6 +992,10 @@
>    Parameters to the :func:`lseek` function. Their values are 0, 1, and 2,
>    respectively. Availability: Windows, Unix.
>
> +   .. versionadded:: 3.3
> +      Some operating systems could support additional values, like

"Some operating systems may support" is better. (They applies to other
parts in the docs, too.)

> +      :data:`os.SEEK_HOLE` or :data:`os.SEEK_DATA`.
> +

Since we're explicitly listing which ones we support, it would be nice
to explain what they do.



-- 
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Is it safe to assume that Python 2.7 is always built with unicode support?

2012-04-26 Thread Stefano Taschini
Understood.

May I suggest that http://bugs.python.org/issue8767 be reopened, to make
things clear?

Stefano


On 26 April 2012 16:01,  wrote:

> I'm looking into issue 1065986 [1], and in order to submit a patch I need
>> to know whether I have to take into account the eventuality that cpyhon
>> 2.7
>> be built without unicode support.
>>
>
> It's intended (at least, it is *my* intention) that Python 2.7 can be built
> without Unicode support, and it's a bug if that is not possible anymore.
> Certain embedded configurations might want that.
>
> That doesn't mean that the bug needs to be fixed; this can be deferred
> until
> somebody actually requests that bug being fixed, or better, until somebody
> contributes a patch to do so.
>
> However, it *does* mean that we shouldn't further break the feature, at
> least
> not knowingly.
>
> OTOH, it's clear that certain functionality cannot work if Unicode is
> disabled,
> so it may be acceptable if pydoc breaks in such a configuration.
>
> Regards,
> Martin
>
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Is it safe to assume that Python 2.7 is always built with unicode support?

2012-04-26 Thread R. David Murray
On Thu, 26 Apr 2012 17:07:46 +0200, Stefano Taschini  wrote:
> May I suggest that http://bugs.python.org/issue8767 be reopened, to make
> things clear?

Done.

--David

PS: we prefer no top-posting on this list.  It makes it far easier
to retain just enough context to make a message stand on its own
when properly edited.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sys.implementation

2012-04-26 Thread Eric Snow
On Thu, Apr 26, 2012 at 8:31 AM, Barry Warsaw  wrote:
> On Apr 25, 2012, at 11:31 PM, Eric Snow wrote:
>>Are there any objections?  Considering the positive reaction and the
>>scope of the addition, does this need a PEP?
>
> It's somewhat of a corner case, but I think a PEP couldn't hurt.  The
> rationale section would be useful, at least.
>
> -Barry

Yeah, I'm finding little bits and pieces that would be nice to have
recorded in one place.  I'll get something up in the next couple days.

-eric
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Assigning copyright...

2012-04-26 Thread stefan brunthaler
Hello Mark,

> A URL for the code repository (with an open-source license),
> so code can be reviewed.
> It is hard to review and update a giant patch.

OK, I took Nick's advice to heart and created a fork from the official
cpython mirror on bitbucket. You can view the code patched  in
(branch: inca-only) under the following URL:
https://bitbucket.org/sbrunthaler/cpython-inline-caching

Since it is a fork, it contains the usual LICENSE from Python.

Regarding Eric's hint: It seems that this agreement needs to be signed
and mailed. Can I sign/scan and email it to somebody? (Or should I
wait until there is a decision regarding a potential integration?) The
way I understood Guido's last message it is best to use Apache 2
license without retaining my own copyright. I am perfectly fine with
that but am not sure if using the fork with sub-directories including
the official LICENSE takes care of that. Obviously, I don't have too
much experience in this area, so if I am missing something blatantly
obvious, I apologize beforehand...

Best,
--stefan
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Assigning copyright...

2012-04-26 Thread Martin v. Löwis

Regarding Eric's hint: It seems that this agreement needs to be signed
and mailed. Can I sign/scan and email it to somebody?


Yes, see

http://www.python.org/psf/contrib/

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Changes in html.parser may cause breakage in client code

2012-04-26 Thread Vinay Sajip
Following recent changes in html.parser, the Python 3 port of Django I'm working
on has started failing while parsing HTML.

The reason appears to be that Django uses some module-level data in html.parser,
for example tagfind, which is a regular expression pattern. This has changed
recently (Ezio changed it in ba4baaddac8d).

Now tagfind (and other such patterns) are not marked as private (though not
documented), but should they be? The following script (tagfind.py):

import html.parser as Parser

data = ''

m = Parser.tagfind.match(data, 1)
print('%r -> %r' % (Parser.tagfind.pattern, data[1:m.end()]))

gives different results on 3.2 and 3.3:

$ python3.2 tagfind.py
'[a-zA-Z][-.a-zA-Z0-9:_]*' -> 'select'
$ python3.3 tagfind.py
'([a-zA-Z][-.a-zA-Z0-9:_]*)(?:\\s|/(?!>))*' -> 'select '

The trailing space later causes a mismatch with the end tag, and leads to the
errors. Django's use of the tagfind pattern is in a subclass of HTMLParser, in
an overridden parse_startag method.

Do we need to indicate more strongly that data like tagfind are private? Or has
the change introduced inadvertent breakage, requiring a fix in Python?

Regards,

Vinay Sajip

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Changes in html.parser may cause breakage in client code

2012-04-26 Thread Georg Brandl
On 26.04.2012 21:10, Vinay Sajip wrote:
> Following recent changes in html.parser, the Python 3 port of Django I'm 
> working
> on has started failing while parsing HTML.
> 
> The reason appears to be that Django uses some module-level data in 
> html.parser,
> for example tagfind, which is a regular expression pattern. This has changed
> recently (Ezio changed it in ba4baaddac8d).
> 
> Now tagfind (and other such patterns) are not marked as private (though not
> documented), but should they be? The following script (tagfind.py):
> 
> import html.parser as Parser
> 
> data = ''
> 
> m = Parser.tagfind.match(data, 1)
> print('%r -> %r' % (Parser.tagfind.pattern, data[1:m.end()]))
> 
> gives different results on 3.2 and 3.3:
> 
> $ python3.2 tagfind.py
> '[a-zA-Z][-.a-zA-Z0-9:_]*' -> 'select'
> $ python3.3 tagfind.py
> '([a-zA-Z][-.a-zA-Z0-9:_]*)(?:\\s|/(?!>))*' -> 'select '
> 
> The trailing space later causes a mismatch with the end tag, and leads to the
> errors. Django's use of the tagfind pattern is in a subclass of HTMLParser, in
> an overridden parse_startag method.
> 
> Do we need to indicate more strongly that data like tagfind are private? Or 
> has
> the change introduced inadvertent breakage, requiring a fix in Python?

Since it's a module level constant without a leading underscore, IMO it was
okay for Django to use it, even if not documented.

In this case, especially since we actually have evidence of someone using the
constant, I would keep it as-is and use a new (underscored, this time) name for
the new pattern.

And yes, I think that we do need to indicate private-ness of module-level data.

Georg

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Changes in html.parser may cause breakage in client code

2012-04-26 Thread Guido van Rossum
On Thu, Apr 26, 2012 at 12:10 PM, Vinay Sajip  wrote:
> Following recent changes in html.parser, the Python 3 port of Django I'm 
> working
> on has started failing while parsing HTML.
>
> The reason appears to be that Django uses some module-level data in 
> html.parser,
> for example tagfind, which is a regular expression pattern. This has changed
> recently (Ezio changed it in ba4baaddac8d).
>
> Now tagfind (and other such patterns) are not marked as private (though not
> documented), but should they be? The following script (tagfind.py):
>
>    import html.parser as Parser
>
>    data = ''
>
>    m = Parser.tagfind.match(data, 1)
>    print('%r -> %r' % (Parser.tagfind.pattern, data[1:m.end()]))
>
> gives different results on 3.2 and 3.3:
>
>    $ python3.2 tagfind.py
>    '[a-zA-Z][-.a-zA-Z0-9:_]*' -> 'select'
>    $ python3.3 tagfind.py
>    '([a-zA-Z][-.a-zA-Z0-9:_]*)(?:\\s|/(?!>))*' -> 'select '
>
> The trailing space later causes a mismatch with the end tag, and leads to the
> errors. Django's use of the tagfind pattern is in a subclass of HTMLParser, in
> an overridden parse_startag method.
>
> Do we need to indicate more strongly that data like tagfind are private? Or 
> has
> the change introduced inadvertent breakage, requiring a fix in Python?

I think both. Looks like it wasn't meant to be exported. But it should
have been marked as such. And I think it would behoove us to reduce
random failures in important 3rd party libraries by keeping the old
version around (but mark it as deprecated with an explaining comment,
and submit a Django fix to stop using it).

Also the module should be updated to use _tagfind internally (and
likewise for other accidental exports).

Traditionally we've been really lax about this stuff. We should strive
to improve and clarify the exact boundaries of our APIs better.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] package imports, sys.path and os.chdir()

2012-04-26 Thread Christian Tismer

Howdy,

I have a small problem/observation with imports.

I have several packages to import, which works all fine, as long
as the packages are imported from directories found on the installed
site-packages, via .pth etc.

The only problem is the automatically prepended empty string in sys.path.
Depending from where I start my application, the values stored
in package.__file__ and package.__path__ are absolute or relative
paths.

So, if my pwd is the directory that contains my top-level modules,
even though sys.path contains correct absolute entries for that, in this
case the '' entry wins.

Assume this:

<- cwd is here
   moda
   modb

>>> import moda

Some code happens to chdir away, and later some code does

>>> from moda import modb

Since the __path__ entry is now a relative path, this second import fails.

Although it is no recommended practice to leave a changed chdir(), I
don't see why this is so. When a module is imported, would it not be
better to always make __file__ and __path__ absolute?

I see the module path, hidden by the '' entry not as a feature but
an undesired side-effect.

No big deal and easy to work around, I just would like to understand why.

cheers -- chris

--
Christian Tismer :^)
tismerysoft GmbH : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key ->  http://pgp.uni-mainz.de
work +49 173 24 18 776  mobile +49 173 24 18 776  fax n.a.
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Changes in html.parser may cause breakage in client code

2012-04-26 Thread Nick Coghlan
On Fri, Apr 27, 2012 at 5:21 AM, Guido van Rossum  wrote:
> Traditionally we've been really lax about this stuff. We should strive
> to improve and clarify the exact boundaries of our APIs better.

Yeah, I must admit in my own projects these days I habitually mark all
module level and class level names with a leading underscore until I
make a conscious decision to make them part of the relevant public
API. I also do this for any new helper attributes and
functions/methods I add to the stdlib.

One key catalyst for this was when PJE pointed out a bug years ago in
the behaviour of the -m switch that meant I had to introduce a *new*
helper function to runpy, because runpy.run_module was public, and I
needed to change the signature in a backwards incompatible way to fix
the bug (and thus the current runpy._run_module_as_main hook was
born).

When I use dir() and help() as much as I do to explore unfamiliar
APIs, I feel obliged to make sure that introspecting my own code
accurately reflects which names are part of the public API and which
are just implementation details.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] package imports, sys.path and os.chdir()

2012-04-26 Thread Nick Coghlan
On Fri, Apr 27, 2012 at 7:30 AM, Christian Tismer  wrote:
> No big deal and easy to work around, I just would like to understand why.

I don't like it either and want to change it, but I'm also not going
to mess with it until the importlib bootstrapping is fully integrated
and stable.

For the moment, there's a workaround in runpy to ensure at least
__main__.__file__ is always absolute (even when using the -m switch).
Longer term, I'd like to see __file__ and __path__ entries to be
guaranteed to be *always* absolutely, even when they're imported
relative to the current working directory.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sys.implementation

2012-04-26 Thread Larry Hastings

On 04/25/2012 10:31 PM, Eric Snow wrote:

The patch adds a struct sequence that holds ("name" =>  "CPython",
"version" =>  sys.version_info).  If later needs dictate more fields,
we can cross that bridge then.


My one bit of bike-shedding: I don't think it's desirable that this 
object be iterable.  Therefore I suggest you don't use struct sequence.



//arry/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Changes in html.parser may cause breakage in client code

2012-04-26 Thread Ezio Melotti

Hi,

On 26/04/2012 22.10, Vinay Sajip wrote:

Following recent changes in html.parser, the Python 3 port of Django I'm working
on has started failing while parsing HTML.

The reason appears to be that Django uses some module-level data in html.parser,
for example tagfind, which is a regular expression pattern. This has changed
recently (Ezio changed it in ba4baaddac8d).


html.parser doesn't use any private _name, so I was considering part of 
the public API only the documented names.  Several methods are marked 
with an "# internal" comment, but that's not visible unless you go read 
the source code.



Now tagfind (and other such patterns) are not marked as private (though not
documented), but should they be? The following script (tagfind.py):

 import html.parser as Parser

 data = ''

 m = Parser.tagfind.match(data, 1)
 print('%r ->  %r' % (Parser.tagfind.pattern, data[1:m.end()]))

gives different results on 3.2 and 3.3:

 $ python3.2 tagfind.py
 '[a-zA-Z][-.a-zA-Z0-9:_]*' ->  'select'
 $ python3.3 tagfind.py
 '([a-zA-Z][-.a-zA-Z0-9:_]*)(?:\\s|/(?!>))*' ->  'select'

The trailing space later causes a mismatch with the end tag, and leads to the
errors. Django's use of the tagfind pattern is in a subclass of HTMLParser, in
an overridden parse_startag method.


Django shouldn't override parse_starttag (internal and undocumented), 
but just use handle_starttag (public and documented).

I see two possible reasons why it's overriding parse_starttag:
 1) Django is working around an HTMLParser bug.  In this case the bug 
could have been fixed (leading to the breakage of the now-useless 
workaround), and now you could be able to use the original 
parse_starttag and have the correct result.  If it is indeed working 
around a bug and the bug is still present, you should report it upstream.
 2) Django is implementing an additional feature.  Depending on what 
exactly the code is doing you might want to open a new feature request 
on the bug tracker. For example the original parse_starttag sets a 
self.lasttag attribute with the correct name of the last tag parsed.  
Note however that both parse_starttag and self.lasttag are internal and 
shouldn't be used directly (but lasttag could be exposed and documented 
if people really think that it's useful).



Do we need to indicate more strongly that data like tagfind are private? Or has
the change introduced inadvertent breakage, requiring a fix in Python?


I'm not sure that reverting the regex, deprecate all the exposed 
internal names, and add/use internal _names instead is a good idea at 
this point.  This will cause more breakage, and it would require an 
extensive renaming.  I can add notes to the documentation/docstrings and 
specify what's private and what's not though.
OTOH, if this specific fix is not released yet I can still do something 
to limit/avoid the breakage.


Best Regards,
Ezio Melotti


Regards,

Vinay Sajip



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sys.implementation

2012-04-26 Thread Eric Snow
On Thu, Apr 26, 2012 at 9:29 PM, Larry Hastings  wrote:
> My one bit of bike-shedding: I don't think it's desirable that this object
> be iterable.  Therefore I suggest you don't use struct sequence.

Good point.  Noted.

-eric
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com