[Python-Dev] mUTF-7 support?

2014-10-09 Thread Jesus Cea
I miss mUTF-7 support (as used to encode IMAP4 mailbox names) in Python,
in the codecs module. As an european with a language with 27 different
letters (instead of english 26), tildes, opening question marks, etc., I
find it very inconvenient.

This encoding is used basically only in IMAP4, I know. But IMAP4 is an
important protocol and all projects related to it needs mUTF-7 support
if they care about non-english alphabets. Everybody has already an
implementation, waste of effort.

We already support quite amusing encodings in
.

What do you think?. Could be considered for Python 3.5?.

I volunteer for the job, of course.

PS: Do you think a Python implementation would be good enough?. I don't
think this need to be C-fast.

-- 
Jesús Cea Avión _/_/  _/_/_/_/_/_/
j...@jcea.es - http://www.jcea.es/ _/_/_/_/  _/_/_/_/  _/_/
Twitter: @jcea_/_/_/_/  _/_/_/_/_/
jabber / xmpp:j...@jabber.org  _/_/  _/_/_/_/  _/_/  _/_/
"Things are not so easy"  _/_/  _/_/_/_/  _/_/_/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/_/_/_/  _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] mUTF-7 support?

2014-10-09 Thread Victor Stinner
Hi,

You can develop a codec and plug it into Python 3.4 right now using
codecs.register().

It's difficult to decide if a codec is important enough to be added to Python.

When you say "IMAP4", do you mean any IMAP4 server? Do you have a list
of server vendors known to use the encoding mUTF-7? Is it possible to
ask the server to speak a specific codec like UTF-8? I don't know the
protocol. Interesting article:
http://comments.gmane.org/gmane.mail.imap.general/3416

Python supports UTF-7, but this codec doesn't look to be used. Bugs
were fixed in this codec "recently".

Anyway, open an issue ;-)

How is mUTF-7 different than UTF-7? (Why yet another encoding while
standard UTF encodings exist???)

Requests of new encodings:

"missing vietnamese codec TCVN 5712:1993 in Python" (open)
http://bugs.python.org/issue21081

"add thai encoding aliases to encodings.aliases" (open)
http://bugs.python.org/issue17254

"Add "java modified utf-8" codec" (closed as wont fix 2 years ago)
http://bugs.python.org/issue2857

"Add support for CESU-8 encoding" (rejected 3 years ago)
http://bugs.python.org/issue12742

"Adding new CNS11643, a *huge* charset,support in cjkcodecs"
(closed as wont fix 4 years ago)
http://bugs.python.org/issue2066

"Add KOI8-RU as a known encoding" (rejected 5 years ago)
http://bugs.python.org/issue5214
("This charset wasn't supported by Ukrainian Internet community due to
political reasons; KOI8-U was invented as opposition to KOI8-RU.")

Recently added codec:

"Add support of the cp1125 encoding" (1 year ago)
http://bugs.python.org/issue19668

"Add cp65001 codec" (3 years ago)
http://bugs.python.org/issue13216

Victor

2014-10-10 0:47 GMT+02:00 Jesus Cea :
> I miss mUTF-7 support (as used to encode IMAP4 mailbox names) in Python,
> in the codecs module. As an european with a language with 27 different
> letters (instead of english 26), tildes, opening question marks, etc., I
> find it very inconvenient.
>
> This encoding is used basically only in IMAP4, I know. But IMAP4 is an
> important protocol and all projects related to it needs mUTF-7 support
> if they care about non-english alphabets. Everybody has already an
> implementation, waste of effort.
>
> We already support quite amusing encodings in
> .
>
> What do you think?. Could be considered for Python 3.5?.
>
> I volunteer for the job, of course.
>
> PS: Do you think a Python implementation would be good enough?. I don't
> think this need to be C-fast.
>
> --
> Jesús Cea Avión _/_/  _/_/_/_/_/_/
> j...@jcea.es - http://www.jcea.es/ _/_/_/_/  _/_/_/_/  _/_/
> Twitter: @jcea_/_/_/_/  _/_/_/_/_/
> jabber / xmpp:j...@jabber.org  _/_/  _/_/_/_/  _/_/  _/_/
> "Things are not so easy"  _/_/  _/_/_/_/  _/_/_/_/  _/_/
> "My name is Dump, Core Dump"   _/_/_/_/_/_/  _/_/  _/_/
> "El amor es poner tu felicidad en la felicidad de otro" - Leibniz
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] mUTF-7 support?

2014-10-09 Thread Antoine Pitrou
On Fri, 10 Oct 2014 00:47:46 +0200
Jesus Cea  wrote:
> I miss mUTF-7 support (as used to encode IMAP4 mailbox names) in Python,
> in the codecs module. As an european with a language with 27 different
> letters (instead of english 26), tildes, opening question marks, etc., I
> find it very inconvenient.
> 
> This encoding is used basically only in IMAP4, I know. But IMAP4 is an
> important protocol and all projects related to it needs mUTF-7 support
> if they care about non-english alphabets. Everybody has already an
> implementation, waste of effort.

This sounds good to me. Feel free to propose a patch for 3.5.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] mUTF-7 support?

2014-10-09 Thread Jesus Cea
On 10/10/14 01:08, Victor Stinner wrote:
> When you say "IMAP4", do you mean any IMAP4 server? Do you have a list
> of server vendors known to use the encoding mUTF-7?

All of them. IMAP4 protocol **REQUIRES** mUTF-7.

UTF-8 is optional in IMAP4, and even UTF-8 capable servers have to
support clients without that ability.

Check "5.1. Mailbox Naming" in IMAP4 RFC:
.

> Anyway, open an issue ;-)

I would like to gauge interest/resistance to the idea from my fellows first.

> How is mUTF-7 different than UTF-7? (Why yet another encoding while
> standard UTF encodings exist???)

As explained in section "5.1.3. Mailbox International Naming Convention":

"""
The purpose of these modifications is to correct the following
   problems with UTF-7:

  1) UTF-7 uses the "+" character for shifting; this conflicts with
 the common use of "+" in mailbox names, in particular USENET
 newsgroup names.

  2) UTF-7's encoding is BASE64 which uses the "/" character; this
 conflicts with the use of "/" as a popular hierarchy delimiter.

  3) UTF-7 prohibits the unencoded usage of "\"; this conflicts with
 the use of "\" as a popular hierarchy delimiter.

  4) UTF-7 prohibits the unencoded usage of "~"; this conflicts with
 the use of "~" in some servers as a home directory indicator.

  5) UTF-7 permits multiple alternate forms to represent the same
 string; in particular, printable US-ASCII characters can be
 represented in encoded form.
"""

> Requests of new encodings:

I am volunteering and can even do the mercurial PUSH myself :-p. That is
an advantage over some of those new encoding requests :-pp.

But then yes, I realize that this is a specialized tool (even if IMAP4
is probably the most popular mail access protocol in the world), we can
accommodate every use-case and there are tons of mUTF-7 libraries out
there already.

So, I am asking :).

-- 
Jesús Cea Avión _/_/  _/_/_/_/_/_/
j...@jcea.es - http://www.jcea.es/ _/_/_/_/  _/_/_/_/  _/_/
Twitter: @jcea_/_/_/_/  _/_/_/_/_/
jabber / xmpp:j...@jabber.org  _/_/  _/_/_/_/  _/_/  _/_/
"Things are not so easy"  _/_/  _/_/_/_/  _/_/_/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/_/_/_/  _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] mUTF-7 support?

2014-10-09 Thread Ethan Furman

On 10/09/2014 03:47 PM, Jesus Cea wrote:

[]  mUTF-7 support  [...]

What do you think?. Could be considered for Python 3.5?.


+1

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] mUTF-7 support?

2014-10-09 Thread Victor Stinner
2014-10-10 1:33 GMT+02:00 Jesus Cea :
> The purpose of these modifications is to correct the following
>problems with UTF-7:

If you need performances, I would be interested to see if it would be
possible to reuse the C codec for UTF-7 to share as much code as
possible.

What is the current behaviour of imaplib in Python 3.4 with non-ASCII
characters in mailbox names?

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Status of C compilers for Python on Windows

2014-10-09 Thread Victor Stinner
Hi,

Windows is not the primary target of Python developers, probably
because most of them work on Linux. Official Python binaries are
currently built by Microsoft Visual Studio. Even if Python developers
get free licenses thanks for Microsoft, I would prefer to use an open
source compiler if it would be possible. So *anyone* can build Python
from scatch. I don't like the requirement of having a license to build
Python. The free version (Visual Studio Express) only supports 32-bit
and doesn't support PGO build (Profile-Guided Optimizations, which are
disabled if I remember correctly because of compiler bugs).

I know that it's hard to replace Visual Studio. I don't want to do it
right now, but I would like to discuss that with you.


=== Open Watcom

Jeffrey Armstrong is working on the Python support of OpenWatcom(v2), see:
http://lightningpython.org/
https://bitbucket.org/ArmstrongJ/lightning-python

This compiler was initially written on MS-DOS in 32-bit, but it now
supports Windows and Linux as well. The 64-bit mode is new and
experimental. The Open Watcom "v2" project is actively developed at:

https://github.com/open-watcom/open-watcom-v2/

On Linux, Open Watcom don't support dynamic linking. On Windows, it
uses its own C library. I'm not sure that Open Watcom is the best
choice to build Python on Windows.


=== MinGW

Some people tried to compile Python. See for example:
https://bitbucket.org/puqing/python-mingw

We even got some patches:
http://bugs.python.org/issue3871 (rejected)

See also:
https://stackoverflow.com/questions/15365249/build-python-with-mingw-and-gcc

MinGW reuses the Microsoft C library and it is based on GCC which is
very stable, actively developed, supports a lot of archiectures, etc.
I guess that it should be possible to reuse third party GCC tools like
the famous GDB debugger?


=== Cywin

Cygwin was written compile POSIX applications on Windows and it
provides a DLL for that. Python doesn't need that, it uses directly
the Windows native API. I don't think that we should use Cygwin.


=== Clang

I have no idea of the support of Clang on Windows. Which C library is
used? I found some pages:
http://clang.llvm.org/docs/MSVCCompatibility.html

http://blog.llvm.org/2014/07/clangllvm-on-windows-update.html
This article starts with "It’s time for an update on Clang’s support
for building native Windows programs, compatible with Visual C++!".
Good.

I see binaries for Windows.


=== Other compilers?

Do you know other C compiler which can be used to build Python?


=== Requirements

A compiler alone is not enough. To develop, we need tools to automate
the compilation, we need a good debugger, and other similar tools.
(Personally, I don't need an IDE. I mostly write code on Linux and
only run Windows to ensure that my code works on Windows before
pushing it.)

IMO 64-bit support is simply required (we currently provide 64-bit
binaries on Windows). Supporting ARM can be interesting for Windows 8
and Windows 9.

Some parts of Python are low-level like ctypes with libffi. The
_decimal module uses libmpdec which is highly optimized. In short, the
goal is to have a full working standard Python library.

It's probably better to reuse the Microsoft C library instead of
having to embed a different C library. What do you think?

What about the Python stable ABI? Would it be broken if we use a
different compiler?

What about third party Python extensions?

What about external dependencies like gzip, bz2, Tk, Tcl, OpenSSL, etc.?

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] mUTF-7 support?

2014-10-09 Thread R. David Murray
On Fri, 10 Oct 2014 01:33:58 +0200, Jesus Cea  wrote:
> On 10/10/14 01:08, Victor Stinner wrote:
> > When you say "IMAP4", do you mean any IMAP4 server? Do you have a list
> > of server vendors known to use the encoding mUTF-7?
> 
> All of them. IMAP4 protocol **REQUIRES** mUTF-7.

[...]

> I am volunteering and can even do the mercurial PUSH myself :-p. That is
> an advantage over some of those new encoding requests :-pp.
> 
> But then yes, I realize that this is a specialized tool (even if IMAP4
> is probably the most popular mail access protocol in the world), we can
> accommodate every use-case and there are tons of mUTF-7 libraries out
> there already.
> 
> So, I am asking :).

I see you are already nosy on issue 5305.  I don't think there is any
question that this is a useful and desired feature for python's imaplib.
If it makes sense to implement it as a codec, there is no reason *not*
to do that.

--David
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] mUTF-7 support?

2014-10-09 Thread Jesus Cea
On 10/10/14 02:00, Victor Stinner wrote:
> 2014-10-10 1:33 GMT+02:00 Jesus Cea :
>> The purpose of these modifications is to correct the following
>>problems with UTF-7:
> 
> If you need performances, I would be interested to see if it would be
> possible to reuse the C codec for UTF-7 to share as much code as
> possible.

I don't need performance, and implementations I am studying are already
using UTF-7 as an intermediate step.

Example: 

> What is the current behaviour of imaplib in Python 3.4 with non-ASCII
> characters in mailbox names?

It breaks. Crash & burn.

-- 
Jesús Cea Avión _/_/  _/_/_/_/_/_/
j...@jcea.es - http://www.jcea.es/ _/_/_/_/  _/_/_/_/  _/_/
Twitter: @jcea_/_/_/_/  _/_/_/_/_/
jabber / xmpp:j...@jabber.org  _/_/  _/_/_/_/  _/_/  _/_/
"Things are not so easy"  _/_/  _/_/_/_/  _/_/_/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/_/_/_/  _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] mUTF-7 support?

2014-10-09 Thread Victor Stinner
2014-10-10 2:34 GMT+02:00 Jesus Cea :
>> What is the current behaviour of imaplib in Python 3.4 with non-ASCII
>> characters in mailbox names?
>
> It breaks. Crash & burn.

Oh ok. So in short, imaplib doesn't work on Python 3: it's a bug and
it must be fixed. I agree that a new codec is good idea and I will
support it!

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] mUTF-7 support?

2014-10-09 Thread Jesus Cea
On 10/10/14 02:43, Victor Stinner wrote:
> 2014-10-10 2:34 GMT+02:00 Jesus Cea :
>>> What is the current behaviour of imaplib in Python 3.4 with non-ASCII
>>> characters in mailbox names?
>>
>> It breaks. Crash & burn.
> 
> Oh ok. So in short, imaplib doesn't work on Python 3: it's a bug and
> it must be fixed. I agree that a new codec is good idea and I will
> support it!

Actually, it doesn't work in Python 2 either. It never supported
international mailbox names.

Should I dare to suggest to port this to 2.7, since 2.7 is special and
will be supported for a long time?. Or maybe this is something like
"Yes, Python 2 is broken, the real deal is Python 3"? :).

-- 
Jesús Cea Avión _/_/  _/_/_/_/_/_/
j...@jcea.es - http://www.jcea.es/ _/_/_/_/  _/_/_/_/  _/_/
Twitter: @jcea_/_/_/_/  _/_/_/_/_/
jabber / xmpp:j...@jabber.org  _/_/  _/_/_/_/  _/_/  _/_/
"Things are not so easy"  _/_/  _/_/_/_/  _/_/_/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/_/_/_/  _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Status of C compilers for Python on Windows

2014-10-09 Thread Nathaniel Smith
On Fri, Oct 10, 2014 at 1:29 AM, Victor Stinner
 wrote:
> Hi,
>
> Windows is not the primary target of Python developers, probably
> because most of them work on Linux. Official Python binaries are
> currently built by Microsoft Visual Studio. Even if Python developers
> get free licenses thanks for Microsoft, I would prefer to use an open
> source compiler if it would be possible. So *anyone* can build Python
> from scatch. I don't like the requirement of having a license to build
> Python. The free version (Visual Studio Express) only supports 32-bit
> and doesn't support PGO build (Profile-Guided Optimizations, which are
> disabled if I remember correctly because of compiler bugs).
>
> I know that it's hard to replace Visual Studio. I don't want to do it
> right now, but I would like to discuss that with you.
>
>
> === Open Watcom
>
> Jeffrey Armstrong is working on the Python support of OpenWatcom(v2), see:
> http://lightningpython.org/
> https://bitbucket.org/ArmstrongJ/lightning-python
>
> This compiler was initially written on MS-DOS in 32-bit, but it now
> supports Windows and Linux as well. The 64-bit mode is new and
> experimental. The Open Watcom "v2" project is actively developed at:
>
> https://github.com/open-watcom/open-watcom-v2/
>
> On Linux, Open Watcom don't support dynamic linking. On Windows, it
> uses its own C library. I'm not sure that Open Watcom is the best
> choice to build Python on Windows.
>
>
> === MinGW
>
> Some people tried to compile Python. See for example:
> https://bitbucket.org/puqing/python-mingw
>
> We even got some patches:
> http://bugs.python.org/issue3871 (rejected)
>
> See also:
> https://stackoverflow.com/questions/15365249/build-python-with-mingw-and-gcc
>
> MinGW reuses the Microsoft C library and it is based on GCC which is
> very stable, actively developed, supports a lot of archiectures, etc.
> I guess that it should be possible to reuse third party GCC tools like
> the famous GDB debugger?

You may want to get in touch with Carl Kleffner -- he's done a bunch
of work lately on getting a mingw-based toolchain to the point where
it can build numpy and scipy. (This is pretty urgent for us because
(a) numerical work requires a BLAS library and the main competitive
open-source one -- OpenBLAS -- cannot be built by msvc because of asm
syntax issues, (b) msvc's fortran support is even worse than its C99
support.) Getting this working is non-trivial, since by default
mingw-compiled code depends on the GCC runtime libraries, the default
ABI doesn't match msvc, etc. But apparently these issues are all
fixable.

General info:
  https://github.com/numpy/numpy/wiki/Mingw-static-toolchain

The built toolchains etc.:
  https://bitbucket.org/carlkl/mingw-w64-for-python/downloads

Readme:
  https://bitbucket.org/carlkl/mingw-w64-for-python/downloads/readme.txt

The patch to the numpy sources -- this in particular includes the
various distutils hacks needed to enable the crucial ABI-compatibility
switches:
  https://bitbucket.org/carlkl/mingw-w64-for-python/downloads/numpy.patch

(Unfortunately he doesn't seem to have posted the build recipe for the
toolchain itself -- I'm sure he'd be happy to if you asked though.)

AFAICT the end result is a single free compiler toolchain that can
spit out 32- and 64-bit binaries using whichever MSVC runtime you
prefer.

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] mUTF-7 support?

2014-10-09 Thread Victor Stinner
2014-10-10 2:52 GMT+02:00 Jesus Cea :
> "Yes, Python 2 is broken, the real deal is Python 3"? :).

For Unicode, my favorite answer is "it's time to upgrade! Python 3 has
a much better Unicode support." and not fix the issue on Python 2.7.

I don't want to open the can of worm "unicode" in Python 2. I don't
want to redo all the work I already did on Python 3.

For the specific case of the new codec, I don't know. It will be
easier to decide when the bug will be fully fixed in Python 3.5, to
see the size of the changeset.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] mUTF-7 support?

2014-10-09 Thread Dan Stromberg
On Thu, Oct 9, 2014 at 3:47 PM, Jesus Cea  wrote:
> I miss mUTF-7 support (as used to encode IMAP4 mailbox names) in Python,
> in the codecs module. As an european with a language with 27 different
> letters (instead of english 26), tildes, opening question marks, etc., I
> find it very inconvenient.
>
> This encoding is used basically only in IMAP4, I know. But IMAP4 is an
> important protocol and all projects related to it needs mUTF-7 support
> if they care about non-english alphabets. Everybody has already an
> implementation, waste of effort.

I've been parsing up a huge gmail account with no encoding errors,
using CPython 2.x and CPython 3.x.  I'd be surprised if there are no
foreign characters in any of the thousands of messages there - but
maybe I'm just being very lucky.  I'm not specifying a codec, and I
don't see a way of specifying one offhand.

Does email.header.decode_header help you?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] mUTF-7 support?

2014-10-09 Thread Antoine Pitrou
On Thu, 9 Oct 2014 19:12:29 -0700
Dan Stromberg  wrote:
> On Thu, Oct 9, 2014 at 3:47 PM, Jesus Cea  wrote:
> > I miss mUTF-7 support (as used to encode IMAP4 mailbox names) in Python,
> > in the codecs module. As an european with a language with 27 different
> > letters (instead of english 26), tildes, opening question marks, etc., I
> > find it very inconvenient.
> >
> > This encoding is used basically only in IMAP4, I know. But IMAP4 is an
> > important protocol and all projects related to it needs mUTF-7 support
> > if they care about non-english alphabets. Everybody has already an
> > implementation, waste of effort.
> 
> I've been parsing up a huge gmail account with no encoding errors,
> using CPython 2.x and CPython 3.x.  I'd be surprised if there are no
> foreign characters in any of the thousands of messages there - but
> maybe I'm just being very lucky.  I'm not specifying a codec, and I
> don't see a way of specifying one offhand.

AFAIU, this is specifically about mailbox names, not messages.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] mUTF-7 support?

2014-10-09 Thread Chris Angelico
On Fri, Oct 10, 2014 at 11:52 AM, Jesus Cea  wrote:
> Actually, it doesn't work in Python 2 either. It never supported
> international mailbox names.
>
> Should I dare to suggest to port this to 2.7, since 2.7 is special and
> will be supported for a long time?. Or maybe this is something like
> "Yes, Python 2 is broken, the real deal is Python 3"? :).

That's ultimately up to the release manager, but IMO this sounds like
a bug to be fixed, more than a feature being added.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] mUTF-7 support?

2014-10-09 Thread R. David Murray
On Fri, 10 Oct 2014 04:28:21 +0200, Antoine Pitrou  wrote:
> On Thu, 9 Oct 2014 19:12:29 -0700
> Dan Stromberg  wrote:
> > On Thu, Oct 9, 2014 at 3:47 PM, Jesus Cea  wrote:
> > > I miss mUTF-7 support (as used to encode IMAP4 mailbox names) in Python,
> > > in the codecs module. As an european with a language with 27 different
> > > letters (instead of english 26), tildes, opening question marks, etc., I
> > > find it very inconvenient.
> > >
> > > This encoding is used basically only in IMAP4, I know. But IMAP4 is an
> > > important protocol and all projects related to it needs mUTF-7 support
> > > if they care about non-english alphabets. Everybody has already an
> > > implementation, waste of effort.
> > 
> > I've been parsing up a huge gmail account with no encoding errors,
> > using CPython 2.x and CPython 3.x.  I'd be surprised if there are no
> > foreign characters in any of the thousands of messages there - but
> > maybe I'm just being very lucky.  I'm not specifying a codec, and I
> > don't see a way of specifying one offhand.
> 
> AFAIU, this is specifically about mailbox names, not messages.

Specifically, it is about what we might better term mailbox
*folders*...that is, not what you would normally think of as the
'mailbox name', which is usually understood to be the thing before the @
in the email address (and can't contain non-ASCII yet...we need RFC 6855
support for that, and I'm not sure *anybody* has that yet).

In this context it is the names you give to folders on the IMAP
server...starting (usually) with INBOX and adding from there.  These
names are used in IMAP commands (ex: the 'select' or 'create' commands),
and IMAP uses mUTF-7 for those.

--David
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] mUTF-7 support?

2014-10-09 Thread Antoine Pitrou
On Fri, 10 Oct 2014 13:36:49 +1100
Chris Angelico  wrote:
> On Fri, Oct 10, 2014 at 11:52 AM, Jesus Cea  wrote:
> > Actually, it doesn't work in Python 2 either. It never supported
> > international mailbox names.
> >
> > Should I dare to suggest to port this to 2.7, since 2.7 is special and
> > will be supported for a long time?. Or maybe this is something like
> > "Yes, Python 2 is broken, the real deal is Python 3"? :).
> 
> That's ultimately up to the release manager, but IMO this sounds like
> a bug to be fixed, more than a feature being added.

Well, it would be a bug if we had claimed to support it.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] mUTF-7 support?

2014-10-09 Thread Glenn Linderman

On 10/9/2014 7:41 PM, R. David Murray wrote:

Specifically, it is about what we might better term mailbox
*folders*...that is, not what you would normally think of as the
'mailbox name', which is usually understood to be the thing before the @
in the email address (and can't contain non-ASCII yet...we need RFC 6855
support for that, and I'm not sure*anybody*  has that yet).


There are still lots of idiotic web sites that assume everything in 
front of the @ must be a letter, digit, dot, or hyphen, and even some 
that only permit one dot after the @... even though for 30 years or so, 
the RFCs have permitted a nice variety of other special characters, 
although not all of them.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] mUTF-7 support?

2014-10-09 Thread Chris Angelico
On Fri, Oct 10, 2014 at 3:05 PM, Glenn Linderman  wrote:
> There are still lots of idiotic web sites that assume everything in front of
> the @ must be a letter, digit, dot, or hyphen, and even some that only
> permit one dot after the @... even though for 30 years or so, the RFCs have
> permitted a nice variety of other special characters, although not all of
> them.

And heaps that require a dot after the @, which is definitely not a requirement.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com