Re: [Pythonmac-SIG] unicode problem w/pyapp 2.5 vs 2.6

2009-02-21 Thread has

tom wible wrote:

i've recently installed 2.6 on my minimac pvr, and it raised a  
unicode issue:


under 2.5, the filename returned from an applescript.app is plain  
text:

[...]
>>> sfn
u'New York Goes To War_Jan_17_2009__08_00_26-1_AM.m2t'
[...]
but under 2.6:
[...]
 >>> sfn
u'\u4e00\u6500\u7700\u2000\u5900\u6f00\u7200\u6b00\u2000\u4700\u6f00\u6500\u7300\u2000\u5400\u6f00\u2000\u5700\u6100\u7200\u5f00\u4a00\u6100\u6e00\u5f00\u3100\u3700\u5f00\u3200\u3000\u3000\u3900\u5f00\u5f00\u3000\u3800\u5f00\u3000\u3000\u5f00\u3200\u3600\u2d00\u3100\u5f00\u4100\u4d00\u2e00\u6d00\u3200\u7400'
[...]
i had simply copied aem from the 2.5 site-packages to the 2.6's...is  
there

something i missed in doing that? some data is ok (the dates)


There's a known issue in Python 2.6's Unicode APIs; py-appscript  
0.19.0+ contains a workaround for this.


BTW, Python modules/extensions are not officially binary compatible  
across major Python releases, so you should be installing afresh anyway.


HTH

has
--
Control AppleScriptable applications from Python, Ruby and ObjC:
http://appscript.sourceforge.net

___
Pythonmac-SIG maillist  -  Pythonmac-SIG@python.org
http://mail.python.org/mailman/listinfo/pythonmac-sig


Re: [Pythonmac-SIG] unicode problem w/pyapp 2.5 vs 2.6

2009-02-21 Thread Nicholas Riley
On Sat, Feb 21, 2009 at 07:44:12AM -0500, tom wible wrote:
> i had simply copied aem from the 2.5 site-packages to the 2.6's...is there 
> something i missed in doing that? some data is ok (the dates)

Looks like you might have a UCS-4 version of one Python and a UCS-2
version of the other.  Extension modules are not compatible between
the two, although usually you get a link error.

It's probably easiest just to reinstall install appscript on 2.6.

-- 
Nicholas Riley  | 
___
Pythonmac-SIG maillist  -  Pythonmac-SIG@python.org
http://mail.python.org/mailman/listinfo/pythonmac-sig


[Pythonmac-SIG] unicode problem w/pyapp 2.5 vs 2.6

2009-02-21 Thread tom wible

i've recently installed 2.6 on my minimac pvr, and it raised a unicode issue:

under 2.5, the filename returned from an applescript.app is plain text:

tomsdvr:/DVR/recordings dvr$ /usr/local/bin/python2.5
Python 2.5.2 (r252:60911, Feb 22 2008, 07:57:53)
[GCC 4.0.1 (Apple Computer, Inc. build 5363)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import aem
>>> playrec  = aem.Application(aem.findapp.byname('playRec'))
>>> recIndx=22
>>> [sfn, title, eptitle, descr, startDT, stopDT] = playrec.event('ascrpsbr', 
{'snam':'getiteminfo', '': [recIndx]}).send()

>>> sfn
u'New York Goes To War_Jan_17_2009__08_00_26-1_AM.m2t'
>>>
>>> startDT
'Saturday, January 17, 2009 8:00:00 AM'

but under 2.6:
tomsdvr:/DVR/recordings dvr$ python
Python 2.6.1 (r261:67515, Dec  6 2008, 16:42:21)
[GCC 4.0.1 (Apple Computer, Inc. build 5370)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import aem
>>> playrec  = aem.Application(aem.findapp.byname('playRec'))
>>> [sfn, title, eptitle, descr, startDT, stopDT] = playrec.event('ascrpsbr', 
{'snam':'getiteminfo', '': [recIndx]}).send()

Traceback (most recent call last):
  File "", line 1, in 
NameError: name 'recIndx' is not defined
>>> recIndx=31
>>> [sfn, title, eptitle, descr, startDT, stopDT] = playrec.event('ascrpsbr', 
{'snam':'getiteminfo', '': [recIndx]}).send()

>>> sfn
u'\u4e00\u6500\u7700\u2000\u5900\u6f00\u7200\u6b00\u2000\u4700\u6f00\u6500\u7300\u2000\u5400\u6f00\u2000\u5700\u6100\u7200\u5f00\u4a00\u6100\u6e00\u5f00\u3100\u3700\u5f00\u3200\u3000\u3000\u3900\u5f00\u5f00\u3000\u3800\u5f00\u3000\u3000\u5f00\u3200\u3600\u2d00\u3100\u5f00\u4100\u4d00\u2e00\u6d00\u3200\u7400'
>>> startDT
'Saturday, January 17, 2009 8:00:00 AM'


i had simply copied aem from the 2.5 site-packages to the 2.6's...is there 
something i missed in doing that? some data is ok (the dates)

___
Pythonmac-SIG maillist  -  Pythonmac-SIG@python.org
http://mail.python.org/mailman/listinfo/pythonmac-sig


Re: [Pythonmac-SIG] Unicode and split

2008-05-23 Thread Jeremy Reichman
Thanks to everyone who replied!

I'll take a further look into the encoding of the file because I'm
interested in that for other reasons. In the output I saw, u"\xe1" (and a
few others I found after sending my note) were prevalent around the splits.

For the moment, though, I've solved my immediate difficulty by splitting
twice. I really only need the space delimited fields that appear after a tab
in each line, and the characters causing problems are always before that. I
split by tab first and then a normal split of that gets me to the fields I
need.


-- 
Jeremy


___
Pythonmac-SIG maillist  -  Pythonmac-SIG@python.org
http://mail.python.org/mailman/listinfo/pythonmac-sig


Re: [Pythonmac-SIG] Unicode and split

2008-05-23 Thread Christopher Barker

Jeremy Reichman wrote:

I have some characters in line strings in a file I'm processing that appear
to be Unicode. (When I print them to the shell from my script, they are
Asian characters for files like fonts in the Mac OS X filesystem.)

When I run a.split() on the affected line strings, they split on what I'm
guessing is considered a Unicode whitespace character. Specifically, the
culprit seems to be '\xe1':

$ python -c 'print "\xe1"'
?


actually, u'xe1' is a lower case accented a: á (if the unicode comes 
through email OK), so I doubt that python is splitting on that.


Also, when you do the above, you're creating a regular string, not a 
unicode object. If you do:


$ python -c 'print u"\xe1"'
á

You may get the right thing, if you're terminal is set up right to 
display unicode.


I suspect your problem is that you aren't decoding the input file 
correctly. The whole problem with unicode (and indeed, any non-ascii 
encoding), is that you need to know what encoding your data is, in order 
to use it. if it looks mostly OK when interpreted as ASCII, then in 
MIGHT be utf8, so try reading in your file and decoding it this way:


contents = myfile.read().decode('utf8')

Then do your splitting. If it's not utf8, then you'll need to figure out 
what it is.


First, read this:
http://www.joelonsoftware.com/articles/Unicode.html

then take a look at some of the python unicode tutorials, this is only 
one of them:


http://www.reportlab.com/i18n/python_unicode_tutorial.html

there are other good ones.

-Chris



--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

[EMAIL PROTECTED]

___
Pythonmac-SIG maillist  -  Pythonmac-SIG@python.org
http://mail.python.org/mailman/listinfo/pythonmac-sig


[Pythonmac-SIG] Unicode and split

2008-05-23 Thread Jeremy Reichman
I have some characters in line strings in a file I'm processing that appear
to be Unicode. (When I print them to the shell from my script, they are
Asian characters for files like fonts in the Mac OS X filesystem.)

When I run a.split() on the affected line strings, they split on what I'm
guessing is considered a Unicode whitespace character. Specifically, the
culprit seems to be '\xe1':

$ python -c 'print "\xe1"'
?

I want to split only only ASCII spaces and tabs, however. Unfortunately, the
line strings from the file may be split on space runs and/or tabs -- and I
have no control over what was originally written to the source files -- so
the defaults for a.split() are otherwise ideal. The split method works on
most lines I'm processing perfectly well.

I'd rather not have to import the 're' module to split on a regular
expression.

Does anyone have any suggestions on how to handle this? I'm in Apple's
Python 2.5.1 in Leopard, and I'd also like to remain compatible with 2.3.x
in Tiger. I'd appreciate advice, thanks!


-- 
Jeremy


___
Pythonmac-SIG maillist  -  Pythonmac-SIG@python.org
http://mail.python.org/mailman/listinfo/pythonmac-sig


Re: [Pythonmac-SIG] Unicode path problems (was: Re: suggestions for an appscript FAQ)

2008-03-19 Thread Henning Hraban Ramm
Am 2008-03-18 um 20:13 schrieb has:
>> E.g. I had big problems with Finder labels - often got "16393" as
>> label_index for some tries, until finally the right number showed
>> up. Can't reproduce that any more, without changing my installation.
>
> Sounds very odd. Doubt the problem is coming from appscript itself,
> unless its installation has gotten screwed up. You might check that
> any older versions have been completely removed, doing a thorough
> cleanout and fresh reinstall if needed. Also, where a problem is
> encountered, try running the equivalent AppleScript to see if the
> problem occurs there as well.

I guess it has to do with my own buggy path conversion - or perhaps  
it's a problem with a SMB mounted volume.
If the file/folder is visible in Finder, it works better.

Can't trace it at the moment - working long shifts every day to get  
my most urgent magazine projects to the printer.

>> As I posted before, it was a bit hard to find out where to apply
>> which unicode conversion to file paths. (And before you advised me
>> about mactypes I tried to convert Unix to Mac paths myself, which
>> didn't always work.)
>>
>> This way works for me, hopefully it will last:
>>
>> myfile = mactypes.Alias(os.path.abspath(unicodedata.normalize('NFD',
>> unicode(sys.argv[1], 'utf-8'
>
> I'll need to look into normalisation issues when I've time, although I
> would've assumed the filesystem APIs would apply any necessary
> normalisation at their end.

Unfortunately it doesn't. Or at least not reliable.
And it's hopeless to try to teach your colleagues and customers to  
use only ASCII file names - some still use slashes in file names and  
leave out extensions because it worked in OS9 (one customer still  
works on OS7, I guess...)

>> (Or could it depend on TerminologyServer running? Will check that
>> tomorrow.)
> TerminologyServer is long, long gone. If you've still got a copy
> kicking around then you definitely need to do some cleaning-up. :)

Ok, thanks, deleted it.
The other old stuff was cleaned out before, but there was no  
replacement for TS, so I left it...

As I mentioned: since I got the path stuff right, everything works  
(as soon as I figure out how some strange applescript reference  
property works).

Now I must catch up on my Python OOP skills - I started with 2.2 and  
didn't embrace a lot of new syntax since 2.3.
Also I must check whether there are appscript shortcuts to some code  
I tinkered.
Otherwise I'd be ashamed to provide my code as an appscript example.

>> Ok, perhaps you could answer generally, why some items are "NOT
>> AVAILABLE (YET)".
>> Or did I overlook that somewhere?
> Who can explain what goes on in application developers' heads? These
> commands were supported in Finder on OS 9, but weren't re-implemented
> when it was ported over to OS X. For whatever reason, the Finder
> developers left the dictionary definitions in instead of doing the
> obvious and sensible thing which would be to hide or remove them. One
> of the few consistencies in scriptable applications, particularly
> Carbon ones, is that they are inconsistent.

Ah, sorry, didn't realize that it comes from the AS dictionary  
itself, I thought those were some items you couldn't wrap for Python.

>> I think the worst problem with appscript is the "strange behaviour"
>> id you're used to either linear or object oriented or really
>> asynchronous code - appscript looks like linear & object oriented,
>> but is "a bit" asynchronous (but not like twisted's Deferreds) and
>> instead of "real" objects you get those dynamic references.
>
> The OO-like syntax can be misleading for newcomers, but dressing up
> query-driven APIs in OO-ish syntax for conciseness and readability
> isn't unique to appscript; e.g. SQLObject. The key to 'getting'
> appscript is realising that Mac application scripting is based on RPC
> +queries, and what that implies.
>
> The problem is a communication one: either folks aren't reading the
> appscript documentation, or they aren't understanding it. I know that
> some folk fall into the first trap (they look at existing code
> samples, and assume that since it looks like OOP, which they already
> know, it must behave like it as well), and I've no doubt that others
> fall into the second (since my writing is less than stellar).
>
> Any suggestions on how to address either of these problems will be
> very welcome.

Don't know. You do explain that behaviour. But I did understand only  
after some experimenting what you mean. (Being no native speaker  
doesn't help ;-)

On one hand it would be nice to get examples how to write really  
asynchronous code with appscript (e.g. with PyDispatcher or Twisted),  
on the other hand you normally need the seemingly linear behaviour of  
appscript - I doubt if it would work to send several events to an  
application in an asychronous way - perhaps to different apps, but  
then you would better use subprocessing...

Oh, another FAQ:
Does appscript ca

Re: [Pythonmac-SIG] Unicode path problems (was: Re: suggestions for an appscript FAQ)

2008-03-18 Thread has
Henning Hraban Ramm wrote:

>> I struggled a lot with paths containing non-ASCII characters.
>> Hmmm. Is this with the 0.18.1 release? Do you get the same problem
>> with the current appscript trunk?
>
> Sorry - since yesterday it works (with 0.18.1).
> Some problems with appscript seem to appear or disappear  
> unreproducable. :-(
>
> E.g. I had big problems with Finder labels - often got "16393" as  
> label_index for some tries, until finally the right number showed  
> up. Can't reproduce that any more, without changing my installation.

Sounds very odd. Doubt the problem is coming from appscript itself,  
unless its installation has gotten screwed up. You might check that  
any older versions have been completely removed, doing a thorough  
cleanout and fresh reinstall if needed. Also, where a problem is  
encountered, try running the equivalent AppleScript to see if the  
problem occurs there as well.


> As I posted before, it was a bit hard to find out where to apply  
> which unicode conversion to file paths. (And before you advised me  
> about mactypes I tried to convert Unix to Mac paths myself, which  
> didn't always work.)
>
> This way works for me, hopefully it will last:
>
> myfile = mactypes.Alias(os.path.abspath(unicodedata.normalize('NFD',  
> unicode(sys.argv[1], 'utf-8'

I'll need to look into normalisation issues when I've time, although I  
would've assumed the filesystem APIs would apply any necessary  
normalisation at their end.


>>> << 1 >>
>>> Why does this work:
>>> document = InDesign.make(new=k.document)
>>> document.save(to=ComposedFile(path))
>>>
>>> but not this:
>>> document =  
>>> InDesign.make(new=k.document).save(to=ComposedFile(path))
>>
>> Seems to work on CS3. What are you expecting it to do vs. what is it
>> actually doing?
>
> If I chained it, it just wouldn't save.
> Again: can't reproduce the error.

The two above examples create and save the document in exactly the  
same way. Python+appscript isn't like AppleScript, where the  
interpreter inserts magic behaviours between lines. Could be something  
else in your setup is acting up, resulting in hard-to-trace errors  
cropping up elsewhere.


> I guess my path wasn't perfectly converted until today or something  
> like that.
> (Or could it depend on TerminologyServer running? Will check that  
> tomorrow.)

TerminologyServer is long, long gone. If you've still got a copy  
kicking around then you definitely need to do some cleaning-up. :)


>>> << 2 >>
>>> Why is Finder.copy not (yet) implemented? Will it soon?
>>
>> Application-specific problems are outside the scope of appscript's
>> FAQ, but I'm going to add a "I'm having trouble scripting > app>. What do I do now?" topic (suggested on rb-appscript-discuss)
>> pointing users to appropriate forums (AppleScript-users, etc.) as a
>> general cover-all.
>
> Ok, perhaps you could answer generally, why some items are "NOT  
> AVAILABLE (YET)".
> Or did I overlook that somewhere?

Who can explain what goes on in application developers' heads? These  
commands were supported in Finder on OS 9, but weren't re-implemented  
when it was ported over to OS X. For whatever reason, the Finder  
developers left the dictionary definitions in instead of doing the  
obvious and sensible thing which would be to hide or remove them. One  
of the few consistencies in scriptable applications, particularly  
Carbon ones, is that they are inconsistent.


> I think the worst problem with appscript is the "strange behaviour"  
> id you're used to either linear or object oriented or really  
> asynchronous code - appscript looks like linear & object oriented,  
> but is "a bit" asynchronous (but not like twisted's Deferreds) and  
> instead of "real" objects you get those dynamic references.

The OO-like syntax can be misleading for newcomers, but dressing up  
query-driven APIs in OO-ish syntax for conciseness and readability  
isn't unique to appscript; e.g. SQLObject. The key to 'getting'  
appscript is realising that Mac application scripting is based on RPC 
+queries, and what that implies.

The problem is a communication one: either folks aren't reading the  
appscript documentation, or they aren't understanding it. I know that  
some folk fall into the first trap (they look at existing code  
samples, and assume that since it looks like OOP, which they already  
know, it must behave like it as well), and I've no doubt that others  
fall into the second (since my writing is less than stellar).

Any suggestions on how to address either of these problems will be  
very welcome.


> Additionally every application behaves differently and you've to  
> find out what works how (or not) - or how they call something  
> internally what you know with some (translated) name from GUI and  
> manual...

Lousy API documentation is a chronic problem amongst scriptable  
applications. File feature requests with application developers asking  
for improved documentation. The 

Re: [Pythonmac-SIG] Unicode path problems (was: Re: suggestions for an appscript FAQ)

2008-03-15 Thread Henning Hraban Ramm
Am 2008-03-15 um 21:37 schrieb has:

> Henning Hraban Ramm wrote:
>> I struggled a lot with paths containing non-ASCII characters.
> Hmmm. Is this with the 0.18.1 release? Do you get the same problem
> with the current appscript trunk?

Sorry - since yesterday it works (with 0.18.1).
Some problems with appscript seem to appear or disappear  
unreproducable. :-(

E.g. I had big problems with Finder labels - often got "16393" as  
label_index for some tries, until finally the right number showed up.  
Can't reproduce that any more, without changing my installation.

As I posted before, it was a bit hard to find out where to apply  
which unicode conversion to file paths. (And before you advised me  
about mactypes I tried to convert Unix to Mac paths myself, which  
didn't always work.)

This way works for me, hopefully it will last:

myfile = mactypes.Alias(os.path.abspath(unicodedata.normalize('NFD',  
unicode(sys.argv[1], 'utf-8'


Greetlings from Lake Constance!
Hraban
---
http://www.fiee.net
https://www.cacert.org (I'm an assurer)


___
Pythonmac-SIG maillist  -  Pythonmac-SIG@python.org
http://mail.python.org/mailman/listinfo/pythonmac-sig


Re: [Pythonmac-SIG] Unicode path problems (was: Re: suggestions for an appscript FAQ)

2008-03-15 Thread has
Henning Hraban Ramm wrote:

> I struggled a lot with paths containing non-ASCII characters.

Hmmm. Is this with the 0.18.1 release? Do you get the same problem  
with the current appscript trunk?

Ta,

has
-- 
Control AppleScriptable applications from Python, Ruby and ObjC:
http://appscript.sourceforge.net

___
Pythonmac-SIG maillist  -  Pythonmac-SIG@python.org
http://mail.python.org/mailman/listinfo/pythonmac-sig


Re: [Pythonmac-SIG] Unicode

2007-03-15 Thread Daniel Lord
Good article link, Thanks.

On Mar 14, 2007, at 9:18 PM, Bob Ippolito wrote:

> Here's a very recent, well written and pertinent article:
>
> http://boodebr.org/main/python/all-about-python-and-unicode
>
> -bob
> ___
> Pythonmac-SIG maillist  -  Pythonmac-SIG@python.org
> http://mail.python.org/mailman/listinfo/pythonmac-sig

___
Pythonmac-SIG maillist  -  Pythonmac-SIG@python.org
http://mail.python.org/mailman/listinfo/pythonmac-sig


Re: [Pythonmac-SIG] Unicode

2007-03-14 Thread Bob Ippolito
Here's a very recent, well written and pertinent article:

http://boodebr.org/main/python/all-about-python-and-unicode

-bob

On 3/14/07, Dougal Graham <[EMAIL PROTECTED]> wrote:
> Thanks for the quick reply! As I'm sure you can tell, I'm still fairly
> new to Python. Do you know of a tutorial on how to properly manage
> unicode in Python, then?
>
> I ran into trouble when trying to run a command containing unicode
> characters through commands.getoutput()...
>
> On 3/15/07, Bob Ippolito <[EMAIL PROTECTED]> wrote:
> > On 3/14/07, Dougal Graham <[EMAIL PROTECTED]> wrote:
> > > Hi there,
> > >
> > > I am having a problem with figuring out how to set utf-8 as the
> > > default encoding for python. I have found various references to
> > > sitecustomize.py, but I'm not sure where to put that file. I just
> > > recently updated to python 2.5 using the .dmg file from python.org.
> > >
> >
> > You really don't want to do that. The default encoding should always
> > be ASCII, setting it to anything else breaks some invariants.
> >
> > The site-packages folder is one place to put stuff. It lives in
> > /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages
> >
> > However, don't do it in this case. The reason you haven't been able to
> > find a good resource that tells you how to do it is because it's not
> > the correct thing to do.
> >
> > Here's Frederik's thoughts on the setting (and he is certainly an
> > authority on Python's unicode implementation).
> >
> > """
> > sys.setdefaultencoding() was added for experimentation during Unicode
> > development, and should not be used in production code. All sorts of
> > ugliness can happen if you mess around with the conversion rules
> > (especially if you use a variable-width encoding). It's not that hard
> > to write encoding-aware code, really.
> > """
> >
> > -bob
> >
>
>
> --
> Dougal Graham
> Home: (709) 753-2831
> Cell: (709) 351-0587
> ___
> Pythonmac-SIG maillist  -  Pythonmac-SIG@python.org
> http://mail.python.org/mailman/listinfo/pythonmac-sig
>
___
Pythonmac-SIG maillist  -  Pythonmac-SIG@python.org
http://mail.python.org/mailman/listinfo/pythonmac-sig


Re: [Pythonmac-SIG] Unicode

2007-03-14 Thread Dougal Graham
Thanks for the quick reply! As I'm sure you can tell, I'm still fairly
new to Python. Do you know of a tutorial on how to properly manage
unicode in Python, then?

I ran into trouble when trying to run a command containing unicode
characters through commands.getoutput()...

On 3/15/07, Bob Ippolito <[EMAIL PROTECTED]> wrote:
> On 3/14/07, Dougal Graham <[EMAIL PROTECTED]> wrote:
> > Hi there,
> >
> > I am having a problem with figuring out how to set utf-8 as the
> > default encoding for python. I have found various references to
> > sitecustomize.py, but I'm not sure where to put that file. I just
> > recently updated to python 2.5 using the .dmg file from python.org.
> >
>
> You really don't want to do that. The default encoding should always
> be ASCII, setting it to anything else breaks some invariants.
>
> The site-packages folder is one place to put stuff. It lives in
> /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages
>
> However, don't do it in this case. The reason you haven't been able to
> find a good resource that tells you how to do it is because it's not
> the correct thing to do.
>
> Here's Frederik's thoughts on the setting (and he is certainly an
> authority on Python's unicode implementation).
>
> """
> sys.setdefaultencoding() was added for experimentation during Unicode
> development, and should not be used in production code. All sorts of
> ugliness can happen if you mess around with the conversion rules
> (especially if you use a variable-width encoding). It's not that hard
> to write encoding-aware code, really.
> """
>
> -bob
>


-- 
Dougal Graham
Home: (709) 753-2831
Cell: (709) 351-0587
___
Pythonmac-SIG maillist  -  Pythonmac-SIG@python.org
http://mail.python.org/mailman/listinfo/pythonmac-sig


Re: [Pythonmac-SIG] Unicode

2007-03-14 Thread Bob Ippolito
On 3/14/07, Dougal Graham <[EMAIL PROTECTED]> wrote:
> Hi there,
>
> I am having a problem with figuring out how to set utf-8 as the
> default encoding for python. I have found various references to
> sitecustomize.py, but I'm not sure where to put that file. I just
> recently updated to python 2.5 using the .dmg file from python.org.
>

You really don't want to do that. The default encoding should always
be ASCII, setting it to anything else breaks some invariants.

The site-packages folder is one place to put stuff. It lives in
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages

However, don't do it in this case. The reason you haven't been able to
find a good resource that tells you how to do it is because it's not
the correct thing to do.

Here's Frederik's thoughts on the setting (and he is certainly an
authority on Python's unicode implementation).

"""
sys.setdefaultencoding() was added for experimentation during Unicode
development, and should not be used in production code. All sorts of
ugliness can happen if you mess around with the conversion rules
(especially if you use a variable-width encoding). It's not that hard
to write encoding-aware code, really.
"""

-bob
___
Pythonmac-SIG maillist  -  Pythonmac-SIG@python.org
http://mail.python.org/mailman/listinfo/pythonmac-sig


[Pythonmac-SIG] Unicode

2007-03-14 Thread Dougal Graham
Hi there,

I am having a problem with figuring out how to set utf-8 as the
default encoding for python. I have found various references to
sitecustomize.py, but I'm not sure where to put that file. I just
recently updated to python 2.5 using the .dmg file from python.org.

Any help would be greatly appreciated,

Thanks,

-Dougal

-- 
Dougal Graham
Home: (709) 753-2831
Cell: (709) 351-0587
___
Pythonmac-SIG maillist  -  Pythonmac-SIG@python.org
http://mail.python.org/mailman/listinfo/pythonmac-sig


Re: [Pythonmac-SIG] Unicode Filenames on the Mac

2005-07-15 Thread Piet van Oostrum
> Bob Ippolito <[EMAIL PROTECTED]> (BI) wrote:

> import sys
> sys.getfilesystemencoding()
>BI> 'utf-8'

It is UTF-8, but you must be careful: the filenames are in normalized (or
whatever they call it) UTF-8, meaning that accented letters are split up
into the letter followed by the accent. The filename API does accept the
composed accented letters, but normalizes them, and that is what the
listdir calls return.

>>> fn = u'\u00E1'
>>> f = open(fn,'w')
>>> f.close()

We now have a file with name 'á'

>>> import os
>>> os.listdir (u'.')
[u'a\u0301']

The accent follows the 'a'.
-- 
Piet van Oostrum <[EMAIL PROTECTED]>
URL: http://www.cs.uu.nl/~piet [PGP 8DAE142BE17999C4]
Private email: [EMAIL PROTECTED]
___
Pythonmac-SIG maillist  -  Pythonmac-SIG@python.org
http://mail.python.org/mailman/listinfo/pythonmac-sig


Re: [Pythonmac-SIG] Unicode Filenames on the Mac

2005-07-14 Thread Bob Ippolito

On Jul 14, 2005, at 9:17 AM, Nick Matsakis wrote:

>
> On Wed, 13 Jul 2005, Bob Ippolito wrote:
>
>
>> HFS actually uses UTF-16 internally, but the POSIX layer is UTF-8.
>> It will bite you if you expect the code to work on other platforms.
>> Not all platforms use UTF-8 for their filesystem encoding.
>>
>
> I don't care about other platforms, but I assume from your message  
> that
> sending 'unicode' strings to system modules is safe (and a best  
> practice
> too?).

yes

-bob

___
Pythonmac-SIG maillist  -  Pythonmac-SIG@python.org
http://mail.python.org/mailman/listinfo/pythonmac-sig


Re: [Pythonmac-SIG] Unicode Filenames on the Mac

2005-07-14 Thread Nick Matsakis

On Wed, 13 Jul 2005, Bob Ippolito wrote:

> HFS actually uses UTF-16 internally, but the POSIX layer is UTF-8.
> It will bite you if you expect the code to work on other platforms.
> Not all platforms use UTF-8 for their filesystem encoding.

I don't care about other platforms, but I assume from your message that
sending 'unicode' strings to system modules is safe (and a best practice
too?).

Nick
___
Pythonmac-SIG maillist  -  Pythonmac-SIG@python.org
http://mail.python.org/mailman/listinfo/pythonmac-sig


Re: [Pythonmac-SIG] Unicode Filenames on the Mac

2005-07-13 Thread Bob Ippolito

On Jul 13, 2005, at 6:05 PM, Nick Matsakis wrote:

>
> What is the best way to deal with non-ASCII paths when working with  
> the
> python standard library? Specifically, when using functions like  
> open()
> and the os and glob modules, what should be passed in?  What should I
> expect out?

If you pass unicode in, you get unicode out:

 >>> import os
 >>> set(map(type, os.listdir('.')))
set([])
 >>> set(map(type, os.listdir(u'.')))
set([])

Otherwise you pass and receive byte strings.  The encoding of those  
byte strings is fixed:

 >>> import sys
 >>> sys.getfilesystemencoding()
'utf-8'

> In experimenting with it, it appears that these libraries accept str
> objects containing UTF-8 encoded bytes and similarly that is what they
> return.  It would seem better to me if they could be made to accept  
> and
> return unicode objects, but I could see that that might cause  
> backwards
> compatibility problems.  Still, is UTF-8 encoded strs really a safe  
> bet?
> Are there circumstances, including non HFS filesystems, where it  
> will bite
> me if I make this assumption?

HFS actually uses UTF-16 internally, but the POSIX layer is UTF-8.   
It will bite you if you expect the code to work on other platforms.   
Not all platforms use UTF-8 for their filesystem encoding.

-bob

___
Pythonmac-SIG maillist  -  Pythonmac-SIG@python.org
http://mail.python.org/mailman/listinfo/pythonmac-sig


[Pythonmac-SIG] Unicode Filenames on the Mac

2005-07-13 Thread Nick Matsakis

What is the best way to deal with non-ASCII paths when working with the
python standard library? Specifically, when using functions like open()
and the os and glob modules, what should be passed in?  What should I
expect out?

In experimenting with it, it appears that these libraries accept str
objects containing UTF-8 encoded bytes and similarly that is what they
return.  It would seem better to me if they could be made to accept and
return unicode objects, but I could see that that might cause backwards
compatibility problems.  Still, is UTF-8 encoded strs really a safe bet?
Are there circumstances, including non HFS filesystems, where it will bite
me if I make this assumption?

Nick
___
Pythonmac-SIG maillist  -  Pythonmac-SIG@python.org
http://mail.python.org/mailman/listinfo/pythonmac-sig