Re: japanese encoding iso-2022-jp in python vs. perl

2007-10-24 Thread Leo Kislov
On Oct 23, 3:37 am, kettle [EMAIL PROTECTED] wrote:
 Hi,
   I am rather new to python, and am currently struggling with some
 encoding issues.  I have some utf-8-encoded text which I need to
 encode as iso-2022-jp before sending it out to the world. I am using
 python's encode functions:
 --
  var = var.encode(iso-2022-jp, replace)
  print var
 --

  I am using the 'replace' argument because there seem to be a couple
 of utf-8 japanese characters which python can't correctly convert to
 iso-2022-jp.  The output looks like this:
 ↓東京???日比谷線?北千住行

  However if use perl's encode module to re-encode the exact same bit
 of text:
 --
  $var = encode(iso-2022-jp, decode(utf8, $var))
  print $var
 --

  I get proper output (no unsightly question-marks):
 ↓東京メトロ日比谷線・北千住行

 So, what's the deal?  

Thanks that I have my crystal ball working. I can see clearly that the
forth
character of the input is 'HALFWIDTH KATAKANA LETTER ME' (U+FF92)
which is
not present in ISO-2022-JP as defined by RFC 1468 so python converts
it into
question mark as you requested. Meanwhile perl as usual is trying to
guess what
you want and silently converts that character into 'KATAKANA LETTER
ME' (U+30E1)
which is present in ISO-2022-JP.

 Why can't python properly encode some of these
 characters?

Because Explicit is better than implicit. Do you care about
roundtripping?
Do you care about width of characters? What about full-width  (U
+FF02)? Python
doesn't know answers to these questions so it doesn't do anything with
your
input. You have to do it yourself. Assuming you don't care about
roundtripping
and width here is an example demonstrating how to deal with narrow
characters:

from unicodedata import normalize
iso2022_squeezing = dict((i, normalize('NFKC',unichr(i))) for i in
range(0xFF61,0xFFE0))
print repr(u'\uFF92'.translate(iso2022_squeezing))

It prints u'\u30e1'. Feel free to ask questions if something is not
clear.

Note, this is just an example, I *don't* claim it does what you want
for any character
in FF61-FFDF range. You may want to carefully review the whole unicode
block:
http://www.unicode.org/charts/PDF/UFF00.pdf

  -- Leo.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Portable general timestamp format, not 2038-limited

2007-06-28 Thread Leo Kislov
On Jun 27, 10:51 pm, Paul Rubin http://[EMAIL PROTECTED] wrote:
 The difficulty/impossibility of computing intervals on UTC because of
 leap seconds suggests TAI is a superior timestamp format.

If you care about intervals you'd better keep timestamps in SI seconds
since some zero time point (just like OP wanted). TAI timestamps are
pretty useless IMHO. They need to be converted to decimal/float for
interval calculations and they don't represent any legal time.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: String formatting for complex writing systems

2007-06-27 Thread Leo Kislov
On Jun 27, 12:20 am, Andy [EMAIL PROTECTED] wrote:
 Hi guys,

 I'm writing a piece of software for some Thai friend.  At the end it
 is supposed to print on paper some report with tables of text and
 numbers.  When I test it in English, the columns are aligned nicely,
 but when he tests it with Thai data, the columns are all crooked.

 The problem here is that in the Thai writing system some times two or
 more characters together might take one single space, for example งิ
 (u\u0E07\u0E34).  This is why when I use something like u%10s
 % ..., it just doesn't work as expected.

 Is anybody aware of an alternative string format function that can
 deal with this kind of writing properly?

In general case it's impossible to write such a function for many
unicode characters without feedback from rendering library.
Assuming you use *fixed* font for English and Thai the following
function will return how many columns your text will use:

from unicodedata import category
def columns(self, s):
return sum(1 for c in s if category(c) != 'Mn')

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: String formatting for complex writing systems

2007-06-27 Thread Leo Kislov
On Jun 27, 3:10 am, Leo Kislov [EMAIL PROTECTED] wrote:
 On Jun 27, 12:20 am, Andy [EMAIL PROTECTED] wrote:

  Hi guys,

  I'm writing a piece of software for some Thai friend.  At the end it
  is supposed to print on paper some report with tables of text and
  numbers.  When I test it in English, the columns are aligned nicely,
  but when he tests it with Thai data, the columns are all crooked.

  The problem here is that in the Thai writing system some times two or
  more characters together might take one single space, for example งิ
  (u\u0E07\u0E34).  This is why when I use something like u%10s
  % ..., it just doesn't work as expected.

  Is anybody aware of an alternative string format function that can
  deal with this kind of writing properly?

 In general case it's impossible to write such a function for many
 unicode characters without feedback from rendering library.
 Assuming you use *fixed* font for English and Thai the following
 function will return how many columns your text will use:

 from unicodedata import category
 def columns(self, s):
     return sum(1 for c in s if category(c) != 'Mn')

That should of course be written as def columns(s). Need to learn to
proofread before posting :)

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Method much slower than function?

2007-06-13 Thread Leo Kislov
On Jun 13, 5:40 pm, [EMAIL PROTECTED] wrote:
 Hi all,

 I am running Python 2.5 on Feisty Ubuntu. I came across some code that
 is substantially slower when in a method than in a function.

  cProfile.run(bar.readgenome(open('cb_foo')))

  20004 function calls in 10.214 CPU seconds

  cProfile.run(z=r.readgenome(open('cb_foo')))

  20004 function calls in 0.041 CPU seconds


I suspect open files are cached so the second reader
picks up where the first one left: at the of the file.
The second call doesn't do any text processing at all.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to wrap a Japanese text in Python

2007-06-08 Thread Leo Kislov
On Jun 7, 5:12 am, [EMAIL PROTECTED] wrote:
 Hi All,

 I am trying to wrap a japanese text in Python, by the following code.

 if len(message)  54:
message = message.decode(UTF8)
strlist = textwrap.wrap(message,54)

 After this I am wirting it to you a CAD Software window. While
 displaying in this window some Japanese characters at the end of the
 line  some at the begining of the line are not displayed at all.
 Meaning the text wrapping is not happening correctly.

 Can any body please help me out in resolving this problem.

First of all you should move message.decode('utf-8') call out of if
and you don't need if anyway because if the line is less than 54
textwrap won't touch it:

message = message.decode('utf-8')
strlist = textwrap.wrap(message, 54)

I don't know Japanese but the following example *seems* to work fine
for me:

# -*- coding: utf-8 -*-
sample=u  
 

import textwrap
for line in textwrap.wrap(sample, 6):
print line

Result:

  
  
  
  
  
  
  
  

Can you post a short example that clearly demonstrates the problem?

  -- Leo


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to wrap a Japanese text in Python

2007-06-08 Thread Leo Kislov
On Jun 8, 2:24 am, Leo Kislov [EMAIL PROTECTED] wrote:
 On Jun 7, 5:12 am, [EMAIL PROTECTED] wrote:

  Hi All,

  I am trying to wrap a japanese text in Python, by the following code.

  if len(message)  54:
 message = message.decode(UTF8)
 strlist = textwrap.wrap(message,54)

  After this I am wirting it to you a CAD Software window. While
  displaying in this window some Japanese characters at the end of the
  line  some at the begining of the line are not displayed at all.
  Meaning the text wrapping is not happening correctly.

  Can any body please help me out in resolving this problem.

 First of all you should move message.decode('utf-8') call out of if
 and you don't need if anyway because if the line is less than 54
 textwrap won't touch it:

 message = message.decode('utf-8')
 strlist = textwrap.wrap(message, 54)

 I don't know Japanese but the following example *seems* to work fine
 for me:

 # -*- coding: utf-8 -*-
 sample=u  
  

 import textwrap
 for line in textwrap.wrap(sample, 6):
 print line
 
 Result:

Oh, my. IE7 and/or Google groups ate my Japanese text :(  But I hope
you've got the idea: try to work on a small example python program
in a unicode-friendly IDE like for example IDLE.

 Can you post a short example that clearly demonstrates the problem?

This question is still valid.

  -- Leo.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python memory handling

2007-06-01 Thread Leo Kislov
On May 31, 8:06 am, [EMAIL PROTECTED] wrote:
 Hello,

 I will try later with python 2.5 under linux, but as far as I can see,
 it's the same problem under my windows python 2.5
 After reading this document 
 :http://evanjones.ca/memoryallocator/python-memory.pdf

 I think it's because list or dictionnaries are used by the parser, and
 python use an internal memory pool (not pymalloc) for them...


If I understand the document correctly you should be able to free
list
and dict caches if you create more than 80 new lists and dicts:

[list(), dict() for i in range(88)]

If it doesn't help that means 1) listdict caches don't really work
like I think or 2) pymalloc cannot return memory because of
fragmentation and that is not simple to fix.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: getmtime differs between Py2.5 and Py2.4

2007-05-07 Thread Leo Kislov
On May 7, 4:15 pm, Irmen de Jong [EMAIL PROTECTED] wrote:
 Martin v. Löwis wrote:
  Is this a bug?

  Why don't you read the responses posted earlier? John Machin
  replied (in [EMAIL PROTECTED])
  that you are mistaken: There is NO difference between the outcome
  of os.path.getmtime between Py2.5 and Py2.4. It always did return
  UTC, and always will.

  Regards,
  Martin

 Err.:

 [E:\Projects]dir *.py

   Volume in drive E is Data   Serial number is 2C4F:9C2D
   Directory of  E:\Projects\*.py

 31-03-2007  20:46 511  log.py
 25-11-2006  16:59 390  p64.py
   7-03-2007  23:07 207  sock.py
   3-02-2007  16:15 436  threads.py
1.544 bytes in 4 files and 0 dirs16.384 bytes allocated
  287.555.584 bytes free

 [E:\Projects]c:\Python24\python.exe -c import os; print 
 os.path.getmtime('p64.py')
 1164470381

 [E:\Projects]c:\Python25\python.exe -c import os; print 
 os.path.getmtime('p64.py')
 1164466781.28

 This is python 2.4.4 and Python 2.5.1 on windows XP.
 The reported time clearly differs.

Let me guess: your E drive uses FAT filesystem?

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: invoke user's standard mail client

2007-05-07 Thread Leo Kislov
On May 7, 2:00 pm, [EMAIL PROTECTED] [EMAIL PROTECTED]
wrote:
 On May 7, 10:28 am, Gabriel Genellina [EMAIL PROTECTED]
 wrote:



  Get the pywin32 package (Python for Windows extensions) from sourceforge,
  install it, and look into the win32comext\mapi\demos directory.

 Thanks for the hint, Gabriel.
 Wow, that's heavily spiced code! When I invoke mapisend.py I get:

   Traceback (most recent call last):
 File mapisend1.py, line 85, in module
   SendEMAPIMail(SendSubject, SendMessage, SendTo,
 MAPIProfile=MAPIProfile)
 File mapisend1.py, line 23, in SendEMAPIMail
   mapi.MAPIInitialize(None)
   pywintypes.com_error: (-2147467259, 'Unspecified error', None, None)

 But what is a MAPI profile?

It's an abstraction of incoming and outgoing mail accounts. In UNIX
terms it's kind of like running local sendmail that forwards mail to
another server and fetchmail that fetches mail from external inboxes,
e.g. it's a proxy between you and outgoing/incoming mail server.

 I left this variable blank. Do I need MS
 Exchange Server to run this demo?

No, but you need an account on some mail server and some email program
should create a MAPI profile to represent that account on your local
computer. As I understand creation of MAPI profiles is not a common
practice among non-Microsoft products, for example my computer with
Lotus Notes doesn't have any MAPI profiles.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: relative import broken?

2007-05-04 Thread Leo Kislov
On May 3, 10:08 am, Alan Isaac [EMAIL PROTECTED] wrote:
 Alex Martelli [EMAIL PROTECTED] wrote in message

 news:[EMAIL PROTECTED]

  Very simply, PEP 328 explains:
  
  Relative Imports and __name__

  Relative imports use a module's __name__ attribute to determine that
  module's position in the package hierarchy. If the module's name does
  not contain any package information (e.g. it is set to '__main__') then
  relative imports are resolved as if the module were a top level module,
  regardless of where the module is actually located on the file system.
  

 To change my question somewhat, can you give me an example
 where this behavior (when __name__ is '__main__') would
 be useful for a script? (I.e., more useful than importing relative
 to the directory holding the script, as indicated by __file__.)

Do you realize it's a different behaviour and it won't work for some
packages? One possible alternative is to assume empty parent
package and let from . import foo work but not
from .. import bar or any other upper levels. The package author
should also realize __init__.py will be ignored.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: hp 11.11 64 bit python 2.5 build gets error import site failed

2007-05-03 Thread Leo Kislov
On May 3, 2:54 pm, Martin v. Löwis [EMAIL PROTECTED] wrote:
  import site failed
  OverflowError: signed integer is greater than the maximum.
  - what is the value of ival?
  ival: 4294967295

 I see. This is 0x, which would be -1 if it were of type
 int. So perhaps some value got cast incorrectly at some point,
 breaking subsequent computations



  - where does that number come from?

  It is coming from the call to PyInt_AsLong. In that function there is
  a call to:
  PyInt_AS_LONG((PyIntObject*)op)
  which returns the value of ival.

 That was not my question, really. I wanted to know where the object
 whose AsLong value was taken came from. And before you say it's
 in the arg parameter of convertsimple() - sure it is. However, how
 did it get there? It's in an argument tuple - and where came
 that from?

Looking at the call stack OP posted, -1 is coming as forth parameter
of
__import__, I *guess* at the first import in site.py or at implicit
import site. I think it'd be helpful if OP also tried if it works:
python -S -c -v print -1, type(-1), id(0), id(-1)


  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: My Python annoyances

2007-05-03 Thread Leo Kislov
On May 3, 9:27 pm, Gabriel Genellina [EMAIL PROTECTED] wrote:
 En Thu, 03 May 2007 10:49:26 -0300, Ben Collver [EMAIL PROTECTED]  
 escribió:

  I tried to write portable Python code.  The zlib CRC function returned  
  different results on architectures between 32 bit and 64 bit  
  architectures.  I filed a bug report.  It was closed, without a comment  
  from the person who closed it.  I get the unspoken message: bug reports  
  are not welcome.

 You got a comment from me, that you never disputed nor commented further.  
 I would have changed the status to invalid myself, if I were able to do  
 so.

I think it should have been marked as won't fix as it's a wart just
like
1/2 == 0, but as there are many users of the current behaviour it's
impossible
to fix it in Python 2.x. Maybe in Python 3.0?

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's handling of unicode surrogates

2007-04-22 Thread Leo Kislov
On Apr 20, 7:34 pm, Rhamphoryncus [EMAIL PROTECTED] wrote:
 On Apr 20, 6:21 pm, Martin v. Löwis [EMAIL PROTECTED] wrote:
  If you absolutely think support for non-BMP characters is necessary
  in every program, suggesting that Python use UCS-4 by default on
  all systems has a higher chance of finding acceptance (in comparison).

 I wish to write software that supports Unicode.  Like it or not,
 Unicode goes beyond the BMP, so I'd be lying if I said I supported
 Unicode if I only handled the BMP.

Having ability to iterate over code points doesn't mean you support
Unicode. For example if you want to determine if a string is one word
and you iterate over code points and call isalpha you'll get incorrect
result in some cases in some languages (just to backup this
claim this isn't going to work at least in Russian. Russian language
uses U+0301 combining acute accent which is not part of the alphabet
but it's an element of the Russian writing system).

IMHO what is really needed is a bunch of high level methods like
.graphemes() - iterate over graphemes
.codepoints() - iterate over codepoints
.isword() - check if the string represents one word
etc...

Then you can actually support all unicode characters in utf-16 build
of Python. Just make all existing unicode methods (except
unicode.__iter__) iterate over code points. Changing __iter__
to iterate over code points will make indexing wierd. When the
programmer is *ready* to support unicode he/she will explicitly
call .codepoints() or .graphemes(). As they say: You can lead
a horse to water, but you can't make it drink.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: iterator interface for Queue?

2007-04-08 Thread Leo Kislov
On Apr 7, 11:40 pm, Paul Rubin http://[EMAIL PROTECTED] wrote:
 Is there any reason Queue shouldn't have an iterator interface?
 I.e. instead of

 while True:
item = work_queue.get()
if item is quit_sentinel:
# put sentinel back so other readers can find it
work_queue.put(quit_sentinel)  
break
process(item)

It's almost equal to:

for item in iter(work_queue.get, quit_sentinel):
process(item)

except that it doesn't keep the quit sentinel in the queue. But that's
a personal preference, I usually put as many quit sentinels in a queue
as many consumers.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I18n issue with optik

2007-04-01 Thread Leo Kislov
On Apr 1, 8:47 am, Thorsten Kampe [EMAIL PROTECTED] wrote:
 I guess the culprit is this snippet from optparse.py:

 # used by test suite
 def _get_encoding(self, file):
 encoding = getattr(file, encoding, None)
 if not encoding:
 encoding = sys.getdefaultencoding()
 return encoding

 def print_help(self, file=None):
 print_help(file : file = stdout)

 Print an extended help message, listing all options and any
 help text provided with them, to 'file' (default stdout).
 
 if file is None:
 file = sys.stdout
 encoding = self._get_encoding(file)
 file.write(self.format_help().encode(encoding, replace))

 So this means: when the encoding of sys.stdout is US-ASCII, Optparse
 sets the encoding to of the help text to ASCII, too.

.encode() method doesn't set an encoding. It encodes unicode text into
bytes according to specified encoding. That means optparse needs ascii
or unicode (at least) for help text. In other words you'd better use
unicode throughout your program.

 But that's
 nonsense because the Encoding is declared in the Po (localisation)
 file.

For backward compatibility gettext is working with bytes by default,
so the PO file encoding is not even involved. You need to use unicode
gettext.

 How can I set the encoding of sys.stdout to another encoding?

What are you going to set it to? As I understand you're going to
distribute your program to some users. How are you going to find out
the encoding of the terminal of your users?

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: shutil.copy Problem

2007-04-01 Thread Leo Kislov
On Mar 28, 7:01 am, David Nicolson [EMAIL PROTECTED] wrote:
 Hi John,

 That was an excellent idea and it was the cause problem. Whether this  
 is a bug inshutilI'm not sure.

 Here is the traceback, Python 2.4.3 on Windows XP:





  C:\Documents and Settings\GüstavC:\python243\python Z:\sh.py
  Copying  u'C:\\Documents and Settings\\G\xfcstav\\My Documents\\My  
  Music\\iTunes
  \\iTunes Music Library.xml' ...
  Traceback (most recent call last):
File Z:\sh.py, line 12, in ?
 shutil.copy(xmlfile,C:iTunes Music Library.xml)

Note, there is no backslash after C:. shutil will try to make an
absolute file name and concatenate it with a current directory name (C:
\Documents and Settings\Güstav) that contains non-ascii characters.
Because of backward compatibility the absolute name won't be unicode.
On the other hand data coming from registry is unicode. When shutil
tries to compare those two file names it fails. To avoid the problem
you need either make both file names unicode or both file names byte-
strings.

However one thing is still mystery to me. Your source code contains
backslash but your traceback doesn't:

 shutil.copy(xmlfile,C:\iTunes Music Library.xml)




 Theshutilline needed to be changed to this to be successful:

 shutil.copy(xmlfile.encode(windows-1252),C:\iTunes Music  
  Library.xml

It will work only in some European locales. Using of locale module you
can make it work for 99% of world users, but it will still fail in
cases like German locale and Greek characters in file names. Only
using unicode everywhere in your program is a complete solution. Like

shutil.copy(xmlfile, uC:\iTunes Music Library.xml)

if you use constant or make sure your file name is unicode:

dest = unicode()
shutil.copy(xmlfile, dest)


  -- Leo.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode zipping from Python code?

2007-03-26 Thread Leo Kislov
On Mar 26, 12:21 am, durumdara [EMAIL PROTECTED] wrote:
 Hi!

 As I experienced in the year 2006, the Python's zip module is not
 unicode-safe.

I'd rather say unicode file names are not supported. Why? Because zip
format didn't support unicode file names upto 2006.

 With the hungarian filenames I got wrong result.
 I need to convert iso-8859-2 to cp852 chset to get good result.

So you solved the problem, didn't you?

 As I see, this module is a command line tool imported as extension.

 Now I search for something that can handle the characters good, or
 handle the unicode filenames.

You said you've got good result, so it's not clear what do you want.


 Does anyone knows about a python project that can do this?
 Or other tool what I can use for zipping intern. characters?

Zipping is only half of the problem. How are you going to unzip such
files?


  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: shutil.copy Problem

2007-03-26 Thread Leo Kislov
On Mar 26, 8:10 pm, David Nicolson [EMAIL PROTECTED] wrote:
 Hi,

 I wasn't exactly sure where to send this, I don't know if it is a bug  
 in Python or not. This is rare, but it has occurred a few times and  
 seems to be reproducible for those who experience it.

 Examine this code:
   try:
   shutil.copy(/file.xml,/Volumes/External/file.xml)
   except Exception, err:
   print sys.exc_info()[0]
   print err

 This is the output:
 exceptions.UnicodeDecodeError
 'ascii' codec can't decode byte 0xd6 in position 26: ordinal not in  
 range(128)]

 What could the possible cause of this be?

Show us traceback, without it I doubt anyone can help.

 Shouldn't shutil simply be  
 reading and writing the bytes and not character decoding them?

Yes, shutil.copy copies content verbatim.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Making a non-root daemon process

2007-03-23 Thread Leo Kislov
On Mar 22, 11:19 pm, Ben Finney [EMAIL PROTECTED] wrote:
 Howdy all,

 For making a Python program calve off an independent daemon process of
 itself, I found Carl J. Schroeder's recipe in the ASPN Python Cookbook.
 URL:http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/278731

 This is a thorough approach, and I'm cribbing a simpler process from
 this example. One thing that strikes me is that the algorithm seems to
 depend on running the program as the root user.

 import os

 def become_daemon():
 pid = os.fork()
 if pid == 0:
 # This is the child of the fork

 # Become a process leader of a new process group
 os.setsid()

 # Fork again and exit this parent
 pid = os.fork()
 if pid == 0:
 # This is the child of the second fork -- the running process.
 pass
 else:
 # This is the parent of the second fork
 # Exit to prevent zombie process
 os._exit(0)
 else:
 # This is the parent of the fork
 os._exit(0)

 become_daemon()
 # Continue with the program

 The double-fork seems to be to:
   - Allow the first forked child to start a new process group
   - Allow the second forked child to be orphaned immediately

 The problem I'm having is that 'os.setsid()' fails with 'OSError:
 [Errno 1] Operation not permitted' unless I run the program as the
 root user. This isn't a program that I want necessarily running as
 root.

It works for me. I mean your program above produces no exceptions for
me on Debian 3.1 python2.4

 What does the 'os.setsid()' gain me?

It dettaches you from terminal. It means you won't receive signals
from terminal for sure. Like SIGINT and SIGHUP, but there are maybe
other.

 How can I get that without being
 the root user?

Maybe you can go over the list of all possible signals from the
terminal and notify kernel that you want to ignore them. Sounds
similar to dettaching from the terminal, but maybe there some
differences. But the fact that os.setsid fails for you is weird
anyway.

  -- Leo.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: lock problem

2007-03-18 Thread Leo Kislov
On Mar 16, 3:08 pm, Ritesh Raj Sarraf [EMAIL PROTECTED] wrote:
 Leo Kislov wrote:
  But you miss the fact that there is only one environment per process.

 Maybe there's a confusion.
 The environment variable that I'm setting has noting to do with ldapsearch. I
 use the environment variable as a filename to which ldapsearch can redirect 
 its
 output. And that I do is because the output can be huge and useless.
 Then I do some pattern matching on that file and filter my data and then 
 delete
 it.

 If you think I still am missing something important, request you to describe 
 it.

Imagine this timeline:

thread1 os.environ['__kabc_ldap'] = '/tmp/tmp1'
thread1 suspended, thread2 starts to run
thread2 os.environ['__kabc_ldap'] = '/tmp/tmp2'
thread2 launch ldapsearch (output goes to '/tmp/tmp2')
thread2 suspended, thread1 starts to run
thread1 launch ldapsearch (output goes to '/tmp/tmp2' over output
from ldapsearch launched from thread1)

Seems like that's what is happening to your program.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: lock problem

2007-03-16 Thread Leo Kislov
On Mar 16, 12:40 am, Ritesh Raj Sarraf [EMAIL PROTECTED] wrote:
 Leo Kislov wrote:
  You're changing environmental variable __kabc_ldap that is shared
  between your threads. Environment is not designed for that kind of
  usage, it was designed for settings. Either use an option to set
  output file or just redirect stdout. If the interface of ldapsearch is
  so lame that it requires environmental variable use env to set the
  variable: env __kabc_ldap=/tmp/wrjhdsf ldapsearch ...

 The environment variable is set with temp_file_name which gets the name from
 tempfile.mkstemp(), which is run in every thread. So I don't think the
 environment variable is going to be the same.

But you miss the fact that there is only one environment per process.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: lock problem

2007-03-15 Thread Leo Kislov
On Mar 15, 2:31 pm, Ritesh Raj Sarraf [EMAIL PROTECTED] wrote:

[snip]

 os.environ['__kabc_ldap'] = temp_file_name

[snip]

 Now as per the above code, aa is the first string which will be executed in
 Thread-1. In my query to the ldap server, I am getting a record which matches
 the aa string. I've verified it by putting a breakpoint and checking the
 value.

 The problem is that when I run the program manually, I don't get the data from
 the first thread i.e. of the string aa.

 I'm not sure if there's something wrong in the code mentioned above or is it
 really a lock problem.

 Can somebody please help about where I'm doing any mistake ?

You're changing environmental variable __kabc_ldap that is shared
between your threads. Environment is not designed for that kind of
usage, it was designed for settings. Either use an option to set
output file or just redirect stdout. If the interface of ldapsearch is
so lame that it requires environmental variable use env to set the
variable: env __kabc_ldap=/tmp/wrjhdsf ldapsearch ...

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: INSERT statements not INSERTING when using mysql from python

2006-12-29 Thread Leo Kislov
Ask Ben, he might know, although he's out to lunch.

Ben wrote:

 I'll try it after lunch. Does anyoone know whether this might be the
 problem?

 Ben


 Ben wrote:
  I have found the problem, but not the cause.
 
  I tried setting the database up manually before hand, which let me get
  rid of the IF NOT EXISTS lines, and now it works!
 
  But why the *** should it not work anyway? The first time it is run, no
  database or tables, so it creates them. That works. But apparentlyu on
  subsequent runs it decides the tables it created arent' actually there,
  and overwrites them. Gr.
 
 
  Ben
 
 
 
  Ben wrote:
   Well, I've checked the SQL log, and my insert statements are certainly
   being logged. The only option left open is that the table in question
   is being replaced, but I can't see why it should be...
  
  
   Ben wrote:
Nope... that can't be it. I tried running those commands manually and
nothing went wrong.
But then again when I execute the problematic command manually nothing
goes wrong. Its just not executing until the last time, or being
overwritten.
   
   
Ben wrote:
 Each time my script is run, the following is called:

 self.cursor.execute(CREATE DATABASE IF NOT EXISTS +name)
 self.cursor.execute(USE +name)
 self.cursor.execute(CREATE TABLE IF NOT EXISTS table_name ( 

 The idea being that stuf is only created the first time the script is
 run, and after that the original tables and database is used. This
 might explain my pronblem if for some reason the old tables are being
 replaced... can anyone see anything wrong with the above?

 Ben






 Ben wrote:
  One partial explanation might be that for some reason it is 
  recreating
  the table each time the code runs. My code says CREATE TABLE IF NOT
  EXISTS but if for some reason it is creating it anyway and dropping
  the one before that could explain why there are missing entires.
 
  It wouldn't explain why the NOT EXISTS line is being ignored 
  though...
 
  Ben
 
 
  Ben wrote:
   I initially had it set up so that when I connected to the 
   database I
   started a transaction, then when I disconnected I commited.
  
   I then tried turning autocommit on, but that didn't seem to make 
   any
   difference (althouh initially I thought it had)
  
   I'll go back and see what I can find...
   Cheers,
   Ben
  
  
   johnf wrote:
Ben wrote:
   
 I don't know whether anyone can help, but I have an odd 
 problem. I have
 a PSP (Spyce) script that makes many calls to populate a 
 database. They
 all work without any problem except for one statement.

 I first connect to the database...

 self.con = MySQLdb.connect(user=username, passwd =password)
 self.cursor = self.con.cursor()
 self.cursor.execute(SET max_error_count=0)

 All the neccesary tables are created...

 self.cursor.execute(CREATE DATABASE IF NOT EXISTS +name)
 self.cursor.execute(USE +name)

 self.cursor.execute(CREATE TABLE IF NOT EXISTS networks (SM
 varchar(20),DMC int,DM varchar(50),NOS int,OS varchar(50),NID
 varchar(20))

 Then I execute many insert statements in various different 
 loops on
 various tables, all of which are fine, and result in multiple 
 table
 entries. The following one is executed many times also. and 
 seems
 identical to the rest. The print statements output to the 
 browser
 window, and appear repeatedly, so the query must be being 
 called
 repeatedly also:

 print pbSQL query executing/bp
 self.cursor.execute(INSERT INTO networks VALUES ('a',' +i+
 ','c','2','e','f','g'))
 print pbSQL query executed/bp

 I have, for debugging, set i up as a counter variable.

 No errors are given, but the only entry to appear in the 
 final database
 is that from the final execution of the INSERT statement (the 
 last
 value of i)

 I suspect that this is to vague for anyone to be able to 
 help, but if
 anyone has any ideas I'd be really grateful :-)

 It occured to me that if I could access the mysql query log 
 that might
 help, but I was unsure how to enable logging for MysQL with 
 python.

 Cheers,

 Ben
   
Not sure this will help but where is the commit?  I don't use 
MySQL but
most SQL engines require a commit.
Johnf

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: dealing with special characters in Python and MySQL

2006-12-18 Thread Leo Kislov
ronrsr wrote:
  
  Try putting use_unicode=True in the MySQLdb connect call.

 tried that, and also added charset=utf8 -

 now, I can't do any string operations, I get the error msg:

 descriptor 'lower' requires a 'str' object but received a 'unicode'
   args = (descriptor 'lower' requires a 'str' object but received
 a 'unicode',)


 or similar, on every string operation.

What is string operation? Every time you say I get error please
provide source code where this error occurs. And by the way, do you
know that for non-ascii characters you should use unicode type, not str
type?

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: urllib.unquote and unicode

2006-12-18 Thread Leo Kislov

George Sakkis wrote:
 The following snippet results in different outcome for (at least) the
 last three major releases:

  import urllib
  urllib.unquote(u'%94')

 # Python 2.3.4
 u'%94'

 # Python 2.4.2
 UnicodeDecodeError: 'ascii' codec can't decode byte 0x94 in position 0:
 ordinal not in range(128)

 # Python 2.5
 u'\x94'

 Is the current version the right one or is this function supposed to
 change every other week ?

IMHO, none of the results is right. Either unicode string should be
rejected by raising ValueError or it should be encoded with ascii
encoding and result should be the same as
urllib.unquote(u'%94'.encode('ascii')) that is '\x94'. You can consider
current behaviour as undefined just like if you pass a random object
into some function you can get different outcome in different python
versions.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: writing serial port data to the gzip file

2006-12-17 Thread Leo Kislov

Petr Jakes wrote:
 I am trying to save data it is comming from the serial port continually
 for some period.
 (expect reading from serial port is 100% not a problem)
 Following is an example of the code I am trying to write. It works, but
 it produce an empty gz file (0kB size) even I am sure I am getting data
 from the serial port. It looks like g.close() does not close the gz
 file.
 I was reading in the doc:

 Calling a GzipFile object's close() method does not close fileobj,
 since you might wish to append more material after the compressed
 data...

 so I am completely lost now...

 thanks for your comments.
 Petr Jakes
  snippet of the code  
 def dataOnSerialPort():
 data=s.readLine()
 if data:
 return data
 else:
 return 0

 while 1:
 g=gzip.GzipFile(/root/foofile.gz,w)
 while dataOnSerialPort():
 g.write(data)
 else: g.close()

Your while loop is discarding result of dataOnSerialPort, so you're
probably writing empty string to the file many times. Typically this
kind of loop are implemented using iterators. Check if your s object
(is it from external library?) already implements iterator. If it does
then

for data in s:
g.write(data)

is all you need. If it doesn't, you can use iter to create iterator for
you:

for data in iter(s.readLine, ''):
g.write(data)

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: connect from windows to linux using ssh

2006-12-15 Thread Leo Kislov

[EMAIL PROTECTED] wrote:
 Hi Folks,

 How to connect from windows to linux using ssh without username/passwd.

 With this scenario,  i need to write a program on python.

Use ssh library http://cheeseshop.python.org/pypi/paramiko

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Serial port failure

2006-12-15 Thread Leo Kislov

Rob wrote:
 Hi all,

 I am fairly new to python, but not programming and embedded.  I am
 having an issue which I believe is related to the hardware, triggered
 by the software read I am doing in pySerial.  I am sending a short
 message to a group of embedded boxes daisy chained via the serial port.
  When I send a 'global' message, all the connected units should reply
 with their Id and Ack in this format '0 Ack'  To be certain that I
 didn't miss a packet, and hence a unit, I do the procedure three times,
 sending the message and waiting for a timeout before I run through the
 next iteration.  Frequently I get through the first two iterations
 without a problem, but the third hangs up and crashes, requiring me to
 remove the Belkin USB to serial adapter, and then reconnect it.  Here
 is the code:

 import sys, os
 import serial
 import sret
 import time

 from serial.serialutil import SerialException
 
  GetAck Procedure
 
 def GetAck(p):
 response = 

 try:
 response = p.readline()
 except SerialException:
   print Timed out
   return -1
 res = response.split()

 #look for ack in the return message
 reslen = len(response)
 if reslen  5:
 if res[1] == 'Ack':
   return res[0]
   elif res[1] == 'Nak':
   return 0x7F
   else:
   return -1


  Snip 
 
  GetNumLanes Procedure
 
 def GetNumLanes(Lanes):
   print Looking for connected units
 # give a turn command and wait for responses
   msg = .g t 0 336\n

   for i in range(3):
   port = OpenPort()
   time.sleep(3)
   print port.isOpen()
   print Request #%d % (i+1)
   try:
   port.writelines(msg)
   except OSError:
   print Serial port failure.  Power cycle units
   port.close()
   sys.exit(1)

 done = False
 # Run first connection check
   #Loop through getting responses until we get a -1 from GetAck
 while done == False:
   # lane will either be -1 (timeout), 0x7F (Nak),
   # or the lane number that responded with an Ack
   lane = GetAck(port)
   if lane = '0':

Your GetAck returns either string or number and then you compare it
with a string. If you compare string with a number python currently
returns result you probably don't expect

 -1 = '0'
False
 0x7f = '0'
False

This is a wart and it will be fixed in python 3.0 (it will raise
exception) I think you should rewrite GetAck to return a tuple (state,
lane)

def GetAck(p):
   response = 

   try:
   response = p.readline()
   except SerialException:
   print Timed out
   return 'Timeout', 'NoID'
   res = response.split()

   #look for ack in the return message
   reslen = len(response)
   if reslen  5:
   if res[1] == 'Ack':
   return 'Ack', res[0]
   elif res[1] == 'Nak':
   return 'Nak', Does Nak response contain lane id?
   else:
   return 'Unknown', 'NoID'

And then instead of

lane = GetAck(port)
if lane = '0':

use

state, lane = GetAck(port)
if state == 'Ack':

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Serial port failure

2006-12-15 Thread Leo Kislov
Rob wrote:
 try:
 response = p.readline()
 except SerialException:
   print Timed out


   try:
   port.writelines(msg)
   except OSError:
   print Serial port failure.  Power cycle units
   port.close()
   sys.exit(1)


 Does anyone have any ideas?

It'd be a good idea to print all exceptions, it can help debugging the
problem (if you don't like it going to the screen of an end user at
least write it to a log file):

except SerialException, err:
print err
print Timed out

except OSError, err:
print err
print Serial port failure.  Power cycle units

and in your OpenPort function too.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Roundtrip SQL data especially datetime

2006-12-15 Thread Leo Kislov
John Nagle wrote:
 Routinely converting MySQL DATETIME objects to Python datetime
 objects isn't really appropriate, because the MySQL objects have a
 year range from 1000 to , while Python only has the UNIX range
 of 1970 to 2038.

You're mistaken. Python datetime module excepts years from 1 up to
:

 datetime.MINYEAR
1
 datetime.MAXYEAR


  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how can i write a hello world in chinese with python

2006-12-13 Thread Leo Kislov

kernel1983 wrote:
 and I tried unicode and utf-8

How did you try unicode? Like this? :

EasyDialogs.Message(u'\u4e2d')

 I tried to both use unicodeutf-8 head just like \xEF\xBB\xBF and not
 to use

 Anyone knows about the setting in the python code file?
 Maybe python doesn't know I'm to use chinese?!

It depends on how EasyDialogs works. And by the way, when you say utf-8
encoded text is not displayed correctly, what do you actually see on
the screen?

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: inconvenient unicode conversion of non-string arguments

2006-12-13 Thread Leo Kislov

Holger Joukl wrote:
 Hi there,

 I consider the behaviour of unicode() inconvenient wrt to conversion of
 non-string
 arguments.
 While you can do:

  unicode(17.3)
 u'17.3'

 you cannot do:

  unicode(17.3, 'ISO-8859-1', 'replace')
 Traceback (most recent call last):
   File stdin, line 1, in ?
 TypeError: coercing to Unicode: need string or buffer, float found
 

 This is somehow annoying when you want to convert a mixed-type argument
 list
 to unicode strings, e.g. for a logging system (that's where it bit me) and
 want to make sure that possible raw string arguments are also converted to
 unicode without errors (although by force).
 Especially as this is a performance-critical part in my application so I
 really
 do not like to wrap unicode() into some custom tounicode() function that
 handles
 such cases by distinction of argument types.

 Any reason why unicode() with a non-string argument should not allow the
 encoding and errors arguments?

There is reason: encoding is a property of bytes, it is not applicable
to other objects.

 Or some good solution to work around my problem?

Do not put undecoded bytes in a mixed-type argument list. A rule of
thumb working with unicode: decode as soon as possible, encode as late
as possible.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: inconvenient unicode conversion of non-string arguments

2006-12-13 Thread Leo Kislov

Holger Joukl wrote:
 [EMAIL PROTECTED] schrieb am 13.12.2006
 11:02:30:

 
  Holger Joukl wrote:
   Hi there,
  
   I consider the behaviour of unicode() inconvenient wrt to conversion of
   non-string
   arguments.
   While you can do:
  
unicode(17.3)
   u'17.3'
  
   you cannot do:
  
unicode(17.3, 'ISO-8859-1', 'replace')
   Traceback (most recent call last):
 File stdin, line 1, in ?
   TypeError: coercing to Unicode: need string or buffer, float found
   
   [...]
   Any reason why unicode() with a non-string argument should not allow
 the
   encoding and errors arguments?
 
  There is reason: encoding is a property of bytes, it is not applicable
  to other objects.

 Ok, but I still don't see why these arguments shouldn't simply be silently
 ignored
 for non-string arguments.

That's rather bizzare and sloppy approach. Should

unicode(17.3, 'just-having-fun', 'I-do-not-like-errors')
unicode(17.3, 'sdlfkj', 'ewrlkj', 'eoirj', 'sdflkj')

work?


   Or some good solution to work around my problem?
 
  Do not put undecoded bytes in a mixed-type argument list. A rule of
  thumb working with unicode: decode as soon as possible, encode as late
  as possible.

 It's not always that easy when you deal with a tree data structure with the
 tree elements containing different data types and your user may decide to
 output
 root.element.subelement.whateverData.
 I have the problems in a logging mechanism, and it would vanish if
 unicode(non-string, encoding, errors) would work and just ignore the
 obsolete
 arguments.

I don't really see from your example what stops you from putting
unicode instead of bytes into your tree, but I can believe some
libraries can cause some extra work. That's the problem with libraries,
not with builtin function unicode(). Would you be happy if floating
point value 17.3 would be stored as 8 bytes in your tree? After all,
that is how 17.3 is actually represented in computer memory. Same story
with unicode, if some library gives you raw bytes *you* have to do
extra work later.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to turn of the monitor by python?

2006-12-12 Thread Leo Kislov
[EMAIL PROTECTED] wrote:
 I want to turn off my monitor from within python, How to do it?
 Thanks!

Do you realize that hardware management and control is OS dependant?
When asking such questions always specify OS.

Assuming you are interested in Windows, then you just need to translate
this http://www.codeproject.com/system/display_states.asp C API calls
into python. You can use ctypes (included in Python 2.5) or python
win32 extensions.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: sys.stdin.encoding

2006-12-11 Thread Leo Kislov

[EMAIL PROTECTED] wrote:
 Duncan Booth skrev:

  [EMAIL PROTECTED] wrote:
 
   The following line in my code is failing because sys.stdin.encoding is
   Null.
 
  I'll guess you mean None rather than Null.
 
   This has only started happening since I started working with
   Pydef in Eclipse SDK. Any ideas?
  
   uni=unicode(word,sys.stdin.encoding)
  
  You could give it a fallback value:
 
  uni = unicode(word, sys.stdin.encoding or sys.getdefaultencoding())
 
  or even just:
 
  uni = unicode(word, sys.stdin.encoding or 'ascii')
 
  which should be the same in all reasonable universes (although I did get
  bitten recently when someone had changed the default encoding in a system).


 Thanks for your help. The problem now is that I cant enter the Swedish
 characters åöä etc without getting the following error -

 Enter word Påe
 Traceback (most recent call last):
   File C:\Documents and Settings\workspace\simple\src\main.py, line
 25, in module
 archive.Test()
   File C:\Documents and Settings\workspace\simple\src\verb.py, line
 192, in Test
 uni=unicode(word,sys.stdin.encoding or sys.getdefaultencoding())
 UnicodeDecodeError: 'ascii' codec can't decode byte 0xe5 in position 1:
 ordinal not in range(128)

 The call to sys.getdefaultencoding() returns ascii. Since I can enter
 the characters åöä on the command line in Pydef/Eclipse doesn't that
 mean that the stdin is not ascii? What should I do?

The workaround in your case is:

in the beginning of your program:

import sys
if hasattr(sys.stdin, 'encoding'):
console_encoding = sys.stdin.encoding
else:
import locale
locale_name, console_encoding = locale.getdefaultlocale()

and later:

uni = unicode(word, console_encoding)

But don't think it's portable, if you use other IDE or OS, it may not
work. It would be better if PyDev implemented sys.stdin.encoding

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: sys.stdin.encoding

2006-12-11 Thread Leo Kislov

Martin v. Löwis wrote:
 [EMAIL PROTECTED] schrieb:
  The following line in my code is failing because sys.stdin.encoding is
  Null. This has only started happening since I started working with
  Pydef in Eclipse SDK. Any ideas?
 
  uni=unicode(word,sys.stdin.encoding)

 That's a problem with pydev, where the standard machinery to determine
 the terminal's encoding fail.

 I have no idea yet how to fix this.

Environmental variable TERMENCODING ? Heck, maybe this will catch on
and will be used by other languages, libraries, terminals, etc. It's
not really Python only problem.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Printing Barcodes from webapp?

2006-12-02 Thread Leo Kislov

Burhan wrote:
 Hello Group:

   I am in the planning stages of an application that will be accessed
 over the web, and one of the ideas is to print a barcode that is
 generated when the user creates a record.  The application is to track
 paperwork/items and uses barcodes to easily identify which paper/item
 belongs to which record.

   Is there an easy way to generate barcodes using Python -- considering
 the application will be printing to a printer at the client's machine?
 I thought of two ways this could be done; one would be to interface
 with the printing options of the browser to ensure that margins,
 headers, footers are setup properly (I have done this before using
 activex and IE, but with mixed results); the other would be to install
 some small application at the client machine that would intercept the
 print jobs and format them properly (taking the printing function away
 from the browser).

   Does anyone have any experience or advice? Any links I could read up
 on to help me find out how to program this?  Another way (easier
 hopefully) to accomplish this?

I think one of the easiest ways is to install acrobat reader and
redirect client browser to a generated pdf file.
http://www.reportlab.org/ has support for generating barcodes (and
more) in pdf documents.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problem with imaplib (weird result if mailbox contains a %)

2006-11-29 Thread Leo Kislov
Antoon Pardon wrote:
 On 2006-11-28, Leo Kislov [EMAIL PROTECTED] wrote:
 
  Antoon Pardon wrote:
  This little program gives IMO a strange result.
 
  import imaplib
 
  user = cpapen
 
  cyr = imaplib.IMAP4(imap.vub.ac.be)
  cyr.login(cyrus, cOn-A1r)
  rc, lst = cyr.list('', user/%s/* % user)
  for el in lst:
print %r % (el,)
 
  And the result is:
 
  '(\\HasNoChildren) / user/cpapen/Out'
  '(\\HasNoChildren) / user/cpapen/Punten'
  '(\\HasNoChildren) / user/cpapen/Spam'
  '(\\HasNoChildren) / user/cpapen/agoog to be'
  '(\\HasNoChildren) / user/cpapen/artistiek - kunst'
  '(\\HasNoChildren) / user/cpapen/copains et copinnes =x='
  '(\\HasNoChildren) / user/cpapen/cp - writing'
  '(\\HasNoChildren) / user/cpapen/examen'
  '(\\HasNoChildren) / user/cpapen/important info (pass)'
  '(\\HasNoChildren) / user/cpapen/lesmateriaal'
  '(\\HasNoChildren) / user/cpapen/love - flesh for fantasy'
  '(\\HasNoChildren) / user/cpapen/media'
  '(\\HasNoChildren) / user/cpapen/music - beats'
  ('(\\HasNoChildren) / {25}', 'user/cpapen/newsletters %')
  ''
  '(\\HasNoChildren) / user/cpapen/organisatie - structuur'
  '(\\HasNoChildren) / user/cpapen/sociale wetenschappen'
  '(\\HasNoChildren) / user/cpapen/the closest ones to me [x]'
  '(\\HasNoChildren) / user/cpapen/vubrations'
  '(\\HasNoChildren) / user/cpapen/wm2addressbook'
  '(\\HasNoChildren) / user/cpapen/wm2prefs'
  '(\\HasNoChildren) / user/cpapen/wm2signature'
 
 
  What I have a problem with is the 14th and 15th line.
  All other entries are strings but the 14th is a tuple.
  and the 15th is an empty string. As far as I can tell
  every time a % is in the mailbox name I get this kind of
  result.
 
  I'm using python 2.3.3 and the imap sytem is Cyrus.
 
  Can someone explain what is going one?
 
  Is this a bug?
 
  Empty string seems to be a bug. But tuple is by design, read the docs
  and imap rfc. The protocol is convoluted in the first place, and so is
  python interface.

 Are there more docs than at http://www.python.org/doc/. I don't find
 those very helpfull in explaining this.

 I also took a look at rfc 2060 and to be honest I don't find anything
 there to explain this difference. I only took a closer look at section
 7.2.2. So maybe I should look somewehere else but after reading section
 7.2.2. I don't understand why the list method returned a tuple for this
 mailbox instead of the following string:

'(\\HasNoChildren) / user/cpapen/newsletters %'

This is described in section 4.3. imaplib is too close to the protocol.
It should interpret response for each command separately. For example
list method could return list of tuples like:

(\\HasNoChildren, /, user/cpapen/newsletters %)

Without this abstraction level in imaplib you have to build it
yourself.

 
  If it is, is it fixed in later versions?
 
  Why don't you try to pull imaplib.py from later versions? I don't think
  it changed that much so it should be compatible with python 2.3

 I could take my hands on a 2.4 version and the result was the same.

I was talking only about empty string response. Is it still there?
Anyway, this issue requires investigation. That could also be a bug in
the server.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to increase the speed of this program?

2006-11-28 Thread Leo Kislov

Peter Otten wrote:
 Peter Otten wrote:

  HYRY wrote:
 
  I want to join two mono wave file to a stereo wave file by only using
  the default python module.
  Here is my program, but it is much slower than the C version, so how
  can I increase the speed?
  I think the problem is at line #1, #2, #3.
 
  oarray = array.array(h, [0]*(len(larray)+len(rarray))) #1
 
  ITEMSIZE = 2
  size = ITEMSIZE*(len(larray) + len(rarray))
  oarray = array.array(h)
  oarray.fromstring(\0 * size)
 
  may be a bit faster.

 Confirmed:

 $ python2.5 -m timeit -s'from array import array; N = 10**6' 'a =
 array(h); a.fromstring(\0*(2*N))'
 100 loops, best of 3: 9.68 msec per loop
 $ python2.5 -m timeit -s'from array import array; N = 10**6' 'a = array(h,
 [0]*N);'
 10 loops, best of 3: 199 msec per loop

Funny thing is that using huge temporary string is faster that
multiplying small array:

C:\Python25python -m timeit -sfrom array import array; N = 10**6 a
=array('h'); a.fromstring('\0'*(2*N))
100 loops, best of 3: 9.57 msec per loop

C:\Python25python -m timeit -sfrom array import array; N = 10**6 a
= array('h','\0\0'); a*N
10 loops, best of 3: 28.4 msec per loop

Perhaps if array multiplication was as smart as string multiplication
then array multiplication version would be the fastest.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to increase the speed of this program?

2006-11-28 Thread Leo Kislov
HYRY wrote:
 Peter Otten wrote:
  HYRY wrote:
 
   I want to join two mono wave file to a stereo wave file by only using
   the default python module.
   Here is my program, but it is much slower than the C version, so how
   can I increase the speed?
   I think the problem is at line #1, #2, #3.
 
   oarray = array.array(h, [0]*(len(larray)+len(rarray))) #1
 
  ITEMSIZE = 2
  size = ITEMSIZE*(len(larray) + len(rarray))
  oarray = array.array(h)
  oarray.fromstring(\0 * size)
 
  may be a bit faster.
 
  Peter

 Thank you very much, that is just what I want.

Even faster: oarray = larray + rarray

C:\Python25python -m timeit -sfrom array import array; N = 10**6 a
=array('h'); a.fromstring('\0'*(2*N))
100 loops, best of 3: 9.57 msec per loop

C:\Python25python -m timeit -sfrom array import array; N = 10**6; b =
array('h', [0])*(N/2); c = b[:] a = b + c
100 loops, best of 3: 5.7 msec per loop

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Modifying every alternate element of a sequence

2006-11-28 Thread Leo Kislov

[EMAIL PROTECTED] wrote:
 I have a list of numbers and I want to build another list with every
 second element multiplied by -1.

 input = [1,2,3,4,5,6]
 wanted = [1,-2,3,-4,5,-6]

 I can implement it like this:

 input = range(3,12)
 wanted = []
 for (i,v) in enumerate(input):
 if i%2 == 0:
 wanted.append(v)
 else:
 wanted.append(-v)

 But is there any other better way to do this.

Use slices:

input[1::2] = [-item for item in input[1::2]]

If you don't want to do it in-place, just make a copy:

wanted = input[:]
wanted[1::2] = [-item for item in wanted[1::2]]

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: os.walk return hex excapes

2006-11-28 Thread Leo Kislov
Alex S wrote:
 Hi,
 os.walk return hex excape sequence inside a files name, and when i try
 to feed it back to os.remove i get

 OSError: [Errno 22] Invalid argument:
 'C:\\Temp\\?p?\xbfS\xbf\xac?G\xaba ACDSee \xbb?a??n a???\xac\xb5\xbfn.exe'

It's not escape sequences that are the problem but question marks, I
suspect. Most likely this file name contains characters not in your
locale's language. To access this file name you need to use unicode,
just make sure the first parameter of os.walk is a unicode string, for
example: os.walk(u'c:\\temp'). The exact code how to make the first
parameter unicode depends on where it is coming from (network, config
file, registry, etc...) Reading unicode tutorial is highly recommended.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Modifying every alternate element of a sequence

2006-11-28 Thread Leo Kislov
[EMAIL PROTECTED] wrote:
 Wow, I was in fact searching for this syntax in the python tutorial. It
 is missing there.
  Is there a reference page which documents all possible list
 comprehensions.

There is actually only two forms of list comprehensions:
http://docs.python.org/ref/lists.html
[blah for x in expr] and [blah for x in expr if cond]

And here is reference page for slicing (note, it's not list
comprehension): http://docs.python.org/ref/slicings.html

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Dynamic/runtime code introspection/compilation

2006-11-28 Thread Leo Kislov

Thomas W wrote:
 Maybe a stupid subject, but this is what I want to do :

 I got some python code stored in a string:

 somecode = 

 from somemodule import ISomeInterface

 class Foo(ISomeInterface):
 param1 = ...
 param2 = 

 

 and I want to compile that code so that I can use the Foo-class and
 check what class it extends, in this case ISomeInterface etc. I've
 tried eval, codeop etc. but it doesn't work. Something like this would
 be nice :

 from somemodule import ISomeInteface

 d = compile(sourcecode)

 myfoo = d.Foo()

 print ISomeInterface in myfoo.__bases__

 Any hints?

Here is hello world program for plugins:

import sys

somecode = 
class Foo:
   param1 = Hello, world!


plugin = type(sys)('unknown_plugin') # Create new empty module
exec somecode in plugin.__dict__

print plugin.Foo.param1

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problem with imaplib (weird result if mailbox contains a %)

2006-11-28 Thread Leo Kislov

Antoon Pardon wrote:
 This little program gives IMO a strange result.

 import imaplib

 user = cpapen

 cyr = imaplib.IMAP4(imap.vub.ac.be)
 cyr.login(cyrus, cOn-A1r)
 rc, lst = cyr.list('', user/%s/* % user)
 for el in lst:
   print %r % (el,)

 And the result is:

 '(\\HasNoChildren) / user/cpapen/Out'
 '(\\HasNoChildren) / user/cpapen/Punten'
 '(\\HasNoChildren) / user/cpapen/Spam'
 '(\\HasNoChildren) / user/cpapen/agoog to be'
 '(\\HasNoChildren) / user/cpapen/artistiek - kunst'
 '(\\HasNoChildren) / user/cpapen/copains et copinnes =x='
 '(\\HasNoChildren) / user/cpapen/cp - writing'
 '(\\HasNoChildren) / user/cpapen/examen'
 '(\\HasNoChildren) / user/cpapen/important info (pass)'
 '(\\HasNoChildren) / user/cpapen/lesmateriaal'
 '(\\HasNoChildren) / user/cpapen/love - flesh for fantasy'
 '(\\HasNoChildren) / user/cpapen/media'
 '(\\HasNoChildren) / user/cpapen/music - beats'
 ('(\\HasNoChildren) / {25}', 'user/cpapen/newsletters %')
 ''
 '(\\HasNoChildren) / user/cpapen/organisatie - structuur'
 '(\\HasNoChildren) / user/cpapen/sociale wetenschappen'
 '(\\HasNoChildren) / user/cpapen/the closest ones to me [x]'
 '(\\HasNoChildren) / user/cpapen/vubrations'
 '(\\HasNoChildren) / user/cpapen/wm2addressbook'
 '(\\HasNoChildren) / user/cpapen/wm2prefs'
 '(\\HasNoChildren) / user/cpapen/wm2signature'


 What I have a problem with is the 14th and 15th line.
 All other entries are strings but the 14th is a tuple.
 and the 15th is an empty string. As far as I can tell
 every time a % is in the mailbox name I get this kind of
 result.

 I'm using python 2.3.3 and the imap sytem is Cyrus.

 Can someone explain what is going one?

 Is this a bug?

Empty string seems to be a bug. But tuple is by design, read the docs
and imap rfc. The protocol is convoluted in the first place, and so is
python interface.

 If it is, is it fixed in later versions?

Why don't you try to pull imaplib.py from later versions? I don't think
it changed that much so it should be compatible with python 2.3

 Whether or not it is a bug, can I rely on the mailbox
 being the last item in the tuple in these cases?

Yes (at least for list command)

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Email headers and non-ASCII characters

2006-11-24 Thread Leo Kislov

Christoph Haas wrote:
 Hello, everyone...

 I'm trying to send an email to people with non-ASCII characters in their
 names. A recpient's address may look like:

 Jörg Nørgens [EMAIL PROTECTED]

 My example code:

 =
 def sendmail(sender, recipient, body, subject):
message = MIMEText(body)
message['Subject'] = Header(subject, 'iso-8859-1')
message['From'] = Header(sender, 'iso-8859-1')
message['To'] = Header(recipient, 'iso-8859-1')

s = smtplib.SMTP()
s.connect()
s.sendmail(sender, recipient, message.as_string())
s.close()
 =

 However the Header() method encodes the whole expression in ISO-8859-1:

 =?iso-8859-1?q?=22J=C3=B6rg_N=C3=B8rgens=22_=3Cjoerg=40nowhere=3E?=

 However I had expected something like:

 =?utf-8?q?J=C3=B6rg?= =?utf-8?q?_N=C3=B8rgens?= [EMAIL PROTECTED]

 Of course my mail transfer agent is not happy with the first string
 although I see that Header() is just doing its job. I'm looking for a way
 though to encode just the non-ASCII parts like any mail client does. Does
 anyone have a recipe on how to do that? Or is there a method in
 the email module of the standard library that does what I need? Or
 should I split by regular expression to extract the email address
 beforehand? Or a list comprehension to just look for non-ASCII character
 and Header() them? Sounds dirty.

Why dirty?

from email.Header import Header
from itertools import groupby
h = Header()
addr = u'Jörg Nørgens [EMAIL PROTECTED]'
def is_ascii(char):
return ord(char)  128
for ascii, group in groupby(addr, is_ascii):
h.append(''.join(group),latin-1)

print h
=
J =?iso-8859-1?q?=F6?= rg N =?iso-8859-1?q?=F8?= rgens
[EMAIL PROTECTED]

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: A python IDE for teaching that supports cyrillic i/o

2006-11-19 Thread Leo Kislov
Kirill Simonov wrote:
 Hi,

 Could anyone suggest me a simple IDE suitable for teaching Python as a
 first programming language to high school students?  It is necessary
 that it has a good support for input/output in Cyrillic.

 Unfortunately, most IDEs I tried failed miserably in this respect.  My
 test was simple: I've run the code
 name = raw_input(What's your name? )  # written in Russian
 print Hello, %s! % name   # in Russian as well
 both from the shell and as a standalone script. This either caused a
 UnicodeError or just printed invalid characters.

 For the record, I've checked IDLE, PythonWin, Eric, DrPython, SPE, and
 WingIDE.  The only ones that worked are WingIDE and IDLE (under Linux,
 but not under Windows).

IDLE on Windows works fine for your example in interactive console:

 name = raw_input(What's your name? )
What's your name? Леонид
 print name
Леонид
 name
u'\u041b\u0435\u043e\u043d\u0438\u0434'

and as a script:

What's your name? Леонид
Hello, Леонид!
type 'unicode'


That is IDLE + python 2.4 on Windows. So I'm not sure what is the
problem. In other messages you seems to be talking about system
console. Why? It's not part of IDE.

And another question: are you aware of the fact that recommended way to
handle non-ascii characters is to use unicode type? Most of IDEs should
work fine with unicode. 

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: A python IDE for teaching that supports cyrillic i/o

2006-11-19 Thread Leo Kislov

Kirill Simonov wrote:
 On Sun, Nov 19, 2006 at 03:27:32AM -0800, Leo Kislov wrote:
  IDLE on Windows works fine for your example in interactive console:
 
   name = raw_input(What's your name? )

 Have you tried to use cyrillic characters in a Python string in
 interactive console? When I do it, I get the Unsupported characters in
 input error. For instance,

  print Привет  # That's Hi in Russian.
 Unsupported characters in input

That works for me in Win XP English, with Russian locale and Russian
language for non-unicode programs. Didn't you say you want to avoid
unicode? If so, you need to set proper locale and language for
non-unicode programs.


  And another question: are you aware of the fact that recommended way to
  handle non-ascii characters is to use unicode type? Most of IDEs should
  work fine with unicode.

 Usually using unicode type gives you much more headache than benefits
 unless you are careful enough to never mix unicode and str objects.

For a professional programmer life is full of headaches like this :)
For high school students it could be troublesome and annoying, I agree.


 Anyway, I just want the interactive console of an IDE to behave like a
 real Python console under a UTF-8 terminal (with sys.stdout.encoding ==
 'utf-8').

Do you realize that utf-8 locale makes len() function and slicing of
byte strings look strange for high school students?

hi = uПривет.encode(utf-8)
r = uр.encode(utf-8)
print len(hi)# prints 12
print hi[1] == r   # prints False
for char in hi:
print char  # prints garbage

As I see you have several options:
1. Set Russian locale and Russian language for non-unicode programs on
Windows.
2. Introduce students to unicode.
3. Wait for python 3.0
4. Hack some IDE to make unicode friendly environment like unicode
literals by default, type(Привет) == unicode, unicode
stdin/stdout, open() uses utf-8 encoding by default for text files,
etc...

   -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: os.lisdir, gets unicode, returns unicode... USUALLY?!?!?

2006-11-18 Thread Leo Kislov
Martin v. Löwis wrote:
 Leo Kislov schrieb:
  How about returning two lists, first list contains unicode names, the
  second list contains undecodable names:
 
  files, troublesome = os.listdir(separate_errors=True)
 
  and make separate_errors=True by default in python 3.0 ?

 That would be quite an incompatible change, no?

Yeah, that was idea-dump. Actually it is possible to make this idea
mostly backward compatible by making os.listdir() return only unicode
names and os.binlistdir() return only binary directory entries.
Unfortunately the same trick will not work for getcwd.

Another idea is to map all 256 bytes to unicode private code points.
When a file name cannot be fully decoded the undecoded bytes will be
mapped to specially allocated code points. Unfortunately this idea
seems to leak if the program later wants to write such unicode string
to a file. Python will have to throw an exception since we don't know
if it is ok to write broken string to a file. So we are back to square
one, programs need to deal with filesystem garbage :(

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: how to print pdf with python on a inkjet printer.

2006-11-17 Thread Leo Kislov

krishnakant Mane wrote:
 hello all.
 I am developing an ncurses based python application that will require
 to create pdf reports for printing.
 I am not using py--qt or wx python.
 it is a consol based ui application and I need to make a pdf report
 and also send it to a lazer or ink jet printer.
 is it possible to do so with python?
 or is it that I will have to use the wxpython library asuming that
 there is a print dialog which can open up the list of printers?
 if wx python and gui is the only way then it is ok but I will like to
 keep this application on the ncurses side.

Assuming you are on a UNIX-like system, you really need to setup CUPS
http://www.cups.org/ (or may be your system already provides CUPS).
PDF seems to be the future intermediate format for UNIX printing
http://www.linux.com/article.pl?sid=06/04/18/2114252 and CUPS already
supports printing PDF files, just run lp your_file.pdf to print a
file. CUPS only have command line interface:
http://www.cups.org/documentation.php/options.html

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to print pdf with python on a inkjet printer.

2006-11-17 Thread Leo Kislov
Leo Kislov wrote:
 CUPS only have command line interface:
 http://www.cups.org/documentation.php/options.html

My mistake: CUPS actually has official C API
http://www.cups.org/documentation.php/api-cups.html and unofficial
python bindings http://freshmeat.net/projects/pycups/.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: os.lisdir, gets unicode, returns unicode... USUALLY?!?!?

2006-11-17 Thread Leo Kislov

Martin v. Löwis wrote:
 gabor schrieb:
  All this code will typically work just fine with the current behavior,
  so people typically don't see any problem.
 
 
  i am sorry, but it will not work. actually this is exactly what i did,
  and it did not work. it dies in the os.path.join call, where file_name
  is converted into unicode. and python uses 'ascii' as the charset in
  such cases. but, because listdir already failed to decode the file_name
  with the filesystem-encoding, it usually also fails when tried with
  'ascii'.

 Ah, right. So yes, it will typically fail immediately - just as you
 wanted it to do, anyway; the advantage with this failure is that you
 can also find out what specific file name is causing the problem
 (whereas when listdir failed completely, you could not easily find
  out the cause of the failure).

 How would you propose listdir should behave?

How about returning two lists, first list contains unicode names, the
second list contains undecodable names:

files, troublesome = os.listdir(separate_errors=True)

and make separate_errors=True by default in python 3.0 ?

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: os.lisdir, gets unicode, returns unicode... USUALLY?!?!?

2006-11-17 Thread Leo Kislov

gabor wrote:
 Martin v. Löwis wrote:
  gabor schrieb:
  i also recommend this approach.
 
  also, raising an exception goes well with the principle of the least
  surprise imho.
 
  Are you saying you wouldn't have been surprised if that had been
  the behavior?


 yes, i would not have been surprised. because it's kind-of expected when
 dealing with input, that malformed input raises an unicode-exception.
 and i would also expect, that if os.listdir completed without raising an
 exception, then the returned data is correct.

The problem is that most programmers just don't want to deal with
filesystem garbage but they won't be happy if the program breaks
either.

  How would you deal with that exception in your code?

 depends on the application. in the one where it happened i would just
 display an error message, and tell the admins to check the
 filesystem-encoding.

 (in other ones, where it's not critical to get the correct name, i would
 probably just convert the text to unicode using the replace behavior)

 what about using flags similar to how unicode() works? strict, ignore,
 replace and maybe keep-as-bytestring.

 like:
 os.listdir(dirname,'strict')

That's actually an interesting idea. The error handling modes could be:
'mix' -- current behaviour, 'ignore' -- drop names that cannot be
decoded, 'separate' -- see my other message. 

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problem reading with bz2.BZ2File(). Bug?

2006-11-15 Thread Leo Kislov
Clodoaldo Pinto Neto wrote:
 Fredrik Lundh wrote:
  Clodoaldo Pinto Neto wrote:
 
   The offending file is 5.5 MB. Sorry, i could not reproduce this problem
   with a smaller file.
 
  but surely you can post the repr() of the last two lines?

 This is the output:

 $ python bzp.py
 line number: 588317
 '\x07'
 ''

Confirmed on windows with 2.4 and 2.5:

C:\p\Python24\python.exe bzp.py
line number: 588317
'\x1e'
''

C:\p\Python25\python.exe bzp.py
line number: 588317
'\x1e'
''

Looks like one byte of garbage is appended at the end of file. Please
file a bug report. As a workaround rU mode seems to work fine for
this file.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: str.title question after '

2006-11-13 Thread Leo Kislov

Antoon Pardon wrote:
 I have a text in ascii. I use the ' for an apostroph. The problem is
 this gives problems with the title method.  I don't want letters
 after a ' to be uppercased. Here are some examples:

argument   result  expected

   't smidje   'T Smidje   't Smidje
   na'ama  Na'Ama  Na'ama
   al pi tnu'atAl Pi Tnu'AtAl Pi Tnu'at


 Is there an easy way to get what I want?

def title_words(s):
words = re.split('(\s+)', s)
return ''.join(word[0:1].upper()+word[1:] for word in words)


 Should the current behaviour condidered a bug?

I believe it follows definition of \w from re module.

 My would be inclined to answer yes, but that may be
 because this behaviour would be wrong in Dutch. I'm
 not so sure about english.

The problem is more complicated. First of all, why title() should be
limited to human languages? What about programming languages? Is
bar.bar.spam three tokens or one in a foo programming language? There
are some problems with human languages too: how are you going to
process out-of-the-box and italian-american?

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Character Encodings and display of strings

2006-11-13 Thread Leo Kislov

JKPeck wrote:
 It seemed to me that this sentence

 For many types, this function makes an attempt to return a string that
 would yield an object with the same value when passed to eval().

 might mean that the encoding setting of the source file might influence
 how repr represented the contents of the string.  Nothing to do with
 Unicode.  If a source file could have a declared encoding of, say,
 cp932 via the # coding comment, I thought there was a chance that eval
 would respond to that, too.

Not a chance :) Encoding is a property of an input/output object
(console, web page, plain text file, MS Word file, etc...). All
input/output object have specific rules determining their encoding,
there is absolutely no connection between encoding of the source file
and any other input/output object.

repr escapes bytes 128..255 because it doesn't know where you're going
to output its result so repr uses the safest encoding: ascii.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: comparing Unicode and string

2006-11-10 Thread Leo Kislov
Neil Cerutti wrote:
 On 2006-11-10, Steve Holden [EMAIL PROTECTED] wrote:
  But I don't insist on my PEP. The example just shows just
  another pitfall with Unicode and why I'll advise to any
  beginner: Never write text constants that contain non-ascii
  chars as simple strings, always make them Unicode strings by
  prepending the u.
 
  That doesn't do any good if you aren't writing them in unicode
  code points, though.
 
  You tell the interpreter what encoding your source code is in.
  It then knows precisely how to decode your string literals into
  Unicode. How do you write things in Unicode code points?

 for = uf\xfcr

Unless you're using unicode unfriendly editor or console, uf\xfcr is
the same as ufür:

 uf\xfcr is ufür
True

So there is no need to write unicode strings in hexadecimal
representation of code points.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Erronous unsupported locale setting ?

2006-11-06 Thread Leo Kislov
robert wrote:
 Why can the default locale not be set by its true name? but only by '' ? :

Probably it is just not implemented. But since locale names are system
specific (For example windows accepts 'ch' as Chinese in Taiwan, where
as IANA http://www.iana.org/assignments/language-subtag-registry
considers it Chamorro) setlocale should probably grow an additional
keyword parameter: setlocale(LC_ALL, iana='de-DE')

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Erronous unsupported locale setting ?

2006-11-06 Thread Leo Kislov

robert wrote:
 Leo Kislov wrote:
  robert wrote:
  Why can the default locale not be set by its true name? but only by '' ? :
 
  Probably it is just not implemented. But since locale names are system
  specific (For example windows accepts 'ch' as Chinese in Taiwan, where
  as IANA http://www.iana.org/assignments/language-subtag-registry
  considers it Chamorro) setlocale should probably grow an additional
  keyword parameter: setlocale(LC_ALL, iana='de-DE')

 that'd be another fat database to blow up the python core(s).

 I just wonder why locale.setlocale(locale.LC_ALL,de_DE) doesn't accept the 
 name, which
  locale.getlocale() / getdefaultlocale()
 ('de_DE', 'cp1252')
 already deliver ?

It is documented that those functions return cross platform RFC 1766
language code. This code sometimes won't be compatible with OS specific
locale name. Cross platform code can useful if you want to create your
own locale database for example cross platform language packs.

Right now we have:

setlocale(category) -- get(it's not a typo) OS locale name
getlocale(category) -- get cross platform locale name
setlocale(category,'') -- enable default locale, return OS locale name
getdefaultlocale()  -- get cross platform locale name

I agree it's very confusing API, especially setlocale acting like
getter, but that's what we have. Improvement is welcome.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to test python extension modules during 'make check' / 'make distcheck'?

2006-11-02 Thread Leo Kislov

Mark Asbach wrote:
 Hi pythonians,

 I'm one of the maintainers of an open source image processing toolkit
 (OpenCV) and responsible for parts of the autotools setup. The package
 mainly consists of four shared libraries but is accompanied by a python
 package containing some pure python code and of course extension modules
 for the four libraries.

 Now during the last month we were preparing a major release which means
 testing, fixing, testing, fixing, ... in the first degree. Typical
 functionality of the shared libraries is verified during 'make check'
 and 'make distcheck' by binaries that are linked against the libraries
 (straight forward) and are listed in the 'TESTS' automake primary.

 Unfortunately, many problems with the python wrappers arose from time to
 time. Currently we have to build and install before we can run any
 python-based test routines. When trying to integrate python module
 testing into the automake setup, there are some problems that I couldn't
 find a solution for:

 a) the extension modules are built in different (other) subdirectories -
 so they are not in the local path where python could find them

As I understand it's not python that cannot find them but dynamic
linker. On ELF UNIX systems you can set LD_LIBRARY_PATH to help linker
find dependencies, on Windows -- PATH. If you need details, you can
find them in dynamic linker manuals.

 b) the libraries and extension modules are built with libtool and may
 have rpaths compiled in (is this problematic)?

libtools seems to have some knobs to cope with rpath:
http://sourceware.org/ml/bug-glibc/2000-01/msg00058.html

 c) a different version of our wrappers might be installed on the testing
 machine, somewhere in python/site-packages. How can I make sure that
 python only finds my 'new' local generated modules?

Set PYTHONPATH to the directory where locally generated modules are
located. They will be found before site packages.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Lookuperror : unknown encoding : utf-8

2006-10-30 Thread Leo Kislov

Sachin Punjabi wrote:
 On Oct 30, 1:29 pm, Fredrik Lundh [EMAIL PROTECTED] wrote:
  Sachin Punjabi wrote:
   The OS is Windows XPthen your installation is seriously broken.  where 
   did you get the
  installation kit?  have you removed stuff from the Lib directory ?
 
  /F

 It was already installed on my PC and I have no clue how it was
 installed or any changes has been done.

Then it's a distribution of your PC manufacturer. They could omit some
modules like utf-8 codec.

 I am just downloading newer
 version from python.org and will install and check it. I think there
 should be problem with installation itself.

That's a right idea, I'd also recommend to leave the manufacturer's
python distribution alone. Do not remove it, do not upgrade it. Some
programs provided by the manufacturer can stop working. If the
preinstalled python was installed into c:\python24 directory, choose
some other directory when you install python from python.org.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Lookuperror : unknown encoding : utf-8

2006-10-30 Thread Leo Kislov

Sachin Punjabi wrote:
 I installed it again but it makes no difference. It still throws me
 error for LookUp Error: unknown encoding : utf-8.

Most likely you're not using the new python, you're still running old
one. 

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: subprocess decoding?

2006-10-29 Thread Leo Kislov

MC wrote:
 Hi!

 On win-XP (french), when I read subprocess (stdout), I must use
 differents decoding (cp1252,cp850,cp437, or no decoding), depending of
 the launch mode of the same Python's script:
   - from command-line
   - from start+run
   - from icon
   - by Python-COM-server
   - etc.

 (.py  .pyw can also contribute)


 How to know, on the fly, the encoding used by subprocess?

You can't. Consider a Windows equivalent of UNIX cat program. It just
dump content of a file to stdout. So the problem of finding out the
encoding of stdout is equal to finding out encoding of any file. It's
just impossible to do in general. Now, you maybe talking about
conventions. AFAIK since Windows doesn't have strong command line
culture, it doesn't such conventions.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Lookuperror : unknown encoding : utf-8

2006-10-29 Thread Leo Kislov

Sachin Punjabi wrote:
 Hi,

 I wanted to read a file encoded in utf-8 and and using the following
 syntax in my source which throws me an error specifying Lookuperror :
 unknown encoding : utf-8. Also I am working on Python version 2.4.1.

 import codecs
 fileObj = codecs.open( data.txt, r, utf-8 )

 Can anyone please guide me how do I get utf-8 activated in my codecs or
 any setting needs to be done for the same before using codecs.

What OS? Where did you get your python distribution? Anyway, I believe
utf-8 codec was in the python.org distribution since the introduction
of unicode (around python 2.0). If you can't use utf-8 codec right out
of the box, something is really wrong with your setup.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: gettext on Windows

2006-10-28 Thread Leo Kislov

[EMAIL PROTECTED] wrote:
 Martin v. Löwis wrote:
  [EMAIL PROTECTED] schrieb:
   Traceback (most recent call last):
 File panicbutton.py, line 36, in ?
   lan = gettext.GNUTranslations (open (sLang, rb))
 File C:\Python24\lib\gettext.py, line 177, in __init__
   self._parse(fp)
 File C:\Python24\lib\gettext.py, line 280, in _parse
   raise IOError(0, 'File is corrupt', filename)
   IOError: [Errno 0] File is corrupt: 'locale\\fr_FR.mo'
 
  If it says so, it likely is right. How did you create the file?

 I only get the File is corrupt error when I changed

 lan = gettext.GNUTranslations (open (sLang))

This code definately corrupts .mo files since on windows files are
opened in text mode by default.

 to

 lan = gettext.GNUTranslations (open (sLang, rb))

 Without the rb in the open () I get a struct.error : unpack str size
 does not match format error (see original post).

struct.error usually means input data doesn't correspond to expected
format.

 The .mo files were created using poEdit (www.poedit.org), and I get the
 same error with various translations, all created by different people.

Try msgunfmt
http://www.gnu.org/software/gettext/manual/html_node/gettext_128.html#SEC128
to see if it can convert your files back to text.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: my first software

2006-10-27 Thread Leo Kislov

[EMAIL PROTECTED] wrote:
 I am a beginner of programming and started to learn Python a week ago.
 last 3 days, i write this little tool for Renju.if you have any advice
 on my code,please tell me

 s = ''

 for i in range (0,len(done) - 1):
 s = s +str(done[i][0]) + str(done[i][1]) + '\n'
 s = s + str(done[len(done) - 1][0]) + str(done[len(done) - 1][1])



This is easier to do with a generator comprehension and join method:

s = '\n'.join(str(item[0]) + str(item[1]) for item in done)


 for i in range (0, len(s)):
 x = s[i][0]
 .
 if i%2 == 0:
 

There is a builtin function enumerate for this case, IMHO it's slightly
easier to read:

for i, item in enumerate(s)
x = item[0]
...
if not i%2:
...

 if len(done) != 0 and beensaved == 0 and askyesno(...):
 saveasfile()

It's a personal matter, but usually python programmers treats values in
boolean context directly without comparison:

if done and not beensaved and askyesno(...):

The rules are documented here: http://docs.python.org/lib/truth.html .

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: subprocess cwd keyword.

2006-10-27 Thread Leo Kislov

Ivan Vinogradov wrote:
 Dear All,

 I would greatly appreciate a nudge in the right direction concerning
 the use of cwd argument in the call function from subprocess module.

 The setup is as follows:

 driver.py - python script
 core/ - directory
   main- fortran executable in the core directory


 driver script generates some input files in the core directory. Main
 should do its thing and dump the output files back into core.
 The problem is, I can't figure out how to do this properly.

 call(core/main) works but uses .. of core for input/output.

 call(core/main,cwd=core) and call(main,cwd=core) both result in
[snip exception]

Usually current directory is not in the PATH on UNIX. Try
call(./main,cwd=core) 

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to identify generator/iterator objects?

2006-10-25 Thread Leo Kislov

Kenneth McDonald wrote:
 I'm trying to write a 'flatten' generator which, when give a
 generator/iterator that can yield iterators, generators, and other data
 types, will 'flatten' everything so that it in turns yields stuff by
 simply yielding the instances of other types, and recursively yields the
 stuff yielded by the gen/iter objects.

 To do this, I need to determine (as fair as I can see), what are
 generator and iterator objects. Unfortunately:

   iter(abc)
 iterator object at 0x61d90
   def f(x):
 ... for s in x: yield s
 ...
   f
 function f at 0x58230
   f.__class__
 type 'function'

 So while I can identify iterators, I can't identify generators by class.

But f is not a generator, it's a function returning generator:

 def f():
... print Hello
... yield 1
...
 iter(f)
Traceback (most recent call last):
  File input, line 1, in ?
TypeError: iteration over non-sequence
 iter(f())
generator object at 0x016C7238
 type(f())
type 'generator'
 

Notice, there is no side effect of calling f function.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to identify generator/iterator objects?

2006-10-25 Thread Leo Kislov

Michael Spencer wrote:
 Kenneth McDonald wrote:
  I'm trying to write a 'flatten' generator which, when give a
  generator/iterator that can yield iterators, generators, and other data
  types, will 'flatten' everything so that it in turns yields stuff by
  simply yielding the instances of other types, and recursively yields the
  stuff yielded by the gen/iter objects.
 
  To do this, I need to determine (as fair as I can see), what are
  generator and iterator objects. Unfortunately:
 
iter(abc)
  iterator object at 0x61d90
def f(x):
  ... for s in x: yield s
  ...
f
  function f at 0x58230
f.__class__
  type 'function'
 
  So while I can identify iterators, I can't identify generators by class.
 
  Is there a way to do this? Or perhaps another (better) way to achieve
  this flattening effect? itertools doesn't seem to have anything that
  will do it.
 
  Thanks,
  Ken
 I *think* the only way to tell if a function is a generator without calling it
 is to inspect the compilation flags of its code object:

from compiler.consts import CO_GENERATOR
def is_generator(f):
   ... return f.func_code.co_flags  CO_GENERATOR != 0
   ...
def f1(): yield 1
   ...
def f2(): return 1
   ...
is_generator(f1)
   True
is_generator(f2)
   False
   

It should be noted that this checking is completely irrelevant for the
purpose of writing flatten generator. Given

def inc(n):
yield n+1

the following conditions should be true:

list(flatten([inc,inc])) == [inc,inc]
list(flatten([inc(3),inc(4)]) == [4,5]

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: encoding of sys.argv ?

2006-10-23 Thread Leo Kislov

Jiba wrote:
 Hi all,

 I am desperately searching for the encoding of sys.argv.

 I use a Linux box, with French UTF-8 locales and an UTF-8 filesystem. 
 sys.getdefaultencoding() is ascii and sys.getfilesystemencoding() is 
 utf-8. However, sys.argv is neither in ASCII (since I can pass French 
 accentuated character), nor in UTF-8. It seems to be encoded in latin-1, 
 but why ?

Your system is misconfigured, complain to your distribution. On UNIX
sys.getfilesystemencoding(), sys.stdin.encoding, sys.stdout.encoding,
locale.getprefferedencoding and the encoding of the characters you type
should be the same.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: encoding of sys.argv ?

2006-10-23 Thread Leo Kislov

Marc 'BlackJack' Rintsch wrote:
 In [EMAIL PROTECTED], Jiba wrote:

  I am desperately searching for the encoding of sys.argv.
 
  I use a Linux box, with French UTF-8 locales and an UTF-8 filesystem.
  sys.getdefaultencoding() is ascii and sys.getfilesystemencoding() is
  utf-8. However, sys.argv is neither in ASCII (since I can pass French
  accentuated character), nor in UTF-8. It seems to be encoded in
  latin-1, but why ?

 There is no way to determine the encoding.  The application that starts
 another and sets the arguments can use any encoding it likes and there's
 no standard way to find out which it was.

There is standard way: nl_langinfo function
http://www.opengroup.org/onlinepubs/009695399/functions/nl_langinfo.html
The code in pythonrun.c properly uses it find out the encoding. The
other question if Linux or *BSD distributions confirm to the standard.

  -- Leo.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Flexible Collating (feedback please)

2006-10-20 Thread Leo Kislov
Ron Adam wrote:
 Leo Kislov wrote:
  Ron Adam wrote:
 
  locale.setlocale(locale.LC_ALL, '')  # use current locale settings
 
  It's not current locale settings, it's user's locale settings.
  Application can actually use something else and you will overwrite
  that. You can also affect (unexpectedly to the application)
  time.strftime() and C extensions. So you should move this call into the
  _test() function and put explanation into the documentation that
  application should call locale.setlocale

 I'll experiment with this a bit, I was under the impression that local.strxfrm
 needed the locale set for it to work correctly.

Actually locale.strxfrm and all other functions in locale module work
as designed: they work in C locale before the first call to
locale.setlocale. This is by design, call to locale.setlocale should be
done by an application, not by a 3rd party module like your collation
module.

 Maybe it would be better to have two (or more) versions?  A string, unicode, 
 and
 locale version or maybe add an option to __init__ to choose the behavior?

I don't think it should be two separate versions. Unicode support is
only a matter of code like this:

# in the constructor
self.encoding = locale.getpreferredencoding()

# class method
def strxfrm(self, s):
if type(s) is unicode:
return locale.strxfrm(s.encode(self.encoding,'replace')
return locale.strxfrm(s)

and then instead of locale.strxfrm call self.strxfrm. And similar code
for locale.atof

 This was the reason for using locale.strxfrm. It should let it work with 
 unicode
 strings from what I could figure out from the documents.

 Am I missing something?

strxfrm works only with byte strings encoded in the system encoding.

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: comparing Unicode and string

2006-10-20 Thread Leo Kislov

[EMAIL PROTECTED] wrote:
 Thanks, John and Neil, for your explanations.

 Still I find it rather difficult to explain to a Python beginner why
 this error occurs.

 Suggestion: shouldn't an error raise already when I try to assign s2? A
 normal string should never be allowed to contain characters that are
 not codable using the system encoding. This test could be made at
 compile time and would render Python more didadic.

This is impossible because of backward compatibility, your suggestion
will break a lot of existing programs. The change is planned to happen
in python 3.0 where it's ok to break backward compatibility if needed.

  -- Leo.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: right curly quote and unicode

2006-10-20 Thread Leo Kislov
On 10/19/06, TiNo [EMAIL PROTECTED] wrote:
 Now I know where the problem lies. The character in the actual file path is
 u+00B4 (Acute accent) and in the Itunes library it is u+2019 (a right curly
 quote). Somehow Itunes manages to make these two the same...?

 As it is the only file that gave me trouble, I changed the accent in the
 file to an apostrophe and re-imported it in Itunes. But I would like to hear
 if there is a solution for this problem?

I remember once I imported a russian mp3 violating tagging standard by
encoding song name in windows-1251 encoding into itunes and itunes
converted the name without even asking me into standard compliant
utf-8. So there is some magic going on. In your case u+00B4 is a
compatibility character from unicode.org point of view and they
discourage usage of such characters. Perhaps itunes is eager to make
u+00B4 character history as soon as possible. Googling for itunes
replaces acute with quote reveals that char u+00B4 is not alone. Read
the first hit. I'm afraid you will have to reverse engeneer what
itunes is doing to some characters.

  -- Leo.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Flexible Collating (feedback please)

2006-10-19 Thread Leo Kislov
Ron Adam wrote:

 locale.setlocale(locale.LC_ALL, '')  # use current locale settings

It's not current locale settings, it's user's locale settings.
Application can actually use something else and you will overwrite
that. You can also affect (unexpectedly to the application)
time.strftime() and C extensions. So you should move this call into the
_test() function and put explanation into the documentation that
application should call locale.setlocale


  self.numrex = re.compile(r'([\d\.]*|\D*)', re.LOCALE)

[snip]

  if NUMERICAL in self.flags:
  slist = self.numrex.split(s)
  for i, x in enumerate(slist):
  try:
  slist[i] = float(x)
  except:
  slist[i] = locale.strxfrm(x)

I think you should call locale.atof instead of float, since you call
re.compile with re.LOCALE.

Everything else looks fine. The biggest missing piece is support for
unicode strings.

  -- Leo.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Type discrepancy using struct.unpack

2006-10-19 Thread Leo Kislov

Pieter Rautenbach wrote:
 Hallo,

 I have a 64 bit server with CentOS 4.3 installed, running Python.

 [EMAIL PROTECTED] pymsnt-0.11.2]$ uname -a
 Linux lutetium.mxit.co.za 2.6.9-34.ELsmp #1 SMP Thu Mar 9 06:23:23 GMT
 2006 x86_64 x86_64 x86_64 GNU/Linux

 Consider the following two snippets, issuing a struct.unpack(...) using
 Python 2.3.4 and Python 2.5 respectively.

 [EMAIL PROTECTED] pymsnt-0.11.2]$ python
 Python 2.5 (r25:51908, Oct 17 2006, 10:34:59)
 [GCC 3.4.5 20051201 (Red Hat 3.4.5-2)] on linux2
 Type help, copyright, credits or license for more information.
  import struct
  print type(struct.unpack(L, )[0])
 type 'int'
 

 [EMAIL PROTECTED] pymsnt-0.11.2]$ /usr/bin/python2.3
 Python 2.3.4 (#1, Feb 17 2005, 21:01:10)
 [GCC 3.4.3 20041212 (Red Hat 3.4.3-9.EL4)] on linux2
 Type help, copyright, credits or license for more information.
  import struct
  print type(struct.unpack(L, )[0])
 type 'long'
 

 I would expect type 'long' in both cases. Why is this not so?

http://mail.python.org/pipermail/python-dev/2006-May/065199.html

  -- Leo.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: right curly quote and unicode

2006-10-18 Thread Leo Kislov
On 10/17/06, TiNo [EMAIL PROTECTED] wrote:
 Hi all,

 I am trying to compare my Itunes Library xml to the actual files on my
 computer.
 As the xml file is in UTF-8 encoding, I decided to do the comparison of the
 filenames in that encoding.
 It all works, except with one file. It is named 'The Chemical
 Brothers-Elektrobank-04 - Don't Stop the Rock (Electronic Battle Weapon
 Version).mp3'. It goes wrong with the apostrophe in Don't. That is actually
 not an apostrophe, but ASCII char 180: ´

It's actually Unicode char #180, not ASCII. ASCII characters are in
0..127 range.

 In the Itunes library it is encoded as: Don%E2%80%99t

Looks like a utf-8 encoded string, then encoded like an url.

 I do some some conversions with both the library path names and the folder
 path names. Here is the code:
 (in the comment I dispay how the Don't part looks. I got this using print
 repr(filename))
 -
 #Once I have the filenames from the library I clean them using the following
 code (as filenames are in the format '
 file://localhost/m:/music/track%20name.mp3')

 filename = urlparse.urlparse(filename)[2][1:]  # u'Don%E2%80%99t' ; side
 question, anybody who nows a way to do this in a more fashionable way?
 filename = urllib.unquote (filename) # u'Don\xe2\x80\x99t'

This doesn't work for me in python 2.4, unquote expects str type, not
unicode. So it should be:

filename = urllib.unquote(filename.encode('ascii')).decode('utf-8')


 filename = os.path.normpath(filename) # u'Don\xe2\x80\x99t'

 I get the files in my music folder with the os.walk method and then
 I do:

 filename = os.path.normpath(os.path.join (root,name))  # 'Don\x92t'
 filename = unicode(filename,'latin1') # u'Don\x92t'
 filename = filename.encode('utf-8') # 'Don\xc2\x92t'
 filename = unicode(filename,'latin1') # u'Don\xc2\x92t'

This looks like calling random methods with random parameters :)
Python is able to return you unicode file names right away, you just
need to pass input parameters as unicode strings:

 os.listdir(u/)
[u'alarm', u'ARCSOFT' ...]

So in your case you need to make sure the start directory parameter
for walk function is unicode.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: characters in python

2006-10-18 Thread Leo Kislov


On Oct 18, 11:50 am, Stens [EMAIL PROTECTED] wrote:
 Stens wrote:
  Can python handle this characters: c,c,ž,d,š?

  If can howI wanna to change some characters in text (in the file) to the
 characters at this address:

 http://rapidshare.de/files/37244252/Untitled-1_copy.png.html

You need to use unicode, see any python unicode tutorial, for example
this one http://www.amk.ca/python/howto/unicode or any other you can
find with google.

Your script can look like this:

# -*- coding: PUT-HERE-ENCODING-OF-THIS-SCRIPT-FILE -*-
import codecs
outfile = codecs.open(your output file, w, encoding of the output
file):
for line in codecs.open(your input file, r, encoding of the input
file):
outfile.write(line.replace(u'd',u'd'))

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: characters in python

2006-10-18 Thread Leo Kislov

Leo Kislov wrote using google groups beta:
 On Oct 18, 11:50 am, Stens [EMAIL PROTECTED] wrote:
  Stens wrote:
   Can python handle this characters: c,c,ž,d,š?

[snip]

 outfile.write(line.replace(u'd',u'd'))

I hope you'll do better than google engeers who mess up croatian
characters in new google groups. Of course the last 'd' should be latin
d with stroke. I really typed it but google swallowed the stroke.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: codecs.EncodedFile

2006-10-18 Thread Leo Kislov

Neil Cerutti wrote:

 It turns out to be troublesome for my case because the
 EncodedFile object translates calls to readline into calls to
 read.

 I believe it ought to raise a NotImplemented exception when
 readline is called.

 As it is it silently causes interactive applications to
 apparently hang forever, and breaks the line-buffering
 expectation of non-interactive applications.

Does it work if stdin is a pipe? If it works then raising
NotImplemented doesn't make sense.

 If raising the exception is too much to ask, then at least it
 should be documented better.

Improving documentation is always a good idea. Meanwhile see my
solution how to make readline method work:
http://groups.google.com/group/comp.lang.python/msg/f1267dc612314657

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to send E-mail without an external SMTP server ?

2006-10-16 Thread Leo Kislov


On Oct 15, 10:25 pm, [EMAIL PROTECTED] [EMAIL PROTECTED]
wrote:
 Hi,

 I just want to send a very simple email from within python.

 I think the standard module of smtpd in python can do this, but I
 haven't found documents about how to use it after googleing. Are there
 any examples of using smtpd ? I'm not an expert,so I need some examples
 to learn how to use it.

smtpd is for relaying mail not for sending. What you need it a dns
toolkit (search cheeseshop) to map domain name to list of incoming mail
servers, and then using stdlib smtplib try to submit the message to
them.

 Or maybe there is a better way to to this?

This won't work if you're behind a strict corporate firewall or if ISP
is blocking port 25 outgoing connections. In those cases you _have_ to
use an external mail server.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to send E-mail without an external SMTP server ?

2006-10-16 Thread Leo Kislov


On Oct 16, 12:31 am, [EMAIL PROTECTED] [EMAIL PROTECTED]
wrote:
 Rob Wolfe wrote:
  [EMAIL PROTECTED] wrote:

  Hi,

  I just want to send a very simple email from within python.

  I think the standard module of smtpd in python can do this, but I
  haven't found documents about how to use it after googleing. Are there
  any examples of using smtpd ? I'm not an expert,so I need some examples
  to learn how to use it.

  See standard documentation:

 http://docs.python.org/lib/SMTP-example.html

  HTH,
  RobI have read the example and copied the code and save as send.py, then I
 run it. Here is the output:
 $ python send.py
 From: [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Enter message, end with ^D (Unix) or ^Z (Windows):
 just a test from localhost
 Message length is 82
 send: 'ehlo [202.127.19.74]\r\n'
 reply: '250-WebMail\r\n'
 reply: '250 AUTH plain\r\n'
 reply: retcode (250); Msg: WebMail
 AUTH plain
 send: 'mail FROM:[EMAIL PROTECTED]\r\n'
 reply: '502 negative vibes\r\n'
 reply: retcode (502); Msg: negative vibes
 send: 'rset\r\n'
 reply: '502 negative vibes\r\n'
 reply: retcode (502); Msg: negative vibes
 Traceback (most recent call last):
  File send.py, line 26, in ?
server.sendmail(fromaddr, toaddrs, msg)
  File /usr/lib/python2.4/smtplib.py, line 680, in sendmail
raise SMTPSenderRefused(code, resp, from_addr)
 smtplib.SMTPSenderRefused: (502, 'negative vibes', '[EMAIL PROTECTED]')

 Do I have to setup a smtp server on my localhost ?

You need to use login method
http://docs.python.org/lib/SMTP-objects.html. And by the way, the
subject of your message is very confusing, you are posting log where
you're sending email using external server.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to send E-mail without an external SMTP server ?

2006-10-16 Thread Leo Kislov


On Oct 16, 2:04 am, [EMAIL PROTECTED] [EMAIL PROTECTED]
wrote:
 It's not safe if I have to use login method explicitly by which I have
 to put my username and password in the script. I have also tried the
 Unix command 'mail', but without success, either. I could use 'mail' to
 send an E-mail to the user on the server, but I couldn't send an E-mail
 to an external E-mail server. I realized that it may because the port 25
 outgoing connections are blocked, so I gave up. I will have to login
 periodically to check the status of the jobs:-(

Using username password is safe as long as you trust system admin, you
just need to make your script readable only to you. Or even better put
the username and password in a separate file. There is also a way to
limit damage in case you don't trust admin, you just need to get auth
token. Start smtp session and set debug level(True), use login method
and see the token:

send: 'AUTH PLAIN HERE IS THE TOKEN\r\n'
reply: '235 2.7.0 Accepted\r\n'
reply: retcode (235); Msg: 2.7.0 Accepted

Then put the token in a file readable only to you, and from now on
instead of login() method use docmd('AUTH PLAIN',YOUR TOKEN FROM
FILE). If the token is stolen, the thief can only send mail from your
account but won't be able to login with password.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Alphabetical sorts

2006-10-16 Thread Leo Kislov


On Oct 16, 2:39 pm, Tuomas [EMAIL PROTECTED] wrote:
 My application needs to handle different language sorts. Do you know a
 way to apply strxfrm dynamically i.e. without setting the locale?

Collation is almost always locale dependant. So you have to set locale.
One day I needed collation that worked on Windows and Linux. It's not
that polished and not that tested but it worked for me:

import locale, os, codecs

current_encoding = 'ascii'
current_locale = ''

def get_collate_encoding(s):
'''Grab character encoding from locale name'''
split_name = s.split('.')
if len(split_name) != 2:
return 'ascii'
encoding = split_name[1]
if os.name == nt:
encoding = 'cp' + encoding
try:
codecs.lookup(encoding)
return encoding
except LookupError:
return 'ascii'

def setup_locale(locale_name):
'''Switch to new collation locale or do nothing if locale
   is the same'''
global current_locale, current_encoding
if current_locale == locale_name:
return
current_encoding = get_collate_encoding(
locale.setlocale(locale.LC_COLLATE, locale_name))
current_locale = locale_name

def collate_key(s):
'''Return collation weight of a string'''
return locale.strxfrm(s.encode(current_encoding, 'ignore'))

def collate(lst, locale_name):
'''Sort a list of unicode strings according to locale rules.
   Locale is specified as 2 letter code'''
setup_locale(locale_name)
return sorted(lst, key = collate_key)


words = u'c ch f'.split()
print ' '.join(collate(words, 'en'))
print ' '.join(collate(words, 'cz'))

Prints:

c ch f
c f ch

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Need a Regular expression to remove a char for Unicode text

2006-10-13 Thread Leo Kislov


On Oct 13, 4:44 am, [EMAIL PROTECTED] wrote:
 శ్రీనివాస wrote:
  Hai friends,
  Can any one tell me how can i remove a character from a unocode text.
  కల్‌హార is a Telugu word in Unicode. Here i want to
  remove '' but not replace with a zero width char. And one more thing,
  if any whitespaces are there before and after '' char, the text should
  be kept as it is. Please tell me how can i workout this with regular
  expressions.

  Thanks and regards
  Srinivasa Raju DatlaDon't know anything about Telugu, but is this the 
  approach you want?

  x=u'\xfe\xff  \xfe\xff \xfe\xff\xfe\xff'
  noampre = re.compile('(?!\s)(?!\s)', re.UNICODE).sub
  noampre('', x)

He wants to replace  with zero width joiner so the last call should be
noampre(u\u200D, x)

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Need a Regular expression to remove a char for Unicode text

2006-10-13 Thread Leo Kislov
On Oct 13, 4:55 am, Leo Kislov [EMAIL PROTECTED] wrote:
 On Oct 13, 4:44 am, [EMAIL PROTECTED] wrote:

  శ్రీనివాస wrote:
   Hai friends,
   Can any one tell me how can i remove a character from a unocode text.
   కల్‌హార is a Telugu word in Unicode. Here i want to
   remove '' but not replace with a zero width char. And one more thing,
   if any whitespaces are there before and after '' char, the text should
   be kept as it is. Please tell me how can i workout this with regular
   expressions.

   Thanks and regards
   Srinivasa Raju DatlaDon't know anything about Telugu, but is this the 
   approach you want?

   x=u'\xfe\xff  \xfe\xff \xfe\xff\xfe\xff'
   noampre = re.compile('(?!\s)(?!\s)', re.UNICODE).sub
   noampre('', x)

 He wants to replace  with zero width joiner so the last call should be
 noampre(u\u200D, x)

Pardon my poor reading comprehension, OP doesn't want zero width
joiner. Though I'm confused why he mentioned it at all.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: does raw_input() return unicode?

2006-10-10 Thread Leo Kislov

Theerasak Photha wrote:
 On 10/10/06, Martin v. Löwis [EMAIL PROTECTED] wrote:
  Theerasak Photha schrieb:
   At the moment, it only returns unicode objects when invoked
   in the IDLE shell, and only if the character entered cannot
   be represented in the locale's charset.
  
   Why only IDLE? Does urwid or another console UI toolkit avoid this 
   somehow?
 
  I admit I don't know what urwid is; from a shallow description I find
  (a console user interface library) I can't see the connection to
  raw_input(). How would raw_input() ever use urwid?

 The other way around: would urwid use raw_input() or other Python
 input functions anywhere?

 And what causes Unicode input to work in IDLE alone?

Other applications except python are actually free to implement unicode
stdin. python cannot do it because of backward compatibility. You can
argue that python interactive console could do it too, but think about
it this way: python interactive console deliberately behaves like a
running python program would.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: does raw_input() return unicode?

2006-10-10 Thread Leo Kislov

Duncan Booth wrote:
 Stuart McGraw [EMAIL PROTECTED] wrote:

  So, does raw_input() ever return unicode objects and if
  so, under what conditions?
 
 It returns unicode if reading from sys.stdin returns unicode.

 Unfortunately, I can't tell you how to make sys.stdin return unicode for
 use with raw_input. I tried what I thought should work and as you can see
 it messed up the buffering on stdin. Does anyone else know how to wrap
 sys.stdin so it returns unicode but is still unbuffered?

Considering that all consoles are ascii based, the following should
work where python was able to determine terminal encoding:

class ustdio(object):
def __init__(self, stream):
self.stream = stream
self.encoding = stream.encoding
def readline(self):
return self.stream.readline().decode(self.encoding)

sys.stdin = ustdio(sys.stdin)

answer = raw_input()
print type(answer)

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: People's names (was Re: sqlite3 error)

2006-10-10 Thread Leo Kislov
John J. Lee wrote:
 Steve Holden [EMAIL PROTECTED] writes:
 [...]
   There would also need to be a flag field to indicate the canonical
   ordering
   for writing out the full name: e.g. family-name-first, given-names-first.
   Do we need something else for the Vietnamese case?
 
  You'd think some standards body would have worked on this, wouldn't
  you. I couldn't think of a Google search string that would lead to
  such information, though. Maybe other, more determined, readers can do
  better.

 I suppose very few projects actually deal with more than a handful of
 languages or cultures, but it does surprise me how hard it is to find
 out about this kind of thing -- especially given that open source
 projects often end up with all kinds of weird and wonderful localised
 versions.

 On a project that involved 9 localisations, just trying to find
 information on the web about standard collation of diacritics
 (accented characters) in English, German, and Scandinavian languages
 was more difficult than I'd expected.

As far as I understand unicode.org has become the central(?) source of
locale information: http://unicode.org/cldr/Did you use it?

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to find number of characters in a unicode string?

2006-10-10 Thread Leo Kislov

Lawrence D'Oliveiro wrote:
 In message [EMAIL PROTECTED], Marc 'BlackJack'
 Rintsch wrote:

  In [EMAIL PROTECTED],
  Preben Randhol wrote:
 
  Is there a way to calculate in characters
  and not in bytes to represent the characters.
 
  Decode the byte string and use `len()` on the unicode string.

 Hmmm, for some reason

 len(uC\u0327)

 returns 2.

If python ever provide this functionality it would be I guess
uC\u0327.width() == 1. But it's not clear when unicode.org will
provide recommended fixed font character width information for *all*
characters. I recently stumbled upon Tamil language, where for example
u'\u0b95\u0bcd', u'\u0b95\u0bbe', u'\u0b95\u0bca', u'\u0b95\u0bcc'
looks like they have width 1,2,3 and 4 columns. To add insult to injury
these 4 symbols are all considered *single* letter symbols :) If your
email reader is able to show them, here they are in all their glory:
க், கா, கொ, கௌ.

-- 
http://mail.python.org/mailman/listinfo/python-list