ANN: eGenix mxODBC Zope/Plone Database Adapter 2.1.2

2013-05-07 Thread eGenix Team: M.-A. Lemburg


ANNOUNCEMENT

  mxODBC Zope/Plone Database Adapter

Version 2.1.2

 for Zope and the Plone CMS

Available for Plone 4.0, 4.1 and 4.2,
Zope 2.12 and 2.13, on
Windows, Linux, Mac OS X, FreeBSD and other platforms

This announcement is also available on our web-site for online reading:
http://www.egenix.com/company/news/eGenix-mxODBC-Zope-DA-2.1.2-GA.html



INTRODUCTION

The eGenix mxODBC Zope DA allows you to easily connect your Zope or
Plone CMS installation to just about any database backend on the
market today, giving you the reliability of the commercially supported
eGenix product mxODBC and the flexibility of the ODBC standard as
middle-tier architecture.

The mxODBC Zope Database Adapter is highly portable, just like Zope
itself and provides a high performance interface to all your ODBC data
sources, using a single well-supported interface on Windows, Linux,
Mac OS X, FreeBSD and other platforms.

This makes it ideal for deployment in ZEO Clusters and Zope hosting
environments where stability and high performance are a top priority,
establishing an excellent basis and scalable solution for your Plone
CMS.

Product page:

http://www.egenix.com/products/zope/mxODBCZopeDA/



NEWS

We are pleased to announce the new version 2.1.2 of our mxODBC
Zope/Plone Database Adapter product.

Compatibility Enhancements
--

 * Added a work-around for a regression in Python 2.7.4 that results
   in a segfault when exiting Zope/Plone after loading the mxODBC Zope
   DA.

   The regression will be fixed in Python 2.7.5, but we don't want to
   expose our users to segfaults, so added a work-around.

   See http://bugs.python.org/issue17703 for the bug ticket.

 * Upgraded the underlying mxODBC library to version 3.2.3. Please see
   the mxODBC 3.2.3 release announcement for additional details:

   http://www.egenix.com/company/news/eGenix-mxODBC-3.2.3-GA.html


Driver Compatibility


 * Please also see the mxODBC Zope DA 2.1.1 announcement for an
   important new feature which allows to dramatically increase the
   fetch performance when working with MS SQL Server and IBM DB2
   databases.

   http://www.egenix.com/company/news/eGenix-mxODBC-Zope-DA-2.1.1-GA.html


For the full set of changes please check the change log:

http://www.egenix.com/products/zope/mxODBCZopeDA/changelog.html



FEATURES

Version 2.1.0 of our mxODBC Zope/Plone Database Adapter product was
released on 2012-09-18. Please see the announcement for highlights of
the 2.1 release:

http://www.egenix.com/company/news/eGenix-mxODBC-Zope-DA-2.1.0-GA.html

For the full set of features mxODBC Zope DA has to offer, please see:

http://www.egenix.com/products/zope/mxODBCZopeDA/#Features



UPGRADING

Users are encouraged to upgrade to this latest mxODBC Zope/Plone DA
release to benefit from the new features and updated ODBC driver
support.

We have taken special care not to introduce backwards incompatible
changes, making the upgrade experience as smooth as possible.

As always, patch level upgrades (e.g. from 2.1.0 to 2.1.2) are free of
charge. The licenses you have purchased for 2.1 will continue to work
with this new release.

For major and minor upgrade purchases, we will give out 20% discount
coupons going from mxODBC Zope DA 1.x to 2.1 and 50% coupons for
upgrades from mxODBC 2.x to 2.1. After upgrade, use of the original
license from which you upgraded is no longer permitted.

Please contact the eGenix.com Sales Team with your existing license
serials for details for an upgrade discount coupon.

If you want to try the new release before purchase, you can request
30-day evaluation licenses by visiting our web-site or writing to
sa...@egenix.com, stating your name (or the name of the company) and
the number of evaluation licenses that you need.

___

SUPPORT

Commercial support for this product is available from eGenix.com.
Please see

http://www.egenix.com/services/support/

for details about our support offerings.



MORE INFORMATION

For more information on the mxODBC Zope Database Adapter, licensing
and download instructions, please visit our web-site:

http://www.egenix.com/products/zope/mxODBCZopeDA/

You can buy mxODBC Zope DA licenses online from the eGenix.com shop at:

http://shop.egenix.com/

About Python (http://www.python.org/):

Python is an object-oriented Open S

Re: First python program, syntax error in while loop

2013-05-07 Thread Chris Angelico
On Tue, May 7, 2013 at 4:10 PM, Mark Lawrence  wrote:
> On 07/05/2013 01:17, alex23 wrote:
>>
>> On May 6, 10:37 pm, Mark Lawrence  wrote:
>>>
>>> One of these days I'll work out why some people insist on using
>>> superfluous parentheses in Python code.  Could it be that they enjoy
>>> exercising their fingers by reaching for the shift key in conjunction
>>> with the 9 or 0 key?
>>
>>
>> One of these days I'll work out why some programmers consider typing
>> to be "effort".
>>
>
> I think it's very important to consider this aspect.  E.g. any one using
> dynamically typed languages has to put in more effort as they should be
> typing up their test code, whereas people using statically typed languages
> can simply head down to the pub once their code has compiled.

And those who are porting other people's code to a new platform have
weeks of work just to get ./configure to run, but after that it's
smooth...

Everything is variable.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Why sfml does not play the file inside a function in this python code?

2013-05-07 Thread cheirasacan
from tkinter import *
import sfml


window = Tk()
window.minsize( 640, 480 )


def sonido():
file = sfml.Music.from_file('poco.ogg')
file.play()


test = Button ( window, text = 'Sound test', command=sonido )
test.place ( x = 10, y = 60)

window.mainloop()




Using Windows 7, Python 3.3, sfml 1.3.0 library, the file it is played if i put 
it out of the function. ¿ what am i doing wrong ? Thanks.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why sfml does not play the file inside a function in this python code?

2013-05-07 Thread MRAB

On 07/05/2013 10:27, cheirasa...@gmail.com wrote:

from tkinter import *
import sfml


window = Tk()
window.minsize( 640, 480 )


def sonido():
 file = sfml.Music.from_file('poco.ogg')
 file.play()


test = Button ( window, text = 'Sound test', command=sonido )
test.place ( x = 10, y = 60)

window.mainloop()




Using Windows 7, Python 3.3, sfml 1.3.0 library, the file it is played if i put 
it out of the function. ¿ what am i doing wrong ? Thanks.


Perhaps what's happening is that sonido starts playing it and then
returns, meaning that there's no longer a reference to it ('file' is
local to the function), so it's collected by the garbage collector.

If that's the case, try keeping a reference to it, perhaps by making
'file' global (in a simple program like this one, using global should
be OK).

--
http://mail.python.org/mailman/listinfo/python-list


unexpected syntax errors

2013-05-07 Thread Robin Becker

A user is getting this error



New issue 8: bad raise syntax
https://bitbucket.org/rptlab/reportlab/issue/8/bad-raise-syntax



  File "/usr/lib/python2.7/site-packages/svg2rlg.py", line 16, in 
from reportlab.graphics import renderPDF
  File "/usr/lib64/python2.7/site-packages/reportlab/graphics/renderPDF.py", 
line 168
raise ValueError, 'bad value for textAnchor '+str(text_anchor)
^
SyntaxError: invalid syntax




however, I believe that this older syntax is allowed in python 2.7. We've had 
other issues like this raised from various distros which are apparently making 
changes to 2.7 which change the external behaviour eg spelling corrections to 
attribute names. Could this be one of those?

--
Robin Becker

--
http://mail.python.org/mailman/listinfo/python-list


use python to split a video file into a set of parts

2013-05-07 Thread iMath
I use the following python code to split a FLV video file into a set of parts 
,when finished ,only the first part video can be played ,the other parts are 
corrupted.I wonder why and Is there some correct ways to split video files

import sys, os
kilobytes = 1024
megabytes = kilobytes * 1000
chunksize = int(1.4 * megabytes)   # default: roughly a floppy

print(chunksize , type(chunksize ))

def split(fromfile, todir, chunksize=chunksize):
if not os.path.exists(todir):  # caller handles errors
os.mkdir(todir)# make dir, read/write parts
else:
for fname in os.listdir(todir):# delete any existing files
os.remove(os.path.join(todir, fname))
partnum = 0
input = open(fromfile, 'rb')   # use binary mode on Windows
while True:# eof=empty string from read
chunk = input.read(chunksize)  # get next part <= chunksize
if not chunk: break
partnum += 1
filename = os.path.join(todir, ('part{}.flv'.format(partnum)))
fileobj  = open(filename, 'wb')
fileobj.write(chunk)
fileobj.close()# or simply open().write()
input.close()
assert partnum <=  # join sort fails if 5 digits
return partnum

if __name__ == '__main__':

fromfile = input('File to be split: ')   # input if clicked
todir= input('Directory to store part files:')
print('Splitting', fromfile, 'to', todir, 'by', chunksize)
parts = split(fromfile, todir, chunksize)
print('Split finished:', parts, 'parts are in', todir)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: use python to split a video file into a set of parts

2013-05-07 Thread Chris Angelico
On Tue, May 7, 2013 at 9:15 PM, iMath  wrote:
> I use the following python code to split a FLV video file into a set of parts 
> ,when finished ,only the first part video can be played ,the other parts are 
> corrupted.I wonder why and Is there some correct ways to split video files

Most complex files of this nature have headers. You're chunking it in
pure bytes, so chances are you're disrupting that. The only thing you
can reliably do with your chunks is recombine them into the original
file.

> import sys, os
> kilobytes = 1024
> megabytes = kilobytes * 1000
> chunksize = int(1.4 * megabytes)   # default: roughly a floppy

Hrm. Firstly, this is a very small chunksize for today's files. You
hard-fail any file more than about 13GB, and for anything over a gig,
you're looking at a thousand files or more. Secondly, why are you
working with 1024 at the first level and 1000 at the second? You're
still a smidge short of the 1440KB that was described as 1.44MB, and
you have the same error of unit. Stick to binary kay OR decimal kay,
don't mix and match!

> print(chunksize , type(chunksize ))

Since you passed chunksize through the int() constructor, you can be
fairly confident it'll be an int :)

> def split(fromfile, todir, chunksize=chunksize):
> if not os.path.exists(todir):  # caller handles errors
> os.mkdir(todir)# make dir, read/write 
> parts
> else:
> for fname in os.listdir(todir):# delete any existing files
> os.remove(os.path.join(todir, fname))

Tip: Use os.mkdirs() in case some of its parents need to be made. And
if you wrap it in try/catch rather than probing first, you eliminate a
race condition. (By the way, it's pretty dangerous to just delete
files from someone else's directory. I would recommend aborting with
an error if you absolutely must work with an empty directory.)

> input = open(fromfile, 'rb')   # use binary mode on 
> Windows

As a general rule I prefer to avoid shadowing builtins, but it's not
strictly a problem.

> filename = os.path.join(todir, ('part{}.flv'.format(partnum)))
> assert partnum <=  # join sort fails if 5 
> digits
> return partnum

Why the assertion? Since this is all you do with the partnum, why does
it matter how long the number is? Without seeing the join sort I can't
know why that would fail; but there must surely be a solution to this.

> fromfile = input('File to be split: ')   # input if clicked

"clicked"? I'm guessing this is a translation problem, but I've no
idea what you mean by it.

What you have seems to be a reasonably viable (not that I tested it or
anything) file-level split. You should be able to re-join the parts
quite easily. But the subsequent parts are highly unlikely to play.
Even if you were working in a format that had no headers and could
resynchronize, chances are a 1.4MB file won't have enough to play
anything. Consider: A 1280x720 image contains 921,600 pixels;
uncompressed, this would take 2-4 bytes per pixel, depending on color
depth. To get a single playable frame, you would need an i-frame (ie
not a difference frame) to start and end within a single 1.4MB unit;
it would need to compress 50-75% just to fit, and that's assuming
optimal placement. With random placement, you would need to be getting
87% compression on your index frames, and then you'd still get just
one frame inside your chunk. That's not likely to be very playable.

But hey. You can stitch 'em back together again :)

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


formatted output

2013-05-07 Thread Sudheer Joseph
Dear members,
I need to print few arrays in a tabular form for example below 
array IL has 25 elements, is there an easy way to print this as 5x5 comma 
separated table? in python

IL=[]
for i in np.arange(1,bno+1):
   IL.append(i)
print(IL)
%
in fortran I could do it as below
%
integer matrix(5,5)
   in=0
  do, k=1,5
  do, l=1,5
   in=in+1
  matrix(k,l)=in
  enddo
  enddo
  m=5
  n=5
  do, i=1,m
  write(*,"(5i5)") ( matrix(i,j), j=1,n )
  enddo
  end
 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: use python to split a video file into a set of parts

2013-05-07 Thread Dave Angel

On 05/07/2013 07:15 AM, iMath wrote:

I use the following python code to split a FLV video file into a set of parts 
,when finished ,only the first part video can be played ,the other parts are 
corrupted.I wonder why and Is there some correct ways to split video files



There are two parts to answering the question.  First, did it accurately 
chunk the file into separate pieces.  That should be trivial to test -- 
simply concatenate them back together (eg. using copy /b) and make sure 
you get exactly the original.  (using md5sum, for example) I think you will.


And second, why the arbitrary pieces don't play in some unspecified 
video player.  That one's more interesting, but hasn't anything to do 
with Python.  I'm curious why you would expect that it would play.  It 
won't have any of the header information, and the compressed data will 
be missing its context information.  To split apart a binary file into 
useful pieces requires a lot of knowledge about the file format.




--
DaveA
--
http://mail.python.org/mailman/listinfo/python-list


Re: formatted output

2013-05-07 Thread Roy Smith
In article ,
 Sudheer Joseph  wrote:

> Dear members,
> I need to print few arrays in a tabular form for example below 
> array IL has 25 elements, is there an easy way to print this as 
> 5x5 comma separated table? in python
> 
> IL=[]
> for i in np.arange(1,bno+1):
>IL.append(i)
> print(IL)
> %
> in fortran I could do it as below
> %
> integer matrix(5,5)
>in=0
>   do, k=1,5
>   do, l=1,5
>in=in+1
>   matrix(k,l)=in
>   enddo
>   enddo
>   m=5
>   n=5
>   do, i=1,m
>   write(*,"(5i5)") ( matrix(i,j), j=1,n )
>   enddo
>   end
>  

Excellent.  My kind of programming language!  See 
http://www.python.org/doc/humor/#bad-habits.

Anyway, that translates, more or less, as follows.

Note that I'm modeling the Fortran 2-dimensional array as a dictionary 
keyed by (k, l) tuples.  That's easy an convenient, but conceptually a 
poor fit and not terribly efficient.  If efficiency is an issue (i.e. 
much larger values of (k, l), you probably want to be looking at numpy.

Also, "in" is a keyword in python, so I changed that to "value".  
There's probably cleaner ways to do this. I did a pretty literal 
transliteration.


matrix = {}
value = 0
for k in range(1, 6):
   for l in range(1, 6):
  value += 1
  matrix[(k, l)] = value

for i in range(1, 6):
   print ",".join("%5d" % matrix[(i, j)] for j in range(1, 6))

This prints:

1,2,3,4,5
6,7,8,9,   10
   11,   12,   13,   14,   15
   16,   17,   18,   19,   20
   21,   22,   23,   24,   25
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why do Perl programmers make more money than Python programmers

2013-05-07 Thread jmfauth
On 6 mai, 09:49, Fábio Santos  wrote:
> On 6 May 2013 08:34, "Chris Angelico"  wrote:
>
> > Well you see, it was 70 bytes back in the Python 2 days (I'll defer to
> > Steven for data points earlier than that), but with Python 3, there
> > were two versions: one was 140 bytes representing 70 characters, the
> > other 280 bytes representing 70 characters. In Python 3.3, they were
> > merged, and a trivial amount of overhead added, so now it's 80 bytes
> > representing 70 characters. But you have an absolute guarantee that
> > it's correct now.
>
> > Of course, the entire code can be represented as a single int now. You
> > used to have to use a long.
>
> > ChrisA
> > --
>
> Thanks. You have made my day.
>
> I may rise the average pay of a Python programmer in Portugal. I have asked
> for a raise back in December, and was told that it wouldn't happen before
> this year. I have done well. I think I deserve better pay than a
> supermarket employee now. I am sure that my efforts were appreciated and I
> will be rewarded. I am being sarcastic.
>
> The above paragraph wouldn't be true if I programmed in perl, c++ or lisp.


-


1) The memory gain for many of us (usually non ascii users)
just become irrelevant.

>>> sys.getsizeof('maçã')
41
>>> sys.getsizeof('abcd')
29

2) More critical, Py 3.3, just becomes non unicode compliant,
(eg European languages or "ascii" typographers !)

>>> import timeit
>>> timeit.timeit("'abcd'*1000 + 'a'")
2.186670111428325
>>> timeit.timeit("'abcd'*1000 + '€'")
2.9951699820528432
>>> timeit.timeit("'abcd'*1000 + 'œ'")
3.0036780444886233
>>> timeit.timeit("'abcd'*1000 + 'ẞ'")
3.004992278824048
>>> timeit.timeit("'maçã'*1000 + 'œ'")
3.231025618708202
>>> timeit.timeit("'maçã'*1000 + '€'")
3.215894398100758
>>> timeit.timeit("'maçã'*1000 + 'œ'")
3.224407974255655
>>> timeit.timeit("'maçã'*1000 + '’'")
3.2206342273566406
>>> timeit.timeit("'abcd'*1000 + '’'")
2.991440344906

3) Python is "pround" to cover the whole unicode range,
unfortunately it "breaks" the BMP range.
Small GvR exemple (ascii) from the the bug list,
but with non ascii characters.

# Py 3.2, all chars

>>> timeit.repeat("a = 'hundred'; 'x' in a")
[0.09087790617297742, 0.07456871885972305, 0.07449940353376405]
>>> timeit.repeat("a = 'maçãé€ẞ'; 'x' in a")
[0.10088136800095526, 0.07488497003487282, 0.07497594640028638]


# Py 3.3 ascii and non ascii chars
>>> timeit.repeat("a = 'hundred'; 'x' in a")
[0.11426985953005442, 0.10040049292649655, 0.09920834808588097]
>>> timeit.repeat("a = 'maçãé€ẞ'; 'é' in a")
[0.2345595188256766, 0.21637172864154763, 0.2179096624382737]


There are plenty of good reasons to use Python. There are
also plenty of good reasons to not use (or now to drop)
Python and to realize that if you wish to process text
seriously, you are better served by using "corporate
products" or tools using Unicode properly.

jmf


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: formatted output

2013-05-07 Thread Peter Otten
Roy Smith wrote:

> In article ,
>  Sudheer Joseph  wrote:
> 
>> Dear members,
>> I need to print few arrays in a tabular form for example
>> below array IL has 25 elements, is there an easy way to print
>> this as 5x5 comma separated table? in python
>> 
>> IL=[]
>> for i in np.arange(1,bno+1):
>>IL.append(i)
>> print(IL)
>> %
>> in fortran I could do it as below
>> %
>> integer matrix(5,5)
>>in=0
>>   do, k=1,5
>>   do, l=1,5
>>in=in+1
>>   matrix(k,l)=in
>>   enddo
>>   enddo
>>   m=5
>>   n=5
>>   do, i=1,m
>>   write(*,"(5i5)") ( matrix(i,j), j=1,n )
>>   enddo
>>   end
>>  
> 
> Excellent.  My kind of programming language!  See
> http://www.python.org/doc/humor/#bad-habits.
> 
> Anyway, that translates, more or less, as follows.
> 
> Note that I'm modeling the Fortran 2-dimensional array as a dictionary
> keyed by (k, l) tuples.  That's easy an convenient, but conceptually a
> poor fit and not terribly efficient.  If efficiency is an issue (i.e.
> much larger values of (k, l), you probably want to be looking at numpy.
> 
> Also, "in" is a keyword in python, so I changed that to "value".
> There's probably cleaner ways to do this. I did a pretty literal
> transliteration.
> 
> 
> matrix = {}
> value = 0
> for k in range(1, 6):
>for l in range(1, 6):
>   value += 1
>   matrix[(k, l)] = value
> 
> for i in range(1, 6):
>print ",".join("%5d" % matrix[(i, j)] for j in range(1, 6))
> 
> This prints:
> 
> 1,2,3,4,5
> 6,7,8,9,   10
>11,   12,   13,   14,   15
>16,   17,   18,   19,   20
>21,   22,   23,   24,   25

Or, as the OP may be on the road to numpy anyway:

>>> import numpy
>>> a = numpy.arange(1, 26).reshape(5, 5)
>>> a
array([[ 1,  2,  3,  4,  5],
   [ 6,  7,  8,  9, 10],
   [11, 12, 13, 14, 15],
   [16, 17, 18, 19, 20],
   [21, 22, 23, 24, 25]])
>>> import sys
>>> numpy.savetxt(sys.stdout, a, delimiter=", ", fmt="%5d")
1, 2, 3, 4, 5
6, 7, 8, 9,10
   11,12,13,14,15
   16,17,18,19,20
   21,22,23,24,25


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why do Perl programmers make more money than Python programmers

2013-05-07 Thread Chris Angelico
On Tue, May 7, 2013 at 11:22 PM, jmfauth  wrote:
> There are plenty of good reasons to use Python. There are
> also plenty of good reasons to not use (or now to drop)
> Python and to realize that if you wish to process text
> seriously, you are better served by using "corporate
> products" or tools using Unicode properly.

There are plenty of good reasons to use Python. One of them is the
laughs you can get any time jmf posts here. There are also plenty of
good reasons to drop Python. One of them is because corporate products
like Microsoft Visual Studio are inherently better specifically
because they cost you money, and there's no way that something you
paid nothing for can ever be as good as that. Plus, you get to write
code that works on only one platform, and that's really good. Finally,
moving off Python would mean you don't feel obliged to respond to jmf,
which will increase your productivity measurably.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: First python program, syntax error in while loop

2013-05-07 Thread Chris Angelico
On Tue, May 7, 2013 at 10:44 PM, Ombongi Moraa Fe
 wrote:
> My first language was Pascal. It was at a time in 2005 when computers were
> finally becoming popular in Africa and our year was the first time a girls
> school from our Province did a computer coursework for National Exams. (That
> was such an achievement *sigh*)
>
> "The teacher said ... Good Programming Practice ... Use parentheses to
> format code.. or I will deduct a point from your work when I feel like it."
>
> Cant seem to let go of the behavior. I use parentheses in all languages.

Pretty much all such blanket advice is flawed. I cannot at present
think of any example of a "Good Programming Practice" suggestion that
doesn't have its exceptions and caveats. The only ones that don't are
the ones that get codified into language/syntax rules, and even there,
most of them have their detractors. Python's indentation rule is a
prime example; most people follow the advice to always indent blocks
of code, Python makes it mandatory, and some people hate Python for
it. (And yes, there have been times when I've deliberately misindented
a block of C code, because it made more logical sense that way. I can
quote examples if you like.)

The only principle that you should follow is: Think about what you're
doing. Everything else is an elaboration on that. [1]

ChrisA
[1] Matthew 22:37-40 :)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why do Perl programmers make more money than Python programmers

2013-05-07 Thread Fábio Santos
>
>
> -
>
>
> 1) The memory gain for many of us (usually non ascii users)
> just become irrelevant.
>
> >>> sys.getsizeof('maçã')
> 41
> >>> sys.getsizeof('abcd')
> 29
>
> 2) More critical, Py 3.3, just becomes non unicode compliant,
> (eg European languages or "ascii" typographers !)
>
> >>> import timeit
> >>> timeit.timeit("'abcd'*1000 + 'a'")
> 2.186670111428325
> >>> timeit.timeit("'abcd'*1000 + '€'")
> 2.9951699820528432
> >>> timeit.timeit("'abcd'*1000 + 'œ'")
> 3.0036780444886233
> >>> timeit.timeit("'abcd'*1000 + 'ẞ'")
> 3.004992278824048
> >>> timeit.timeit("'maçã'*1000 + 'œ'")
> 3.231025618708202
> >>> timeit.timeit("'maçã'*1000 + '€'")
> 3.215894398100758
> >>> timeit.timeit("'maçã'*1000 + 'œ'")
> 3.224407974255655
> >>> timeit.timeit("'maçã'*1000 + '’'")
> 3.2206342273566406
> >>> timeit.timeit("'abcd'*1000 + '’'")
> 2.991440344906
>
> 3) Python is "pround" to cover the whole unicode range,
> unfortunately it "breaks" the BMP range.
> Small GvR exemple (ascii) from the the bug list,
> but with non ascii characters.
>
> # Py 3.2, all chars
>
> >>> timeit.repeat("a = 'hundred'; 'x' in a")
> [0.09087790617297742, 0.07456871885972305, 0.07449940353376405]
> >>> timeit.repeat("a = 'maçãé€ẞ'; 'x' in a")
> [0.10088136800095526, 0.07488497003487282, 0.07497594640028638]
>
>
> # Py 3.3 ascii and non ascii chars
> >>> timeit.repeat("a = 'hundred'; 'x' in a")
> [0.11426985953005442, 0.10040049292649655, 0.09920834808588097]
> >>> timeit.repeat("a = 'maçãé€ẞ'; 'é' in a")
> [0.2345595188256766, 0.21637172864154763, 0.2179096624382737]
>
>
> There are plenty of good reasons to use Python. There are
> also plenty of good reasons to not use (or now to drop)
> Python and to realize that if you wish to process text
> seriously, you are better served by using "corporate
> products" or tools using Unicode properly.
>
> jmf

This is so off-topic that, after reading this, I feel I have just returned
from the Moon.

OTOH, it would seem like you know the Portuguese word for apple, so I also
feel home.

I am so confused.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Red Black Tree implementation?

2013-05-07 Thread duncan smith

On 07/05/13 02:20, Dan Stromberg wrote:



[snip]


I'm starting to think Red Black Trees are pretty complex.




A while ago I looked at a few different types of self-balancing binary 
tree. Most look much easier to implement.


BTW, the licence might be MIT - I just copied it from someone else's code.

Duncan
--
http://mail.python.org/mailman/listinfo/python-list


Re: distributing a binary package

2013-05-07 Thread Eric Frederich
I see where I can specify a module that distutils will try to compile.
I already have the .so files compiled.

I'm sure its simple, I just can't find it or don't know what to look for.

On Mon, May 6, 2013 at 9:13 PM, Miki Tebeka  wrote:
>
>> Basically, I'd like to know how to create a proper setup.py script
> http://docs.python.org/2/distutils/setupscript.html
> --
> http://mail.python.org/mailman/listinfo/python-list
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why sfml does not play the file inside a function in this python code?

2013-05-07 Thread cheirasacan
El martes, 7 de mayo de 2013 12:53:25 UTC+2, MRAB  escribió:
> On 07/05/2013 10:27, cheirasa...@gmail.com wrote:
> 
> > from tkinter import *
> 
> > import sfml
> 
> >
> 
> >
> 
> > window = Tk()
> 
> > window.minsize( 640, 480 )
> 
> >
> 
> >
> 
> > def sonido():
> 
> >  file = sfml.Music.from_file('poco.ogg')
> 
> >  file.play()
> 
> >
> 
> >
> 
> > test = Button ( window, text = 'Sound test', command=sonido )
> 
> > test.place ( x = 10, y = 60)
> 
> >
> 
> > window.mainloop()
> 
> >
> 
> >
> 
> >
> 
> >
> 
> > Using Windows 7, Python 3.3, sfml 1.3.0 library, the file it is played if i 
> > put it out of the function. � what am i doing wrong ? Thanks.
> 
> >
> 
> Perhaps what's happening is that sonido starts playing it and then
> 
> returns, meaning that there's no longer a reference to it ('file' is
> 
> local to the function), so it's collected by the garbage collector.
> 
> 
> 
> If that's the case, try keeping a reference to it, perhaps by making
> 
> 'file' global (in a simple program like this one, using global should
> 
> be OK).

Thanks. A global use of 'sonido' fix the problem. The garbage collector must be 
the point. But this code is part of a longer project. What can i do to fix it 
without the use of globals? I will use more functions like this, and i would 
like to keep learning python as well good programming methodology.
Thanks.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why do Perl programmers make more money than Python programmers

2013-05-07 Thread Steve Simmons
"Fábio Santos"  wrote:

>>
>>
>> -
>>
>>
>> 1) The memory gain for many of us (usually non ascii users)
>> just become irrelevant.
>>
>> >>> sys.getsizeof('maçã')
>> 41
>> >>> sys.getsizeof('abcd')
>> 29
>>
>> 2) More critical, Py 3.3, just becomes non unicode compliant,
>> (eg European languages or "ascii" typographers !)
>>
>> >>> import timeit
>> >>> timeit.timeit("'abcd'*1000 + 'a'")
>> 2.186670111428325
>> >>> timeit.timeit("'abcd'*1000 + '€'")
>> 2.9951699820528432
>> >>> timeit.timeit("'abcd'*1000 + 'œ'")
>> 3.0036780444886233
>> >>> timeit.timeit("'abcd'*1000 + 'ẞ'")
>> 3.004992278824048
>> >>> timeit.timeit("'maçã'*1000 + 'œ'")
>> 3.231025618708202
>> >>> timeit.timeit("'maçã'*1000 + '€'")
>> 3.215894398100758
>> >>> timeit.timeit("'maçã'*1000 + 'œ'")
>> 3.224407974255655
>> >>> timeit.timeit("'maçã'*1000 + '’'")
>> 3.2206342273566406
>> >>> timeit.timeit("'abcd'*1000 + '’'")
>> 2.991440344906
>>
>> 3) Python is "pround" to cover the whole unicode range,
>> unfortunately it "breaks" the BMP range.
>> Small GvR exemple (ascii) from the the bug list,
>> but with non ascii characters.
>>
>> # Py 3.2, all chars
>>
>> >>> timeit.repeat("a = 'hundred'; 'x' in a")
>> [0.09087790617297742, 0.07456871885972305, 0.07449940353376405]
>> >>> timeit.repeat("a = 'maçãé€ẞ'; 'x' in a")
>> [0.10088136800095526, 0.07488497003487282, 0.07497594640028638]
>>
>>
>> # Py 3.3 ascii and non ascii chars
>> >>> timeit.repeat("a = 'hundred'; 'x' in a")
>> [0.11426985953005442, 0.10040049292649655, 0.09920834808588097]
>> >>> timeit.repeat("a = 'maçãé€ẞ'; 'é' in a")
>> [0.2345595188256766, 0.21637172864154763, 0.2179096624382737]
>>
>>
>> There are plenty of good reasons to use Python. There are
>> also plenty of good reasons to not use (or now to drop)
>> Python and to realize that if you wish to process text
>> seriously, you are better served by using "corporate
>> products" or tools using Unicode properly.
>>
>> jmf
>
>This is so off-topic that, after reading this, I feel I have just
>returned
>from the Moon.
>
>OTOH, it would seem like you know the Portuguese word for apple, so I
>also
>feel home.
>
>I am so confused.
>
>
>
>
>-- 
>http://mail.python.org/mailman/listinfo/python-list

Good to see jmf finally comparing apples with apples :-) 

Sent from a Galaxy far far away-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why sfml does not play the file inside a function in this python code?

2013-05-07 Thread MRAB

On 07/05/2013 14:56, cheirasa...@gmail.com wrote:

El martes, 7 de mayo de 2013 12:53:25 UTC+2, MRAB  escribió:

On 07/05/2013 10:27, cheirasa...@gmail.com wrote:
> from tkinter import *
> import sfml
>
> window = Tk()
> window.minsize( 640, 480 )
>
> def sonido():
>  file = sfml.Music.from_file('poco.ogg')
>  file.play()
>
> test = Button ( window, text = 'Sound test', command=sonido )
> test.place ( x = 10, y = 60)
>
> window.mainloop()
>
> Using Windows 7, Python 3.3, sfml 1.3.0 library, the file it is played if i 
put it out of the function. � what am i doing wrong ? Thanks.
>

Perhaps what's happening is that sonido starts playing it and then
returns, meaning that there's no longer a reference to it ('file' is
local to the function), so it's collected by the garbage collector.

If that's the case, try keeping a reference to it, perhaps by making
'file' global (in a simple program like this one, using global should
be OK).


Thanks. A global use of 'sonido' fix the problem. The garbage collector must be 
the point. But this code is part of a longer project. What can i do to fix it 
without the use of globals? I will use more functions like this, and i would 
like to keep learning python as well good programming methodology.
Thanks.


Presumably the details of the window are (or will be) hidden away in a
class, so you could make 'file' an attribute of an instance.

Also, please read this:

http://wiki.python.org/moin/GoogleGroupsPython

because gmail insists on adding extra linebreaks, which can be somewhat
annoying.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why sfml does not play the file inside a function in this python code?

2013-05-07 Thread Chris Angelico
On Wed, May 8, 2013 at 12:57 AM, MRAB  wrote:
> Also, please read this:
>
> http://wiki.python.org/moin/GoogleGroupsPython
>
> because gmail insists on adding extra linebreaks, which can be somewhat
> annoying.

Accuracy correction: It's nothing to do with gmail, which is what I
use (via python-list subscription). It's just Google Groups.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


dist-packages or site-packages in Python 3.2 ?

2013-05-07 Thread Vincent Vande Vyvre

Hi,

I've one machine with python3.2 (on Ubuntu 12.04), in the folder 
/usr/lib/python3.2 it is a subfolder dist-packages, maybe created at the 
install or created when I've installed the binding pyexiv2, I don't know.


Now, I've installed a new machine, again with Ubuntu 12.04 and therefore 
python3.2.

This new install hasn't any *-packages subfolder in /usr/lib/python3.2

Today I install PyQt5, the make and make install are executed without 
error but when I try:


>>> from PyQt5 import QtGui
Traceback (most recent call last):
  File "", line 1, in 
ImportError: No module named PyQt5

I go to check in /usr/lib/python3.2 and I see the install of PyQt5 has 
created a subfolder site-packages.
Is this naming dist-packages/site-packages critical for Python? (My 
intuition is yes!)


So, I've tried with:
>>> sys.path.append('/usr/lib/python3.2/site_packages')
and, also, created a file __init__.py in /site-packages but that's not 
solved the problem.


Thanks for your advices.
--
Vincent V.V.
Oqapy  . Qarte 
 . PaQager 

--
http://mail.python.org/mailman/listinfo/python-list


Re: Why do Perl programmers make more money than Python programmers

2013-05-07 Thread Terry Jan Reedy
On 5/7/2013 9:22 AM, jmfauth road forth on his dead hobbyhorse to hijack 
yet another thread:



# Py 3.3 ascii and non ascii chars

timeit.repeat("a = 'hundred'; 'x' in a")

[0.11426985953005442, 0.10040049292649655, 0.09920834808588097]

timeit.repeat("a = 'maçãé€ẞ'; 'é' in a")

[0.2345595188256766, 0.21637172864154763, 0.2179096624382737]


Python 3.3 is a language. Languages do not have timings.
CPython 3.3.0 is an implementation compiled and run under a particular 
OS and hardware. With respect to Unicode timings, especially for 
find/replace, it is obsolete. On my Win7 machine with fresh debug builds 
from the current repository, I see these times.


Python 3.3.1+ (default, May  7 2013, 14:03:12) [MSC v.1600 32 bit (Int
>>> from timeit import repeat
>>> repeat("a = 'hundred'; 'x' in a")
[0.19007337649622968, 0.190116721780754, 0.1900149679567562]
>>> repeat("a = 'maçaé??'; 'é' in a")
[0.20568874581187716, 0.20568782357178053, 0.20577051776710914]

Python 3.4.0a0 (default:32067784f198, May  7 2013, 13:59:10) [MSC v.1600
>>> from timeit import repeat
>>> repeat("a = 'hundred'; 'x' in a")
[0.1708080882915779, 0.17062978853956826, 0.1706740560642051]
>>> repeat("a = 'maçaé??'; 'é' in a")
[0.17612111348809734, 0.17562925210324565, 0.17549245315558437]

Note 1: debug builds are slower than install builds, especially for 
microbenchmarks with trivial statements. My installed 3.3.1 on a 
different machine has timings of about .1 for the ascii test. It is 
slower for the non-ascii test because the latest improvements were made 
after 3.3.1 was released.


Note 2: 3.4 has additional improvements that speed up everything, so 
that the 3.4 non-ascii time is faster that even the 3.3 ascii time.


Terry


--
http://mail.python.org/mailman/listinfo/python-list


Re: python backup script

2013-05-07 Thread Enrico 'Henryx' Bianchi
John Gordon wrote:


> Looks like you need a comma after 'stdout=filename'.

Sigh, yesterday was a terrible day (yes, it lacks a comma)...
Anyway, when it is possible, is recommended to use the drivers for 
communicate with databases, because subprocess (or os.*open*) is more 
expensive compared to (python needs to spawn an external process to execute 
the command)

Enrico
-- 
http://mail.python.org/mailman/listinfo/python-list


Making safe file names

2013-05-07 Thread Andrew Berg
Currently, I keep Last.fm artist data caches to avoid unnecessary API calls and 
have been naming the files using the artist name. However,
artist names can have characters that are not allowed in file names for most 
file systems (e.g., C/A/T has forward slashes). Are there any
recommended strategies for naming such files while avoiding conflicts (I 
wouldn't want to run into problems for an artist named C-A-T or
CAT, for example)? I'd like to make the files easily identifiable, and there 
really are no limits on what characters can be in an artist name.
-- 
CPython 3.3.1 | Windows NT 6.2.9200 / FreeBSD 9.1
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why sfml does not play the file inside a function in this python code?

2013-05-07 Thread cheirasacan
El martes, 7 de mayo de 2013 16:57:59 UTC+2, MRAB  escribió:
> On 07/05/2013 14:56, cheirasa...@gmail.com wrote:
> 
> > El martes, 7 de mayo de 2013 12:53:25 UTC+2, MRAB  escribió:
> 
> >> On 07/05/2013 10:27, cheirasa...@gmail.com wrote:
> 
> >> > from tkinter import *
> 
> >> > import sfml
> 
> >> >
> 
> >> > window = Tk()
> 
> >> > window.minsize( 640, 480 )
> 
> >> >
> 
> >> > def sonido():
> 
> >> >  file = sfml.Music.from_file('poco.ogg')
> 
> >> >  file.play()
> 
> >> >
> 
> >> > test = Button ( window, text = 'Sound test', command=sonido )
> 
> >> > test.place ( x = 10, y = 60)
> 
> >> >
> 
> >> > window.mainloop()
> 
> >> >
> 
> >> > Using Windows 7, Python 3.3, sfml 1.3.0 library, the file it is played 
> >> > if i put it out of the function. � what am i doing wrong ? Thanks.
> 
> >> >
> 
> >>
> 
> >> Perhaps what's happening is that sonido starts playing it and then
> 
> >> returns, meaning that there's no longer a reference to it ('file' is
> 
> >> local to the function), so it's collected by the garbage collector.
> 
> >>
> 
> >> If that's the case, try keeping a reference to it, perhaps by making
> 
> >> 'file' global (in a simple program like this one, using global should
> 
> >> be OK).
> 
> >
> 
> > Thanks. A global use of 'sonido' fix the problem. The garbage collector 
> > must be the point. But this code is part of a longer project. What can i do 
> > to fix it without the use of globals? I will use more functions like this, 
> > and i would like to keep learning python as well good programming 
> > methodology.
> 
> > Thanks.
> 
> >
> 
> Presumably the details of the window are (or will be) hidden away in a
> 
> class, so you could make 'file' an attribute of an instance.
> 
> 
> 
> Also, please read this:
> 
> 
> 
> http://wiki.python.org/moin/GoogleGroupsPython
> 
> 
> 
> because gmail insists on adding extra linebreaks, which can be somewhat
> 
> annoying.

 The reply is very useful. I will keep learning.
 Thanks for all.
-- 
http://mail.python.org/mailman/listinfo/python-list


Get filename using filefialog.askfilename

2013-05-07 Thread cheirasacan
Well. It's driving me crazy. So simple

I use:

file = filedialog.askopenfile ( mode... )

to open a file with an open dialog box, OK. Made it.

How i get the name of the opened file?

i do :

print(file)

the output is: <..name="file.doc"...mode=..encoding..  >

How can i get the second member of 'file'?

I had prove with print(file[1]) and print(file(1)) but does not work.

And i am unable to find a detailed reference to this object in the i.net

http://fossies.org/dox/Python-3.3.1/filedialog_8py_source.html#l00393 is
where i could reach!



Thanks

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why do Perl programmers make more money than Python programmers

2013-05-07 Thread Martijn Lievaart
On Sun, 05 May 2013 17:07:41 -0400, Roy Smith wrote:

> There *are* programming languages worse than PHP.  Have you ever tried
> britescript?

Have you tried MUMPS? :-)

M4


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Get filename using filefialog.askfilename

2013-05-07 Thread John Gordon
In  
cheirasa...@gmail.com writes:

> print(file)

> the output is: <..name="file.doc"...mode=..encoding..  >

> How can i get the second member of 'file'?

If you're using the interpreter, you can type this command:

>>> help(file)

And it will display documentation for using objects of that type.
You can also use this command:

>>> dir(file)

And it will display all the members and methods that the object provides.

-- 
John Gordon   A is for Amy, who fell down the stairs
gor...@panix.com  B is for Basil, assaulted by bears
-- Edward Gorey, "The Gashlycrumb Tinies"

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why do Perl programmers make more money than Python programmers

2013-05-07 Thread 88888 Dihedral
Chris Angelico於 2013年5月7日星期二UTC+8下午9時32分55秒寫道:
> On Tue, May 7, 2013 at 11:22 PM, jmfauth  wrote:
> 
> > There are plenty of good reasons to use Python. There are
> 
> > also plenty of good reasons to not use (or now to drop)
> 
> > Python and to realize that if you wish to process text
> 
> > seriously, you are better served by using "corporate
> 
> > products" or tools using Unicode properly.
> 
> 
> 
> There are plenty of good reasons to use Python. One of them is the
> 
> laughs you can get any time jmf posts here. There are also plenty of
> 
> good reasons to drop Python. One of them is because corporate products
> 
> like Microsoft Visual Studio are inherently better specifically
> 
> because they cost you money, and there's no way that something you
> 
> paid nothing for can ever be as good as that. Plus, you get to write
> 
People used MS products  because most bosses did not  want to pay 
the prices of work stations,  the minis, or the main-frames and 
the salaries of the system  administrators  in 199x. 


> code that works on only one platform, and that's really good. Finally,
> 
> moving off Python would mean you don't feel obliged to respond to jmf,
> 
> which will increase your productivity measurably.
> 
> 
> 
> ChrisA

The price issue of a software  package or a platform is not 
the only  way to judge a programming language.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why do Perl programmers make more money than Python programmers

2013-05-07 Thread Walter Hurry
On Tue, 07 May 2013 23:32:55 +1000, Chris Angelico wrote:

> On Tue, May 7, 2013 at 11:22 PM, jmfauth  wrote:
>> There are plenty of good reasons to use Python. There are also plenty
>> of good reasons to not use (or now to drop) Python and to realize that
>> if you wish to process text seriously, you are better served by using
>> "corporate products" or tools using Unicode properly.
> 
> There are plenty of good reasons to use Python. One of them is the
> laughs you can get any time jmf posts here. There are also plenty of
> good reasons to drop Python. One of them is because corporate products
> like Microsoft Visual Studio are inherently better specifically because
> they cost you money, and there's no way that something you paid nothing
> for can ever be as good as that. Plus, you get to write code that works
> on only one platform, and that's really good. Finally,
> moving off Python would mean you don't feel obliged to respond to jmf,
> which will increase your productivity measurably.

TMML. Thanks!

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Get filename using filefialog.askfilename

2013-05-07 Thread Terry Jan Reedy

On 5/7/2013 4:27 PM, cheirasa...@gmail.com wrote:


file = filedialog.askopenfile ( mode... )


askopenfile is a convenience function that creates an Open dialog 
object, shows it, gets the name returned by the dialog, opens the file 
with that name, and returns an appropriate normal file object



to open a file with an open dialog box, OK. Made it.

How i get the name of the opened file?


file.name, (at least in 3.3), which in your example below is "file.doc"


print(file)

the output is: <..name="file.doc"...mode=..encoding..  >


This is the standard string representation of a file object. It is 
created from the various attributes of the file instance, including 
file.name.



How can i get the second member of 'file'?


Strings do not have fields. The second 'member', would be the second 
character, file[1], which is not what you want.



And i am unable to find a detailed reference to this object in the i.net


Use the Fine Manual. The entry for builtin open() function, which you 
should read to understand the 'open' part of askopenfile, directs you to 
the Glossary entry 'file object' which says "There are actually three 
categories of file objects: raw binary files, buffered binary files and 
text files. Their interfaces are defined in the io module. The canonical 
way to create a file object is by using the open() function." The kind 
of file object you get is determined by the mode ('b' present or not), 
buffer arg, and maybe something else. You can look in the io chapter or 
use dir() and help() as John G. suggested.


Python programmers should really learn to use dir(), help(), and the 
manuls, including the index and module index.


--
Terry Jan Reedy


--
http://mail.python.org/mailman/listinfo/python-list


Re: Making safe file names

2013-05-07 Thread Terry Jan Reedy

On 5/7/2013 3:58 PM, Andrew Berg wrote:

Currently, I keep Last.fm artist data caches to avoid unnecessary API calls and 
have been naming the files using the artist name. However,
artist names can have characters that are not allowed in file names for most 
file systems (e.g., C/A/T has forward slashes). Are there any
recommended strategies for naming such files while avoiding conflicts (I 
wouldn't want to run into problems for an artist named C-A-T or
CAT, for example)? I'd like to make the files easily identifiable, and there 
really are no limits on what characters can be in an artist name.


Sounds like you want something like the html escape or urlencode 
functions, which serve the same purpose of encoding special chars. 
Rather than invent a new tranformation, you could use the same scheme 
used for html entities. (Sorry, I forget the details.) It is possible 
that one of the functions would work for you as is, or with little 
modification.


Terry



--
http://mail.python.org/mailman/listinfo/python-list


Re: Why do Perl programmers make more money than Python programmers

2013-05-07 Thread William Ray Wing
On May 7, 2013, at 4:31 PM, Martijn Lievaart  wrote:

> On Sun, 05 May 2013 17:07:41 -0400, Roy Smith wrote:
> 
>> There *are* programming languages worse than PHP.  Have you ever tried
>> britescript?
> 
> Have you tried MUMPS? :-)
> 
> M4
> 

Which one?  The original MUMPS (Massachusetts General Hospital Utility 
Multi-Programming System), or the one that came out of Europe in the '90s and 
stole the MUMPS name (MUltifrontal Massively Parallel sparse direct Solver)?

I used both the original and its baby brother FOCAL which, considering that 
they ran on DEC PDP-7 (MUMPS) and DEC PDP-8 (FOCAL) systems with as little as 
4k words of 18-bit memory on the PDP-7 and 8-bit memory on the PDP-8 is pretty 
spectacular.  They should be forgiven for a somewhat constrained list of key 
words and operations. 

-Bill
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Making safe file names

2013-05-07 Thread Fábio Santos
I suggest Base64. b64encode
(http://docs.python.org/2/library/base64.html#base64.b64encode) and
b64decode take an argument which allows you to eliminate the pesky "/"
character. It's reversible and simple.

More suggestions: how about a hash? Or just use IDs from the database?

On Tue, May 7, 2013 at 8:58 PM, Andrew Berg  wrote:
> Currently, I keep Last.fm artist data caches to avoid unnecessary API calls 
> and have been naming the files using the artist name. However,
> artist names can have characters that are not allowed in file names for most 
> file systems (e.g., C/A/T has forward slashes). Are there any
> recommended strategies for naming such files while avoiding conflicts (I 
> wouldn't want to run into problems for an artist named C-A-T or
> CAT, for example)? I'd like to make the files easily identifiable, and there 
> really are no limits on what characters can be in an artist name.
> --
> CPython 3.3.1 | Windows NT 6.2.9200 / FreeBSD 9.1
> --
> http://mail.python.org/mailman/listinfo/python-list



--
Fábio Santos
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Making safe file names

2013-05-07 Thread MRAB

On 07/05/2013 20:58, Andrew Berg wrote:

Currently, I keep Last.fm artist data caches to avoid unnecessary API calls and 
have been naming the files using the artist name. However,
artist names can have characters that are not allowed in file names for most 
file systems (e.g., C/A/T has forward slashes). Are there any
recommended strategies for naming such files while avoiding conflicts (I 
wouldn't want to run into problems for an artist named C-A-T or
CAT, for example)? I'd like to make the files easily identifiable, and there 
really are no limits on what characters can be in an artist name.


Conflicts won't occur if:

1. All of the characters of the artist's name are mapped to an encoding.

2. Different characters map to different encodings.

3. No encoding is a prefix of another encoding.

In practice, you'll be mapping most characters to themselves.

--
http://mail.python.org/mailman/listinfo/python-list


Re: Making safe file names

2013-05-07 Thread Dan Stromberg
On 5/7/13, Andrew Berg  wrote:
> Currently, I keep Last.fm artist data caches to avoid unnecessary API calls
> and have been naming the files using the artist name. However,
> artist names can have characters that are not allowed in file names for most
> file systems (e.g., C/A/T has forward slashes). Are there any
> recommended strategies for naming such files while avoiding conflicts (I
> wouldn't want to run into problems for an artist named C-A-T or
> CAT, for example)? I'd like to make the files easily identifiable, and there
> really are no limits on what characters can be in an artist name.

You might consider:
http://stromberg.dnsalias.org/svn/backshift/trunk/escape_mod.py
http://stromberg.dnsalias.org/svn/backshift/trunk/test-escape_mod

It doubles the length of the string, but it produces safe, easily
readable escaped strings - which tends to make debugging easier.

It requires a couple of other modules (easily obtained from the same
SVN repo) though.
-- 
http://mail.python.org/mailman/listinfo/python-list


multiple versions of python

2013-05-07 Thread sokovic . anamarija
Hi,

what is the generally recommended structure when we have into play this type of 
problem:
multiple versions of python (both in the sense of main versions and sub 
versions, e.g., 
2.7 :
   2.7.1
   2.7.3
3:
 3.3
   3.3.1
Different versions of gcc
different compilation strategies (-vanilla and non-vanilla)
different modules (numpy,scipy) together with the different versions of all the 
rest.

any help is appreciated

Ana
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Making safe file names

2013-05-07 Thread Jens Thoms Toerring
Andrew Berg  wrote:
> Currently, I keep Last.fm artist data caches to avoid unnecessary API calls
> and have been naming the files using the artist name. However, artist names
> can have characters that are not allowed in file names for most file systems
> (e.g., C/A/T has forward slashes). Are there any recommended strategies for
> naming such files while avoiding conflicts (I wouldn't want to run into
> problems for an artist named C-A-T or CAT, for example)? I'd like to make
> the files easily identifiable, and there really are no limits on what
> characters can be in an artist name. --

It's not clear what the context that you need this for. You
could e.g. replace all characters not allowed by the file
system by their hexidecimal (ASCII) values, preceeded by a
'%" (so '/' would be changed to '%2F', and also encode a '%'
itself in a name by '%25'). Then you have a well-defined
two-way mapping ("isomorphic" if I remember my math-lear-
nining days correctly) between the original name and the
way you store it. E.g.

  "C/A/T"  would become  "C%2FA%2FT"

and

  "C%2FA/T"  would become  "C%252FA%2FT"

You can translate back and forth between them with not too
much effort.

Of course, that assumes that '%' is a character allowed by
your file system - otherwise pick some other one, any one
will do in principle. It's a bit harder for a human to in-
terpret but rathe likely not that much of a problem. You
probably will have seen that kind of scheme used in URLs.
The concept is rather old and called 'escape character',
i.e. have one character that assumes some special meaning
and also "escaped" it.

If, on the hand, those names are never to be translated back
to the original name another strategy would be to use the SHA1
hash value of the artists name. Since clashes between SHA1 hash
values are rather hard to produce it's a rather safe method of
converting something (i.e. the artists name) to a number. The
drawback, of course, is that you can't translate back from the
hash value to the original name (if that would be simple the
whole thing wouldn't work;-)

   Regards, Jens
-- 
  \   Jens Thoms Toerring  ___  j...@toerring.de
   \__  http://toerring.de
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Making safe file names

2013-05-07 Thread Chris Angelico
On Wed, May 8, 2013 at 8:18 AM, Fábio Santos  wrote:
> I suggest Base64. b64encode
> (http://docs.python.org/2/library/base64.html#base64.b64encode) and
> b64decode take an argument which allows you to eliminate the pesky "/"
> character. It's reversible and simple.

But it doesn't look anything like the original.

I'd be inclined to go for something like quoted-printable or
URL-encoding; special characters become much longer, but ordinary
characters (mostly) stay as themselves.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Making safe file names

2013-05-07 Thread Andrew Berg
On 2013.05.07 17:18, Fábio Santos wrote:
> I suggest Base64. b64encode
> (http://docs.python.org/2/library/base64.html#base64.b64encode) and
> b64decode take an argument which allows you to eliminate the pesky "/"
> character. It's reversible and simple.
> 
> More suggestions: how about a hash? Or just use IDs from the database?
None of these would work because I would have no idea which file stores data 
for which artist without writing code to figure it out. If I
were to end up writing a bug that messed up a few of my cache files and noticed 
it with a specific artist (e.g., doing a "now playing" and
seeing the wrong tags), I would either have to manually match up the hash or 
base64 encoding in order to delete just that file so that it
gets regenerated or nuke and regenerate my entire cache.

-- 
CPython 3.3.1 | Windows NT 6.2.9200 / FreeBSD 9.1
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Making safe file names

2013-05-07 Thread Andrew Berg
On 2013.05.07 17:01, Terry Jan Reedy wrote:
> Sounds like you want something like the html escape or urlencode 
> functions, which serve the same purpose of encoding special chars. 
> Rather than invent a new tranformation, you could use the same scheme 
> used for html entities. (Sorry, I forget the details.) It is possible 
> that one of the functions would work for you as is, or with little 
> modification.
This has the problem of mangling non-ASCII characters (and artist names with 
non-ASCII characters are not rare). I most definitely want to
keep as many characters untouched as possible so that the files are easy to 
identify by looking at the file name. Ideally, only characters
that file systems don't like would be transformed.

-- 
CPython 3.3.1 | Windows NT 6.2.9200 / FreeBSD 9.1
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: multiple versions of python

2013-05-07 Thread Roy Smith
In article <72f93710-9812-441e-8d3d-f221d5698...@googlegroups.com>,
 sokovic.anamar...@gmail.com wrote:

> Hi,
> 
> what is the generally recommended structure when we have into play this type 
> of problem:
> multiple versions of python (both in the sense of main versions and sub 
> versions, e.g., 
> 2.7 :
>2.7.1
>2.7.3
> 3:
>  3.3
>3.3.1
> Different versions of gcc
> different compilation strategies (-vanilla and non-vanilla)
> different modules (numpy,scipy) together with the different versions of all 
> the rest.
> 
> any help is appreciated
> 
> Ana

Virtualenv is your friend.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Making safe file names

2013-05-07 Thread Dave Angel

On 05/07/2013 03:58 PM, Andrew Berg wrote:

Currently, I keep Last.fm artist data caches to avoid unnecessary API calls and 
have been naming the files using the artist name. However,
artist names can have characters that are not allowed in file names for most 
file systems (e.g., C/A/T has forward slashes). Are there any
recommended strategies for naming such files while avoiding conflicts (I 
wouldn't want to run into problems for an artist named C-A-T or
CAT, for example)? I'd like to make the files easily identifiable, and there 
really are no limits on what characters can be in an artist name.



So what you need first is a list of allowable characters for all your 
target OS versions.  And don't forget that the allowable characters may 
vary depending on the particular file system(s) mounted on a given OS.


You also need to decide how to handle Unicode characters, since they're 
different for different OS.  In Windows on NTFS, filenames are in 
Unicode, while on Unix, filenames are bytes.  So on one of those, you 
will be encoding/decoding if your code is to be mostly portable.


Don't forget that ls and rm may not use the same encoding you're using. 
 So you may not consider it adequate to make the names legal, but you 
may also want they easily typeable in the shell.


--
DaveA
--
http://mail.python.org/mailman/listinfo/python-list


Re: Making safe file names

2013-05-07 Thread Roy Smith
In article ,
 Dave Angel  wrote:

> On 05/07/2013 03:58 PM, Andrew Berg wrote:
> > Currently, I keep Last.fm artist data caches to avoid unnecessary API calls 
> > and have been naming the files using the artist name. However,
> > artist names can have characters that are not allowed in file names for 
> > most file systems (e.g., C/A/T has forward slashes). Are there any
> > recommended strategies for naming such files while avoiding conflicts (I 
> > wouldn't want to run into problems for an artist named C-A-T or
> > CAT, for example)? I'd like to make the files easily identifiable, and 
> > there really are no limits on what characters can be in an artist name.
> >
> 
> So what you need first is a list of allowable characters for all your 
> target OS versions.  And don't forget that the allowable characters may 
> vary depending on the particular file system(s) mounted on a given OS.
> 
> You also need to decide how to handle Unicode characters, since they're 
> different for different OS.  In Windows on NTFS, filenames are in 
> Unicode, while on Unix, filenames are bytes.  So on one of those, you 
> will be encoding/decoding if your code is to be mostly portable.
> 
> Don't forget that ls and rm may not use the same encoding you're using. 
>   So you may not consider it adequate to make the names legal, but you 
> may also want they easily typeable in the shell.

One possible tool that may help you here is unidecode 
(https://pypi.python.org/pypi/Unidecode).  It doesn't solve your whole 
problem, but it does help get unicode text into a form which is both 
7-bit clean and human readable.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Making safe file names

2013-05-07 Thread Andrew Berg
On 2013.05.07 17:37, Jens Thoms Toerring wrote:
> You
> could e.g. replace all characters not allowed by the file
> system by their hexidecimal (ASCII) values, preceeded by a
> '%" (so '/' would be changed to '%2F', and also encode a '%'
> itself in a name by '%25'). Then you have a well-defined
> two-way mapping ("isomorphic" if I remember my math-lear-
> nining days correctly) between the original name and the
> way you store it. E.g.
> 
>   "C/A/T"  would become  "C%2FA%2FT"
> 
> and
> 
>   "C%2FA/T"  would become  "C%252FA%2FT"
> 
> You can translate back and forth between them with not too
> much effort.
> 
> Of course, that assumes that '%' is a character allowed by
> your file system - otherwise pick some other one, any one
> will do in principle. It's a bit harder for a human to in-
> terpret but rathe likely not that much of a problem.
Yes, something like this is what I am trying to achieve. Judging by the 
responses I've gotten so far, I think I'll have to roll my own
transformation scheme since URL encoding and the like transform Unicode 
characters. I can memorize that 植松伸夫 is a Japanese composer who
is well-known for his works in the Final Fantasy series of video games. Trying 
to match up the URL-encoded version to an artist would be
almost impossible when I have several other artist names that have no ASCII 
characters.

-- 
CPython 3.3.1 | Windows NT 6.2.9200 / FreeBSD 9.1
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why do Perl programmers make more money than Python programmers

2013-05-07 Thread Neil Hodgson

jmfauth:


2) More critical, Py 3.3, just becomes non unicode compliant,
(eg European languages or "ascii" typographers !)
...


   This is not demonstrating non-compliance. It is comparing 
performance, not compliance.


   Please show an example where Python 3.3 is not compliant with Unicode.

   Neil
--
http://mail.python.org/mailman/listinfo/python-list


Re: Making safe file names

2013-05-07 Thread Andrew Berg
On 2013.05.07 19:14, Dave Angel wrote:
> You also need to decide how to handle Unicode characters, since they're 
> different for different OS.  In Windows on NTFS, filenames are in 
> Unicode, while on Unix, filenames are bytes.  So on one of those, you 
> will be encoding/decoding if your code is to be mostly portable.
Characters outside whatever sys.getfilesystemencoding() returns won't be 
allowed. If the user's locale settings don't support Unicode, my
program will be far from the only one to have issues with it. Any problem 
reports that arise from a user moving between legacy encodings
will generally be ignored. I haven't yet decided how I will handle artist names 
with characters outside UTF-8, but inside UTF-16/32 (UTF-16
is just fine on Windows/NTFS, but on Unix(-ish) systems, many use UTF-8 in 
their locale settings).
> Don't forget that ls and rm may not use the same encoding you're using. 
> So you may not consider it adequate to make the names legal, but you 
> may also want they easily typeable in the shell.
I don't understand. I have no intention of changing Unicode characters.


This is not a Unicode issue since (modern) file systems will happily accept it. 
The issue is that certain characters (which are ASCII) are
not allowed on some file systems:
 \ / : * ? " < > | @ and the NUL character
The first 9 are not allowed on NTFS, the @ is not allowed on ext3cow, and NUL 
and / are not allowed on pretty much any file system. Locale
settings and encodings aside, these 11 characters will need to be escaped.
-- 
CPython 3.3.1 | Windows NT 6.2.9200 / FreeBSD 9.1
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: distributing a binary package

2013-05-07 Thread Miki Tebeka
> I already have the .so files compiled.
http://docs.python.org/2/distutils/setupscript.html#installing-package-data ?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Making safe file names

2013-05-07 Thread Dave Angel

On 05/07/2013 08:51 PM, Andrew Berg wrote:

On 2013.05.07 19:14, Dave Angel wrote:

You also need to decide how to handle Unicode characters, since they're
different for different OS.  In Windows on NTFS, filenames are in
Unicode, while on Unix, filenames are bytes.  So on one of those, you
will be encoding/decoding if your code is to be mostly portable.

Characters outside whatever sys.getfilesystemencoding() returns won't be 
allowed. If the user's locale settings don't support Unicode, my
program will be far from the only one to have issues with it. Any problem 
reports that arise from a user moving between legacy encodings
will generally be ignored. I haven't yet decided how I will handle artist names 
with characters outside UTF-8,


There aren't any characters "outside UTF-8".  But a character is not "in 
utf-8", it can be encoded by utf-8.


 but inside UTF-16/32 (UTF-16

Nor outside UTF-16 or 32.


is just fine on Windows/NTFS, but on Unix(-ish) systems, many use UTF-8 in 
their locale settings).

Don't forget that ls and rm may not use the same encoding you're using.
So you may not consider it adequate to make the names legal, but you
may also want they easily typeable in the shell.

I don't understand. I have no intention of changing Unicode characters.


So you're comfortable typing arbitrary characters?  what about all the 
characters that have identical displays in your font? What about viewing 
0x07 in the terminal window?  Or 0x04?





This is not a Unicode issue since (modern) file systems will happily accept it. 
The issue is that certain characters (which are ASCII) are
not allowed on some file systems:
  \ / : * ? " < > | @ and the NUL character
The first 9 are not allowed on NTFS, the @ is not allowed on ext3cow, and NUL 
and / are not allowed on pretty much any file system. Locale
settings and encodings aside, these 11 characters will need to be escaped.



As soon as you have a small, finite list of invalid characters, writing 
an escape system is pretty easy.



--
DaveA
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why do Perl programmers make more money than Python programmers

2013-05-07 Thread Benjamin Kaplan
On May 7, 2013 5:42 PM, "Neil Hodgson"  wrote:
>
> jmfauth:
>
>> 2) More critical, Py 3.3, just becomes non unicode compliant,
>> (eg European languages or "ascii" typographers !)
>> ...
>
>
>This is not demonstrating non-compliance. It is comparing performance,
not compliance.
>
>Please show an example where Python 3.3 is not compliant with Unicode.
>
>Neil
> --
> http://mail.python.org/mailman/listinfo/python-list

It's violating page 1+1j of the Unicode spec, where it says precisely how
long each operation is allowed to take. Only wise people can see that page.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why do Perl programmers make more money than Python programmers

2013-05-07 Thread Dave Angel

On 05/07/2013 09:11 PM, Benjamin Kaplan wrote:

On May 7, 2013 5:42 PM, "Neil Hodgson"  wrote:


jmfauth:


2) More critical, Py 3.3, just becomes non unicode compliant,
(eg European languages or "ascii" typographers !)
...



This is not demonstrating non-compliance. It is comparing performance,

not compliance.


Please show an example where Python 3.3 is not compliant with Unicode.

Neil
--
http://mail.python.org/mailman/listinfo/python-list


It's violating page 1+1j of the Unicode spec, where it says precisely how
long each operation is allowed to take. Only wise people can see that page.




Of course!   It's a complex page.


--
DaveA
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why do Perl programmers make more money than Python programmers

2013-05-07 Thread Mark Lawrence

On 08/05/2013 01:34, Neil Hodgson wrote:

jmfauth:


2) More critical, Py 3.3, just becomes non unicode compliant,
(eg European languages or "ascii" typographers !)
...


This is not demonstrating non-compliance. It is comparing
performance, not compliance.

Please show an example where Python 3.3 is not compliant with Unicode.

Neil


Surely nobody expects an answer, although I suppose there is always a 
first time for everything.  Once again stealing from Tommy Docherty, 
jmfauth is to Python what King Herod was to baby sitting.


--
If you're using GoogleCrap™ please read this 
http://wiki.python.org/moin/GoogleGroupsPython.


Mark Lawrence

--
http://mail.python.org/mailman/listinfo/python-list


Re: Making safe file names

2013-05-07 Thread Neil Hodgson

Andrew Berg:


This is not a Unicode issue since (modern) file systems will happily accept it. 
The issue is that certain characters (which are ASCII) are
not allowed on some file systems:
  \ / : * ? "<  >  | @ and the NUL character
The first 9 are not allowed on NTFS, the @ is not allowed on ext3cow, and NUL 
and / are not allowed on pretty much any file system. Locale
settings and encodings aside, these 11 characters will need to be escaped.


   There's also the Windows device name hole. There may be trouble with 
artists named 'COM4', 'CLOCK$', 'Con', or similar.


http://support.microsoft.com/kb/74496
http://en.wikipedia.org/wiki/Nul_%28band%29

   Neil
--
http://mail.python.org/mailman/listinfo/python-list


Re: multiple versions of python

2013-05-07 Thread Colin J. Williams

On 07/05/2013 6:26 PM, sokovic.anamar...@gmail.com wrote:

Hi,

what is the generally recommended structure when we have into play this type of 
problem:
multiple versions of python (both in the sense of main versions and sub 
versions, e.g.,
2.7 :
2.7.1
2.7.3
3:
  3.3
3.3.1
Different versions of gcc
different compilation strategies (-vanilla and non-vanilla)
different modules (numpy,scipy) together with the different versions of all the 
rest.

any help is appreciated

Ana


Do you really need more than 2.7.3  and 3.3.1.

Typically, these go to C:\Python27 and C:\Python33 with windows.

Colin W.
--
http://mail.python.org/mailman/listinfo/python-list


Re: multiple versions of python

2013-05-07 Thread Roy Smith
In article ,
 "Colin J. Williams"  wrote:

> Do you really need more than 2.7.3  and 3.3.1.

It's often useful to have older versions around, so you can test your 
code against them.  Lots of projects try to stay compatible with older 
releases.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Making safe file names

2013-05-07 Thread Dave Angel

On 05/07/2013 09:28 PM, Neil Hodgson wrote:

Andrew Berg:


This is not a Unicode issue since (modern) file systems will happily
accept it. The issue is that certain characters (which are ASCII) are
not allowed on some file systems:
  \ / : * ? "<  >  | @ and the NUL character
The first 9 are not allowed on NTFS, the @ is not allowed on ext3cow,
and NUL and / are not allowed on pretty much any file system. Locale
settings and encodings aside, these 11 characters will need to be
escaped.


There's also the Windows device name hole. There may be trouble with
artists named 'COM4', 'CLOCK$', 'Con', or similar.



In MSDOS 2, there was a switch that would tell the OS to ignore such 
names unless they were prefixed by \DEV.  But like the switchar switch, 
it was largely ignored by the ignorant, and probably doesn't exist in 
current versions of M$OS



http://support.microsoft.com/kb/74496
http://en.wikipedia.org/wiki/Nul_%28band%29

Neil


While we're looking for trouble, there's also case insensitivity. 
Unclear if the user cares, but tom and TOM are the same file in most 
configurations of NT.


--
DaveA
--
http://mail.python.org/mailman/listinfo/python-list


Re: Making safe file names

2013-05-07 Thread Andrew Berg
On 2013.05.07 20:28, Neil Hodgson wrote:
> http://support.microsoft.com/kb/74496
> http://en.wikipedia.org/wiki/Nul_%28band%29
I can indeed confirm that at least 'nul' cannot be used as a filename. However, 
I add an extension to the file names to identify them as caches.

-- 
CPython 3.3.1 | Windows NT 6.2.9200 / FreeBSD 9.1
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Making safe file names

2013-05-07 Thread Andrew Berg
On 2013.05.07 20:45, Dave Angel wrote:
> While we're looking for trouble, there's also case insensitivity. 
> Unclear if the user cares, but tom and TOM are the same file in most 
> configurations of NT.
Artist names on Last.fm cannot differ only in case. This does remind me to make 
sure to update the case of the artist name as necessary,
though. For example, if Sam becomes SAM again (I have seen Last.fm change the 
case for artist names), I need to make sure that I don't end
up with two file names differing only in case.

-- 
CPython 3.3.1 | Windows NT 6.2.9200 / FreeBSD 9.1
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Making safe file names

2013-05-07 Thread Andrew Berg
On 2013.05.07 20:13, Dave Angel wrote:
> So you're comfortable typing arbitrary characters?  what about all the 
> characters that have identical displays in your font?
Identification is more important than typing. I can copy and paste into a 
terminal if necessary. I don't foresee typing out one of the
filenames being anything more than a rare occurrence, but I will occasionally 
just read the list.
> What about viewing 
> 0x07 in the terminal window?  Or 0x04?
I don't think Last.fm will even send those characters. In any case, control 
characters in artist names are rare enough that it's not worth
the trouble to write the code to avoid the problems associated with them.
> As soon as you have a small, finite list of invalid characters, writing 
> an escape system is pretty easy.
Probably. I was just hoping there was an existing system that would work, but 
as I said in a different reply, it would seem I need to roll
my own.

-- 
CPython 3.3.1 | Windows NT 6.2.9200 / FreeBSD 9.1
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Making safe file names

2013-05-07 Thread Roy Smith
In article ,
 Dave Angel  wrote:

> While we're looking for trouble, there's also case insensitivity. 
> Unclear if the user cares, but tom and TOM are the same file in most 
> configurations of NT.

OSX, too.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: multiple versions of python

2013-05-07 Thread Mark Lawrence

On 08/05/2013 02:35, Colin J. Williams wrote:

On 07/05/2013 6:26 PM, sokovic.anamar...@gmail.com wrote:

Hi,

what is the generally recommended structure when we have into play
this type of problem:
multiple versions of python (both in the sense of main versions and
sub versions, e.g.,
2.7 :
2.7.1
2.7.3
3:
  3.3
3.3.1
Different versions of gcc
different compilation strategies (-vanilla and non-vanilla)
different modules (numpy,scipy) together with the different versions
of all the rest.

any help is appreciated

Ana


Do you really need more than 2.7.3  and 3.3.1.

Typically, these go to C:\Python27 and C:\Python33 with windows.

Colin W.


In which case you'll normally be doing a binary installation.  If you're 
compiling it's more likely to be VC++ not gcc.


--
If you're using GoogleCrap™ please read this 
http://wiki.python.org/moin/GoogleGroupsPython.


Mark Lawrence

--
http://mail.python.org/mailman/listinfo/python-list


Re: Why do Perl programmers make more money than Python programmers

2013-05-07 Thread Steven D'Aprano
On Tue, 07 May 2013 15:17:52 +0100, Steve Simmons wrote:

> Good to see jmf finally comparing apples with apples :-)

*groans*

Truly the terrible pun that the terrible hijacking deserves.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Making safe file names

2013-05-07 Thread Steven D'Aprano
On Tue, 07 May 2013 19:51:24 -0500, Andrew Berg wrote:

> On 2013.05.07 19:14, Dave Angel wrote:
>> You also need to decide how to handle Unicode characters, since they're
>> different for different OS.  In Windows on NTFS, filenames are in
>> Unicode, while on Unix, filenames are bytes.  So on one of those, you
>> will be encoding/decoding if your code is to be mostly portable.
>
> Characters outside whatever sys.getfilesystemencoding() returns won't be
> allowed. If the user's locale settings don't support Unicode, my program
> will be far from the only one to have issues with it. Any problem
> reports that arise from a user moving between legacy encodings will
> generally be ignored. I haven't yet decided how I will handle artist
> names with characters outside UTF-8, but inside UTF-16/32 (UTF-16 is
> just fine on Windows/NTFS, but on Unix(-ish) systems, many use UTF-8 in
> their locale settings).

There aren't any characters outside of UTF-8 :-) UTF-8 covers the entire 
Unicode range, unlike other encodings like Latin-1 or ASCII.

Well, that is to say, there may be characters that are not (yet) handled 
at all by Unicode, but there are no known legacy encodings that support 
such characters.

To a first approximation, Unicode covers the entire set of characters in 
human use, and for those which it does not, there is always the private 
use area. So for example, if you wish to record the Artist Formerly Known 
As "The Artist Formerly Known As Prince" as Love Symbol, you could pick 
an arbitrary private use code point, declare that for your application 
that code point means Love Symbol, and use that code point as the artist 
name. You could even come up with a custom font that includes a rendition 
of that character glyph.

However, there are byte combinations which are not valid UTF-8, which is 
a different story. If you're receiving bytes from (say) a file name, they 
may not necessarily make up a valid UTF-8 string. But this is not an 
issue if you are receiving data from something guaranteed to be valid 
UTF-8.


>> Don't forget that ls and rm may not use the same encoding you're using.
>> So you may not consider it adequate to make the names legal, but you
>> may also want they easily typeable in the shell.
>
> I don't understand. I have no intention of changing Unicode characters.

Of course you do. You even talk below about Unicode characters like * 
and ? not being allowed on NTFS systems.

Perhaps you are thinking that there are a bunch of characters over here 
called "plain text ASCII characters", and a *different* bunch of 
characters with funny accents and stuff called "Unicode characters". If 
so, then you are labouring under a misapprehension, and you should start 
off by reading this:

http://www.joelonsoftware.com/articles/Unicode.html


then come back with any questions.


> This is not a Unicode issue since (modern) file systems will happily
> accept it. The issue is that certain characters (which are ASCII) are
> not allowed on some file systems:
>  \ / : * ? " < > | @ and the NUL character

These are all Unicode characters too. Unicode is a subset of ASCII, so 
anything which is ASCII is also Unicode.


> The first 9 are not allowed on NTFS, the @ is not allowed on ext3cow,
> and NUL and / are not allowed on pretty much any file system. Locale
> settings and encodings aside, these 11 characters will need to be
> escaped.

If you have an artist with control characters in their name, like newline 
or carriage return or NUL, I think it is fair to just drop the control 
characters and then give the artist a thorough thrashing with a halibut.

Does your mapping really need to be guaranteed reversible? If you have an 
artist called "JoeBlow", and another artist called "Joe\0Blow", and a 
third called "Joe\nBlow", does it *really* matter if your application 
conflates them?


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Making safe file names

2013-05-07 Thread Dave Angel

On 05/07/2013 10:06 PM, Andrew Berg wrote:

On 2013.05.07 20:28, Neil Hodgson wrote:

http://support.microsoft.com/kb/74496
http://en.wikipedia.org/wiki/Nul_%28band%29

I can indeed confirm that at least 'nul' cannot be used as a filename. However, 
I add an extension to the file names to identify them as caches.



Won't help.  NUL.txt is just as reserved as NUL is.  Extensions are 
ignored in this particular piece of historical nonsense.



--
DaveA
--
http://mail.python.org/mailman/listinfo/python-list


Re: Making safe file names

2013-05-07 Thread Dave Angel

On 05/07/2013 11:40 PM, Steven D'Aprano wrote:


   

These are all Unicode characters too. Unicode is a subset of ASCII, so
anything which is ASCII is also Unicode.




Typo.  You meant  Unicode is a superset of ASCII.


--
DaveA
--
http://mail.python.org/mailman/listinfo/python-list


Re: Making safe file names

2013-05-07 Thread Steven D'Aprano
On Wed, 08 May 2013 00:13:20 -0400, Dave Angel wrote:

> On 05/07/2013 11:40 PM, Steven D'Aprano wrote:
>>
>>
>>
>> These are all Unicode characters too. Unicode is a subset of ASCII, so
>> anything which is ASCII is also Unicode.
>>
>>
>>
> Typo.  You meant  Unicode is a superset of ASCII.

Damn. Yes, you're right. I was thinking superset, but my fingers typed 
subset.

Thanks for the correction.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Making safe file names

2013-05-07 Thread Andrew Berg
On 2013.05.07 22:40, Steven D'Aprano wrote:
> There aren't any characters outside of UTF-8 :-) UTF-8 covers the entire 
> Unicode range, unlike other encodings like Latin-1 or ASCII.
You are correct. I'm not sure what I was thinking.

>> I don't understand. I have no intention of changing Unicode characters.
> 
> Of course you do. You even talk below about Unicode characters like * 
> and ? not being allowed on NTFS systems.
I worded that incorrectly. What I meant, of course, is that I intend to 
preserve as many characters as possible and have no need to stay
within ASCII.

> If you have an artist with control characters in their name, like newline 
> or carriage return or NUL, I think it is fair to just drop the control 
> characters and then give the artist a thorough thrashing with a halibut.
While the thrashing with a halibut may be warranted (though I personally would 
use a rubber chicken), conflicts are problematic.

> Does your mapping really need to be guaranteed reversible? If you have an 
> artist called "JoeBlow", and another artist called "Joe\0Blow", and a 
> third called "Joe\nBlow", does it *really* matter if your application 
> conflates them?
Yes and yes. Some artists like to be real cute with their names and make witch 
house artist names look tame in comparison, and some may
choose to use names similar to some very popular artists. I've also seen people 
scrobble fake artists with names that look like real artist
names (using things like a non-breaking space instead of a regular space) with 
different artist pictures in order to confuse and troll
people. If I could remember the user profiles with this, I'd link them. Last.fm 
is a silly place.
As I said before though, I don't think control characters are even allowed in 
artist names (likely for technical reasons).
-- 
CPython 3.3.1 | Windows NT 6.2.9200 / FreeBSD 9.1
-- 
http://mail.python.org/mailman/listinfo/python-list