Re: [Tutor] A regular expression problem

2010-12-01 Thread Steven D'Aprano

Josep M. Fontana wrote:
[...]

I guess this is because the character encoding was not specified but
accented characters in the languages I'm dealing with should be
treated as a-z or A-Z, shouldn't they? 


No. a-z means a-z. If you want the localized set of alphanumeric 
characters, you need \w.


Likewise 0-9 means 0-9. If you want localized digits, you need \d.


 I mean, how do you deal with

languages that are not English with regular expressions? I would
assume that as long as you set the right encoding, Python will be able
to determine which subset of specific sequences of bytes count as a-z
or A-Z.


Encodings have nothing to do with this issue.

Literal characters a, b, ..., z etc. always have ONE meaning: they 
represent themselves (although possibly in a case-insensitive fashion). 
E means E, not È, É, Ê or Ë.


Localization tells the regex how to interpret special patterns like \d 
and \w. This has nothing to do with encodings -- by the time the regex 
sees the string, it is already dealing with characters. Localization is 
about what characters are in categories (is 5 a digit or a letter? how 
about ٣ ?).


Encoding is used to translate between bytes on disk and characters. For 
example, the character Ë could be stored on disk as the hex bytes:


\xcb  # one byte
\xc3\x8b  # two bytes
\xff\xfe\xcb\x00  # four bytes

and more, depending on the encoding used.


--
Steven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] How to handle exceptions raised inside a function?

2010-12-01 Thread Steven D'Aprano

Patty wrote:
This is very interesting to me - the below excerpt is something I was 
trying to do for one of my programs and gave up on it:


A fifth approach, common in some other languages, is to return some 
arbitrary value, and set an error flag. The caller then has to write 
code like this:


result = function(arguments)
if not last_result_error:
# no error occurred
print result is, result


If you do this, I will *personally* track you down and beat you to 
death with a rather large fish.


*wink*


I think I was trying to do something like thius at the end of a function 
I wrote-


return 2  or return my_special_integer_mvar


That syntax won't work. However, the basic idea is (moderately) sound: 
your function has a special value that means something funny happened. 
Python very occasionally uses this:


 hello world.find(z)  # not found
-1

which you then use like this:

result = string.find(target)
if result == -1:  # special value
print(not found)
else:
print(found at position %d % result)

In general, this idiom is mildly disparaged in Python circles, but not 
forbidden. Exceptions are usually considered better.


However, what I'm talking about is quite different. Here's how I might 
write the string find method using this (horrible) implementation:



# Global status flag.
find_succeeded = 0

def find(string, target):
global find_succeeded
if target in string:
find_succeeded = 1
return string.find(target)
else:
find_succeeded = 0
# I never know what number to return...
return 42  # that'll do...

(In low-level languages like C, the number returned on failure (where I 
choose 42) is often whatever value happens to be in some piece of memory 
-- essentially a random number.)



result = find(hello world, z)
if find_succeeded == 1:
print(found at position %d % result)
else:
print(not found)


This is even more inconvenient and difficult than earlier. Consider what 
happens if you want to do two or more searches. Because they all report 
their status via the one global variable, you must inspect the global 
after each call, before making the next call, or the status will be 
lost. In Python you can do this:


results = [s.find(target) for s in list_of_strings]

and then, later, inspect each individual result to see if it was -1 or 
not. But with the global status idiom, you can't do that, because the 
status flag is lost once you call find() again.




--
Steven

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] How to handle exceptions raised inside a function?

2010-12-01 Thread patty
I think I understand, I will have to reread this a couple times!  But I do
consider myself a C programmer so that probably explains why I was trying
to write code that way.  And you are right 'you must inspect the global
 after each call, before making the next call'.  I *was* setting up the
function to check on and muck with this mvar before the end where I
'return' my own thing.

So for Python programming you are advising us to use one of the other 
four approaches to error handling and not try to do something like:

call custom function, something is returned to a mvar while processing
each statement, which is then examined for content and if one of three
'bad' or 'good' things are in the mvar, trying to avoid return values of
[0 1 -1 2 -2] because off-the-top-of-my-head these are meaningful in
various languages, so come up with 3 way-out-there-numbers-of-my-own, such
as your '42'example.  Then try and 'return' this from the function back to
the calling function, possibly main().  Then examine return value integer
and have ' if statements' doing something depending...

So does this mean I am thinking through this like a C programmer?

And I don't think I would be checking a variable at all using the other
four ways, I would just let an error happen and let the return value be
whatever it is and let the exception come up (or my custom exception
handler) and handle it, instead I was trying to get right in the middle of
it and force things.  Also I was trying to use the return value for
purposes other than error.  I think what I was trying to do would be like
a Case Statement
really if that were supported in Python.


 return my_special_integer_mvar

So the above will not work?  If you were to try this, do you have to
return digits?  You can't return an mvar (and hope it doesn't change on
you while going back to calling program)?

Thanks for confirming my understanding or confusion as the case may be!!

Patty


 Patty wrote:
 This is very interesting to me - the below excerpt is something I was
 trying to do for one of my programs and gave up on it:

 A fifth approach, common in some other languages, is to return some
 arbitrary value, and set an error flag. The caller then has to write
 code like this:

 result = function(arguments)
 if not last_result_error:
 # no error occurred
 print result is, result


 If you do this, I will *personally* track you down and beat you to
 death with a rather large fish.

 *wink*

 I think I was trying to do something like thius at the end of a function
 I wrote-

 return 2  or return my_special_integer_mvar

 That syntax won't work. However, the basic idea is (moderately) sound:
 your function has a special value that means something funny happened.
 Python very occasionally uses this:

   hello world.find(z)  # not found
 -1

 which you then use like this:

 result = string.find(target)
 if result == -1:  # special value
  print(not found)
 else:
  print(found at position %d % result)

 In general, this idiom is mildly disparaged in Python circles, but not
 forbidden. Exceptions are usually considered better.

 However, what I'm talking about is quite different. Here's how I might
 write the string find method using this (horrible) implementation:


 # Global status flag.
 find_succeeded = 0

 def find(string, target):
  global find_succeeded
  if target in string:
  find_succeeded = 1
  return string.find(target)
  else:
  find_succeeded = 0
  # I never know what number to return...
  return 42  # that'll do...

 (In low-level languages like C, the number returned on failure (where I
 choose 42) is often whatever value happens to be in some piece of memory
 -- essentially a random number.)


 result = find(hello world, z)
 if find_succeeded == 1:
  print(found at position %d % result)
 else:
  print(not found)


 This is even more inconvenient and difficult than earlier. Consider what
 happens if you want to do two or more searches. Because they all report
 their status via the one global variable, you must inspect the global
 after each call, before making the next call, or the status will be
 lost. In Python you can do this:

 results = [s.find(target) for s in list_of_strings]

 and then, later, inspect each individual result to see if it was -1 or
 not. But with the global status idiom, you can't do that, because the
 status flag is lost once you call find() again.



 --
 Steven

 ___
 Tutor maillist  -  Tutor@python.org
 To unsubscribe or change subscription options:
 http://mail.python.org/mailman/listinfo/tutor




___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] How to handle exceptions raised inside a function?

2010-12-01 Thread Richard D. Moores
On Tue, Nov 30, 2010 at 13:23, Steven D'Aprano st...@pearwood.info wrote:
 Richard D. Moores wrote:

 Please take a look at 2 functions I just wrote to calculate the
 harmonic and geometric means of lists of positive numbers:
 http://tutoree7.pastebin.com/VhUnZcma.

 Both Hlist and Glist must contain only positive numbers, so I really
 need to test for this inside each function. But is there a good way to
 do this? What should the functions return should a non-positive number
 be detected? Is there a conventional Pythonic way to do this?


 (2) If you don't trust that a sensible exception will be raised, then do
 your own error checking, and raise an exception.

I'll go with this one because I do want both Hlist and Glist to
contain only positive real numbers. So I'll go with Jerry Hill's
suggestion for both H and G. See the two revised functions at
http://tutoree7.pastebin.com/VfYLpFQq.

 For what it's worth, I have a module of statistics functions (shameless
 plug: http://pypi.python.org/pypi/stats and
 http://code.google.com/p/pycalcstats -- feedback and bug reports welcome)

An impressive collection. Thanks for sharing!

 that includes the harmonic and geometric mean. My harmonic mean looks like
 this:

 def harmonic_mean(data):
    try:
        m = mean(1.0/x for x in data)
    except ZeroDivisionError:
        return 0.0
    if m == 0.0:
        return math.copysign(float('inf'), m)
    return 1/m

math.copysign! Didn't know about that one. But mean? It's not a
built-in function..

Dick

 Notice that if the data includes one or more zeroes, the harmonic mean
 itself will be zero: limit as x-0 of 1/x - infinity, and 1/infinity - 0.
 If the sum of reciprocals itself cancels to zero, I return the infinity with
 the appropriate sign. The only exceptions that could occur are:

 * mean will raise ValueError if the data is empty;
 * if an argument is non-numeric, TypeError will occur when I take the
 reciprocal of it.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] How to handle exceptions raised inside a function?

2010-12-01 Thread Steven D'Aprano

Richard D. Moores wrote:
[...]

def harmonic_mean(data):
   try:
   m = mean(1.0/x for x in data)
   except ZeroDivisionError:
   return 0.0
   if m == 0.0:
   return math.copysign(float('inf'), m)
   return 1/m


math.copysign! Didn't know about that one. But mean? It's not a
built-in function..


No, it's not, it's a function from my stats module.

The naive version is simple:

def mean(data):
return sum(data)/len(data)


My version is a bit more complicated than that, in order to minimize 
round-off error and support iterators.




--
Steven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] permutations?

2010-12-01 Thread Alex Hall
Hi all,
I am wondering if there is a python package that will find
permutations? For example, if I have (1, 2, 3), the possibilities I
want are:
12
13
23
123
132
231

Order does not matter; 21 is the same as 12, but no numbers can
repeat. If no package exists, does someone have a hint as to how to
get a function to do this? The one I have right now will not find 132
or 231, nor will it find 13. TIA.

-- 
Have a great day,
Alex (msg sent from GMail website)
mehg...@gmail.com; http://www.facebook.com/mehgcap
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] permutations?

2010-12-01 Thread Hugo Arts
On Wed, Dec 1, 2010 at 11:45 PM, Alex Hall mehg...@gmail.com wrote:
 Hi all,
 I am wondering if there is a python package that will find
 permutations? For example, if I have (1, 2, 3), the possibilities I
 want are:
 12
 13
 23
 123
 132
 231

 Order does not matter; 21 is the same as 12, but no numbers can
 repeat. If no package exists, does someone have a hint as to how to
 get a function to do this? The one I have right now will not find 132
 or 231, nor will it find 13. TIA.


Does order matter or not? You say (2, 1) == (1, 2) but you do list (1,
2, 3) and (1, 3, 2) as separate, so order does matter there. Be
consistent.

You probably want to look at the itertools.permutations and
itertools.combinations functions.

Hugo
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] permutations?

2010-12-01 Thread bob gailer

On 12/1/2010 5:45 PM, Alex Hall wrote:

Hi all,
I am wondering if there is a python package that will find
permutations? For example, if I have (1, 2, 3), the possibilities I
want are:
12
13
23
123
132
231

Order does not matter; 21 is the same as 12, but no numbers can
repeat. If no package exists, does someone have a hint as to how to
get a function to do this? The one I have right now will not find 132
or 231, nor will it find 13. TIA.


According to Wikipedia  there are six permutations of the set {1,2,3}, 
namely [1,2,3], [1,3,2], [2,1,3], [2,3,1], [3,1,2], and [3,2,1].


Above you show some combinations and a subset of the permutations.

What rules did you apply to come up with your result?

--
Bob Gailer
919-636-4239
Chapel Hill NC

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] How to handle exceptions raised inside a function?

2010-12-01 Thread Richard D. Moores
On Tue, Nov 30, 2010 at 13:23, Steven D'Aprano st...@pearwood.info wrote:

 For what it's worth, I have a module of statistics functions (shameless
 plug: http://pypi.python.org/pypi/stats and
 http://code.google.com/p/pycalcstats -- feedback and bug reports welcome)

Your readme says:
Installation


stats requires Python 3.1. To install, do the usual:

python3 setup.py install

I'm afraid I've never fully understood instructions like that. I have
Python 3.1. I now have your stats-0.1.1a. where do I put it to do the
above? It's now in 3.1's site-packages. Do I CD to stats-0.1.1a and
run that command? Or?

Could I have put stats-0.1.1a anywhere, CD-ed to that anywhere, and
then run the command?

I hesitate to experiment because I don't want your files sprayed all
over my hard disk.

Thanks,

Dick
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] How to handle exceptions raised inside a function?

2010-12-01 Thread James Mills
On Thu, Dec 2, 2010 at 10:27 AM, Richard D. Moores rdmoo...@gmail.com wrote:
 Could I have put stats-0.1.1a anywhere, CD-ed to that anywhere, and
 then run the command?

Yes.

python setup.py install essentially instructs
distutils (or setuptools or distribute - whichever is being used)
to install the package into your site-packages or
$PYTHONPATH (if configured that way by means of configuration).

NB: Copying the package's directory into site-packages hwoever
has the same effect. The directory that contains the __init__.py

cheers
James


-- 
-- James Mills
--
-- Problems are solved by method
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] How to handle exceptions raised inside a function?

2010-12-01 Thread Richard D. Moores
On Wed, Dec 1, 2010 at 16:41, James Mills prolo...@shortcircuit.net.au wrote:
 On Thu, Dec 2, 2010 at 10:27 AM, Richard D. Moores rdmoo...@gmail.com wrote:
 Could I have put stats-0.1.1a anywhere, CD-ed to that anywhere, and
 then run the command?

 Yes.

Thanks, James. Did that.

Thought I'd publicize Steven's stuff a bit. I pasted the help on
module stats at http://tutoree7.pastebin.com/bhcjhRuV

Dick
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] permutations?

2010-12-01 Thread Alex Hall
Thanks to everyone for the itertools hint; that sounds like it will work.

Sorry I was not clearer:
1. Order matters; I meant to say that direction does not. That is, 123
is not the same as 213, but 123 is the same as 321 since the second
example is simply a reversal.
2. I am looking for all permutations and subpermutations (if that is a
word) of 1-n where the list must have at least 2, but no more than n,
unique numbers (so 1 is not valid, nor is 1231 since it repeats 1 and
is too long for n=3).
I hope that makes sense. However, hopefully itertools will do it; if I
run into problems I will respond to this email to keep it in the same
thread. Thanks again! Oh, to the person who asked, I have 2.6 and 2.7
installed, with the default being 2.6.

On 12/1/10, bob gailer bgai...@gmail.com wrote:
 On 12/1/2010 5:45 PM, Alex Hall wrote:
 Hi all,
 I am wondering if there is a python package that will find
 permutations? For example, if I have (1, 2, 3), the possibilities I
 want are:
 12
 13
 23
 123
 132
 231

 Order does not matter; 21 is the same as 12, but no numbers can
 repeat. If no package exists, does someone have a hint as to how to
 get a function to do this? The one I have right now will not find 132
 or 231, nor will it find 13. TIA.

 According to Wikipedia  there are six permutations of the set {1,2,3},
 namely [1,2,3], [1,3,2], [2,1,3], [2,3,1], [3,1,2], and [3,2,1].

 Above you show some combinations and a subset of the permutations.

 What rules did you apply to come up with your result?

 --
 Bob Gailer
 919-636-4239
 Chapel Hill NC




-- 
Have a great day,
Alex (msg sent from GMail website)
mehg...@gmail.com; http://www.facebook.com/mehgcap
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] data structures

2010-12-01 Thread Dana

Hello,

I'm using Python to extract words from plain text files.  I have a list 
of words.  Now I would like to convert that list to a dictionary of 
features where the key is the word and the value the number of 
occurrences in a group of files based on the filename (different files 
correspond to different categories).  What is the best way to represent 
this data?  When I finish I expect to have about 70 unique dictionaries 
with values I plan to use in frequency distributions, etc.  Should I use 
globally defined dictionaries?


Dana
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] permutations?

2010-12-01 Thread Alex Hall
Alright, I have it working. Now the problem is that it does not throw
out reversals. I tried to do this myself with a couple loops, but I
get index errors. My total list of permutations is called l.

for i in range(0, len(l)):
 r=l[i]; r.reverse()
 for j in range(0, len(l)):
  print l[j], r, i, j

  if r==l[j]: l.remove(r)


  if r==l[j]: l.remove(r)
IndexError: list index out of range

When it has found two repeats (oddly, they are the same repeat - 2,1)
and removed them, there are ten of the twelve items left. Once j hits
10, it errors out, obviously, since it is looking for l[10] which no
longer exists. However, should the loop not re-evaluate len(l) each
time, so this should not be a problem? Why is it still looking for an
index that is no longer in the range (0, len(l)), and how can I fix
this? Also, why would it find several items twice? For example, it
picks up (2,1), then picks it up again.
Is there a better way of doing this that would avoid me having to
write the function?

On 12/1/10, Alex Hall mehg...@gmail.com wrote:
 Thanks to everyone for the itertools hint; that sounds like it will work.

 Sorry I was not clearer:
 1. Order matters; I meant to say that direction does not. That is, 123
 is not the same as 213, but 123 is the same as 321 since the second
 example is simply a reversal.
 2. I am looking for all permutations and subpermutations (if that is a
 word) of 1-n where the list must have at least 2, but no more than n,
 unique numbers (so 1 is not valid, nor is 1231 since it repeats 1 and
 is too long for n=3).
 I hope that makes sense. However, hopefully itertools will do it; if I
 run into problems I will respond to this email to keep it in the same
 thread. Thanks again! Oh, to the person who asked, I have 2.6 and 2.7
 installed, with the default being 2.6.

 On 12/1/10, bob gailer bgai...@gmail.com wrote:
 On 12/1/2010 5:45 PM, Alex Hall wrote:
 Hi all,
 I am wondering if there is a python package that will find
 permutations? For example, if I have (1, 2, 3), the possibilities I
 want are:
 12
 13
 23
 123
 132
 231

 Order does not matter; 21 is the same as 12, but no numbers can
 repeat. If no package exists, does someone have a hint as to how to
 get a function to do this? The one I have right now will not find 132
 or 231, nor will it find 13. TIA.

 According to Wikipedia  there are six permutations of the set {1,2,3},
 namely [1,2,3], [1,3,2], [2,1,3], [2,3,1], [3,1,2], and [3,2,1].

 Above you show some combinations and a subset of the permutations.

 What rules did you apply to come up with your result?

 --
 Bob Gailer
 919-636-4239
 Chapel Hill NC




 --
 Have a great day,
 Alex (msg sent from GMail website)
 mehg...@gmail.com; http://www.facebook.com/mehgcap



-- 
Have a great day,
Alex (msg sent from GMail website)
mehg...@gmail.com; http://www.facebook.com/mehgcap
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] data structures

2010-12-01 Thread Chris Fuller

It sounds to me like all you need is for the values to be another dictionary, 
keyed on filename?  Don't use globals if you can help it.  You should be able 
to structure your program such that they aren't necessary.

Something like this?

features = {
feature1': {'file1': 12, 'file2': 0, 'file3':6 }
feature2': {'file1': 4, 'file2': 17, 'file3':0 }
}


Cheers

On Wednesday 01 December 2010, Dana wrote:
 Hello,
 
 I'm using Python to extract words from plain text files.  I have a list
 of words.  Now I would like to convert that list to a dictionary of
 features where the key is the word and the value the number of
 occurrences in a group of files based on the filename (different files
 correspond to different categories).  What is the best way to represent
 this data?  When I finish I expect to have about 70 unique dictionaries
 with values I plan to use in frequency distributions, etc.  Should I use
 globally defined dictionaries?
 
 Dana
 ___
 Tutor maillist  -  Tutor@python.org
 To unsubscribe or change subscription options:
 http://mail.python.org/mailman/listinfo/tutor

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] data structures

2010-12-01 Thread Knacktus

Am 02.12.2010 02:51, schrieb Dana:

Hello,

I'm using Python to extract words from plain text files. I have a list
of words. Now I would like to convert that list to a dictionary of
features where the key is the word and the value the number of
occurrences in a group of files based on the filename (different files
correspond to different categories). What is the best way to represent
this data? When I finish I expect to have about 70 unique dictionaries
with values I plan to use in frequency distributions, etc. Should I use
globally defined dictionaries?
Depends on what else you want to do with the group of files. If you're 
expecting some operations on the group's data you should create a class 
to be able to add some more methods to the data. I would probably go 
with a class.


class FileGroup(object):

def __init__(self, filenames):
self.filenames = filenames
self.word_to_occurrences = {}
self._populate_word_to_occurrences()

def _populate_word_to_occurrences():
for filename in filenames:
with open(filename) as fi:
# do the processing

Now you could add other meaningful data and methods to a group of files.

But also I think dictionaries can be fine. If you really only need the 
dicts. You could create a function to create those.


def create_word_to_occurrences(filenames):
word_to_occurrences = {}
for filename in filenames:
with open(filename) as fi
# do the processing
return word_to_occurrences

But as I said, if in doubt I would go for the class.



Dana
___
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor