[Tutor] Evaluating program running time?

2011-04-08 Thread Cory Teshera-Sterne
Hi all,

I have a small(ish) Python program, and I need to be able to log the running
time. This isn't something I've ever really encountered, and I've been led
to believe it can be a little hairy. Are there any Python-specific
approaches to this? I found the timeit module, but that doesn't seem to be
quite what I'm looking for.

Thanks for any insight,
Cory

-- 
Cory Teshera-Sterne
Mount Holyoke College, 2010
www.linkedin.com/in/corytesherasterne
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Evaluating program running time?

2011-04-08 Thread bob gailer

On 4/8/2011 2:29 PM, Cory Teshera-Sterne wrote:

Hi all,

I have a small(ish) Python program, and I need to be able to log the 
running time. This isn't something I've ever really encountered, and 
I've been led to believe it can be a little hairy. Are there any 
Python-specific approaches to this? I found the timeit module, but 
that doesn't seem to be quite what I'm looking for.


I like to use the time module

import time
start = time.time()
rest of program
print time.time() - start

I believe that gives best precisioni on *nix
On Windows use time.clock() )instead.

--
Bob Gailer
919-636-4239
Chapel Hill NC

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Evaluating program running time?

2011-04-08 Thread Prasad, Ramit
Odd, my previous email seems to have gotten lost

Cory,
See: http://stackoverflow.com/questions/156330/get-timer-ticks-in-python
It is basically what Bob mentions, but with a few more details / alternatives.


Ramit



Ramit Prasad | JPMorgan Chase Investment Bank | Currencies Technology
712 Main Street | Houston, TX 77002
work phone: 713 - 216 - 5423


-Original Message-
From: tutor-bounces+ramit.prasad=jpmchase@python.org 
[mailto:tutor-bounces+ramit.prasad=jpmchase@python.org] On Behalf Of bob 
gailer
Sent: Friday, April 08, 2011 2:49 PM
To: Cory Teshera-Sterne
Cc: tutor@python.org
Subject: Re: [Tutor] Evaluating program running time?

On 4/8/2011 2:29 PM, Cory Teshera-Sterne wrote:
 Hi all,

 I have a small(ish) Python program, and I need to be able to log the 
 running time. This isn't something I've ever really encountered, and 
 I've been led to believe it can be a little hairy. Are there any 
 Python-specific approaches to this? I found the timeit module, but 
 that doesn't seem to be quite what I'm looking for.

I like to use the time module

import time
start = time.time()
rest of program
print time.time() - start

I believe that gives best precisioni on *nix
On Windows use time.clock() )instead.

-- 
Bob Gailer
919-636-4239
Chapel Hill NC

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor
This communication is for informational purposes only. It is not
intended as an offer or solicitation for the purchase or sale of
any financial instrument or as an official confirmation of any
transaction. All market prices, data and other information are not
warranted as to completeness or accuracy and are subject to change
without notice. Any comments or statements made herein do not
necessarily reflect those of JPMorgan Chase  Co., its subsidiaries
and affiliates.

This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law. If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, or
use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED. Although this transmission and any
attachments are believed to be free of any virus or other defect
that might affect any computer system into which it is received and
opened, it is the responsibility of the recipient to ensure that it
is virus free and no responsibility is accepted by JPMorgan Chase 
Co., its subsidiaries and affiliates, as applicable, for any loss
or damage arising in any way from its use. If you received this
transmission in error, please immediately contact the sender and
destroy the material in its entirety, whether in electronic or hard
copy format. Thank you.

Please refer to http://www.jpmorgan.com/pages/disclosures for
disclosures relating to European legal entities.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Evaluating program running time?

2011-04-08 Thread Prasad, Ramit
Cory,
See: http://stackoverflow.com/questions/156330/get-timer-ticks-in-python

Ramit



Ramit Prasad | JPMorgan Chase Investment Bank | Currencies Technology
712 Main Street | Houston, TX 77002
work phone: 713 - 216 - 5423

From: tutor-bounces+ramit.prasad=jpmchase@python.org 
[mailto:tutor-bounces+ramit.prasad=jpmchase@python.org] On Behalf Of Cory 
Teshera-Sterne
Sent: Friday, April 08, 2011 1:29 PM
To: tutor@python.org
Subject: [Tutor] Evaluating program running time?

Hi all,

I have a small(ish) Python program, and I need to be able to log the running 
time. This isn't something I've ever really encountered, and I've been led to 
believe it can be a little hairy. Are there any Python-specific approaches to 
this? I found the timeit module, but that doesn't seem to be quite what I'm 
looking for.

Thanks for any insight,
Cory

--
Cory Teshera-Sterne
Mount Holyoke College, 2010
www.linkedin.com/in/corytesherasternehttp://www.linkedin.com/in/corytesherasterne


This communication is for informational purposes only. It is not
intended as an offer or solicitation for the purchase or sale of
any financial instrument or as an official confirmation of any
transaction. All market prices, data and other information are not
warranted as to completeness or accuracy and are subject to change
without notice. Any comments or statements made herein do not
necessarily reflect those of JPMorgan Chase  Co., its subsidiaries
and affiliates.

This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law. If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, or
use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED. Although this transmission and any
attachments are believed to be free of any virus or other defect
that might affect any computer system into which it is received and
opened, it is the responsibility of the recipient to ensure that it
is virus free and no responsibility is accepted by JPMorgan Chase 
Co., its subsidiaries and affiliates, as applicable, for any loss
or damage arising in any way from its use. If you received this
transmission in error, please immediately contact the sender and
destroy the material in its entirety, whether in electronic or hard
copy format. Thank you.

Please refer to http://www.jpmorgan.com/pages/disclosures for
disclosures relating to European legal entities.___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Evaluating program running time?

2011-04-08 Thread Lie Ryan
On 04/09/11 04:29, Cory Teshera-Sterne wrote:
 Hi all,
 
 I have a small(ish) Python program, and I need to be able to log the
 running time. This isn't something I've ever really encountered, and
 I've been led to believe it can be a little hairy. Are there any
 Python-specific approaches to this? I found the timeit module, but
 that doesn't seem to be quite what I'm looking for.
 

import cProfile
...
def main():
...

cProfile.run('main()')

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Evaluating a string expression

2009-11-06 Thread Alan Gauld

Modulok modu...@gmail.com wrote


I would like to know how would I evaluate a string expression in python.
For example, if i say:

a = 3*2

I want to do something to evaluate the variable 'a' to give me 6. How
can I do this?

[/snip]

The eval() function can do this:

  eval(3*2)

WARNING: Long winded security rant below...


And these are valid warnings which begs the question what are the 
alternatives?


If your string forms a well defined pattern you can parse the string into
its components - an arithmetic calculation in the example and execute it 
that way.

There are Python modules/tools available to help create such parsers and if
you are dealing with well defined input that is probably the safest 
approach.


Use eval() only if you know that the input cannot be malicious (or 
accidentally bad)

code.

HTH,

--
Alan Gauld
Author of the Learn to Program web site
http://www.alan-g.me.uk/ 



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Evaluating a string expression

2009-11-05 Thread Modulok
[snip]
 I would like to know how would I evaluate a string expression in python.
 For example, if i say:
 a = 3*2
 I want to do something to evaluate the variable 'a' to give me 6. How
 can I do this?
[/snip]

The eval() function can do this:

   eval(3*2)

WARNING: Long winded security rant below...

Be *very* careful what strings you pass to eval(). It is executing
code! If you're doing this in a controlled environment it's not a
problem. If this is part of a bigger program which is going to be used
by other people, perhaps even online, this is a potentially *huge*
security risk. You will either have to very carefully parse the users
input to control what they can and cannot do, or better, strictly
control what the kernel permits the process to do. This includes what
hardware resources (memory/processor time) the process is allowed.
This way, even if (when) the process is hijacked, the damage will be
very limited.

Such a feat is accomplished by having the program execute as a user
who has very limited permissions. This is something best (only?) done
on UNIX/Linux/BSD flavored systems. This could be done via a setuid
binary, or a *carefully written* root process which immediately
demotes its privilege level upon execution/spawning of children. (Such
a model is employed by programs like apache's httpd server, where one
process is root owned and does nothing but spawn lesser privileged
processes to handle untrusted data.) If this is something you're
interested in, the os module features functions like, 'setuid()',
'setgid()', and notably 'chroot()'. For further security yet, you
might look into isolating a process from the rest of the system, as is
the case with FreeBSD's jails.

These are really big topics and in the end, it really depends on what
'untrusted source' constitutes, and your operating environment.
Writing bulletproof code in regards to security is challenging. It is
a very specialized topic worthy of further study. But in most
situations executing code from an untrusted source is a *really* bad
idea, even with precautions as those outlined in the example URL
provided by one of the other responses.
(http://effbot.org/zone/librarybook-core-eval.htm)

Sorry for all the lecture. I'll shut up now. :p
-Modulok-
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Evaluating Swahili Part of Speech Tagging. How can I write a Python script for that?

2009-03-24 Thread عماد نوفل
Evaluating Swahili Part of Speech Tagging. How can I write a Python script
for that?
# The information provided herein about Swahili may not be accurate
# it is just intended to illustrate the problem

Hi Tutors,
I would appreciate it if you gave me ideas about how to tackle this problem.


Assigninig POS tags to words is a major step in many linguistic analyses.
POS tags give the grammatical category of words, for example:

The Determiner
man Noun
who RelativePronoun
came Verb
to Preposition
us AccusativePluralPronoun
is CopulaPresent
an Determiner
engineer Noun

What we usually do is train a Part-of-Speech Tagger, and then test it on an
already tagged (gold standard) test set. After running the tagger, we get
something like this:

The DeterminerDeterminer
man NounPresentVerb
who RelativePronounRelativePronoun
came VerbVerb
to PrepositionPreposition
us AccusativePluralPronounAccusativePluralPronoun
is CopulaPresentCopulaPresent
an DeterminerDeterminer
engineer NounNoun

As can be seen from above, the POS tagger assigned the wrong Part of Speech
to the word man, and this makes it easy to calculate the POS tagger
accuracy, simply 8 out of 9 are correct (88.8%).

Swahili is a morphologically complex language. The same sentence above is
usaually written as:

theman whocametous isanengineer

This means that we should run a word segmenter before running the POS
tagger. The word segmenter of course makes mistakes which will affect the
accuracy of the POS tagger.
We get an output like the following where the second word (sic) is
ill-segmented:

# Segmenter + POS Tagger output file
the Determiner
whocame Noun
to Preposition
us AccusativePluralPronoun
is CopulaPresent
an Determiner
engineer Noun

Now, how can I measure the accuracy of this output file against the gold
standard file below given that the line alignment is lost every time the
segmenter makes a mistake, which happens at the rate of 15 per 1000 words:

# Gold Standard File
The Determiner
man Noun
who RelativePronoun
to Preposition
us AccusativePluralPronoun
is CopulaPresent
an Determiner
engineer Noun

Please note that the output file is usually in the range of 100,000 words

-- 
لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.محمد
الغزالي
No victim has ever been more repressed and alienated than the truth

Emad Soliman Nawfal
Indiana University, Bloomington

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Evaluating Swahili Part of Speech Tagging. How can I write a Python script for that?

2009-03-24 Thread عماد نوفل
Evaluating Swahili Part of Speech Tagging. How can I write a Python script
for that?
# The information provided herein about Swahili may not be accurate
# it is just intended to illustrate the problem
# The first message had an error. Sorry for that

Hi Tutors,
I would appreciate it if you gave me ideas about how to tackle this problem.

Assigninig POS tags to words is a major step in many linguistic analyses.
POS tags give the grammatical category of words, for example:

The Determiner
man Noun
who RelativePronoun
came Verb
to Preposition
us AccusativePluralPronoun
is CopulaPresent
an Determiner
engineer Noun

What we usually do is train a Part-of-Speech Tagger, and then test it on an
already tagged (gold standard) test set. After running the tagger, we get
something like this:

The DeterminerDeterminer
man NounPresentVerb
who RelativePronounRelativePronoun
came VerbVerb
to PrepositionPreposition
us AccusativePluralPronounAccusativePluralPronoun
is CopulaPresentCopulaPresent
an DeterminerDeterminer
engineer NounNoun

As can be seen from above, the POS tagger assigned the wrong Part of Speech
to the word man, and this makes it easy to calculate the POS tagger
accuracy, simply 8 out of 9 are correct (88.8%).

Swahili is a morphologically complex language. The same sentence above is
usaually written as:

theman whocametous isanengineer

This means that we should run a word segmenter before running the POS
tagger. The word segmenter of course makes mistakes which will affect the
accuracy of the POS tagger.
We get an output like the following where the second word (sic) is
ill-segmented:

# Segmenter + POS Tagger output file
the Determiner
man Noun
whocame Noun
to Preposition
us AccusativePluralPronoun
is CopulaPresent
an Determiner
engineer Noun

Now, how can I measure the accuracy of this output file against the gold
standard file below given that the line alignment is lost every time the
segmenter makes a mistake, which happens at the rate of 15 per 1000 words:

# Gold Standard File
The Determiner
man Noun
who RelativePronoun
to Preposition
us AccusativePluralPronoun
is CopulaPresent
an Determiner
engineer Noun

Please note that the output file is usually in the range of 100,000 words

-- 

-- 
لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.محمد
الغزالي
No victim has ever been more repressed and alienated than the truth

Emad Soliman Nawfal
Indiana University, Bloomington

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Evaluating Swahili Part of Speech Tagging. How can I write a Python script for that?

2009-03-24 Thread Eduardo Vieira
2009/3/24 Emad Nawfal (عماد نوفل) emadnaw...@gmail.com:
 Evaluating Swahili Part of Speech Tagging. How can I write a Python script
 for that?
 # The information provided herein about Swahili may not be accurate
 # it is just intended to illustrate the problem

Hello, Mr. Emad! Have you checked the NLTK (Natural Language Toolkit -
http://www.nltk.org ) a Python package for Linguistics applications?
Maybe they have something already implemented. I actually liked a lot
their tutorials about python and using pythons for Linguistics. Very
good explanations.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Evaluating Swahili Part of Speech Tagging. How can I write a Python script for that?

2009-03-24 Thread عماد نوفل
2009/3/24 Emad Nawfal (عماد نوفل) emadnaw...@gmail.com



 2009/3/24 Eduardo Vieira eduardo.su...@gmail.com

 2009/3/24 Emad Nawfal (عماد نوفل) emadnaw...@gmail.com:
  Evaluating Swahili Part of Speech Tagging. How can I write a Python
 script
  for that?
  # The information provided herein about Swahili may not be accurate
  # it is just intended to illustrate the problem
 
 Hello, Mr. Emad! Have you checked the NLTK (Natural Language Toolkit -
 http://www.nltk.org ) a Python package for Linguistics applications?
 Maybe they have something already implemented. I actually liked a lot
 their tutorials about python and using pythons for Linguistics. Very
 good explanations.



 I have checked the NLTK, and it does not seem to have something like this.
 Thanks for the suggestion though

 --
 لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.محمد
 الغزالي
 No victim has ever been more repressed and alienated than the truth

 Emad Soliman Nawfal
 Indiana University, Bloomington
 



Thanks James,
I'm using the TnT POS Tagger, and I treat it as a black box, otherwise I
have to write my own, which is a huge task.
The Segmenter I use is home-grown, and it is supposedly the best available.
I used to evaluate on whole words, and this was easy. After the segmentation
and tagging, I combined the various segments of each word, and this
elimintaed the discrepancy in alignment. For example, I would have an output
like this:

the+man Det+Noun the+man Det+Noun
who+came+to+us tag whocame+to+us wrongTag
It is easy to do it this way if you use a WORD_END_DELIMITER, but this is
very tedious, and you have to recalculate the segment accuracy.
I'm looking for something smarter than this.

-- 
لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.محمد
الغزالي
No victim has ever been more repressed and alienated than the truth

Emad Soliman Nawfal
Indiana University, Bloomington

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Evaluating Swahili Part of Speech Tagging. How can I writea Python script for that?

2009-03-24 Thread Alan Gauld

Hi,
That was an interesting post, but I'm not sure what you want help with.
Is it the word splitting?
Is it writing the POS tagger?
Is it comparing tthe POS tagger to the standard?
Or all of these?

Alan G.

Emad Nawfal (عماد نوفل) emadnaw...@gmail.com wrote in message 
news:652641e90903240835o610d013dsd6a81f4675c47...@mail.gmail.com...

Evaluating Swahili Part of Speech Tagging. How can I write a Python script
for that?
# The information provided herein about Swahili may not be accurate
# it is just intended to illustrate the problem

Hi Tutors,
I would appreciate it if you gave me ideas about how to tackle this 
problem.



Assigninig POS tags to words is a major step in many linguistic analyses.
POS tags give the grammatical category of words, for example:

The Determiner
man Noun
who RelativePronoun
came Verb
to Preposition
us AccusativePluralPronoun
is CopulaPresent
an Determiner
engineer Noun

What we usually do is train a Part-of-Speech Tagger, and then test it on an
already tagged (gold standard) test set. After running the tagger, we get
something like this:

The DeterminerDeterminer
man NounPresentVerb
who RelativePronounRelativePronoun
came VerbVerb
to PrepositionPreposition
us AccusativePluralPronounAccusativePluralPronoun
is CopulaPresentCopulaPresent
an DeterminerDeterminer
engineer NounNoun

As can be seen from above, the POS tagger assigned the wrong Part of Speech
to the word man, and this makes it easy to calculate the POS tagger
accuracy, simply 8 out of 9 are correct (88.8%).

Swahili is a morphologically complex language. The same sentence above is
usaually written as:

theman whocametous isanengineer

This means that we should run a word segmenter before running the POS
tagger. The word segmenter of course makes mistakes which will affect the
accuracy of the POS tagger.
We get an output like the following where the second word (sic) is
ill-segmented:

# Segmenter + POS Tagger output file
the Determiner
whocame Noun
to Preposition
us AccusativePluralPronoun
is CopulaPresent
an Determiner
engineer Noun

Now, how can I measure the accuracy of this output file against the gold
standard file below given that the line alignment is lost every time the
segmenter makes a mistake, which happens at the rate of 15 per 1000 words:

# Gold Standard File
The Determiner
man Noun
who RelativePronoun
to Preposition
us AccusativePluralPronoun
is CopulaPresent
an Determiner
engineer Noun

Please note that the output file is usually in the range of 100,000 words

--
لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.محمد
الغزالي
No victim has ever been more repressed and alienated than the truth

Emad Soliman Nawfal
Indiana University, Bloomington








___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor




___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Evaluating Swahili Part of Speech Tagging. How can I writea Python script for that?

2009-03-24 Thread عماد نوفل
2009/3/24 Alan Gauld alan.ga...@btinternet.com

 Hi,
 That was an interesting post, but I'm not sure what you want help with.
 Is it the word splitting?
 Is it writing the POS tagger?
 Is it comparing tthe POS tagger to the standard?
 Or all of these?

 Alan G.

 Emad Nawfal (عماد نوفل) emadnaw...@gmail.com wrote in message
 news:652641e90903240835o610d013dsd6a81f4675c47...@mail.gmail.com...

 Evaluating Swahili Part of Speech Tagging. How can I write a Python script
 for that?
 # The information provided herein about Swahili may not be accurate
 # it is just intended to illustrate the problem

 Hi Tutors,
 I would appreciate it if you gave me ideas about how to tackle this
 problem.


 Assigninig POS tags to words is a major step in many linguistic analyses.
 POS tags give the grammatical category of words, for example:

 The Determiner
 man Noun
 who RelativePronoun
 came Verb
 to Preposition
 us AccusativePluralPronoun
 is CopulaPresent
 an Determiner
 engineer Noun

 What we usually do is train a Part-of-Speech Tagger, and then test it on an
 already tagged (gold standard) test set. After running the tagger, we get
 something like this:

 The DeterminerDeterminer
 man NounPresentVerb
 who RelativePronounRelativePronoun
 came VerbVerb
 to PrepositionPreposition
 us AccusativePluralPronounAccusativePluralPronoun
 is CopulaPresentCopulaPresent
 an DeterminerDeterminer
 engineer NounNoun

 As can be seen from above, the POS tagger assigned the wrong Part of Speech
 to the word man, and this makes it easy to calculate the POS tagger
 accuracy, simply 8 out of 9 are correct (88.8%).

 Swahili is a morphologically complex language. The same sentence above is
 usaually written as:

 theman whocametous isanengineer

 This means that we should run a word segmenter before running the POS
 tagger. The word segmenter of course makes mistakes which will affect the
 accuracy of the POS tagger.
 We get an output like the following where the second word (sic) is
 ill-segmented:

 # Segmenter + POS Tagger output file
 the Determiner
 whocame Noun
 to Preposition
 us AccusativePluralPronoun
 is CopulaPresent
 an Determiner
 engineer Noun

 Now, how can I measure the accuracy of this output file against the gold
 standard file below given that the line alignment is lost every time the
 segmenter makes a mistake, which happens at the rate of 15 per 1000 words:

 # Gold Standard File
 The Determiner
 man Noun
 who RelativePronoun
 to Preposition
 us AccusativePluralPronoun
 is CopulaPresent
 an Determiner
 engineer Noun

 Please note that the output file is usually in the range of 100,000 words

Hi Alan,
Comparing the POS tagger output to the standard. is what I want. I can do it
if I combine the segments into words and the segment tags into complex tags,
which is possible.
BUT I'm wondering whether this can be done just using the segments.


 --
 لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.محمد
 الغزالي
 No victim has ever been more repressed and alienated than the truth

 Emad Soliman Nawfal
 Indiana University, Bloomington
 




 


  ___
 Tutor maillist  -  Tutor@python.org
 http://mail.python.org/mailman/listinfo/tutor



 ___
 Tutor maillist  -  Tutor@python.org
 http://mail.python.org/mailman/listinfo/tutor




-- 
لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.محمد
الغزالي
No victim has ever been more repressed and alienated than the truth

Emad Soliman Nawfal
Indiana University, Bloomington

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Evaluating Swahili Part of Speech Tagging. How can I writea Python script for that?

2009-03-24 Thread عماد نوفل
2009/3/24 Emad Nawfal (عماد نوفل) emadnaw...@gmail.com



 2009/3/24 Alan Gauld alan.ga...@btinternet.com

 Hi,
 That was an interesting post, but I'm not sure what you want help with.
 Is it the word splitting?
 Is it writing the POS tagger?
 Is it comparing tthe POS tagger to the standard?
 Or all of these?

 Alan G.

 Emad Nawfal (عماد نوفل) emadnaw...@gmail.com wrote in message
 news:652641e90903240835o610d013dsd6a81f4675c47...@mail.gmail.com...

 Evaluating Swahili Part of Speech Tagging. How can I write a Python script
 for that?
 # The information provided herein about Swahili may not be accurate
 # it is just intended to illustrate the problem

 Hi Tutors,
 I would appreciate it if you gave me ideas about how to tackle this
 problem.


 Assigninig POS tags to words is a major step in many linguistic analyses.
 POS tags give the grammatical category of words, for example:

 The Determiner
 man Noun
 who RelativePronoun
 came Verb
 to Preposition
 us AccusativePluralPronoun
 is CopulaPresent
 an Determiner
 engineer Noun

 What we usually do is train a Part-of-Speech Tagger, and then test it on
 an
 already tagged (gold standard) test set. After running the tagger, we get
 something like this:

 The DeterminerDeterminer
 man NounPresentVerb
 who RelativePronounRelativePronoun
 came VerbVerb
 to PrepositionPreposition
 us AccusativePluralPronounAccusativePluralPronoun
 is CopulaPresentCopulaPresent
 an DeterminerDeterminer
 engineer NounNoun

 As can be seen from above, the POS tagger assigned the wrong Part of
 Speech
 to the word man, and this makes it easy to calculate the POS tagger
 accuracy, simply 8 out of 9 are correct (88.8%).

 Swahili is a morphologically complex language. The same sentence above is
 usaually written as:

 theman whocametous isanengineer

 This means that we should run a word segmenter before running the POS
 tagger. The word segmenter of course makes mistakes which will affect the
 accuracy of the POS tagger.
 We get an output like the following where the second word (sic) is
 ill-segmented:

 # Segmenter + POS Tagger output file
 the Determiner
 whocame Noun
 to Preposition
 us AccusativePluralPronoun
 is CopulaPresent
 an Determiner
 engineer Noun

 Now, how can I measure the accuracy of this output file against the gold
 standard file below given that the line alignment is lost every time the
 segmenter makes a mistake, which happens at the rate of 15 per 1000 words:

 # Gold Standard File
 The Determiner
 man Noun
 who RelativePronoun
 to Preposition
 us AccusativePluralPronoun
 is CopulaPresent
 an Determiner
 engineer Noun

 Please note that the output file is usually in the range of 100,000 words

 Hi Alan,
 Comparing the POS tagger output to the standard. is what I want. I can do
 it if I combine the segments into words and the segment tags into complex
 tags, which is possible.
 BUT I'm wondering whether this can be done just using the segments.


 Hi Alan,
Comparing the POS tagger output to the standard. is what I want. I can do it
if I combine the segments into words and the segment tags into complex tags,
which is possible.
BUT I'm wondering whether this can be done just using the segments and their
respective simple tags.


 --
 لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.محمد
 الغزالي
 No victim has ever been more repressed and alienated than the truth

 Emad Soliman Nawfal
 Indiana University, Bloomington
 




 


  ___
 Tutor maillist  -  Tutor@python.org
 http://mail.python.org/mailman/listinfo/tutor



 ___
 Tutor maillist  -  Tutor@python.org
 http://mail.python.org/mailman/listinfo/tutor




 --
 لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.محمد
 الغزالي
 No victim has ever been more repressed and alienated than the truth

 Emad Soliman Nawfal
 Indiana University, Bloomington
 




-- 
لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.محمد
الغزالي
No victim has ever been more repressed and alienated than the truth

Emad Soliman Nawfal
Indiana University, Bloomington

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Evaluating Swahili Part of Speech Tagging

2009-03-24 Thread Carnell, James E
Ok I think I understand now (maybe?)


#=== Current Version ==

# Segmenter + POS Tagger output file# Gold Standard File
the Determiner  =   The Determiner
whocame Noun!=  man Noun
to Preposition  !=  who RelativePronoun
us AccusativePluralPronoun  !=  to Preposition
is CopulaPresent!=  us
AccusativePluralPronoun
an Determiner   !=  is CopulaPresent
engineer Noun   !=  an Determiner
!=  engineer Noun

correct 1   
numErrorSegments1   
ErrorSegmentLength  7   

#=== Corrected Version ==

# Segmenter + POS Tagger output file# Gold Standard File
the Determiner  =   The Determiner
whocame Noun!=  man Noun
!=  who RelativePronoun
to Preposition  =   to Preposition
us AccusativePluralPronoun  =   us AccusativePluralPronoun
is CopulaPresent=   is CopulaPresent
an Determiner   =   an Determiner
engineer Noun   =   engineer Noun

correct 6   
numErrorSegments1   
ErrorSegmentLength  2   
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] evaluating AND

2007-09-14 Thread Terry Carroll
On Fri, 14 Sep 2007, Rikard Bosnjakovic wrote:

 For me, if x would be enough. If you think it's a bad thing when x
 is of the wrong data, then you really should check that it contains
 *correct* data as well.

That's an okay approach, but, but it's also non-Pythoninc; more of the 
look-before-you-leap approach rather than ask-forgiveness-not-permission.

 Using the the two function of yours, setting x to an integer:
 
  x = 2
  print test01(x)
 Traceback (most recent call last):
   File stdin, line 1, in ?
   File /usr/tmp/python-3716vZq, line 3, in test01
 TypeError: unsubscriptable object
  print test02(x)
 Traceback (most recent call last):
   File stdin, line 1, in ?
   File /usr/tmp/python-3716vZq, line 8, in test02
 TypeError: unsubscriptable object

which is exactly what I would want it to do: raise an exception on bad 
data.

 Rewriting your test01-function into this:
 
 def test01(x):
  if (type(x) == type((0,)) and
  (x is not None) and
  (length(x) == 2)):
   if x[0]0:
return x[1]/x[0]
 
 and testing again:
 
  x = 2
  print test01(x)
 None

This silently produces an incorrect result, which is a Bad Thing; and it 
took a lot more code to do it, too.


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] evaluating AND

2007-09-13 Thread Orest Kozyar
Given a variable x that can either be None or a tuple of two floats [i.e.
(0.32, 4.2)], which syntax is considered most appropriate under Python
coding standards?

if x and x[0]  0:
pass

=OR=

if x:
if x[0]  0:
pass


In the first, I'm obviously making the assumption that if the first
condition evaluates to false, then the second condition won't be evaluated.
But, is this a good/valid assumption to make?  Is there a more appropriate
design pattern in Python?

Thanks!
Orest

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] evaluating AND

2007-09-13 Thread Eric Brunson

The first is how I would code it.  Python guarantees that compound 
boolean statements are processed from left to right and also that the 
AND operator will short circuit the rest of the evaluation, since the 
rest of the line cannot change the falseness of the entire statement.

Orest Kozyar wrote:
 Given a variable x that can either be None or a tuple of two floats [i.e.
 (0.32, 4.2)], which syntax is considered most appropriate under Python
 coding standards?

 if x and x[0]  0:
   pass

 =OR=

 if x:
   if x[0]  0:
   pass


 In the first, I'm obviously making the assumption that if the first
 condition evaluates to false, then the second condition won't be evaluated.
 But, is this a good/valid assumption to make?  Is there a more appropriate
 design pattern in Python?

 Thanks!
 Orest

 ___
 Tutor maillist  -  Tutor@python.org
 http://mail.python.org/mailman/listinfo/tutor
   

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] evaluating AND

2007-09-13 Thread Terry Carroll
On Thu, 13 Sep 2007, Orest Kozyar wrote:

 Given a variable x that can either be None or a tuple of two floats [i.e.
 (0.32, 4.2)], which syntax is considered most appropriate under Python
 coding standards?
 
 if x and x[0]  0:
   pass
 
 =OR=
 
 if x:
   if x[0]  0:
   pass

I would like either one if instead of if x you used if x is not None; 
that seems a lot easier to me to read.  It's a bit jarring to see the same 
variable used in one expression as both a boolean and a list/tuple.

Besides, suppose somehow x got set to zero.  It would pass without error, 
something you wouldn't want to have happen.  Even if you've set things up 
so that it couldn't happen, it's not obvious from looking at this code 
that it couldn't happen.

If you really want to test for x being non-None, test for x being 
non-None.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] evaluating AND

2007-09-13 Thread Kent Johnson
Orest Kozyar wrote:
 Given a variable x that can either be None or a tuple of two floats [i.e.
 (0.32, 4.2)], which syntax is considered most appropriate under Python
 coding standards?
 
 if x and x[0]  0:
   pass
 
 =OR=
 
 if x:
   if x[0]  0:
   pass

The first is fine.

 In the first, I'm obviously making the assumption that if the first
 condition evaluates to false, then the second condition won't be evaluated.
 But, is this a good/valid assumption to make?

Yes, this is guaranteed by the language.
http://docs.python.org/ref/Booleans.html

Kent
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] evaluating AND

2007-09-13 Thread Adam Bark
On 13/09/2007, Terry Carroll [EMAIL PROTECTED] wrote:

 On Thu, 13 Sep 2007, Orest Kozyar wrote:

  Given a variable x that can either be None or a tuple of two floats [i.e
 .
  (0.32, 4.2)], which syntax is considered most appropriate under Python
  coding standards?
 
  if x and x[0]  0:
pass
 
  =OR=
 
  if x:
if x[0]  0:
pass

 I would like either one if instead of if x you used if x is not None;
 that seems a lot easier to me to read.  It's a bit jarring to see the same
 variable used in one expression as both a boolean and a list/tuple.

 Besides, suppose somehow x got set to zero.  It would pass without error,
 something you wouldn't want to have happen.  Even if you've set things up
 so that it couldn't happen, it's not obvious from looking at this code
 that it couldn't happen.

 If you really want to test for x being non-None, test for x being
 non-None.


The problem is what if it's an empty list or tuple? It would pass but have
not value
whereas if x would work fine.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] evaluating AND

2007-09-13 Thread Terry Carroll
On Thu, 13 Sep 2007, Adam Bark wrote:

 The problem is what if it's an empty list or tuple? It would pass but have
 not value
 whereas if x would work fine.

Exactly.  The poster stated that x is supposed to be either None or a 
tuple of two floats.

Just to put a bit of meat on the example, let's create a function whose 
job is to return x[1]/x[0], but only if x[0]  0.  Otherwise, it just 
falls off, i.e., returning None.

Here are two versions, one using if x is None; the other using just 
if x

 def test01(x):
...  if x is not None:
...   if x[0]0:
...return x[1]/x[0]
...
 def test02(x):
...  if x:
...   if x[0]0:
...return x[1]/x[0]


When x is None, both work:

 x = None
 print test01(x)
None
 print test02(x)
None

When x is, in fact, a tuple of two floats, both work:

 x = (2.0, 5.0)
 print test01(x)
2.5
 print test02(x)
2.5

Now... if x is an empty tuple:

 x = tuple()
 print test01(x)
Traceback (most recent call last):
  File stdin, line 1, in module
  File stdin, line 3, in test01
IndexError: tuple index out of range
 print test02(x)
None


The first one, which checks if x is None fails.  This is a good thing.  

The second one, which just checks if x and is satisfied with any false
value, including an empty tuple, does not raise the error condition, even
though the data is bad.  This is a bad thing.



___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] evaluating AND

2007-09-13 Thread Rikard Bosnjakovic
On 14/09/2007, Terry Carroll [EMAIL PROTECTED] wrote:

 The second one, which just checks if x and is satisfied with any false
 value, including an empty tuple, does not raise the error condition, even
 though the data is bad.  This is a bad thing.

For me, if x would be enough. If you think it's a bad thing when x
is of the wrong data, then you really should check that it contains
*correct* data as well.

Using the the two function of yours, setting x to an integer:

 x = 2
 print test01(x)
Traceback (most recent call last):
  File stdin, line 1, in ?
  File /usr/tmp/python-3716vZq, line 3, in test01
TypeError: unsubscriptable object
 print test02(x)
Traceback (most recent call last):
  File stdin, line 1, in ?
  File /usr/tmp/python-3716vZq, line 8, in test02
TypeError: unsubscriptable object


Rewriting your test01-function into this:

def test01(x):
 if (type(x) == type((0,)) and
 (x is not None) and
 (length(x) == 2)):
  if x[0]0:
   return x[1]/x[0]

and testing again:

 x = 2
 print test01(x)
None


My point is that if you think it's bad for a function to receive
incorrect data you should do exhaustive checks for the input to make
sure it is of correct type, size and whatever the requirements might
be, not just check if it's a tuple or not.

-- 
- Rikard - http://bos.hack.org/cv/
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] evaluating AND

2007-09-13 Thread John Fouhy
On 14/09/2007, Rikard Bosnjakovic [EMAIL PROTECTED] wrote:
 On 14/09/2007, Terry Carroll [EMAIL PROTECTED] wrote:
  The second one, which just checks if x and is satisfied with any false
  value, including an empty tuple, does not raise the error condition, even
  though the data is bad.  This is a bad thing.
 My point is that if you think it's bad for a function to receive
 incorrect data you should do exhaustive checks for the input to make
 sure it is of correct type, size and whatever the requirements might
 be, not just check if it's a tuple or not.

I think Terry's viewpoint is that a function should raise an exception
if it receives bad data.  You don't need to do exhaustive
isinstance()/type() tests; just use the data as if it is what you're
expecting and let errors happen if the data is wrong.

What's important is that the function should not return valid output
when given invalid input.  That's what could happen with Terry's
hypothetical test02() if you give it an empty list.

OTOH, if you give either function an integer, both will raise
exceptions, which is fine.

You could change the function like this, though, to ensure the list
length is correct:

 def test01(x):
...  if x is not None:
...   assert len(x)==2
...   if x[0]0:
...return x[1]/x[0]
...

-- 
John.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor