Re: [Tutor] changing char list to int list isn't working

2013-05-04 Thread Alan Gauld

On 04/05/13 05:13, Jim Mooney wrote:

I'm turning an integer into a string so I can make a list of separate
chars, then turn those chars back into individual ints,


You don't actually need to convert to chars, you could
use divmod to do it directly on the numbers:

 digits = []
 root = 455
 while root  0:
... root, n = divmod(root,10)
... digits.insert(0,n)
...
 digits
[4, 5, 5]

But I suspect the str() method is slightly faster...

--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] creating a corpus from a csv file

2013-05-04 Thread Peter Otten
Treder, Robert wrote:

 I'm very new to python and am trying to figure out how to make a corpus
 from a text file. I have a csv file (actually pipe '|' delimited) where
 each row corresponds to a different text document. Each row contains a
 communication note. Other columns correspond to categories of types of
 communications. I am able to read the csv file and print the notes column
 as follows:
  
 import csv
 with open('notes.txt', 'rb') as infile:
 reader = csv.reader(infile, delimiter = '|')
 i = 0
 for row in reader:
 if i = 25: print row[8]
 i = i+1
 
 I would like to convert this to a categorized corpus with some of the
 other columns corresponding to the categories. All of the columns are text
 (i.e., strings). I have looked for documentation on how to use csv.reader
 with PlaintextCorpusReader but have been unsuccessful in finding a 
 example similar to what I want to do. Can someone please help?

This mailing list is for learning Python. For problems with a specific 
library you should use the general python list 

http://mail.python.org/mailman/listinfo/python-list

or a forum dedicated to that library

http://groups.google.com/group/nltk-users

If you ask on a general forum you should give some context -- the name of 
the library would be the bare minimum.

The following comes with no warranties as I'm not an nltk user:

import csv
from nltk.corpus.reader.plaintext import CategorizedPlaintextCorpusReader
from itertools import islice, chain

LIMIT_SIZE = 25 # set to None if not debugging

def pairs(filename):
Generate (filename, list_of_categories) pairs from a csv file

with open(filename, rb) as infile:
rows = islice(csv.reader(infile, delimiter=|), LIMIT_SIZE)
for row in rows:
# assume that columns 10 and above contain categories
yield row[8], row[9:]

if __name__ == __main__:
import random
FILENAME = notes.txt

# assume that every filename occurs only once in the file
file_to_categories = dict(pairs(FILENAME))

files = list(file_to_categories)

all_categories = 
set(chain.from_iterable(file_to_categories.itervalues()))

reader = CategorizedPlaintextCorpusReader(., files, 
cat_map=file_to_categories)

# print words for a random category
category = random.choice(list(all_categories))
print words for category {}:.format(category)
print sorted(set(reader.words(categories=category)))



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Python internals

2013-05-04 Thread kartik sundarajan
Hi,

I am trying to learn how Python stores variables in memory. For ex:

my_var = 'test'

def func():
pass

when I type dir() I get

['__builtins__', '__doc__', '__name__', '__package__', 'func', 'help',
'my_var']

are these variables stored in a dict and on calling dir() all the keys are
returned?
Or is it stored in a list or a heap?

Can anyone suggest if there some document I can read to help me understand
the Python internals work ?

Cheers
Kartik
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Python internals

2013-05-04 Thread Steven D'Aprano

On 04/05/13 23:04, kartik sundarajan wrote:

Hi,

I am trying to learn how Python stores variables in memory. For ex:

my_var = 'test'

def func():
 pass

when I type dir() I get

['__builtins__', '__doc__', '__name__', '__package__', 'func', 'help',
'my_var']

are these variables stored in a dict and on calling dir() all the keys are
returned?
Or is it stored in a list or a heap?


Python objects are dynamically allocated in the heap.

Python variables are not variables in the C or Pascal sense, they are name 
bindings. When you do this:

my_var = 'test'


Python does the following:

- create a string object 'test'

- create a string object, 'my_var'

- use 'my_var' as a key in the current namespace, with value 'test'.



Creating a function is a little more complicated, but the simplified version 
goes like this:

- create a string object 'func'

- compile the body of the function into a code object

- create a new function object named 'func' from the code object

- use 'func' as a key in the current namespace, with the function object as the 
value.



When you call dir(), by default it looks at the current namespace. The dunder 
names shown (Double leading and trailing UNDERscore) have special meaning to Python; the 
others are objects you have added.


The documentation for dir says:


py help(dir)

Help on built-in function dir in module __builtin__:

dir(...)
dir([object]) - list of strings

If called without an argument, return the names in the current scope.
Else, return an alphabetized list of names comprising (some of) the 
attributes
of the given object, and of attributes reachable from it.
If the object supplies a method named __dir__, it will be used; otherwise
the default dir() logic is used and returns:
  for a module object: the module's attributes.
  for a class object:  its attributes, and recursively the attributes
of its bases.
  for any other object: its attributes, its class's attributes, and
recursively the attributes of its class's base classes.




Can anyone suggest if there some document I can read to help me understand
the Python internals work ?


The Python docs are a good place to start.

http://docs.python.org/3/index.html


Especially:

http://docs.python.org/3/reference/datamodel.html

http://docs.python.org/3/reference/executionmodel.html



--
Steven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] changing char list to int list isn't working

2013-05-04 Thread Mitya Sirenef

On 05/04/2013 12:13 AM, Jim Mooney wrote:

for num in listOfNumChars:

 num = int(num)


It seems like people learning Python run into this very often.

I think the reason is that in most simple cases, it's easier and more
intuitive to think that the name IS the object:

x = 1
y = 2
print x + y

Even though I know it's not a precise description, when I see this code,
I think of it as x is 1, y is 2, print x plus y. And you do get
expected result, which reinforces this intuition.

Of course, a more precise way to think is:

 name 'x' is assigned to object with value=1
 name 'y' is assigned to object with value=2
 sum values that currently have assigned names of 'x' and 'y'

Therefore, what you are really doing is:

for each object in listOfNumChars:
assign name 'num' to object (this is done automatically by the loop)
assign name 'num' to int(value that has currently assigned name 'num')


 -m



--
Lark's Tongue Guide to Python: http://lightbird.net/larks/

Oaths are the fossils of piety.  George Santayana

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] urllib2 and tests

2013-05-04 Thread Steven D'Aprano

On 05/05/13 13:27, RJ Ewing wrote:

When I run the following test.py, I get the following error:

[...]

If I run the fetch_file function outside of the test, it works fine. Any
ideas?


The code you are actually running, and the code you say you are running below, are 
different. Your error message refers to a file test_filefetcher.py, not the Test.py you 
show us. As given, Test.py cannot possibly work, since it doesn't define 
filefetcher. I can only guess that this is meant to be the module you are 
trying to test, but since you don't show us what is in that module, I can only guess what 
it contains.


More comments below:



RROR: test_fetch_file (__main__.TestFileFetcher)
--
Traceback (most recent call last):
   File test_filefetcher.py, line 12, in test_fetch_file
 fetched_file = filefetcher.fetch_file(URL)


What's filefetcher? I'm guessing its the module you are testing, which is 
consistent with the next line showing the file name filefetcher.py:



   File /Users/rjewing/Documents/Work/filefetcher.py, line 7, in fetch_file
 return urllib2.urlopen(url).read()
   File
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py,
line 126, in urlopen
 return _opener.open(url, data, timeout)
   File
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py,
line 392, in open
 protocol = req.get_type()
AttributeError: 'TestFileFetcher' object has no attribute 'get_type'



Somehow, your test suite, the TestFileFetcher object, is being passed down into 
the urllib2 library. I can only guess that somehow url is not an actual URL. I 
suggest you add a line:

print(url, type(url))

just before the failing line, and see what it prints.



--

Test.py:


This cannot be the actual test suite you are running, since it cannot run as 
shown. It doesn't import unittest or the module to be tested.



class TestFileFetcher(unittest.TestCase):

 def test_fetch_file(URL):
 phrase = 'position = support-intern'

 fetched_file = filefetcher.fetch_file(URL)


And here's your error! Just as I thought, URL is not what you think it is, it 
is the TestFileFetcher instance.

Unittest cases do not take arguments. Since they are methods, they are always defined 
with a single argument, conventionally called self, representing the instance 
that the method is called on. So normally you would define a method like this:

def test_fetch_file(self, url):

which then takes a single *implicit* argument self, provided by Python, plus a second 
*explicit* argument, url. But because this is a test method, the unittest framework 
does not expect to pass an argument to the method, so you have to write it like this:

def test_fetch_file(self):

and get the url some other way.

One common way would be to define an attribute on the test, and store the URL 
in that:

class TestFileFetcher(unittest.TestCase):
URL = some_url_goes_here  # FIX THIS

def test_fetch_file(self):
phrase = 'position = support-intern'
fetched_file = filefetcher.fetch_file(self.URL)
...




 unittest.assertIsNone(fetched_file,
   'The file was not fetched correctly')


This part of the test seems to be wrong to me. It says:

compare the value of fetched_file to None; if it is None, the test passes; if it is 
some other value, the test fails with error message 'The file was not fetched 
correctly'

But then you immediately go on to use fetched_file:


 text = filefetcher.add_phrase(fetched_file)


but if the above assertIsNone test passed, then fetched_file is None so this is 
equivalent to:

text = filefetcher.add_phrase(None)


which surely isn't right?



 unittest.assertNotIn(phrase, text, 'The phrase is not in the file')


This test also appears backwards. You're testing:

check whether phrase is NOT in text; if it is NOT in, then the test passes; 
otherwise, if it IS in, then fail with an error message 'The phrase is not in the 
file'

which is clearly wrong. The message should be:

'The phrase is in the file'


since your test is checking that it isn't in.



--
Steven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] urllib2 and tests

2013-05-04 Thread RJ Ewing
Thank you, I figured out what the problem was. I was passing in url into
the test_file_fetch function instead of self. URL was a global. I did get
the asserts mixed up. They were the opposite of what I wanted. Sorry I
didn't include the whole test.py file for reference.

Thanks again


On Sat, May 4, 2013 at 9:08 PM, Steven D'Aprano st...@pearwood.info wrote:

 On 05/05/13 13:27, RJ Ewing wrote:

 When I run the following test.py, I get the following error:

 [...]

  If I run the fetch_file function outside of the test, it works fine. Any
 ideas?


 The code you are actually running, and the code you say you are running
 below, are different. Your error message refers to a file
 test_filefetcher.py, not the Test.py you show us. As given, Test.py cannot
 possibly work, since it doesn't define filefetcher. I can only guess that
 this is meant to be the module you are trying to test, but since you don't
 show us what is in that module, I can only guess what it contains.


 More comments below:



  RROR: test_fetch_file (__main__.TestFileFetcher)
 --**--**
 --
 Traceback (most recent call last):
File test_filefetcher.py, line 12, in test_fetch_file
  fetched_file = filefetcher.fetch_file(URL)


 What's filefetcher? I'm guessing its the module you are testing, which is
 consistent with the next line showing the file name filefetcher.py:



 File /Users/rjewing/Documents/**Work/filefetcher.py, line 7, in
 fetch_file
  return urllib2.urlopen(url).read()
File
 /Library/Frameworks/Python.**framework/Versions/2.7/lib/**
 python2.7/urllib2.py,
 line 126, in urlopen
  return _opener.open(url, data, timeout)
File
 /Library/Frameworks/Python.**framework/Versions/2.7/lib/**
 python2.7/urllib2.py,
 line 392, in open
  protocol = req.get_type()
 AttributeError: 'TestFileFetcher' object has no attribute 'get_type'



 Somehow, your test suite, the TestFileFetcher object, is being passed down
 into the urllib2 library. I can only guess that somehow url is not an
 actual URL. I suggest you add a line:

 print(url, type(url))

 just before the failing line, and see what it prints.


  --**--**
 --

 Test.py:


 This cannot be the actual test suite you are running, since it cannot run
 as shown. It doesn't import unittest or the module to be tested.



  class TestFileFetcher(unittest.**TestCase):

  def test_fetch_file(URL):
  phrase = 'position = support-intern'

  fetched_file = filefetcher.fetch_file(URL)


 And here's your error! Just as I thought, URL is not what you think it is,
 it is the TestFileFetcher instance.

 Unittest cases do not take arguments. Since they are methods, they are
 always defined with a single argument, conventionally called self,
 representing the instance that the method is called on. So normally you
 would define a method like this:

 def test_fetch_file(self, url):

 which then takes a single *implicit* argument self, provided by Python,
 plus a second *explicit* argument, url. But because this is a test
 method, the unittest framework does not expect to pass an argument to the
 method, so you have to write it like this:

 def test_fetch_file(self):

 and get the url some other way.

 One common way would be to define an attribute on the test, and store the
 URL in that:

 class TestFileFetcher(unittest.**TestCase):
 URL = some_url_goes_here  # FIX THIS

 def test_fetch_file(self):

 phrase = 'position = support-intern'
 fetched_file = filefetcher.fetch_file(self.**URL)
 ...




   unittest.assertIsNone(fetched_**file,
'The file was not fetched correctly')


 This part of the test seems to be wrong to me. It says:

 compare the value of fetched_file to None; if it is None, the test
 passes; if it is some other value, the test fails with error message 'The
 file was not fetched correctly'

 But then you immediately go on to use fetched_file:

   text = filefetcher.add_phrase(**fetched_file)


 but if the above assertIsNone test passed, then fetched_file is None so
 this is equivalent to:

 text = filefetcher.add_phrase(None)


 which surely isn't right?



   unittest.assertNotIn(phrase, text, 'The phrase is not in the
 file')


 This test also appears backwards. You're testing:

 check whether phrase is NOT in text; if it is NOT in, then the test
 passes; otherwise, if it IS in, then fail with an error message 'The phrase
 is not in the file'

 which is clearly wrong. The message should be:

 'The phrase is in the file'


 since your test is checking that it isn't in.



 --
 Steven
 __**_
 Tutor maillist  -  Tutor@python.org
 To unsubscribe or change subscription options:
 http://mail.python.org/**mailman/listinfo/tutorhttp://mail.python.org/mailman/listinfo/tutor