On 03/05/13 21:48, Treder, Robert wrote:

I'm very new to python and am trying to figure out how to
> make a corpus from a text file.

Hi, I for one have no idea what a corpus is or looks like
so you will need to help us out a little before we can help you.

I have a csv file (actually pipe '|' delimited) where each
row corresponds to a different text document.

Each row contains a communication note.
> Other columns correspond to categories of types of communications.

I am able to read the csv file and print the notes column as follows:

import csv
with open('notes.txt', 'rb') as infile:
     reader = csv.reader(infile, delimiter = '|')
     i = 0
     for row in reader:
     if i <= 25: print row[8]
     i = i+1

You don't need to manually manage 'i'.

you could do this instead:

with open('notes.txt', 'rb') as infile:
     reader = csv.reader(infile, delimiter = '|')
     for count, row in enumerate(reader):
         if count <= 25: print row[8]  # I assume indented?
         else: break                   # save time if its a big file

I would like to convert this to a categorized corpus with
> some of the other columns corresponding to the categories.

You might be able to use a dictionary but for now
I'm still not clear what you mean. Can you show us
some sample input and output data?

> documentation on how to use csv.reader with PlaintextCorpusReader

never heard of the latter - is it an external module?

HTH
--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to