Am 02.12.2010 02:51, schrieb Dana:
Hello,

I'm using Python to extract words from plain text files. I have a list
of words. Now I would like to convert that list to a dictionary of
features where the key is the word and the value the number of
occurrences in a group of files based on the filename (different files
correspond to different categories). What is the best way to represent
this data? When I finish I expect to have about 70 unique dictionaries
with values I plan to use in frequency distributions, etc. Should I use
globally defined dictionaries?
Depends on what else you want to do with the group of files. If you're expecting some operations on the group's data you should create a class to be able to add some more methods to the data. I would probably go with a class.

class FileGroup(object):

    def __init__(self, filenames):
        self.filenames = filenames
        self.word_to_occurrences = {}
        self._populate_word_to_occurrences()

    def _populate_word_to_occurrences():
        for filename in filenames:
            with open(filename) as fi:
                # do the processing

Now you could add other meaningful data and methods to a group of files.

But also I think dictionaries can be fine. If you really only need the dicts. You could create a function to create those.

def create_word_to_occurrences(filenames):
    word_to_occurrences = {}
    for filename in filenames:
        with open(filename) as fi
            # do the processing
    return word_to_occurrences

But as I said, if in doubt I would go for the class.


Dana
_______________________________________________
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to