On May 4, 2014 8:31 PM, "Jake Blank" <jakenbl...@gmail.com> wrote:
>
> Hi,
> So I'm doing a problem on the Alice_in_wonderland.txt where I have to write a 
> program that reads a piece of text from a file specified by the user, counts 
> the number of occurrences of each word, and writes a sorted list of words and 
> their counts to an output file. The list of words should be sorted based on 
> the counts, so that the most popular words appear at the top. Words with the 
> same counts should be sorted alphabetically.
>
> My code right now is
>
> word_count = {}
> file = open ('alice_in_wonderland.txt', 'r')
> full_text = file.read().replace('--',' ')
> full_text_words = full_text.split()
>
> for words in full_text_words:
>         stripped_words = words.strip(".,!?'`\"- ();:")
>         try:
>             word_count[stripped_words] += 1
>         except KeyError:
>             word_count[stripped_words] = 1
>
> ordered_keys = word_count.keys()
> sorted(ordered_keys)
> print ("All the words and their frequency in", 'alice in wonderland')
> for k in ordered_keys:
>     print (k, word_count[k])
>
> The Output here is just all of the words in the document NOT SORTED by amount 
> of occurrence.
> I need help sorting this output of words in the Alice_in_wonderland.txt, as 
> well as help asking the user for the input information about the files.
>
> If anyone could give me some guidance you will really be helping me out.
>
> Please and Thank you

Hi Jake,

You are sorting the dictionary keys by the keys themselves, whereas
what you want is the keys sorted by their associated values.

Look at the key parameter in
https://docs.python.org/3.4/library/functions.html#sorted.

To get you started, here is an example in the vicinity:

>>> data = ['abiab', 'cdocd', 'efaef', 'ghbgh']
>>> sorted(data)
['abiab', 'cdocd', 'efaef', 'ghbgh']
>>> sorted(data, key=lambda x:x[2])
['efaef', 'ghbgh', 'abiab', 'cdocd']
>>> def get_third(x): return x[2]
...
>>> sorted(data, key=get_third)
['efaef', 'ghbgh', 'abiab', 'cdocd']
>>>

In case the lambda version is confusing, it is simply a way of doing
the get_third version without having to create a function outside of
the context of the sorted expression.

If that sorts you, great. If not, please do ask a follow-up. (I was
trying not to do it for you, but also not to frustrate by giving you
too little of a push.)

Best,

Brian vdB
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Reply via email to