On 31/05/24 14:26, HenHanna via Python-list wrote:
On 5/30/2024 2:18 PM, dn wrote:
On 31/05/24 08:03, HenHanna via Python-list wrote:

Given a text file of a novel (JoyceUlysses.txt) ...

could someone give me a pretty fast (and simple) Python program that'd give me a list of all words occurring exactly once?

               -- Also, a list of words occurring once, twice or 3 times



re: hyphenated words        (you can treat it anyway you like)

        but ideally, i'd treat  [editor-in-chief]
                                [go-ahead]  [pen-knife]
                                [know-how]  [far-fetched] ...
        as one unit.



Split into words - defined as you will.
Use Counter.

Show some (of your) code and we'll be happy to critique...


hard to decide what to do with hyphens
                and apostrophes
              (I'd,  he's,  can't, haven't,  A's  and  B's)


2-step-Process

           1. make a file listing all words (one word per line)

           2.  then, doing the counting.  using
                               from collections import Counter


Apologies for lateness - only just able to come back to this.

This issue is not Python, and is not solved by code!

If you/your teacher can't define a "word", the code, any code, will almost-certainly be wrong!


One of the interesting aspects of our work is that we can write all manner of tests to try to ensure that the code is correct: unit tests, integration tests, system tests, acceptance tests, eye-tests, ...

However, there is no such thing as a test (or proof) that statements of requirements are complete or correct!
(nor for any other previous stages of the full project life-cycle)

As coders we need to learn to require clear specifications and not attempt to read-between-the-lines, use our initiative, or otherwise 'not bother the ...'. When there is ambiguity, we should go back to the user/client/boss and seek clarification. They are the domain/subject-matter experts...

I'm reminded of a cartoon, possibly from some IBM source, first seen in black-and-white but here in living-color: https://www.monolithic.org/blogs/presidents-sphere/what-the-customer-really-wants

That has been the sad history of programming and dev.projects - wherein we are blamed for every short-coming, because no-one else understands the nuances of development projects.

If we don't insist on clarity, are we our own worst enemy?


--
Regards,
=dn
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to