On 2024-06-04, Edward Teach via Python-list <python-list@python.org> wrote:
> On Mon, 03 Jun 2024 14:58:26 -0400 (EDT)
> Grant Edwards <grant.b.edwa...@gmail.com> wrote:
>
>> On 2024-06-03, Edward Teach via Python-list <python-list@python.org>
>> wrote:
>> 
>> > The Gutenburg Project publishes "plain text".  That's another
>> > problem, because "plain text" means UTF-8....and that means
>> > unicode...and that means running some sort of unicode-to-ascii
>> > conversion in order to get something like "words".  A couple of
>> > hours....a couple of hundred lines of C....problem solved!  
>> 
>> I'm curious.  Why does it need to be converted frum Unicode to ASCII?
>> 
>> When you read it into Python, it gets converted right back to
>> Unicode...

> Well.....when using the file linux.words as a useful master list of
> "words".....linux.words is strict ASCII........

I guess I missed the part of the problem description where it said to
use linux.words to decide what a word is. :)

--
Grant


-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to