Re: Python library to break text into words

2018-05-31 Thread Abdur-Rahmaan Janhangeer
Dietmar's answer is the best, piggybacking on search engines' algorithms and probably instead of a dictionary of english words, we'd need a dictionary of titles, making search much more efficient regards, Abdur-Rahmaan Janhangeer https://github.com/Abdur-rahmaanJ No need to re-invent the wheel:

Re: Python library to break text into words

2018-05-31 Thread Abdur-Rahmaan Janhangeer
1-> search in dict, identify all words example : meaningsofoffers .. identified words : me an mean in meaning meanings so of of offer offers 2-> next filter duplicates, i.e. of above in a new list as the original list serves as chronological reference 3-> next chose the words whose lengths mak

Re: Python library to break text into words

2018-05-31 Thread beliavsky--- via Python-list
On Thursday, May 31, 2018 at 5:31:48 PM UTC-4, Dietmar Schwertberger wrote: > On 5/31/2018 10:26 PM, beliavsky--- via Python-list wrote: > > Is there a Python library that uses intelligent guesses to break sequences > > of characters into words? The general strategy would be to break strings > >

Re: Python library to break text into words

2018-05-31 Thread Chris Angelico
On Fri, Jun 1, 2018 at 7:09 AM, Dietmar Schwertberger wrote: > On 5/31/2018 10:26 PM, beliavsky--- via Python-list wrote: >> >> Is there a Python library that uses intelligent guesses to break sequences >> of characters into words? The general strategy would be to break strings >> into the longest

Re: Python library to break text into words

2018-05-31 Thread Dietmar Schwertberger
On 5/31/2018 10:26 PM, beliavsky--- via Python-list wrote: Is there a Python library that uses intelligent guesses to break sequences of characters into words? The general strategy would be to break strings into the longest words possible. The library would need to "know" a sizable subset of wo

Re: Python library to break text into words

2018-05-31 Thread Chris Angelico
On Fri, Jun 1, 2018 at 6:26 AM, beliavsky--- via Python-list wrote: > I bought some e-books in a Humble Bundle. The file names are shown below. I > would like to hyphenate words within the file names, so that the first three > titles are > > a_devils_chaplain.pdf > atomic_accidents.pdf > chaos_m

Python library to break text into words

2018-05-31 Thread beliavsky--- via Python-list
I bought some e-books in a Humble Bundle. The file names are shown below. I would like to hyphenate words within the file names, so that the first three titles are a_devils_chaplain.pdf atomic_accidents.pdf chaos_making_a_new_science.pdf Is there a Python library that uses intelligent guesses t