On 6/1/2024 4:04 AM, Peter J. Holzer via Python-list wrote:
On 2024-05-30 19:26:37 -0700, HenHanna via Python-list wrote:
hard to decide what to do with hyphens
                and apostrophes
              (I'd,  he's,  can't, haven't,  A's  and  B's)

Especially since the same character is used as both an apostrophe and a
closing quotation mark. And while that's pretty unambiguous between to
characters it isn't at the end of a word:

     This is Alex’ house.
     This type of building is called an ‘Alex’ house.
     The sentence ‘We are meeting at Alex’ house’ contains an apostrophe.

(using proper unicode quotation marks. It get's worse if you stick to
ASCII.)

Personally I like to use U+0027 APOSTROPHE as an apostrophe and U+2018
LEFT SINGLE QUOTATION MARK and U+2019 RIGHT SINGLE QUOTATION MARK as
single quotation marks[1], but despite the suggestive names, this is not
the common typographical convention, so your texts are unlikely to make
this distinction.

         hp

[1] Which I use rarely, anyway.

My usual approach is to replace punctuation by spaces and then to discard anything remaining that is only one character long (or sometimes two, depending on what I'm working on). Yes, OK, I will miss words like "I". Usually I don't care about them. Make exceptions to the policy if you like.

--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to