Hello All

 

I have a little puzzle to disturb your Sunday lunch, maybe. I have been
scraping text data from web pages, which often comes with redundant space
before or after. I routinely use 'trim' on the final string output, but I
have found cases where there are still redundant spaces. Inspecting the
results, I find that the characters are non-break spaces (codepoint 160,
Unicode U+00A0). Looking at the code, String>>#trim depends on
Character>>#isSeparator, which does not answer true for a non-break space. I
can use trimBoth: [:char| char asInteger = 160] to remove the redundant
spaces if I know where to expect them, so it is not a major problem. But the
question remains: should non-break space be included in the list of
separators in Character>>#isSeparator.

 

Peter Kenny

 

Reply via email to