On 2015-06-03 2:55 PM, Robert Voliva wrote:
We're finding that, when working with the em dash character, the LEFT and
LENGTH functions don't work well together. This query shows trying to
strip off the last character from a string containing an em dash:
mysql> select LEFT('031492349−0002,', LENGTH('031492349−0002,') - 1),
LENGTH('031492349−0002,'), LENGTH('031492349-0002,');
+------------------------------------------------------------+-----------------------------+---------------------------+
| LEFT('031492349−0002,', LENGTH('031492349−0002,') - 1) |
LENGTH('031492349−0002,') | LENGTH('031492349-0002,') |
+------------------------------------------------------------+-----------------------------+---------------------------+
| 031492349−0002, |
17 | 15 |
+------------------------------------------------------------+-----------------------------+---------------------------+
1 row in set (0.06 sec)
Is this a bug? If it's a "feature", what could we do instead to get around
this issue?
The last of the four '031...' strings in your query diverges from the
others at the en-dash. In the earlier strings, the dash is a multibyte
character whose hex value is E2, whereas the dash in the later string is
the ASCII dash value 2D.
Since the earlier dashes are 3-byte chars, octet_length() returns 17
instead of 15.
PB
-----
Thanks,
Robert Voliva
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql