Re: Issue with em dash character
On 2015-06-03 2:55 PM, Robert Voliva wrote: We're finding that, when working with the em dash character, the LEFT and LENGTH functions don't work well together. This query shows trying to strip off the last character from a string containing an em dash: mysql> select LEFT('031492349−0002,', LENGTH('031492349−0002,') - 1), LENGTH('031492349−0002,'), LENGTH('031492349-0002,'); ++-+---+ | LEFT('031492349−0002,', LENGTH('031492349−0002,') - 1) | LENGTH('031492349−0002,') | LENGTH('031492349-0002,') | ++-+---+ | 031492349−0002,| 17 |15 | ++-+---+ 1 row in set (0.06 sec) Is this a bug? If it's a "feature", what could we do instead to get around this issue? The last of the four '031...' strings in your query diverges from the others at the en-dash. In the earlier strings, the dash is a multibyte character whose hex value is E2, whereas the dash in the later string is the ASCII dash value 2D. Since the earlier dashes are 3-byte chars, octet_length() returns 17 instead of 15. PB - Thanks, Robert Voliva -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Issue with em dash character
LENGTH() measures bytes, CHAR_LENGTH() measures characters. There's little use for LENGTH() for anything else then raw bytes. On Wed, Jun 3, 2015 at 10:29 PM, Robert Voliva wrote: > information_schema.columns reports a character_set_name of 'utf8' and a > collation_name of 'utf8_general_ci' > > On Wed, Jun 3, 2015 at 3:14 PM, Emil Oppeln-Bronikowski > wrote: > >> >> Is this a bug? If it's a "feature", what could we do instead to get >>> around >>> this issue? >>> >> >> Is your column set to unicode? -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Issue with em dash character
information_schema.columns reports a character_set_name of 'utf8' and a collation_name of 'utf8_general_ci' On Wed, Jun 3, 2015 at 3:14 PM, Emil Oppeln-Bronikowski wrote: > > Is this a bug? If it's a "feature", what could we do instead to get >> around >> this issue? >> > > Is your column set to unicode? > > -- > MySQL General Mailing List > For list archives: http://lists.mysql.com/mysql > To unsubscribe:http://lists.mysql.com/mysql > >
Re: Issue with em dash character
Is this a bug? If it's a "feature", what could we do instead to get around this issue? Is your column set to unicode? -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Issue with em dash character
We're finding that, when working with the em dash character, the LEFT and LENGTH functions don't work well together. This query shows trying to strip off the last character from a string containing an em dash: mysql> select LEFT('031492349−0002,', LENGTH('031492349−0002,') - 1), LENGTH('031492349−0002,'), LENGTH('031492349-0002,'); ++-+---+ | LEFT('031492349−0002,', LENGTH('031492349−0002,') - 1) | LENGTH('031492349−0002,') | LENGTH('031492349-0002,') | ++-+---+ | 031492349−0002,| 17 |15 | ++-+---+ 1 row in set (0.06 sec) Is this a bug? If it's a "feature", what could we do instead to get around this issue? Thanks, Robert Voliva