Why is mySQL not respecting foreign characters as different

2013-09-26 Thread Daevid Vincent
How come MySQL is not differentiating between these characters?

SELECT text_id, us, de, es, fr  
  FROM texts
  WHERE us = fr;

Results in matching here. Notice the difference in the scene vs scène

text_id us  es  de fr  
--  --  --  -  
all_page_scene  scene   escena  Filmszene  scène  


I wold expect this NOT to match.

Do I have to add something to my query to tell MySQL to respect other
character sets as different?

CREATE TABLE `texts` (
  `text_id` varchar(50) CHARACTER SET latin1 COLLATE latin1_general_ci NOT
NULL DEFAULT '',
  `us` text,
  `es` text,
  `de` text,
  `fr` text,
  PRIMARY KEY (`text_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

FieldType Collation  NullKey Default  Extra
Privileges   Comment  
---  ---  -  --  --  ---  --
---  -
text_id  varchar(50)  latin1_general_ci  NO  PRI
select,insert,update,references   
us   text utf8_general_ciYES (NULL)
select,insert,update,references   
es   text utf8_general_ciYES (NULL)
select,insert,update,references   
de   text utf8_general_ciYES (NULL)
select,insert,update,references   
fr   text utf8_general_ciYES (NULL)
select,insert,update,references   


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: Why is mySQL not respecting foreign characters as different

2013-09-26 Thread Claudio Nanni
Hi,

 I wold expect this NOT to match.

This should be because the fields you are comparing are utf8_general_ci,
this collation groups characters in 'classes' so that all variants of what
are considered to belong to the same character type, are put in that class.
Equality comparison is done comparing classes not single characters.
For this reason 'e' and '' are considered equal (same class--same type of
character)

I didn't find anything better but here you can have an idea:
http://collation-charts.org/mysql60/mysql604.utf8_general_ci.european.html


Do I have to add something to my query to tell MySQL to respect other character
sets as different?

If you want to distinguish between characters in the same utf8 class you
have to use some workaround,
from the top of my head, you might use binary() to get the utf8 code of the
string which is different for each character.

mysql SELECT text_id, us, de, es, fr   FROM texts  WHERE us = es;
+-+---+---++---+
| text_id | us| de| es | fr|
+-+---+---++---+
| 1   | scene | Filmszene | escena  | scène |
+-+---+---++---+
1 row in set (0.00 sec)

mysql SELECT text_id, us, de, es, fr   FROM texts  WHERE binary(us) =
binary(es);
Empty set (0.00 sec)

There are probably other ways.


Cheers

Claudio



2013/9/26 Daevid Vincent dae...@daevid.com

 How come MySQL is not differentiating between these characters?

 SELECT text_id, us, de, es, fr
   FROM texts
   WHERE us = fr;

 Results in matching here. Notice the difference in the scene vs scène

 text_id us  es  de fr
 --  --  --  -  
 all_page_scene  scene   escena  Filmszene  scène


 I wold expect this NOT to match.

 Do I have to add something to my query to tell MySQL to respect other
 character sets as different?

 CREATE TABLE `texts` (
   `text_id` varchar(50) CHARACTER SET latin1 COLLATE latin1_general_ci NOT
 NULL DEFAULT '',
   `us` text,
   `es` text,
   `de` text,
   `fr` text,
   PRIMARY KEY (`text_id`)
 ) ENGINE=InnoDB DEFAULT CHARSET=utf8

 FieldType Collation  NullKey Default  Extra
 Privileges   Comment
 ---  ---  -  --  --  ---  --
 ---  -
 text_id  varchar(50)  latin1_general_ci  NO  PRI
 select,insert,update,references
 us   text utf8_general_ciYES (NULL)
 select,insert,update,references
 es   text utf8_general_ciYES (NULL)
 select,insert,update,references
 de   text utf8_general_ciYES (NULL)
 select,insert,update,references
 fr   text utf8_general_ciYES (NULL)
 select,insert,update,references


 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/mysql




-- 
Claudio


Re: Why is mySQL not respecting foreign characters as different

2013-09-26 Thread Anders Karlsson
That is because of the collation. It's the collations that determines 
character equality. I can't tell what the collation is in your case for 
the columns us, es, de, es and fr. Also, that you match character sets 
in different columns is usually not a good idea, unless you have a good 
reason for it (and there are exception to this rule).


Try specifying the utf8_bin collation instead and that will work. For a 
more complete explanation of all this, read my blog on this subject: 
http://karlssonondatabases.blogspot.nl/2012/11/character-sets-collations-utf-8-and-all.html


/Karlsson
Daevid Vincent skrev 2013-09-26 23:44:

How come MySQL is not differentiating between these characters?

SELECT text_id, us, de, es, fr
   FROM texts
   WHERE us = fr;

Results in matching here. Notice the difference in the scene vs scène

text_id us  es  de fr
--  --  --  -  
all_page_scene  scene   escena  Filmszene  scène


I wold expect this NOT to match.

Do I have to add something to my query to tell MySQL to respect other
character sets as different?

CREATE TABLE `texts` (
   `text_id` varchar(50) CHARACTER SET latin1 COLLATE latin1_general_ci NOT
NULL DEFAULT '',
   `us` text,
   `es` text,
   `de` text,
   `fr` text,
   PRIMARY KEY (`text_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

FieldType Collation  NullKey Default  Extra
Privileges   Comment
---  ---  -  --  --  ---  --
---  -
text_id  varchar(50)  latin1_general_ci  NO  PRI
select,insert,update,references
us   text utf8_general_ciYES (NULL)
select,insert,update,references
es   text utf8_general_ciYES (NULL)
select,insert,update,references
de   text utf8_general_ciYES (NULL)
select,insert,update,references
fr   text utf8_general_ciYES (NULL)
select,insert,update,references





--

Anders Karlsson, Senior Sales Engineer
SkySQL | t: +46 708-608-121 | Skype: drdatabase


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql