>Description: I have a table with columns containing keywords in cyrillic. If I write i.e. select * from imgs where kw_1 like '%xx%' (xx are two cyrillic letters and kw_1 is the first keyword-column. The table contains 2666 rows.) MySQL returns a lot of words(396 rows) and all of them do not contain xx. The same problem - when select * from imgs where kw_1 like '%xx_' ---- returns 272 rows all of them not containing xx but!!! one and the same word select * from imgs where kw_1 like '_xx%' ---- returns 15 rows all of them not containing xx if I make selection like select * from imgs where kw_1 like '_xx_' ---- returns 0 rows (this is correct) select * from imgs where kw_1 like 'xx' ---- returns 0 rows (this is correct) and finally select * from imgs where kw_1 like '%xx' ---- returns 52 rows of correct data There is no problem with three or more letter search, BUT there is something more I found interesting If I make a surch query and instead of the cyrillic xx(in the above examples it was the cyrillic correspondence of the latin 'vo') we use doubled latin vowels - double a,e,i,o,u: select * from imgs where kw_1 like '%aa%' ---- 115 rows select * from imgs where kw_1 like '%ee%' ---- 288 rows select * from imgs where kw_1 like '%ii%' ---- 413 rows select * from imgs where kw_1 like '%oo%' ---- 277 rows select * from imgs where kw_1 like '%uu%' ---- 6 rows (one and the same word) I'd like to remind you that all of the keywords are written in cyrillic and there is no latin letter in them! MySQL returns 0 rows if we make it with double english consonants, i.e 'ww','rr','ss' and so on. Such is the case if we try a combination of vowel and consonant, i.e. '%qa%' or '%aq%'. If we try a comnbination of two different vowels the bug works and here are following four examples: select * from imgs where kw_1 like '%ae%' ---- 361 rows select * from imgs where kw_1 like '%ea%' ---- 514 rows select * from imgs where kw_1 like '%ei%' ---- 317 rows select * from imgs where kw_1 like '%ie%' ---- 208 rows I tryed with three vowels search: select * from imgs where kw_1 like '%ieo%' ---- 1 row select * from imgs where kw_1 like '%oeo%' ---- 167 rows select * from imgs where kw_1 like '%oea%' ---- 34 rows select * from imgs where kw_1 like '%oii%' ---- 0 rows select * from imgs where kw_1 like '%oai%' ---- 0 rows select * from imgs where kw_1 like '%eai%' ---- 306 rows select * from imgs where kw_1 like '%iae%' ---- 3 rows Four vowels search: select * from imgs where kw_1 like '%eaio%' --- 25 rows and I think these examples are enough All these bugs are working on the other keyword columns in the same table. The columns are declared as varchar(20) and null is default although there are no null values but empty strings when needed. The table is MyISAM. In my database I have another MyISAM table which contains all of the keywords from table 'imgs'. I found the same bugs. I suggest to make such a table like mine. >How-To-Repeat: create table dumi (kw_id int unsigned auto_increment primary key, duma varchar(20) not null, index slovar(duma)); on http://212.91.166.133/fotoged.php is a menu where you could view all the keywords from this table and a number wich describes how many time it is repeated. >From the generated HTML-source you could copy and paste the keywords. The activities are described above. >Fix: I don't know but I think the problem is somewhere in the ASCII support. And something that could be important if I use ORDER BY clause on a column which contains cyrillic letters it is not performing very well, I mean MySQL is showing a strange alphabetical order! >Submitter-Id: >Originator: root >Organization: GED Ltd. Bulgaria >MySQL support: none >Synopsis: >Severity: >Priority: >Category: mysql >Class: >Release: mysql-3.23.41 (Source distribution) >Environment: <machine, os, target, libraries (multiple lines)> System: Linux inter 2.4.7-10 #1 Thu Sep 6 17:27:27 EDT 2001 i686 unknown Architecture: i686
Some paths: /usr/bin/perl /usr/bin/make /usr/bin/gmake /usr/bin/gcc /usr/bin/cc GCC: Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs gcc version 2.96 20000731 (Red Hat Linux 7.1 2.96-98) Compilation info: CC='gcc' CFLAGS='-O2 -march=i386 -mcpu=i686 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE' CX\ X='c++' CXXFLAGS='-O2 -march=i386 -mcpu=i686 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE' LDFLAGS='' LIBC: lrwxrwxrwx 1 root root 13 Apr 11 13:27 /lib/libc.so.6 -> libc-2.2.4.so -rwxr-xr-x 1 root root 1282588 Sep 4 2001 /lib/libc-2.2.4.so -rw-r--r-- 1 root root 27304836 Sep 4 2001 /usr/lib/libc.a -rw-r--r-- 1 root root 178 Sep 4 2001 /usr/lib/libc.so Configure command: ./configure i386-redhat-linux --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --s\ ysconfdir=/etc --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib --libexecdir=/usr/libexec --localstatedir=/v\ ar --sharedstatedir=/usr/com --mandir=/usr/share/man --infodir=/usr/share/info --without-debug --without-readline --enable-s\ hared --with-extra-charsets=complex --with-bench --localstatedir=/var/lib/mysql --with-unix-socket-path=/var/lib/mysql/mysql\ .sock --with-mysqld-user=mysql --with-extra-charsets=all --disable-assember --with-berkeley-db --enable-large-files=yes --en\ able-largefile=yes --with-thread-safe-client --enable-assembler __________________________________________________ Do you Yahoo!? New DSL Internet Access from SBC & Yahoo! http://sbc.yahoo.com --------------------------------------------------------------------- Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php