On 05/11/2021 08:54, Alex Peshkoff via Firebird-devel wrote: > > Before changing / fixing something we should first of all decide what > result do we really need. >
Please also take into account that what the customer primary problem was the unacceptable performance degradation when they switch to UTF8, demonstrated with these test cases: https://github.com/FirebirdSQL/firebird/issues/6915. About greater than and variant operators, no excuses, we must fix. Then about STARTING WITH, not only about the problems I told early, making CH not START WITH C will make it a slower operation. Surely index lookup will make things faster, but non-indexed comparison will be slower. Also note that they already had problem if their test cases were based on real scenario usage. They do not put a test with ANSI_CZ LIKE 'C%'. Here is it: Times in my machine. SELECT ANSI_CZ FROM TEST1M WHERE ANSI_CZ LIKE 'C%' ORDER BY ANSI_CZ; v4: ~3s master: 2.7s --- SELECT UNICODE_CS_CZ FROM TEST1M WHERE UNICODE_CS_CZ LIKE 'C%' ORDER BY UNICODE_CS_CZ; v4: 3.5s master: 3.1s --- SELECT UNICODE_CI_CZ FROM TEST1M WHERE UNICODE_CI_CZ LIKE 'C%' ORDER BY UNICODE_CI_CZ; v4: 3.7s master: 3.4s --- So in many cases master with Unicode has about same performance than v4 Ansi - and with greater lengths it's even improved after #7038. There is also the case of UNICODE_CS_CZ vs ANSI_CZ test LIKE 'Z%' that became slower and Z is not a actual Czech contraction. This must be about normalization things being reported by ICU as contractions. A problem which must be more investigated and fixed too. So I emphasize my opinion that main performance problem is not the C/CH thing. This was already present and my test case demonstrate that. If the customer did not saw this, test case does not demonstrate their usage pattern. But the test case demonstrated very degradation with letter Z. While test case may also not being demonstrated a common usage pattern with many data, it's surely is a problem we must fix. So instead of introduce lots of inconsistencies (and slow down in some operations), the way to go is: - Fix compare operators case - Fix Z% case - Implement multiple index lookups for C% case I'm not telling that any of these are very easy. Probably only the compare operators case is. Adriano Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel