[RFC] Use ICU in wine ?

2011-10-13 Thread Rafał Mużyło
On Mon Oct 10 12:48:27 CDT 2011, André Hentschel wrote a few things:

Well, in the bug you've mentioned (#5163) there was a link to that file
in mono, but while it seems nice, just looking at mono code makes my
eyes bleed and the perspective of rewriting it in perl would fill me
with terror. There's also the question of how close did mono get to
Windows in this case.

Still, that would at most fixed the default table, the tailoring needs
to be implemented anyway.





[RFC] Use ICU in wine ?

2011-10-11 Thread Rafał Mużyło
On Mon, Oct 10, 2011 at 07:47:32PM +0200, Shachar Shemesh wrote:
 ICU is impossible to dynamically link with, and it's size is quite huge
 if statically linked.

Huh ?
/usr/lib/libicudata.so.48/usr/lib/libiculx.so.48
/usr/lib/libicutest.so.48/usr/lib/libicui18n.so.48
/usr/lib/libicutu.so.48  /usr/lib/libicuio.so.48
/usr/lib/libicuuc.so.48  /usr/lib/libicule.so.48

Care to elaborate ?

Yes, it has an annoying habit of ABI breaks, but so does i.e. poppler.





[RFC] Use ICU in wine ?

2011-10-10 Thread Rafał Mużyło
Right now, wine claims to use DUCET data for lingustic sorting, but by
section 1.9.2 of that document (as of version 6.0.0), uses it in a wrong
way. The result of it are bugs such as #10767 and #9583.

A possible way around it would beby using ICU to get language specific
tailoring and applying some of wine-specific for the parts addressed by
#10767. In a way, wine already *indirectly* depends on icu on many
distros (a potential libxml2 dep) - they were quite a few patches
already addressing build failures caused by this in recent past.

I've cooked up a patch, that's unfortunately isn't working properly yet,
it's more of a draft on where to go, if such road were to be taken.

Main issue with this patch ain't it's not ignoring the proper list of
symbols, as this would be easy to fix, if not for one problem:
the part that I put inside 'if (0)' block.

For some reason (threading, perhaps ?), if I pass a non-zero lenght string
to ucol_openRules, wineboot hangs. Oddly, if the string lenght is 0, the
hang doesn't happen.
I can't really tell, what going wrong, as the almost same code works in
native linux testcase.

Also, while in this patch UCOL_ALTERNATE_HANDLING would have been used
for NORM_IGNORESYMBOLS, it would be much better to manipulate some of
the settings by a custom rule and VariableTop value, but to do that,
first such rule would have to be passed and that odd hang prevents it
at the moment.

It would also be nice, if it was possible to initialize the collator
only once for an app instance - after all, in Windows locale change
requires a reboot, so while (AFAIK) one wineserver instance can have
apps in different locales running, locale can't change for an already
running app.

So if you have any idea of why exactly the hang happens and how to get
around it or know any *technical* reasons of why ICU couldn't be used in
wine, CC me with comments.

--- libs/wine/sortkey.c 2011-10-05 16:44:29.0 +0200
+++ libs/wine/sortkey.c 2011-10-05 21:06:51.0 +0200
@@ -18,6 +18,8 @@
  * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
  */
 #include wine/unicode.h
+#include unicode/ustring.h
+#include unicode/ucol.h
 
 extern int get_decomposition(WCHAR src, WCHAR *dst, unsigned int dstlen);
 extern const unsigned int collation_table[];
@@ -334,10 +336,56 @@
 const WCHAR *str2, int len2)
 {
 int ret;
+UErrorCode status = U_ZERO_ERROR;// status1 = U_ZERO_ERROR;
+UCollator *coll, *coll2;
+UParseError parse_error;
+//U_STRING_DECL(word_sort_rule, \\u=''='-', 18);
+//U_STRING_INIT(word_sort_rule, \\u=''='-', 18);
+const WCHAR word_sort_rule[] = { '', '\\', 'u', '0', '0',
+'0', '0', '=', '\'', '\'', '=', '\'', '-', '\'', 0};
 
 len1 = real_length(str1, len1);
 len2 = real_length(str2, len2);
 
+coll = ucol_open(NULL, status);
+if (U_SUCCESS(status))
+{
+if (0  !(flags  SORT_STRINGSORT))
+{
+coll2 = ucol_openRules(word_sort_rule, 14,
+ucol_getAttribute(coll, UCOL_NORMALIZATION_MODE, status),
+ucol_getStrength(coll), parse_error, status);
+if (U_SUCCESS(status))
+{
+ucol_close(coll);
+coll = coll2;
+}
+}
+if (flags  NORM_IGNORECASE)
+ucol_setStrength(coll, UCOL_SECONDARY);
+if (flags  NORM_IGNORENONSPACE)
+{
+ucol_setStrength(coll, UCOL_PRIMARY);
+if (!(flags  NORM_IGNORECASE))
+ucol_setAttribute(coll, UCOL_CASE_LEVEL, UCOL_ON, status);
+}
+if (flags  NORM_IGNOREKANATYPE)
+{
+if (ucol_getAttribute(coll, UCOL_HIRAGANA_QUATERNARY_MODE, status)
+ == UCOL_ON)
+ ucol_setAttribute(coll, UCOL_HIRAGANA_QUATERNARY_MODE, 
UCOL_OFF, status);
+}
+if (flags  NORM_IGNORESYMBOLS)
+{
+ucol_setAttribute(coll, UCOL_ALTERNATE_HANDLING, UCOL_SHIFTED, 
status);
+}
+
+ret = ucol_strcoll(coll, str1, len1, str2, len2) + 2;
+ucol_close(coll);
+}
+else
+ret = 0;
+#if 0
 ret = compare_unicode_weights(flags, str1, len1, str2, len2);
 if (!ret)
 {
@@ -346,5 +385,6 @@
 if (!ret  !(flags  NORM_IGNORECASE))
 ret = compare_case_weights(flags, str1, len1, str2, len2);
 }
+#endif
 return ret;
 }
--- configure.ac2011-10-04 21:46:40.0 +0200
+++ configure.ac2011-10-05 21:35:57.0 +0200
@@ -913,6 +913,29 @@
 WINE_ERROR_WITH(pthread,[test x$LIBPTHREAD = x],[pthread 
${notice_platform}development files not found.
 Wine cannot support threads without libpthread.])
 
+dnl  Check for icu 
+
+AC_SUBST(ICUINCL,)
+AC_SUBST(ICULIBS,)
+ac_save_CPPFLAGS=$CPPFLAGS
+if test $PKG_CONFIG != false
+then
+ac_icu_libs=`$PKG_CONFIG --libs icu-i18n 2/dev/null`
+ac_icu_cflags=`$PKG_CONFIG --cflags icu-i18n 2/dev/null`
+ 

Re: [RFC] Use ICU in wine ?

2011-10-10 Thread Dmitry Timoshkov
Rafał Mużyło galtge...@o2.pl wrote:

 Right now, wine claims to use DUCET data for lingustic sorting,

What's DUCET and where do you see Wine does claim to use it?

 but by
 section 1.9.2 of that document (as of version 6.0.0), uses it in a wrong
 way.

Could you please be slightly more specific?

 The result of it are bugs such as #10767 and #9583.

Looks like that you don't understand what those bugs are about.

 A possible way around it would beby using ICU to get language specific
 tailoring and applying some of wine-specific for the parts addressed by
 #10767.

Once you uderstand the problems you may see that adding ICU to Wine
depencies will create much more problems than it's supposedly will
solve.

-- 
Dmitry.




[RFC] Use ICU in wine ?

2011-10-10 Thread Rafał Mużyło
On Mon, Oct 10, 2011 at 04:16:26PM +0900, Dmitry Timoshkov wrote (and
contradicted himself):
 What's DUCET and where do you see Wine does claim to use it?

DUCET: Default Unicode Collation Element Table
http://bugs.winehq.org/show_bug.cgi?id=10767#c1
  but by
  section 1.9.2 of that document (as of version 6.0.0), uses it in a wrong
  way.

http://www.unicode.org/reports/tr10/#Non-Goals, point 6
...DUCET does not and cannot actually provide linguistically correct
sorting for every language without further tailoring
 Could you please be slightly more specific?
 
 Looks like that you don't understand what those bugs are about.

Is the above link specific enough ?





[RFC] Use ICU in wine ?

2011-10-10 Thread Rafał Mużyło
On Mon, Oct 10, 2011 at 04:16:26PM +0900, Dmitry Timoshkov wrote (and
contradicted himself):
 What's DUCET and where do you see Wine does claim to use it?

DUCET: Default Unicode Collation Element Table
http://bugs.winehq.org/show_bug.cgi?id=10767#c1
  but by
  section 1.9.2 of that document (as of version 6.0.0), uses it in a wrong
  way.

http://www.unicode.org/reports/tr10/#Non-Goals, point 6
...DUCET does not and cannot actually provide linguistically correct
sorting for every language without further tailoring
 Could you please be slightly more specific?
 
 Looks like that you don't understand what those bugs are about.

Is the above link specific enough ?

Anyway, if DUCET and Windows default differ is dumping that default even
allowed ?

Besides, the only proper way to fix bug #9583 is to implement some kind
of language tailoring.





Re: [RFC] Use ICU in wine ?

2011-10-10 Thread Shachar Shemesh
On 10/10/2011 02:48 AM, Rafał Mużyło wrote:
 Right now, wine claims to use DUCET data for lingustic sorting, but by
 section 1.9.2 of that document (as of version 6.0.0), uses it in a wrong
 way. The result of it are bugs such as #10767 and #9583.

 A possible way around it would beby using ICU to get language specific
 tailoring and applying some of wine-specific for the parts addressed by
 #10767.
I will not go into the part of the discussion that has to do with
whether ICU will or will not resolve the issues. I will point out that
Wine used to have a soft dependency on ICU (introduced, for my sins, by
me) for the sake of BiDi processing. Everyone involved, myself included,
were all too happy to see it go. ICU is impossible to dynamically link
with, and it's size is quite huge if statically linked.

The good news is that ICU's license seems to be compatible with Wine's
LGPL, so if needed, the relevant part can be copied into Wine.

Shachar

-- 
Shachar Shemesh
Lingnu Open Source Consulting Ltd.
http://www.lingnu.com




Re: [RFC] Use ICU in wine ?

2011-10-10 Thread André Hentschel
Am 10.10.2011 17:33, schrieb Rafał Mużyło:
 Anyway, if DUCET and Windows default differ is dumping that default even
 allowed ?

I don't think so... I once was hacking on our collation stuff too to fix
http://bugs.winehq.org/show_bug.cgi?id=5163 but i guess dumping the windows 
stuff wouldn't even have helped me.



-- 

Best Regards, André Hentschel