Re: [Apertium-stuff] Apertium and ICU

2019-06-06 Thread Joan Moratinos Jaume
I developed my little program in Windpws, using its ICU. Only the names of
the include files change. Functionality is the same as in Linux.

On Fri, 7 Jun 2019 at 07:30, Tino Didriksen 
wrote:

> Another data point:
>
> https://docs.microsoft.com/en-us/windows/desktop/intl/international-components-for-unicode--icu-
>
> Windows 10 has been shipping ICU for the past 2 years, and I somehow never
> noticed.
>
> -- Tino Didriksen
>
>
> On Mon, 27 May 2019 at 13:56, Tino Didriksen 
> wrote:
>
>> The PR https://github.com/apertium/apertium/pull/47 wants to add a
>> direct dependency on ICU. I am in favour of this, but figured it should be
>> brought up on the list.
>>
>> Reasoning:
>> - HFST and CG-3 both require ICU, and ICU has been the official Unicode
>> library for 3 years now.
>> - lttoolbox requires libxml2, and libxml2 requires ICU - so Apertium
>> already has a transitive dependency on ICU.
>> - Language development requires libxml2-utils to get xmllint, which again
>> transitively requires ICU.
>>
>> So we might as well embrace ICU entirely - also in other parts of
>> lttoolbox and the wider Apertium tools.
>>
>> -- Tino Didriksen
>>
>> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


-- 
Joan Moratinos
jmorati...@gmail.com
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Apertium and ICU

2019-06-06 Thread Tino Didriksen
Another data point:
https://docs.microsoft.com/en-us/windows/desktop/intl/international-components-for-unicode--icu-

Windows 10 has been shipping ICU for the past 2 years, and I somehow never
noticed.

-- Tino Didriksen


On Mon, 27 May 2019 at 13:56, Tino Didriksen 
wrote:

> The PR https://github.com/apertium/apertium/pull/47 wants to add a direct
> dependency on ICU. I am in favour of this, but figured it should be brought
> up on the list.
>
> Reasoning:
> - HFST and CG-3 both require ICU, and ICU has been the official Unicode
> library for 3 years now.
> - lttoolbox requires libxml2, and libxml2 requires ICU - so Apertium
> already has a transitive dependency on ICU.
> - Language development requires libxml2-utils to get xmllint, which again
> transitively requires ICU.
>
> So we might as well embrace ICU entirely - also in other parts of
> lttoolbox and the wider Apertium tools.
>
> -- Tino Didriksen
>
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Apertium and ICU

2019-05-27 Thread Tommi A Pirinen
On Mon, May 27, 2019 at 01:56:29PM +0200, Tino Didriksen wrote:
> The PR https://github.com/apertium/apertium/pull/47 wants to add a direct
> dependency on ICU. I am in favour of this, but figured it should be brought
> up on the list.
> 
> Reasoning:
> - HFST and CG-3 both require ICU, and ICU has been the official Unicode
> library for 3 years now.
> - lttoolbox requires libxml2, and libxml2 requires ICU - so Apertium
> already has a transitive dependency on ICU.
> - Language development requires libxml2-utils to get xmllint, which again
> transitively requires ICU.

I think at least HFST and libxml2 have configurable ICU support that can
be turned off with acceptable functionality loss.

> So we might as well embrace ICU entirely - also in other parts of lttoolbox
> and the wider Apertium tools.

I would agree. In past one could've argued that new dependencies make
things harder to install and ICU was not the easiest to work with, but
with current packagings it's not such a big concern. I think ICU
probably still is quite big and slow but we could also immediately make
use of it in few places like OOV tokenisations we've seen in issues
recently that outweighs it.

-- 
Doktor Tommi A Pirinen, Computational Linguist,
, Universität
Hamburg, Hamburger Zentrum für Sprachkorpora . CLARIN-D
Entwickler.  President of ACL SIGUR SIG for Uralic languages
.
I tend to follow inline-posting style in desktop e-mail messages.


signature.asc
Description: PGP signature
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff