Re: [PUSHED][PATCH] fix building dictionaries when PERL_UNICODE environment

2012-02-29 Thread Caolán McNamara
On Fri, 2012-02-24 at 17:28 +0100, Petr Vorel wrote:
 First I thought it would be better to recode all of them to utf8. But 
 these files are taken from ispell, aren't they? Do we want to increase
 diversity from upstream? 

For hunspell dictionaries, the .aff files contain the encoding for
the .dic files, the .aff needs to be changed to specify UTF-8 if you
change the encoding of the .dic files. This tool though looks like its
for validating thesaurus files, though they probably have the same thing
where there's a field somewhere to indicate the encoding they are in.

Some of the hunspell dicts might share ispell lineage I guess, but
either way most of them and the thesauri have their own active upstreams
and so on, so yeah, changing the encoding just add unnecessary diversity
from their upstreams.

anyway, the last patch looks pretty minimal and it seems to make no
difference to the typical setup, so I pushed that now to dictionaries

C.

___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: [PATCH] fix building dictionaries when PERL_UNICODE environment

2012-02-24 Thread Petr Vorel



I run into troubles while building dictionaries, because I have on my system
set PERL_UNICODE=SDL (perl script
clone/dictionaries/dictionaries/util/th_check.pl dies as it's forced to use
UTF-8, but not all dictionaries are in this encoding).

Interesting, I'd never seen this environment variable before. :)

Diacritics made me to learn .


First, setting this to SDL emulates the behaviour of Perl 5.8.0 -
this was deliberately fixed in 5.8.1 (see the perlrun and perl581delta
docs), so I would be careful about putting it in your environment.
It's likely to break more than just LibreOffice.
Thank you for pointing L background. Well, I don't consider L as 
harmful, but I don't
really need it. But it's broken even with PERL_UNICODE=SD. i implied 
by D is the source
of troubles. So we need to remove i and D from env or set the env 
explicitly. 0 or

S are working settings.


I am tempted to push your patch as a temporary fix, but there must be
a better way to handle this long-term... I'll check that I can
reproduce the problem, first.

Have you been successful?


As an idea, would it be possible to identify and re-encode the
dictionaries themselves as UTF-8?  After which we could set the
encoding of the filehandles explicitly in th_check.pl.
First I thought it would be better to recode all of them to utf8. But 
these files are
taken from ispell, aren't they? Do we want to increase diversity from 
upstream?  So I
thing it's better to set explicitly the env. But prj/tests.mk in 
dictionaries is IMHO

better place than Makefile in core repo. See the patch.

Petr
From 272951e21383bfaccc46dadf99733849d7702333 Mon Sep 17 00:00:00 2001
From: Petr Vorel petr.vo...@gmail.com
Date: Fri, 24 Feb 2012 17:11:40 +0100
Subject: [PATCH] Explicitly set PERL_UNICODE environment variable

This prevents fail building when PERL_UNICODE contain i or D
---
 dictionaries/prj/tests.mk |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/dictionaries/prj/tests.mk b/dictionaries/prj/tests.mk
index 0eb6008..7048dd6 100644
--- a/dictionaries/prj/tests.mk
+++ b/dictionaries/prj/tests.mk
@@ -25,4 +25,4 @@
 ALLTAR : test1
 test1 .PHONY :
 		@echo Validating thesaurus file
-		$(COMMAND_ECHO) ..$/util$/th_check.pl *.dat
+		$(COMMAND_ECHO) PERL_UNICODE=0 ..$/util$/th_check.pl *.dat
-- 
1.7.9.1

___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: [PATCH] fix building dictionaries when PERL_UNICODE environment

2012-02-20 Thread Tim Retout
Hi!

On 18 February 2012 22:29, Petr Vorel petr.vo...@gmail.com wrote:
 I run into troubles while building dictionaries, because I have on my system
 set PERL_UNICODE=SDL (perl script
 clone/dictionaries/dictionaries/util/th_check.pl dies as it's forced to use
 UTF-8, but not all dictionaries are in this encoding).

Interesting, I'd never seen this environment variable before. :)

First, setting this to SDL emulates the behaviour of Perl 5.8.0 -
this was deliberately fixed in 5.8.1 (see the perlrun and perl581delta
docs), so I would be careful about putting it in your environment.
It's likely to break more than just LibreOffice.

I am tempted to push your patch as a temporary fix, but there must be
a better way to handle this long-term... I'll check that I can
reproduce the problem, first.

As an idea, would it be possible to identify and re-encode the
dictionaries themselves as UTF-8?  After which we could set the
encoding of the filehandles explicitly in th_check.pl.

Kind regards,

-- 
Tim Retout t...@retout.co.uk
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


[PATCH] fix building dictionaries when PERL_UNICODE environment

2012-02-18 Thread Petr Vorel

Hi,

I run into troubles while building dictionaries, because I have on my 
system set PERL_UNICODE=SDL (perl script 
clone/dictionaries/dictionaries/util/th_check.pl dies as it's forced to 
use UTF-8, but not all dictionaries are in this encoding).
Not sure whether it's good idea to handle it. If yes, there is a working 
solution (I'm sure, there must be better place to reset the variable).


Petr
From 6877bd81333f6373fb80d66e6d6479a40e2294f5 Mon Sep 17 00:00:00 2001
From: Petr Vorel petr.vo...@gmail.com
Date: Sat, 18 Feb 2012 23:19:22 +0100
Subject: [PATCH] fix building dictionaries when PERL_UNICODE environment
 variable is set

---
 Makefile |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/Makefile b/Makefile
index cb99243..7d619e2 100644
--- a/Makefile
+++ b/Makefile
@@ -267,7 +267,7 @@ define dmake_module_rules
 
 $(1): bootstrap fetch
 	cd $(1)  unset MAKEFLAGS  \
-$(SOLARENV)/bin/build.pl -P$(BUILD_NCPUS) -- -P$(GMAKE_PARALLELISM)
+PERL_UNICODE=0 $(SOLARENV)/bin/build.pl -P$(BUILD_NCPUS) -- -P$(GMAKE_PARALLELISM) # PERL_UNICODE breaks th_check.pl during validating thesaurus files
 
 $(1).all: bootstrap fetch
 	cd $(1)  unset MAKEFLAGS  \
-- 
1.7.9

___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice