RE: failure in the Russian Analyzer in contrib

Vanlerberghe, Luc Fri, 11 Feb 2005 11:12:52 -0800

Yep, that fixed it!
After an avn update, the files on my machine have the same size as
yours.
All the tests in contrib\analyzers pass now.


Thanks,

Luc.

-----Original Message-----
From: Erik Hatcher [mailto:[EMAIL PROTECTED] 
Sent: vrijdag 11 februari 2005 19:25
To: Lucene Developers List
Subject: Re: failure in the Russian Analyzer in contrib

On Feb 11, 2005, at 11:19 AM, Vanlerberghe, Luc wrote:
> I'm suspecting subversion now: the stemsUnicode.txt and 
> wordsUnicode.txt files are encoded in UTF-16 (they have the proper two

> byte byte-order
> prefix) and have property svn:eol-style set to native.
> On my (Windows :( )system the files are 904424 and 1101164 bytes long 
> and are full of "0d 0a 00" byte sequences which in unicode should 
> probably just be "0a 00" or "0d 00 0a 00".

My files have these sizes:

$ ls -l
total 3608
-rw-r--r--  1 erik  erik   805080 11 Feb 08:30 stemsUnicode.txt
-rw-r--r--  1 erik  erik  1001820 11 Feb 08:30 wordsUnicode.txt

> Is there a way to do a svn update --raw or something that I can check 
> this?

No, svn doesn't have this type of switch.

> If this is indeed the problem, a possible fix would be to set the 
> svn:eol-style to LF or else let svn know that the file is in unicode 
> (perhaps setting the svn:mime-type property to something else than the
> default?)

I have set the svn:eol-style property to LF on both of those files.  
Let me know if that fixes the issue.

        Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: failure in the Russian Analyzer in contrib

Reply via email to