StandardTermsDictReader.java

Robert Muir Sun, 22 Nov 2009 13:20:27 -0800

On Sun, Nov 22, 2009 at 4:16 PM, Michael McCandless <
[email protected]> wrote:


> On Sun, Nov 22, 2009 at 4:06 PM, Robert Muir <[email protected]> wrote:
> > I guess here is where I just say that unicode and java are optimized for
> > utf-16 processing
>
> I agree, though leaving things as UTF8 works fine for low level stuff
> (sorting, comparing equality, etc.)?
>

+1


>
> > and so while I agree with byte[] being available in
> > places like this for flex indexing,
> > I'm already nervous about seeing code / optimizations that only work well
> > with latin-1, and are very slow / buggy for anything else.
>
> Buggy we should clearly outright fix.
>
> Slower, maybe.  But very slow, I hope not?
>
> What places specifically are you worried about?
>

places like AutomatonQuery, where I found myself wanting to consider the
option of processing byte[], when I know this is very bad!


> Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>


-- 
Robert Muir
[email protected]

Re: svn commit: r883088 - in /lucene/java/branches/flex_1458/src/java/org/apache/lucene/index: TermRef.java codecs/standard/StandardTermsDictReader.java

Reply via email to