Re: A codec moment or pickle

2015-02-13 Thread Robert Muir
heh, i just don't think thats the typical case. Its definitely extreme.

Even still, in many cases using the filesystem (properly warmed) with
compression might still be better. It depends how you are measuring
latency. storing your whole index in gigabytes of heap ram without any
compression on a huge heap has consequences too.

On Thu, Feb 12, 2015 at 4:52 PM, Benson Margulies  wrote:
> WHOOPS.
>
> First sentence was, until just before I clicked 'send',
>
> "Hardware has .5T of RAM. Index is relatively small  (20g) ..."
>
>
> On Thu, Feb 12, 2015 at 4:51 PM, Benson Margulies  
> wrote:
>> Robert,
>>
>> Let me lay out the scenario.
>>
>> Hardware has .5T of Index is relatively small. Application profiling
>> shows a significant amount of time spent codec-ing.
>>
>> Options as I see them:
>>
>> 1. Use DPF complete with the irritation of having to have this
>> spurious codec name in the on-disk format that has nothing to do with
>> the on-disk format.
>> 2. 'Officially' use the standard codec, and then use something like
>> AOP to intercept and encapsulate it with the DPF or something else
>> like it -- essentially, a do-it-myself alternative to convincing the
>> community here that this is a use case worthy of support.
>> 3. Find some way to move a significant amount of the data in question
>> out of Lucene altogether into something else which fits nicely
>> together with filling memory with a cache so that the amount of
>> codeccing drops below the threshold of interest.
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: A codec moment or pickle

2015-02-12 Thread Benson Margulies
WHOOPS.

First sentence was, until just before I clicked 'send',

"Hardware has .5T of RAM. Index is relatively small  (20g) ..."


On Thu, Feb 12, 2015 at 4:51 PM, Benson Margulies  wrote:
> Robert,
>
> Let me lay out the scenario.
>
> Hardware has .5T of Index is relatively small. Application profiling
> shows a significant amount of time spent codec-ing.
>
> Options as I see them:
>
> 1. Use DPF complete with the irritation of having to have this
> spurious codec name in the on-disk format that has nothing to do with
> the on-disk format.
> 2. 'Officially' use the standard codec, and then use something like
> AOP to intercept and encapsulate it with the DPF or something else
> like it -- essentially, a do-it-myself alternative to convincing the
> community here that this is a use case worthy of support.
> 3. Find some way to move a significant amount of the data in question
> out of Lucene altogether into something else which fits nicely
> together with filling memory with a cache so that the amount of
> codeccing drops below the threshold of interest.

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: A codec moment or pickle

2015-02-12 Thread Benson Margulies
Robert,

Let me lay out the scenario.

Hardware has .5T of Index is relatively small. Application profiling
shows a significant amount of time spent codec-ing.

Options as I see them:

1. Use DPF complete with the irritation of having to have this
spurious codec name in the on-disk format that has nothing to do with
the on-disk format.
2. 'Officially' use the standard codec, and then use something like
AOP to intercept and encapsulate it with the DPF or something else
like it -- essentially, a do-it-myself alternative to convincing the
community here that this is a use case worthy of support.
3. Find some way to move a significant amount of the data in question
out of Lucene altogether into something else which fits nicely
together with filling memory with a cache so that the amount of
codeccing drops below the threshold of interest.

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: A codec moment or pickle

2015-02-12 Thread Robert Muir
On Thu, Feb 12, 2015 at 8:51 AM, Benson Margulies  wrote:
> On Thu, Feb 12, 2015 at 8:43 AM, Robert Muir  wrote:
>
>> Honestly i dont agree. I don't know what you are trying to do, but if
>> you want file format backwards compat working, then you need a
>> different FilterCodec to match each lucene codec.
>>
>> Otherwise your codec is broken from a back compat standpoint. Wrapping
>> the latest is an antipattern here.
>>
>
> I understand this logic. It leaves me wandering between:
>
> 1: My old desire to convince you that there should be a way to do
> DirectPostingFormat's caching without being a codec at all. Unfortunately,
> I got dragged away from the benchmarking that might have been persuasive.

Honestly, benchmarking won't persuade me. I think this is a trap and I
don't want more of these traps.
We already have RAMDirectory(Directory other) which is this exact same
trap. We don't need more duplicates of it.
But this Direct, man oh man is it even worse by far, because it uses
32 and 64 bits for things that really should typically only be like 8
bits with compression, so it just hogs up RAM.

There isnt a benchmark on this planet that can convince me it should
get any higher status. On the contrary, I want to send it into a deep
dark dungeon in siberia.

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



RE: A codec moment or pickle

2015-02-12 Thread Uwe Schindler
Hi,

FYI, this is the same issues like Locales have/had in ICU! If you try to render 
an error message in Locales's constructors, this breaks with NPE - because 
default Locale is not yet there... I think they implemented some "fallback" 
that is guaranteed to be there.

But this would not help you, too - you need the default Codec be available at 
the time your custom codec is loaded... Same issue, no idea how to solve this.

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -Original Message-
> From: Benson Margulies [mailto:ben...@basistech.com]
> Sent: Thursday, February 12, 2015 11:34 AM
> To: java-user@lucene apache. org
> Subject: Re: A codec moment or pickle
> 
> Based on reading the same comments you read, I'm pretty doubtful that
> Codec.getDefault() is going to work. It seems to me that this situation
> renders the FilterCodec a bit hard to to use, at least given the 'every 
> release
> deprecates a codec' sort of pattern.
> 
> 
> 
> On Thu, Feb 12, 2015 at 3:20 AM, Uwe Schindler  wrote:
> > Hi,
> >
> > How about Codec.getDefault()? It does indeed not necessarily return the
> newest one (if somebody changes the default using Codec.setDefault()), but
> for your use case "wrapping the current default one", it should be fine?
> >
> > I have not tried this yet, but there might be a chicken-egg problem:
> > - Your codec will have a separate name and be listed in META-INF as service
> (I assume this). So it gets discovered by the Codec discovery process and is
> instantiated by that.
> > - On loading the Codec framework the call to codec.getDefault() might get
> in at a time where the codecs are not yet fully initialized (because it will
> instantiate your codec while loading the META-INF). This happens before the
> Codec class is itself fully statically initialized, so the default codec 
> might be
> null...
> > So relying on Codec.getDefault() in constructors of filter codecs may not
> work as expected!
> >
> > Maybe try it out, was just an idea :-)
> >
> > Uwe
> >
> > -
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: u...@thetaphi.de
> >
> >
> >> -Original Message-
> >> From: Benson Margulies [mailto:bimargul...@gmail.com]
> >> Sent: Thursday, February 12, 2015 2:11 AM
> >> To: java-user@lucene.apache.org
> >> Subject: A codec moment or pickle
> >>
> >> I have a class that extends FilterCodec. Written against Lucene 4.9,
> >> it uses the Lucene49Codec.
> >>
> >> Dropped into a copy of Solr with Lucene 4.10, it discovers that this
> >> codec is read-only in 4.10. Is there some way to code one of these to
> >> get 'the default codec' and not have to chase versions?
> >>
> >> -
> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
> >
> > -
> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
> 
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: A codec moment or pickle

2015-02-12 Thread Benson Margulies
On Thu, Feb 12, 2015 at 8:43 AM, Robert Muir  wrote:

> Honestly i dont agree. I don't know what you are trying to do, but if
> you want file format backwards compat working, then you need a
> different FilterCodec to match each lucene codec.
>
> Otherwise your codec is broken from a back compat standpoint. Wrapping
> the latest is an antipattern here.
>

I understand this logic. It leaves me wandering between:

1: My old desire to convince you that there should be a way to do
DirectPostingFormat's caching without being a codec at all. Unfortunately,
I got dragged away from the benchmarking that might have been persuasive.

2: The problem of deprecation. I give someone a jar-of-code that works fine
with Lucene 4.9. It does not work with 4.10. Now, maybe the answer here is
that the codec deprecation is fundamental to the definition of moving from
4.9 to 4.10, so having a codec means that I'm really married to a process
of making releases that mirror Lucene releases.




>
>
> On Thu, Feb 12, 2015 at 5:33 AM, Benson Margulies 
> wrote:
> > Based on reading the same comments you read, I'm pretty doubtful that
> > Codec.getDefault() is going to work. It seems to me that this
> > situation renders the FilterCodec a bit hard to to use, at least given
> > the 'every release deprecates a codec' sort of pattern.
> >
> >
> >
> > On Thu, Feb 12, 2015 at 3:20 AM, Uwe Schindler  wrote:
> >> Hi,
> >>
> >> How about Codec.getDefault()? It does indeed not necessarily return the
> newest one (if somebody changes the default using Codec.setDefault()), but
> for your use case "wrapping the current default one", it should be fine?
> >>
> >> I have not tried this yet, but there might be a chicken-egg problem:
> >> - Your codec will have a separate name and be listed in META-INF as
> service (I assume this). So it gets discovered by the Codec discovery
> process and is instantiated by that.
> >> - On loading the Codec framework the call to codec.getDefault() might
> get in at a time where the codecs are not yet fully initialized (because it
> will instantiate your codec while loading the META-INF). This happens
> before the Codec class is itself fully statically initialized, so the
> default codec might be null...
> >> So relying on Codec.getDefault() in constructors of filter codecs may
> not work as expected!
> >>
> >> Maybe try it out, was just an idea :-)
> >>
> >> Uwe
> >>
> >> -
> >> Uwe Schindler
> >> H.-H.-Meier-Allee 63, D-28213 Bremen
> >> http://www.thetaphi.de
> >> eMail: u...@thetaphi.de
> >>
> >>
> >>> -Original Message-
> >>> From: Benson Margulies [mailto:bimargul...@gmail.com]
> >>> Sent: Thursday, February 12, 2015 2:11 AM
> >>> To: java-user@lucene.apache.org
> >>> Subject: A codec moment or pickle
> >>>
> >>> I have a class that extends FilterCodec. Written against Lucene 4.9,
> it uses the
> >>> Lucene49Codec.
> >>>
> >>> Dropped into a copy of Solr with Lucene 4.10, it discovers that this
> codec is
> >>> read-only in 4.10. Is there some way to code one of these to get 'the
> default
> >>> codec' and not have to chase versions?
> >>>
> >>> -
> >>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> >>> For additional commands, e-mail: java-user-h...@lucene.apache.org
> >>
> >>
> >> -
> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: java-user-h...@lucene.apache.org
> >>
> >
> > -
> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


Re: A codec moment or pickle

2015-02-12 Thread Robert Muir
Honestly i dont agree. I don't know what you are trying to do, but if
you want file format backwards compat working, then you need a
different FilterCodec to match each lucene codec.

Otherwise your codec is broken from a back compat standpoint. Wrapping
the latest is an antipattern here.


On Thu, Feb 12, 2015 at 5:33 AM, Benson Margulies  wrote:
> Based on reading the same comments you read, I'm pretty doubtful that
> Codec.getDefault() is going to work. It seems to me that this
> situation renders the FilterCodec a bit hard to to use, at least given
> the 'every release deprecates a codec' sort of pattern.
>
>
>
> On Thu, Feb 12, 2015 at 3:20 AM, Uwe Schindler  wrote:
>> Hi,
>>
>> How about Codec.getDefault()? It does indeed not necessarily return the 
>> newest one (if somebody changes the default using Codec.setDefault()), but 
>> for your use case "wrapping the current default one", it should be fine?
>>
>> I have not tried this yet, but there might be a chicken-egg problem:
>> - Your codec will have a separate name and be listed in META-INF as service 
>> (I assume this). So it gets discovered by the Codec discovery process and is 
>> instantiated by that.
>> - On loading the Codec framework the call to codec.getDefault() might get in 
>> at a time where the codecs are not yet fully initialized (because it will 
>> instantiate your codec while loading the META-INF). This happens before the 
>> Codec class is itself fully statically initialized, so the default codec 
>> might be null...
>> So relying on Codec.getDefault() in constructors of filter codecs may not 
>> work as expected!
>>
>> Maybe try it out, was just an idea :-)
>>
>> Uwe
>>
>> -
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: u...@thetaphi.de
>>
>>
>>> -Original Message-
>>> From: Benson Margulies [mailto:bimargul...@gmail.com]
>>> Sent: Thursday, February 12, 2015 2:11 AM
>>> To: java-user@lucene.apache.org
>>> Subject: A codec moment or pickle
>>>
>>> I have a class that extends FilterCodec. Written against Lucene 4.9, it 
>>> uses the
>>> Lucene49Codec.
>>>
>>> Dropped into a copy of Solr with Lucene 4.10, it discovers that this codec 
>>> is
>>> read-only in 4.10. Is there some way to code one of these to get 'the 
>>> default
>>> codec' and not have to chase versions?
>>>
>>> -
>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>
>> -
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: A codec moment or pickle

2015-02-12 Thread Benson Margulies
Based on reading the same comments you read, I'm pretty doubtful that
Codec.getDefault() is going to work. It seems to me that this
situation renders the FilterCodec a bit hard to to use, at least given
the 'every release deprecates a codec' sort of pattern.



On Thu, Feb 12, 2015 at 3:20 AM, Uwe Schindler  wrote:
> Hi,
>
> How about Codec.getDefault()? It does indeed not necessarily return the 
> newest one (if somebody changes the default using Codec.setDefault()), but 
> for your use case "wrapping the current default one", it should be fine?
>
> I have not tried this yet, but there might be a chicken-egg problem:
> - Your codec will have a separate name and be listed in META-INF as service 
> (I assume this). So it gets discovered by the Codec discovery process and is 
> instantiated by that.
> - On loading the Codec framework the call to codec.getDefault() might get in 
> at a time where the codecs are not yet fully initialized (because it will 
> instantiate your codec while loading the META-INF). This happens before the 
> Codec class is itself fully statically initialized, so the default codec 
> might be null...
> So relying on Codec.getDefault() in constructors of filter codecs may not 
> work as expected!
>
> Maybe try it out, was just an idea :-)
>
> Uwe
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
>> -Original Message-
>> From: Benson Margulies [mailto:bimargul...@gmail.com]
>> Sent: Thursday, February 12, 2015 2:11 AM
>> To: java-user@lucene.apache.org
>> Subject: A codec moment or pickle
>>
>> I have a class that extends FilterCodec. Written against Lucene 4.9, it uses 
>> the
>> Lucene49Codec.
>>
>> Dropped into a copy of Solr with Lucene 4.10, it discovers that this codec is
>> read-only in 4.10. Is there some way to code one of these to get 'the default
>> codec' and not have to chase versions?
>>
>> -
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



RE: A codec moment or pickle

2015-02-12 Thread Uwe Schindler
Hi,

How about Codec.getDefault()? It does indeed not necessarily return the newest 
one (if somebody changes the default using Codec.setDefault()), but for your 
use case "wrapping the current default one", it should be fine?

I have not tried this yet, but there might be a chicken-egg problem:
- Your codec will have a separate name and be listed in META-INF as service (I 
assume this). So it gets discovered by the Codec discovery process and is 
instantiated by that.
- On loading the Codec framework the call to codec.getDefault() might get in at 
a time where the codecs are not yet fully initialized (because it will 
instantiate your codec while loading the META-INF). This happens before the 
Codec class is itself fully statically initialized, so the default codec might 
be null...
So relying on Codec.getDefault() in constructors of filter codecs may not work 
as expected!

Maybe try it out, was just an idea :-)

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -Original Message-
> From: Benson Margulies [mailto:bimargul...@gmail.com]
> Sent: Thursday, February 12, 2015 2:11 AM
> To: java-user@lucene.apache.org
> Subject: A codec moment or pickle
> 
> I have a class that extends FilterCodec. Written against Lucene 4.9, it uses 
> the
> Lucene49Codec.
> 
> Dropped into a copy of Solr with Lucene 4.10, it discovers that this codec is
> read-only in 4.10. Is there some way to code one of these to get 'the default
> codec' and not have to chase versions?
> 
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



A codec moment or pickle

2015-02-11 Thread Benson Margulies
I have a class that extends FilterCodec. Written against Lucene 4.9,
it uses the Lucene49Codec.

Dropped into a copy of Solr with Lucene 4.10, it discovers that this
codec is read-only in 4.10. Is there some way to code one of these to
get 'the default codec' and not have to chase versions?

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org