Re: A codec moment or pickle
heh, i just don't think thats the typical case. Its definitely extreme. Even still, in many cases using the filesystem (properly warmed) with compression might still be better. It depends how you are measuring latency. storing your whole index in gigabytes of heap ram without any compression on a huge heap has consequences too. On Thu, Feb 12, 2015 at 4:52 PM, Benson Margulies wrote: > WHOOPS. > > First sentence was, until just before I clicked 'send', > > "Hardware has .5T of RAM. Index is relatively small (20g) ..." > > > On Thu, Feb 12, 2015 at 4:51 PM, Benson Margulies > wrote: >> Robert, >> >> Let me lay out the scenario. >> >> Hardware has .5T of Index is relatively small. Application profiling >> shows a significant amount of time spent codec-ing. >> >> Options as I see them: >> >> 1. Use DPF complete with the irritation of having to have this >> spurious codec name in the on-disk format that has nothing to do with >> the on-disk format. >> 2. 'Officially' use the standard codec, and then use something like >> AOP to intercept and encapsulate it with the DPF or something else >> like it -- essentially, a do-it-myself alternative to convincing the >> community here that this is a use case worthy of support. >> 3. Find some way to move a significant amount of the data in question >> out of Lucene altogether into something else which fits nicely >> together with filling memory with a cache so that the amount of >> codeccing drops below the threshold of interest. > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: A codec moment or pickle
WHOOPS. First sentence was, until just before I clicked 'send', "Hardware has .5T of RAM. Index is relatively small (20g) ..." On Thu, Feb 12, 2015 at 4:51 PM, Benson Margulies wrote: > Robert, > > Let me lay out the scenario. > > Hardware has .5T of Index is relatively small. Application profiling > shows a significant amount of time spent codec-ing. > > Options as I see them: > > 1. Use DPF complete with the irritation of having to have this > spurious codec name in the on-disk format that has nothing to do with > the on-disk format. > 2. 'Officially' use the standard codec, and then use something like > AOP to intercept and encapsulate it with the DPF or something else > like it -- essentially, a do-it-myself alternative to convincing the > community here that this is a use case worthy of support. > 3. Find some way to move a significant amount of the data in question > out of Lucene altogether into something else which fits nicely > together with filling memory with a cache so that the amount of > codeccing drops below the threshold of interest. - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: A codec moment or pickle
Robert, Let me lay out the scenario. Hardware has .5T of Index is relatively small. Application profiling shows a significant amount of time spent codec-ing. Options as I see them: 1. Use DPF complete with the irritation of having to have this spurious codec name in the on-disk format that has nothing to do with the on-disk format. 2. 'Officially' use the standard codec, and then use something like AOP to intercept and encapsulate it with the DPF or something else like it -- essentially, a do-it-myself alternative to convincing the community here that this is a use case worthy of support. 3. Find some way to move a significant amount of the data in question out of Lucene altogether into something else which fits nicely together with filling memory with a cache so that the amount of codeccing drops below the threshold of interest. - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: A codec moment or pickle
On Thu, Feb 12, 2015 at 8:51 AM, Benson Margulies wrote: > On Thu, Feb 12, 2015 at 8:43 AM, Robert Muir wrote: > >> Honestly i dont agree. I don't know what you are trying to do, but if >> you want file format backwards compat working, then you need a >> different FilterCodec to match each lucene codec. >> >> Otherwise your codec is broken from a back compat standpoint. Wrapping >> the latest is an antipattern here. >> > > I understand this logic. It leaves me wandering between: > > 1: My old desire to convince you that there should be a way to do > DirectPostingFormat's caching without being a codec at all. Unfortunately, > I got dragged away from the benchmarking that might have been persuasive. Honestly, benchmarking won't persuade me. I think this is a trap and I don't want more of these traps. We already have RAMDirectory(Directory other) which is this exact same trap. We don't need more duplicates of it. But this Direct, man oh man is it even worse by far, because it uses 32 and 64 bits for things that really should typically only be like 8 bits with compression, so it just hogs up RAM. There isnt a benchmark on this planet that can convince me it should get any higher status. On the contrary, I want to send it into a deep dark dungeon in siberia. - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
RE: A codec moment or pickle
Hi, FYI, this is the same issues like Locales have/had in ICU! If you try to render an error message in Locales's constructors, this breaks with NPE - because default Locale is not yet there... I think they implemented some "fallback" that is guaranteed to be there. But this would not help you, too - you need the default Codec be available at the time your custom codec is loaded... Same issue, no idea how to solve this. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Benson Margulies [mailto:ben...@basistech.com] > Sent: Thursday, February 12, 2015 11:34 AM > To: java-user@lucene apache. org > Subject: Re: A codec moment or pickle > > Based on reading the same comments you read, I'm pretty doubtful that > Codec.getDefault() is going to work. It seems to me that this situation > renders the FilterCodec a bit hard to to use, at least given the 'every > release > deprecates a codec' sort of pattern. > > > > On Thu, Feb 12, 2015 at 3:20 AM, Uwe Schindler wrote: > > Hi, > > > > How about Codec.getDefault()? It does indeed not necessarily return the > newest one (if somebody changes the default using Codec.setDefault()), but > for your use case "wrapping the current default one", it should be fine? > > > > I have not tried this yet, but there might be a chicken-egg problem: > > - Your codec will have a separate name and be listed in META-INF as service > (I assume this). So it gets discovered by the Codec discovery process and is > instantiated by that. > > - On loading the Codec framework the call to codec.getDefault() might get > in at a time where the codecs are not yet fully initialized (because it will > instantiate your codec while loading the META-INF). This happens before the > Codec class is itself fully statically initialized, so the default codec > might be > null... > > So relying on Codec.getDefault() in constructors of filter codecs may not > work as expected! > > > > Maybe try it out, was just an idea :-) > > > > Uwe > > > > - > > Uwe Schindler > > H.-H.-Meier-Allee 63, D-28213 Bremen > > http://www.thetaphi.de > > eMail: u...@thetaphi.de > > > > > >> -Original Message- > >> From: Benson Margulies [mailto:bimargul...@gmail.com] > >> Sent: Thursday, February 12, 2015 2:11 AM > >> To: java-user@lucene.apache.org > >> Subject: A codec moment or pickle > >> > >> I have a class that extends FilterCodec. Written against Lucene 4.9, > >> it uses the Lucene49Codec. > >> > >> Dropped into a copy of Solr with Lucene 4.10, it discovers that this > >> codec is read-only in 4.10. Is there some way to code one of these to > >> get 'the default codec' and not have to chase versions? > >> > >> - > >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >> For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > > - > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: A codec moment or pickle
On Thu, Feb 12, 2015 at 8:43 AM, Robert Muir wrote: > Honestly i dont agree. I don't know what you are trying to do, but if > you want file format backwards compat working, then you need a > different FilterCodec to match each lucene codec. > > Otherwise your codec is broken from a back compat standpoint. Wrapping > the latest is an antipattern here. > I understand this logic. It leaves me wandering between: 1: My old desire to convince you that there should be a way to do DirectPostingFormat's caching without being a codec at all. Unfortunately, I got dragged away from the benchmarking that might have been persuasive. 2: The problem of deprecation. I give someone a jar-of-code that works fine with Lucene 4.9. It does not work with 4.10. Now, maybe the answer here is that the codec deprecation is fundamental to the definition of moving from 4.9 to 4.10, so having a codec means that I'm really married to a process of making releases that mirror Lucene releases. > > > On Thu, Feb 12, 2015 at 5:33 AM, Benson Margulies > wrote: > > Based on reading the same comments you read, I'm pretty doubtful that > > Codec.getDefault() is going to work. It seems to me that this > > situation renders the FilterCodec a bit hard to to use, at least given > > the 'every release deprecates a codec' sort of pattern. > > > > > > > > On Thu, Feb 12, 2015 at 3:20 AM, Uwe Schindler wrote: > >> Hi, > >> > >> How about Codec.getDefault()? It does indeed not necessarily return the > newest one (if somebody changes the default using Codec.setDefault()), but > for your use case "wrapping the current default one", it should be fine? > >> > >> I have not tried this yet, but there might be a chicken-egg problem: > >> - Your codec will have a separate name and be listed in META-INF as > service (I assume this). So it gets discovered by the Codec discovery > process and is instantiated by that. > >> - On loading the Codec framework the call to codec.getDefault() might > get in at a time where the codecs are not yet fully initialized (because it > will instantiate your codec while loading the META-INF). This happens > before the Codec class is itself fully statically initialized, so the > default codec might be null... > >> So relying on Codec.getDefault() in constructors of filter codecs may > not work as expected! > >> > >> Maybe try it out, was just an idea :-) > >> > >> Uwe > >> > >> - > >> Uwe Schindler > >> H.-H.-Meier-Allee 63, D-28213 Bremen > >> http://www.thetaphi.de > >> eMail: u...@thetaphi.de > >> > >> > >>> -Original Message- > >>> From: Benson Margulies [mailto:bimargul...@gmail.com] > >>> Sent: Thursday, February 12, 2015 2:11 AM > >>> To: java-user@lucene.apache.org > >>> Subject: A codec moment or pickle > >>> > >>> I have a class that extends FilterCodec. Written against Lucene 4.9, > it uses the > >>> Lucene49Codec. > >>> > >>> Dropped into a copy of Solr with Lucene 4.10, it discovers that this > codec is > >>> read-only in 4.10. Is there some way to code one of these to get 'the > default > >>> codec' and not have to chase versions? > >>> > >>> - > >>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >>> For additional commands, e-mail: java-user-h...@lucene.apache.org > >> > >> > >> - > >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >> For additional commands, e-mail: java-user-h...@lucene.apache.org > >> > > > > - > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >
Re: A codec moment or pickle
Honestly i dont agree. I don't know what you are trying to do, but if you want file format backwards compat working, then you need a different FilterCodec to match each lucene codec. Otherwise your codec is broken from a back compat standpoint. Wrapping the latest is an antipattern here. On Thu, Feb 12, 2015 at 5:33 AM, Benson Margulies wrote: > Based on reading the same comments you read, I'm pretty doubtful that > Codec.getDefault() is going to work. It seems to me that this > situation renders the FilterCodec a bit hard to to use, at least given > the 'every release deprecates a codec' sort of pattern. > > > > On Thu, Feb 12, 2015 at 3:20 AM, Uwe Schindler wrote: >> Hi, >> >> How about Codec.getDefault()? It does indeed not necessarily return the >> newest one (if somebody changes the default using Codec.setDefault()), but >> for your use case "wrapping the current default one", it should be fine? >> >> I have not tried this yet, but there might be a chicken-egg problem: >> - Your codec will have a separate name and be listed in META-INF as service >> (I assume this). So it gets discovered by the Codec discovery process and is >> instantiated by that. >> - On loading the Codec framework the call to codec.getDefault() might get in >> at a time where the codecs are not yet fully initialized (because it will >> instantiate your codec while loading the META-INF). This happens before the >> Codec class is itself fully statically initialized, so the default codec >> might be null... >> So relying on Codec.getDefault() in constructors of filter codecs may not >> work as expected! >> >> Maybe try it out, was just an idea :-) >> >> Uwe >> >> - >> Uwe Schindler >> H.-H.-Meier-Allee 63, D-28213 Bremen >> http://www.thetaphi.de >> eMail: u...@thetaphi.de >> >> >>> -Original Message- >>> From: Benson Margulies [mailto:bimargul...@gmail.com] >>> Sent: Thursday, February 12, 2015 2:11 AM >>> To: java-user@lucene.apache.org >>> Subject: A codec moment or pickle >>> >>> I have a class that extends FilterCodec. Written against Lucene 4.9, it >>> uses the >>> Lucene49Codec. >>> >>> Dropped into a copy of Solr with Lucene 4.10, it discovers that this codec >>> is >>> read-only in 4.10. Is there some way to code one of these to get 'the >>> default >>> codec' and not have to chase versions? >>> >>> - >>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> >> - >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: A codec moment or pickle
Based on reading the same comments you read, I'm pretty doubtful that Codec.getDefault() is going to work. It seems to me that this situation renders the FilterCodec a bit hard to to use, at least given the 'every release deprecates a codec' sort of pattern. On Thu, Feb 12, 2015 at 3:20 AM, Uwe Schindler wrote: > Hi, > > How about Codec.getDefault()? It does indeed not necessarily return the > newest one (if somebody changes the default using Codec.setDefault()), but > for your use case "wrapping the current default one", it should be fine? > > I have not tried this yet, but there might be a chicken-egg problem: > - Your codec will have a separate name and be listed in META-INF as service > (I assume this). So it gets discovered by the Codec discovery process and is > instantiated by that. > - On loading the Codec framework the call to codec.getDefault() might get in > at a time where the codecs are not yet fully initialized (because it will > instantiate your codec while loading the META-INF). This happens before the > Codec class is itself fully statically initialized, so the default codec > might be null... > So relying on Codec.getDefault() in constructors of filter codecs may not > work as expected! > > Maybe try it out, was just an idea :-) > > Uwe > > - > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > >> -Original Message- >> From: Benson Margulies [mailto:bimargul...@gmail.com] >> Sent: Thursday, February 12, 2015 2:11 AM >> To: java-user@lucene.apache.org >> Subject: A codec moment or pickle >> >> I have a class that extends FilterCodec. Written against Lucene 4.9, it uses >> the >> Lucene49Codec. >> >> Dropped into a copy of Solr with Lucene 4.10, it discovers that this codec is >> read-only in 4.10. Is there some way to code one of these to get 'the default >> codec' and not have to chase versions? >> >> - >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org > > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
RE: A codec moment or pickle
Hi, How about Codec.getDefault()? It does indeed not necessarily return the newest one (if somebody changes the default using Codec.setDefault()), but for your use case "wrapping the current default one", it should be fine? I have not tried this yet, but there might be a chicken-egg problem: - Your codec will have a separate name and be listed in META-INF as service (I assume this). So it gets discovered by the Codec discovery process and is instantiated by that. - On loading the Codec framework the call to codec.getDefault() might get in at a time where the codecs are not yet fully initialized (because it will instantiate your codec while loading the META-INF). This happens before the Codec class is itself fully statically initialized, so the default codec might be null... So relying on Codec.getDefault() in constructors of filter codecs may not work as expected! Maybe try it out, was just an idea :-) Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Benson Margulies [mailto:bimargul...@gmail.com] > Sent: Thursday, February 12, 2015 2:11 AM > To: java-user@lucene.apache.org > Subject: A codec moment or pickle > > I have a class that extends FilterCodec. Written against Lucene 4.9, it uses > the > Lucene49Codec. > > Dropped into a copy of Solr with Lucene 4.10, it discovers that this codec is > read-only in 4.10. Is there some way to code one of these to get 'the default > codec' and not have to chase versions? > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
A codec moment or pickle
I have a class that extends FilterCodec. Written against Lucene 4.9, it uses the Lucene49Codec. Dropped into a copy of Solr with Lucene 4.10, it discovers that this codec is read-only in 4.10. Is there some way to code one of these to get 'the default codec' and not have to chase versions? - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org