I didn't know the bulk API was so important. Which bulk API (eg the postings one or the terms dict)?
On Mon, Aug 15, 2011 at 11:17 PM, Robert Muir <[email protected]> wrote: > On Mon, Aug 15, 2011 at 10:49 PM, Mark Miller <[email protected]> wrote: >> Just throwing this out there, but: >> >> I think it would be really cool if we could get 4.0 out by the end of the >> year. >> >> With such a large release, I think it would also make a lot of sense if we >> tried a more formal beta release, just to increase the amount of usage >> before we officially sign off on a final 4.0. >> > > I agree with the beta idea, I think its really necessary actually: we > are just being honest at that point that its a real point-zero > release. > > on the other hand, besides the GSOC stuff, I think we should > accomplish a few things first to ensure we can actually make the 4.x > release useful and issue minor releases off of it: > * fix the bulk API: otherwise we only have "flexible indexing, as long > as you don't mind flexible == slower". This is really important, I > dont think we have to implement a bunch of new compression algorithms > but the whole postings APIs are suboptimal, and biased towards > lucene's current format: the bulk APIs arent low level enough to give > good performance, the payloads APIs assume you can ask for a payload > at any time (they assume basically that you are going to 'steal bits' > from the positions like we do today), etc etc. > * round out docvalues, especially merging with different docvalues > types and things like that. arguably these are nocommits... I think > you will get an exception during merge? I also think its bad we still > don't use docvalues for norms nor the faceting module, fixing these > kinds of real world uses is probably a great way to round this out. > * figure out the packaging system for modules such that things like > clover/hudson/javadocs etc all work across them (not quite today). We > also need to look at all the minor things like CHANGES.txt and such... > there are too many of these. Furthermore at least I wanted the > analyzers modularization to move forward to a point where we can > remove the Version crap and you just use the old jar file, I don't > feel like we are even close to that. > * fix codec naming: i think its silly to name a codec "Standard" and > use the codec header for backwards compatibility, easier to name the > codec "Standard40" and just package this codec in the next release for > backwards compatibility, e.g. if we want to introduce a new index > format we make it "Standard42". This is just my opinion though, its > not the only way to solve the index backwards compat here but I think > its easiest. > > I have a ton more pet peeves, but I think these are the biggest. It > probably sounds like a lot but I think its totally stupid to release > 4.0 if we cannot "grow" the 4.x branch with 4.1, 4.2, etc while we > work on 5.x. Otherwise we are just jumping from 4.0 to 5.0 and thats a > sign we just shouldnt have released at all. > > -- > lucidimagination.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
