improved locality of "near" documents could be used to avoid loading some
segments during the retrieval phase for certain use cases (e.g. spatial
search).


Il giorno mer 16 nov 2016 alle ore 09:45 Ishan Chattopadhyaya <
ichattopadhy...@gmail.com> ha scritto:

http://shaierera.blogspot.com/2013/04/index-sorting-with-lucene.html

On Wed, Nov 16, 2016 at 11:15 AM, Ishan Chattopadhyaya <
ichattopadhy...@gmail.com> wrote:

> Can IndexSort help here?
> ------------------------------
> From: Erick Erickson <erickerick...@gmail.com>
> Sent: ‎11/‎16/‎2016 9:29
> To: java-user <java-user@lucene.apache.org>
> Subject: Re: Possible to cause documents to be contiguous after
> forceMerge?
>
> Well, codecs are pluggable so if you can show that you'd get
> an improvement (however you measure them) and that whatever
> you have in mind wouldn't penalize the general case you could
> submit it as a proposal/patch.
>
> Best,
> Erick
>
> On Tue, Nov 15, 2016 at 6:21 PM, Kevin Burton <bur...@spinn3r.com> wrote:
> > On Tue, Nov 15, 2016 at 6:16 PM, Erick Erickson <erickerick...@gmail.com
> >
> > wrote:
> >
> >> You can make no assumptions about locality in terms of where separate
> >> documents land on disk. I suppose if you have the whole corpus at index
> >> time you
> >> could index these "similar" documents contiguously. T
> >>
> >
> > Wow.. that's shockingly frightening. There are a ton of optimizations if
> > you can trick the underlying content store into performing locality.
> >
> > Not trying to be overly negative so another way to phrase it is that at
> > least there's room for improvement !
> >
> >
> >> My base question is why you'd care about compressing 500G. Disk space
> >> is so cheap that the expense of trying to control this dwarfs any
> >> imaginable
> >> $avings, unless you're talking about a lot of 500G indexes. In other
> words
> >> this seems like an
> >> XY problem, you're asking about compressing when you are really
> concerned
> >> with something else.
> >>
> >
> > 500GB per day... additionally, disk is cheap, but IOPS are not. The more
> we
> > can keep in ram and on SSD the better.
> >
> > And we're trying to get as much in RAM then SSD as possible... plus we
> have
> > about 2 years of content.  It adds up ;)
> >
> > Kevin
> >
> > --
> >
> > We’re hiring if you know of any awesome Java Devops or Linux Operations
> > Engineers!
> >
> > Founder/CEO Spinn3r.com
> > Location: *San Francisco, CA*
> > blog: http://burtonator.wordpress.com
> > … or check out my Google+ profile
> > <https://plus.google.com/102718274791889610666/posts>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

Reply via email to