Hi Jack; Due to I am new to Solr, can you explain this two things that you said:
1) when most people say "index size" they are referring to all fields, collectively, not individual fields (what do you mean with "Segments are on a per-field basis" and all fields, individual fields.) 2) more cores might make the worst case scenario worse since it will maximize the amount of data processed at a given moment 2013/4/13 Erick Erickson <erickerick...@gmail.com> > bq: disk space is three times > > True, I keep forgetting about compound since I never use it... > > On Wed, Apr 10, 2013 at 11:05 AM, Walter Underwood > <wun...@wunderwood.org> wrote: > > Correct, except the worst case maximum for disk space is three times. > --wunder > > > > On Apr 10, 2013, at 6:04 AM, Erick Erickson wrote: > > > >> You're mixing up disk and RAM requirements when you talk > >> about having twice the disk size. Solr does _NOT_ require > >> twice the index size of RAM to optimize, it requires twice > >> the size on _DISK_. > >> > >> In terms of RAM requirements, you need to create an index, > >> run realistic queries at the installation and measure. > >> > >> Best > >> Erick > >> > >> On Tue, Apr 9, 2013 at 10:32 PM, bigjust <bigj...@lambdaphil.es> wrote: > >>> > >>> > >>> > >>>>> On 4/9/2013 7:03 PM, Furkan KAMACI wrote: > >>>>>> These are really good metrics for me: > >>>>>> You say that RAM size should be at least index size, and it is > >>>>>> better to have a RAM size twice the index size (because of worst > >>>>>> case scenario). > >>>>>> On the other hand let's assume that I have a RAM size that is > >>>>>> bigger than twice of indexes at machine. Can Solr use that extra > >>>>>> RAM or is it a approximately maximum limit (to have twice size of > >>>>>> indexes at machine)? > >>>>> What we have been discussing is the OS cache, which is memory that > >>>>> is not used by programs. The OS uses that memory to make everything > >>>>> run faster. The OS will instantly give that memory up if a program > >>>>> requests it. > >>>>> Solr is a java program, and java uses memory a little differently, > >>>>> so Solr most likely will NOT use more memory when it is available. > >>>>> In a "normal" directly executable program, memory can be allocated > >>>>> at any time, and given back to the system at any time. > >>>>> With Java, you tell it the maximum amount of memory the program is > >>>>> ever allowed to use. Because of how memory is used inside Java, > >>>>> most long-running Java programs (like Solr) will allocate up to the > >>>>> configured maximum even if they don't really need that much memory. > >>>>> Most Java virtual machines will never give the memory back to the > >>>>> system even if it is not required. > >>>>> Thanks, Shawn > >>>>> > >>>>> > >>> Furkan KAMACI <furkankam...@gmail.com> writes: > >>> > >>>> I am sorry but you said: > >>>> > >>>> *you need enough free RAM for the OS to cache the maximum amount of > >>>> disk space all your indexes will ever use* > >>>> > >>>> I have made an assumption my indexes at my machine. Let's assume that > >>>> it is 5 GB. So it is better to have at least 5 GB RAM? OK, Solr will > >>>> use RAM up to how much I define it as a Java processes. When we think > >>>> about the indexes at storage and caching them at RAM by OS, is that > >>>> what you talk about: having more than 5 GB - or - 10 GB RAM for my > >>>> machine? > >>>> > >>>> 2013/4/10 Shawn Heisey <s...@elyograg.org> > >>>> > >>> > >>> 10 GB. Because when Solr shuffles the data around, it could use up to > >>> twice the size of the index in order to optimize the index on disk. > >>> > >>> -- Justin > > > > -- > > Walter Underwood > > wun...@wunderwood.org > > > > > > >