Thanks Erick.

On 7/21/16 8:25 AM, Erick Erickson wrote:
bq: map index files so "reading from disk" will be as simple and quick
as reading from memory hence would not incur any significant
performance degradation.

Well, if
1> the read has already been done. First time a page of the file is
accessed, it must be read from disk.
2> You have enough physical memory that _all_ of the files can be held
in memory at once.

<2> is a little tricky since the big slowdown comes from swapping
eventually. But in an LRU scheme, that may be OK if the oldest pages
are the stored=true data which are only accessed to return the top N,
not to satisfy the search.
I suspect swapping as well. But, for my understanding - are the index files from disk memory mapped automatically at the startup time?

What are your QTimes anyway? Define "optimal"....

I'd really push back on this statement: "We have a requirement to have
updates available immediately (NRT)". Truly? You can't set
expectations that 5 seconds will be needed (or 10?). Often this is an
artificial requirement that does no real service to the user, it's
just something people think they want. If this means you're sending a
commit after every document, it's actually a really bad practice
that'll get you into trouble eventually. Plus you won't be able to do
any autowarming which will read data from disk into the OS memory and
smooth out any spikes

We are not performing "commit" after every update and here is the configuration for softCommit and hardCommit.

<autoCommit>
       <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
       <openSearcher>false</openSearcher>
</autoCommit>

<autoSoftCommit>
       <maxTime>${solr.autoSoftCommit.maxTime:120000}</maxTime>
</autoSoftCommit>

I am seeing QTimes (for searches) swing between 10 seconds - 2 seconds. Some queries were showing the slowness caused to due to faceting (debug=true). Since we have adjusted indexing and facet times are improved but basic query QTime is still high so wondering where can I look? Is there a way to debug (instrument) a query on Solr node?


FWIW,
Erick

On Thu, Jul 21, 2016 at 8:18 AM, Rallavagu <rallav...@gmail.com> wrote:
Solr 5.4.1 with embedded jetty with cloud enabled

We have a Solr deployment (approximately 3 million documents) with both
write and search operations happening. We have a requirement to have updates
available immediately (NRT). Configured with default
"solr.NRTCachingDirectoryFactory" for directory factory. Considering the
fact that every time there is an update, caches are invalidated and re-built
I assume that "solr.NRTCachingDirectoryFactory" would memory map index files
so "reading from disk" will be as simple and quick as reading from memory
hence would not incur any significant performance degradation. Am I right in
my assumption? We have allocated significant amount of RAM (48G total
physical memory, 12G heap, Total index disk size is 15G) but not sure if I
am seeing the optimal QTimes (for searches). Any inputs are welcome. Thanks
in advance.

Reply via email to