Re: Solr 4.2.1 higher memory footprint vs Solr 3.5

2013-06-05 Thread SandeepM
/So we see the jagged edge waveform which keeps climbing (GC cycles don't
completely collect memory over time).  Our test has a short capture from
real traffic and we are replaying that via solrmeter./

Any idea why the memory climbs over time.  The GC should cleanup after data
is shipped back.  Could there be a memory leak in SOLR?

Appreciate any help.
Thanks.
-- Sandeep



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-2-1-higher-memory-footprint-vs-Solr-3-5-tp4067879p4068378.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.2.1 higher memory footprint vs Solr 3.5

2013-06-04 Thread SandeepM
Thanks Eric and Shawn,

Your explanations help understand where SOLR may be spending its time. 
Sounds like compression can be a CPU and heap hog. (I'll try to confirm this
with the heapdumps)

Initially we tried to keep the JVM heap sizes the same on both Solr 3.5 and
4.2.1, which was around 3GB ,which 3.5 handled well even with a 200QPS load. 
Moving to 4.2.1 with the same heap size instantly killed the Server. 
Changing the JVM to 6GB (double) did not help either.  We were seeing higher
CPU and higher heap usage.

We later changed cache settings so as to reduce their sizes, increased the
JVM to 8GB and we see an improvement.  But over time, we do see that the
Heap utilization slowly climbs as the 200QPS test is allowed to run, and
sometimes leads to max heap being exceeded from the JConsole.  So we see the
jagged edge waveform which keeps climbing (GC cycles don't completely
collect memory over time).  Our test has a short capture from real traffic
and we are replaying that via solrmeter. 

Thanks.
Regards,
-- Sandeep



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-2-1-higher-memory-footprint-vs-Solr-3-5-tp4067879p4068150.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr 4.2.1 higher memory footprint vs Solr 3.5

2013-06-03 Thread SandeepM
Hi,

Using the same schema for both Solr 3.5 and Solr 4.2.1 and posting the same
data to both these server,  and the memory requirements seem to have gone up
sharply during request handling.
. Requests come in at around 200QPS.
. Document sizes are very large but that did not seem to be a problem with
3.5 (Lots of multivalued fields with large array lengths.)
Could you help me understand what change in SOLR 4.2.1 would attribute to
this higher memory requirement?

Also, in a different test, I ran a query to just get a list of all unique
ID's via a single query and no load and I see it complete in <500ms however
the time it takes to ship the data back to the client seems to be very
large.  Any idea what could be causing this behavior?

Would appreciate any help.

Regards,
-- Sandeep



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-2-1-higher-memory-footprint-vs-Solr-3-5-tp4067879.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: DirectSolrSpellChecker : vastly varying spellcheck QTime times.

2013-04-24 Thread SandeepM

One of our main concerns is the solr returns the best match based on what it
thinks is the best.  It uses Levenshtein's distance metrics to determine the
best suggestions.   Can we tune this to put more weightage on the number of
frequency/hits vs the number of edits ?   If we can tune this, suggestions
would seem more relevant when corrected.Also, if we can do this while
keeping maxCollation = 1 and maxCollationTries = "some reasonable number so
that QTime does not go out of control" that will be great!   

Any insights into this would be great. Thanks for your help.

Regards,
-- Sandeep



--
View this message in context: 
http://lucene.472066.n3.nabble.com/DirectSolrSpellChecker-vastly-varying-spellcheck-QTime-times-tp4057176p4058655.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: DirectSolrSpellChecker : vastly varying spellcheck QTime times.

2013-04-23 Thread SandeepM
James, Is there a way to determine how many times the collations were tried?  
Is there a parameter that can be issued that can return this in debug
information?  This would be very helpful.
Appreciate your help with this.

Thanks.
-- Sandeep



--
View this message in context: 
http://lucene.472066.n3.nabble.com/DirectSolrSpellChecker-vastly-varying-spellcheck-QTime-times-tp4057176p4058400.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: spellcheck: change in behavior and QTime

2013-04-23 Thread SandeepM
I apologize for the length of the previous message.

I do see a problem with spellcheck becoming faster (notice QTime).  I also
see an increase in the number of cache hits if spellcheck=false is run one
time followed by the original spellcheck query.  Seems like spellcheck=false
alters the behavior of spellcheck. 

http://host/solr/select?spellcheck=true&spellcheck.q=cucoo's+nest&df=spell 
http://host/solr/select?spellcheck=false&spellcheck.q=cucoo's+nest&df=spell  
http://host/solr/select?spellcheck=true&spellcheck.q=cucoo's+nest&df=spell 
<--- see a faster response and increase in the number of query cache hits.

Thanks.
-- Sandeep





--
View this message in context: 
http://lucene.472066.n3.nabble.com/spellcheck-change-in-behavior-and-QTime-tp4058014p4058402.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: DirectSolrSpellChecker : vastly varying spellcheck QTime times.

2013-04-22 Thread SandeepM
Chocolat Factry






  0
  77




  

  1
  0
  8
  615
  

  chocolate
  6544

  


  5
  9
  15
  6
  

  factory
  23614


  factor
  5128


  factus
  290


  factum
  178


  factae
  102

  

false

  chocolate factory
  85
  
chocolate
factory
  

  






Pursut Hapyness




  0
  16




  

  5
  0
  6
  0
  

  pursuit
  1209


  pursue
  108


  pursit
  1


  perdut
  94


  purdue
  70

  


  5
  7
  15
  0
  

  happyness
  175


  hapiness
  62


  hayness
  1


  happiness
  7788


  harkness
  324

  

false

  pursuit happyness
  10
  
pursuit
happyness
  

  



Spellcheck is used separately and we are not using any q along with
spellcheck.

Our search query also queries other fields, not just spellcheck and
therefore does not give a good representation of Qtime.   We use groupings
in the search query.
For Chocolate Factory, I get a search QTime of 198ms
For Pursuit Happyness, I get a search QTime of 318ms

Would appreciate your insights.
Thanks.
-- Sandeep




--
View this message in context: 
http://lucene.472066.n3.nabble.com/DirectSolrSpellChecker-vastly-varying-spellcheck-QTime-times-tp4057176p4058086.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: DirectSolrSpellChecker : vastly varying spellcheck QTime times.

2013-04-22 Thread SandeepM
James, Thanks.  That was very helpful. That helped me understand count and
alternativeTermCount a bit more.

I also have the following case as pointed out earlier...
My query: 

http://host/solr/select?q=&spellcheck.q=chocolat%20factry&spellcheck=true&df=spell&fl=&indent=on&wt=xml&rows=10&version=2.2&echoParams=explicit

In this case, the intent is to correct "chocolat factry" with "chocolate
factory" which exists in my spell field index. I see a QTime from the above
query as somewhere between 350-400ms 

I run a similar query replacing the spellcheck terms to "pursut hapyness"
whereas "pursuit happyness" actually exists in my spell field and I see
QTime of 15-17ms . 

Both query produce collations correctly and picking the first suggestions
and applying them as collation find what I am looking for but there is order
of magnitude difference in QTime.  There is one edit per term in both cases
or 2 edits in each query. The length of words in both these queries seem
identical. I'd like to understand why there is this vast difference in
QTime.  Also "Chocolate factory" and "Pursuit happyness" both are spellcheck
indexed as is.

I would appreciate any help with this since I am not sure how I can get any
meaningful performance numbers and attribute the slowness to anything in
particular. 

Thanks.
Regards,
-- Sandeep



--
View this message in context: 
http://lucene.472066.n3.nabble.com/DirectSolrSpellChecker-vastly-varying-spellcheck-QTime-times-tp4057176p4058048.html
Sent from the Solr - User mailing list archive at Nabble.com.


spellcheck: change in behavior and QTime

2013-04-22 Thread SandeepM
I am using the same setup (solrconfig.xml and schema.xml) as stated in my
prior message:
http://lucene.472066.n3.nabble.com/DirectSolrSpellChecker-vastly-varying-spellcheck-QTime-times-tt4057176.html#a4057389
I am using SOLR 4.2.1 . Just wanted to report something wierd that I am
seeing and would like to find out if anyone else is seeing this behavior.  
Since I don't understand the details of what is happening, I'd like to know
why the change in behavior and if we can do anything to get better QTime
upfront?

I see a change in behavior when running queries against the server due to
which the QTime also changes.

QUERY:
?spellcheck=true
&spellcheck.q=cucoo's+nest
&df=spell
&fq= Its the same every time and I believe moot.

Here is what I have to do:
1.  Run the query.
2.  Run the same query with spellcheck=false
3.  Run the original query (spellcheck=true)

QTime from each of the above stages:
1.  40ms (multiple runs with spellcheck=true.)
2.  10ms (spellcheck = false is run just once)
3.  20ms (after changing back to spellcheck=true again and running multiple
times.)

Cache details at each of the above times:
1.  filterCache

class:
org.apache.solr.search.FastLRUCache

version:
1.0

description:
Concurrent LRU Cache(maxSize=1024, initialSize=512, minSize=921,
acceptableSize=972, cleanupThread=false, autowarmCount=128,
regenerator=org.apache.solr.search.SolrIndexSearcher$2@7ce3d64e)

src:
$URL:
https:/?/?svn.apache.org/?repos/?asf/?lucene/?dev/?branches/?lucene_solr_4_2/?solr/?core/?src/?java/?org/?apache/?solr/?search/?FastLRUCache.java
$

stats:

lookups:
Was: 30, Now: 35, Delta: 5

hits:
Was: 25, Now: 30, Delta: 5

hitratio:
Was: 0.83, Now: 0.85

inserts:
5

evictions:
0

size:
5

warmupTime:
0

cumulative_lookups:
Was: 30, Now: 35, Delta: 5

cumulative_hits:
Was: 25, Now: 30, Delta: 5

cumulative_hitratio:
Was: 0.83, Now: 0.85

cumulative_inserts:
5

cumulative_evictions:
0

queryResultCache

class:
org.apache.solr.search.FastLRUCache

version:
1.0

description:
Concurrent LRU Cache(maxSize=40960, initialSize=10240,
minSize=36864, acceptableSize=38912, cleanupThread=false,
autowarmCount=2560,
regenerator=org.apache.solr.search.SolrIndexSearcher$3@520adaf0)

src:
$URL:
https:/?/?svn.apache.org/?repos/?asf/?lucene/?dev/?branches/?lucene_solr_4_2/?solr/?core/?src/?java/?org/?apache/?solr/?search/?FastLRUCache.java
$

stats:

lookups:
Was: 8, Now: 10, Delta: 2

hits:
Was: 3, Now: 4, Delta: 1

hitratio:
Was: 0.37, Now: 0.40

inserts:
Was: 5, Now: 6, Delta: 1

evictions:
0

size:
Was: 6, Now: 7, Delta: 1

warmupTime:
0

cumulative_lookups:
Was: 8, Now: 10, Delta: 2

cumulative_hits:
Was: 3, Now: 4, Delta: 1

cumulative_hitratio:
Was: 0.37, Now: 0.40

cumulative_inserts:
Was: 5, Now: 6, Delta: 1

cumulative_evictions:
0

CACHE 2
CORE
HIGHLIGHTING
OTHER
QUERYHANDLER 3
UPDATEHANDLER
Watch Changes
Refresh Values

2.  filterCache

class:
org.apache.solr.search.FastLRUCache

version:
1.0

description:
Concurrent LRU Cache(maxSize=1024, initialSize=512, minSize=921,
acceptableSize=972, cleanupThread=false, autowarmCount=128,
regenerator=org.apache.solr.search.SolrIndexSearcher$2@7ce3d64e)

src:
$URL:
https:/?/?svn.apache.org/?repos/?asf/?lucene/?dev/?branches/?lucene_solr_4_2/?solr/?core/?src/?java/?org/?apache/?solr/?search/?FastLRUCache.java
$

stats:

lookups:
Was: 35, Now: 40, Delta: 5

hits:
Was: 30, Now: 35, Delta: 5

hitratio:
Was: 0.85, Now: 0.87

inserts:
5

evictions:
0

size:
5

warmupTime:
0

cumulative_lookups:
Was: 35, Now: 40, Delta: 5

cumulative_hits:
Was: 30, Now: 35, Delta: 5

cumulative_hitratio:
Was: 0.85, Now: 0.87

cumulative_inserts:
5

cumulative_evictions:
0

queryResultCache

class:

RE: DirectSolrSpellChecker : vastly varying spellcheck QTime times.

2013-04-19 Thread SandeepM
James,
Thanks for the reply.  I see your point and sure enough, reducing
maxCollationTries does reduce time, however may not produce results.
It seems like the time is taken for the collations re-runs.  Is there any
way we can activate caching for collations.  The same query repeatedly takes
the same amount of time.  My queryCaches are activated, however don't
believe it gets used for spellchecks.
Thanks.
-- Sandeep



--
View this message in context: 
http://lucene.472066.n3.nabble.com/DirectSolrSpellChecker-vastly-varying-spellcheck-QTime-times-tp4057176p4057389.html
Sent from the Solr - User mailing list archive at Nabble.com.


DirectSolrSpellChecker : vastly varying spellcheck QTime times.

2013-04-18 Thread SandeepM
Hi!

I am using SOLR 4.2.1.

My solrconfig.xml contains the following:

  
   text_spell

 
   MySpellchecker
   spell
   solr.DirectSolrSpellChecker
   internal
   0.5
   2
   1
   5
   3
   0.01
   
 
 



  10
  id
  MySpellchecker
  on
  false
  10
  10
  35
  true
  true
  false
  10
  1
  AND


  MySpellcheck

  

schema.xml with the spell field looks like:


















 
My query:
http://host/solr/select?q=&spellcheck.q=chocolat%20factry&spellcheck=true&df=spell&fl=&indent=on&wt=xml&rows=10&version=2.2&echoParams=explicit

In this case, the intent is to correct "chocolat factry" with "chocolate
factory" which exists in my spell field index. I see a QTime from the above
query as somewhere between 350-400ms

I run a similar query replacing the spellcheck terms to "pursut hapyness"
whereas "pursuit happyness" actually exists in my spell field and I see
QTime of 15-17ms .

Both query produce collations correctly but there is order of magnitude
difference in QTime.  There is one edit per term in both cases or 2 edits in
each query. The length of words in both these queries seem identical. I'd
like to understand why there is this vast difference in QTime.  I would
appreciate any help with this since I am not sure how I can get any
meaningful performance numbers and attribute the slowness to anything in
particular. 

I also see a vast difference in QTime in another case.  Replace the search
terms in the above query with "over cuckoo's nest", "over cuccoo's nst",
etc.   "over cuckoo's nest" exists in my indexed spell field and so it
should find it almost immediately.  This query fails to produce any
collation and takes 10seconds. While the second query "over cuccoo's nst"
corrects the phrase and also returns in 24ms. Something does not sound right
here.

I would appreciate help with these.

Thanks in advance.
Regards,
-- Sandeep



--
View this message in context: 
http://lucene.472066.n3.nabble.com/DirectSolrSpellChecker-vastly-varying-spellcheck-QTime-times-tp4057176.html
Sent from the Solr - User mailing list archive at Nabble.com.