Re: Index Version and Epoch Time?

2011-07-04 Thread Chris Hostetter

: The index version shown on the dashboard is the time at which the most
: recent index segment was created. I'm not sure why it has a value older than
: a month if a commit has happened after that time.

I'm fairly certian that's false.

last time i checked, newly created indexes are assigned a version based on 
index time, but after that each commit simply imcrements the version - so 
index versions are only suitable for comparing if one instance of an index 
is newer or older then another instance of the same index -- it doesn't 
tell you anything about the relative age.


-Hoss


Re: Index Version and Epoch Time?

2011-07-04 Thread Shalin Shekhar Mangar
On Tue, Jul 5, 2011 at 12:03 AM, Chris Hostetter
hossman_luc...@fucit.orgwrote:


 : The index version shown on the dashboard is the time at which the most
 : recent index segment was created. I'm not sure why it has a value older
 than
 : a month if a commit has happened after that time.

 I'm fairly certian that's false.

 last time i checked, newly created indexes are assigned a version based on
 index time, but after that each commit simply imcrements the version - so
 index versions are only suitable for comparing if one instance of an index
 is newer or older then another instance of the same index -- it doesn't
 tell you anything about the relative age.


Thanks for clearing that up Hoss. I only looked at a place where IndexCommit
was being created and it used System.currentTimeMillis, hence the confusion.
Anyways, what the version represents is not guaranteed except that it will
uniquely identify a commit point so users should not make any assumptions.

-- 
Regards,
Shalin Shekhar Mangar.


Index Version and Epoch Time?

2011-06-28 Thread Pranav Prakash
Hi,

I am not sure what is the index number value? It looks like an epoch time,
but in my case, this points to one month back. However, i can see documents
which were added last week, to be in the index.

Even after I did a commit, the index number did not change? Isn't it
supposed to change on every commit? If not, is there a way to look into the
last index time?

Also, this page
http://wiki.apache.org/solr/SolrReplication#Replication_Dashboard shows a
Replication Dashboard. How is this dashboard invoked? Is there any URL which
needs to be called?


*Pranav Prakash*

temet nosce

Twitter http://twitter.com/pranavprakash | Blog http://blog.myblive.com |
Google http://www.google.com/profiles/pranny


Re: Index Version and Epoch Time?

2011-06-28 Thread Shalin Shekhar Mangar
On Tue, Jun 28, 2011 at 4:18 PM, Pranav Prakash pra...@gmail.com wrote:


 I am not sure what is the index number value? It looks like an epoch time,
 but in my case, this points to one month back. However, i can see documents
 which were added last week, to be in the index.


The index version shown on the dashboard is the time at which the most
recent index segment was created. I'm not sure why it has a value older than
a month if a commit has happened after that time.


 Even after I did a commit, the index number did not change? Isn't it
 supposed to change on every commit? If not, is there a way to look into the
 last index time?


Yeah, it changes after every commit which added/deleted a document.


 Also, this page
 http://wiki.apache.org/solr/SolrReplication#Replication_Dashboard shows a
 Replication Dashboard. How is this dashboard invoked? Is there any URL
 which
 needs to be called?


If you have configured replication correctly, the admin dashboard should
show a Replication link right next to the Schema Browser link. The path
should be /admin/replication/index.jsp

-- 
Regards,
Shalin Shekhar Mangar.


Re: Index Version and Epoch Time?

2011-06-28 Thread Pranav Prakash
Hi,

I am facing multiple issues with solr and I am not sure what happens in each
case. I am quite naive in Solr and there are some scenarios I'd like to
discuss with you.

We have a huge volume of documents to be indexed. Somewhere about 5 million.
We have a full indexer script which essentially picks up all the documents
from database and updates into Solr and an incremental script which adds new
documents to Solr.. Relevant areas of my config file goes like

unlockOnStartupfalse/unlockOnStartup
deletionPolicy class=solr.SolrDeletionPolicy
!-- Keep only optimized commit points --
str name=keepOptimizedOnlyfalse/str
!-- The maximum number of commit points to be kept --
str name=maxCommitsToKeep1/str
/deletionPolicy
updateHandler class=solr.DirectUpdateHandler2
autoCommit
maxDocs10/maxDocs
/autoCommit
/updateHandler
requestHandler name=/replication class=solr.ReplicationHandler
lst name=master
str name=enable${enable.master:false}/str
str name=replicateAfterstartup/str
str name=replicateAftercommit/str
/lst
lst name=slave
str name=enable${enable.slave:false}/str
str name=masterUrlhttp://hostname:port/solr/core0/replication/str
/lst
/requestHandler

Sometimes, while the full indexer script breaks while adding documents to
Solr. The script adds the documents and then commits the operation. So, when
the script breaks, we have a huge lot of data which has been updated but not
committed. Next, the incremental index script executes, and figures out all
the new entries, adds them to Solr. It works successfully and commits the
operation.

   - Will the commit by incremental indexer script also commit the
   previously uncommitted changes made by full indexer script before it broke?

Sometimes, while during execution, Solr's avg response time 9avg resp time
for last 10 requests, read from log file) goes as high as 9000ms (which I am
still unclear why, any ideas how to start hunting for the problem?), so the
watchdog process restarts Solr (because it causes a pile of requests queue
at application server, which causes app server to crash). On my local
environment, I performed the same experiment by adding docs to Solr, killing
the process and restarting it. I found that the uncommitted changes were
applied and searchable. However, the updates were uncommitted. Could you
explain me as to how is this happening, or is there a configuration that can
be adjusted for this? Also, what would the index state be if after the
restarting Solr, a commit is applied or a commit is not applied?

I'd be happy to provide any other information that might be needed.

*Pranav Prakash*

temet nosce

Twitter http://twitter.com/pranavprakash | Blog http://blog.myblive.com |
Google http://www.google.com/profiles/pranny


On Tue, Jun 28, 2011 at 20:55, Shalin Shekhar Mangar shalinman...@gmail.com
 wrote:

 On Tue, Jun 28, 2011 at 4:18 PM, Pranav Prakash pra...@gmail.com wrote:

 
  I am not sure what is the index number value? It looks like an epoch
 time,
  but in my case, this points to one month back. However, i can see
 documents
  which were added last week, to be in the index.
 

 The index version shown on the dashboard is the time at which the most
 recent index segment was created. I'm not sure why it has a value older
 than
 a month if a commit has happened after that time.

 
  Even after I did a commit, the index number did not change? Isn't it
  supposed to change on every commit? If not, is there a way to look into
 the
  last index time?
 

 Yeah, it changes after every commit which added/deleted a document.


  Also, this page
  http://wiki.apache.org/solr/SolrReplication#Replication_Dashboard shows
 a
  Replication Dashboard. How is this dashboard invoked? Is there any URL
  which
  needs to be called?
 
 
 If you have configured replication correctly, the admin dashboard should
 show a Replication link right next to the Schema Browser link. The path
 should be /admin/replication/index.jsp

 --
 Regards,
 Shalin Shekhar Mangar.



Re: Index Version and Epoch Time?

2011-06-28 Thread Jonathan Rochkind

On 6/28/2011 1:38 PM, Pranav Prakash wrote:

- Will the commit by incremental indexer script also commit the
previously uncommitted changes made by full indexer script before it broke?


Yes, as long as the Solr instance hasn't crashed.  Anything added but 
not yet committed sticks around and will be committed on next 'commit'. 
There are no 'transactions' for adding docs in Solr, even if multiple 
processes are adding, if anyone of them issues a 'commit' they'll all be 
committed.



Sometimes, while during execution, Solr's avg response time 9avg resp time
for last 10 requests, read from log file) goes as high as 9000ms (which I am
still unclear why, any ideas how to start hunting for the problem?),


It could be a Java garbage collection issue. I have found it useful to 
start the JVM with Solr in it using some parameters to tune garbage 
collection. I use these JVM options:
 -server -XX:+AggressiveOpts -d64 -XX:+UseConcMarkSweepGC 
-XX:+UseCompressedOops


You've still got to make sure Solr has enough memory for what you're 
doing with it, with with your 5 million doc index might be more than you 
expect. On the other hand, giving a JVM too _much_ heap can cause 
slowdowns too, although I think the -XX:+UseConcMarkSweepGC should 
amelioerate that to some extent.


Possibly more likely, it could instead be Solr readying the new indexes. 
Do you issue commits in the middle of 'execution', and could the 
slowdown happen right after a commit?  When a commit is issued to Solr, 
Solr's got to switch new indexes in with the newly added documents, and 
'warm' those indexes in various ways. Which can be a CPU (as well as 
RAM) intensive thing. (For these purposes a replication from master 
counts as a commit (because it is), and an optimize can count too 
(becaue it's close enough)).


This can be especially a problem if you issue multiple commits very 
close together -- Solr's still working away at readying the index from 
the first commit, when the second comes in, and now Solr's trying to get 
ready two indexes at once (one of which will never be used because its' 
already outdated).  Or even more than two if you issue a bunch of 
commits in rapid succession.






  I found that the uncommitted changes were
applied and searchable. However, the updates were uncommitted.


There is in general no way that uncomitted adds could be searchable, 
that's probably not happening.   What is probably happening instead is 
that a commit _is_ happening.  One way a commit can happen even if you 
aren't manually issuing one is with various auto-commit settings in 
solrconfig.xml.  Commit any pending adds after X documents, or after T 
seconds, can both be configured. If they are configured, that could be 
causing commits to happen when you don't realize it, which could also 
trigger the slowdown due to a commit mentioned in the previous paragraph.


Jonathan