Re: Index Version and Epoch Time?
: The index version shown on the dashboard is the time at which the most : recent index segment was created. I'm not sure why it has a value older than : a month if a commit has happened after that time. I'm fairly certian that's false. last time i checked, newly created indexes are assigned a version based on index time, but after that each commit simply imcrements the version - so index versions are only suitable for comparing if one instance of an index is newer or older then another instance of the same index -- it doesn't tell you anything about the relative age. -Hoss
Re: Index Version and Epoch Time?
On Tue, Jul 5, 2011 at 12:03 AM, Chris Hostetter hossman_luc...@fucit.orgwrote: : The index version shown on the dashboard is the time at which the most : recent index segment was created. I'm not sure why it has a value older than : a month if a commit has happened after that time. I'm fairly certian that's false. last time i checked, newly created indexes are assigned a version based on index time, but after that each commit simply imcrements the version - so index versions are only suitable for comparing if one instance of an index is newer or older then another instance of the same index -- it doesn't tell you anything about the relative age. Thanks for clearing that up Hoss. I only looked at a place where IndexCommit was being created and it used System.currentTimeMillis, hence the confusion. Anyways, what the version represents is not guaranteed except that it will uniquely identify a commit point so users should not make any assumptions. -- Regards, Shalin Shekhar Mangar.
Index Version and Epoch Time?
Hi, I am not sure what is the index number value? It looks like an epoch time, but in my case, this points to one month back. However, i can see documents which were added last week, to be in the index. Even after I did a commit, the index number did not change? Isn't it supposed to change on every commit? If not, is there a way to look into the last index time? Also, this page http://wiki.apache.org/solr/SolrReplication#Replication_Dashboard shows a Replication Dashboard. How is this dashboard invoked? Is there any URL which needs to be called? *Pranav Prakash* temet nosce Twitter http://twitter.com/pranavprakash | Blog http://blog.myblive.com | Google http://www.google.com/profiles/pranny
Re: Index Version and Epoch Time?
On Tue, Jun 28, 2011 at 4:18 PM, Pranav Prakash pra...@gmail.com wrote: I am not sure what is the index number value? It looks like an epoch time, but in my case, this points to one month back. However, i can see documents which were added last week, to be in the index. The index version shown on the dashboard is the time at which the most recent index segment was created. I'm not sure why it has a value older than a month if a commit has happened after that time. Even after I did a commit, the index number did not change? Isn't it supposed to change on every commit? If not, is there a way to look into the last index time? Yeah, it changes after every commit which added/deleted a document. Also, this page http://wiki.apache.org/solr/SolrReplication#Replication_Dashboard shows a Replication Dashboard. How is this dashboard invoked? Is there any URL which needs to be called? If you have configured replication correctly, the admin dashboard should show a Replication link right next to the Schema Browser link. The path should be /admin/replication/index.jsp -- Regards, Shalin Shekhar Mangar.
Re: Index Version and Epoch Time?
Hi, I am facing multiple issues with solr and I am not sure what happens in each case. I am quite naive in Solr and there are some scenarios I'd like to discuss with you. We have a huge volume of documents to be indexed. Somewhere about 5 million. We have a full indexer script which essentially picks up all the documents from database and updates into Solr and an incremental script which adds new documents to Solr.. Relevant areas of my config file goes like unlockOnStartupfalse/unlockOnStartup deletionPolicy class=solr.SolrDeletionPolicy !-- Keep only optimized commit points -- str name=keepOptimizedOnlyfalse/str !-- The maximum number of commit points to be kept -- str name=maxCommitsToKeep1/str /deletionPolicy updateHandler class=solr.DirectUpdateHandler2 autoCommit maxDocs10/maxDocs /autoCommit /updateHandler requestHandler name=/replication class=solr.ReplicationHandler lst name=master str name=enable${enable.master:false}/str str name=replicateAfterstartup/str str name=replicateAftercommit/str /lst lst name=slave str name=enable${enable.slave:false}/str str name=masterUrlhttp://hostname:port/solr/core0/replication/str /lst /requestHandler Sometimes, while the full indexer script breaks while adding documents to Solr. The script adds the documents and then commits the operation. So, when the script breaks, we have a huge lot of data which has been updated but not committed. Next, the incremental index script executes, and figures out all the new entries, adds them to Solr. It works successfully and commits the operation. - Will the commit by incremental indexer script also commit the previously uncommitted changes made by full indexer script before it broke? Sometimes, while during execution, Solr's avg response time 9avg resp time for last 10 requests, read from log file) goes as high as 9000ms (which I am still unclear why, any ideas how to start hunting for the problem?), so the watchdog process restarts Solr (because it causes a pile of requests queue at application server, which causes app server to crash). On my local environment, I performed the same experiment by adding docs to Solr, killing the process and restarting it. I found that the uncommitted changes were applied and searchable. However, the updates were uncommitted. Could you explain me as to how is this happening, or is there a configuration that can be adjusted for this? Also, what would the index state be if after the restarting Solr, a commit is applied or a commit is not applied? I'd be happy to provide any other information that might be needed. *Pranav Prakash* temet nosce Twitter http://twitter.com/pranavprakash | Blog http://blog.myblive.com | Google http://www.google.com/profiles/pranny On Tue, Jun 28, 2011 at 20:55, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Tue, Jun 28, 2011 at 4:18 PM, Pranav Prakash pra...@gmail.com wrote: I am not sure what is the index number value? It looks like an epoch time, but in my case, this points to one month back. However, i can see documents which were added last week, to be in the index. The index version shown on the dashboard is the time at which the most recent index segment was created. I'm not sure why it has a value older than a month if a commit has happened after that time. Even after I did a commit, the index number did not change? Isn't it supposed to change on every commit? If not, is there a way to look into the last index time? Yeah, it changes after every commit which added/deleted a document. Also, this page http://wiki.apache.org/solr/SolrReplication#Replication_Dashboard shows a Replication Dashboard. How is this dashboard invoked? Is there any URL which needs to be called? If you have configured replication correctly, the admin dashboard should show a Replication link right next to the Schema Browser link. The path should be /admin/replication/index.jsp -- Regards, Shalin Shekhar Mangar.
Re: Index Version and Epoch Time?
On 6/28/2011 1:38 PM, Pranav Prakash wrote: - Will the commit by incremental indexer script also commit the previously uncommitted changes made by full indexer script before it broke? Yes, as long as the Solr instance hasn't crashed. Anything added but not yet committed sticks around and will be committed on next 'commit'. There are no 'transactions' for adding docs in Solr, even if multiple processes are adding, if anyone of them issues a 'commit' they'll all be committed. Sometimes, while during execution, Solr's avg response time 9avg resp time for last 10 requests, read from log file) goes as high as 9000ms (which I am still unclear why, any ideas how to start hunting for the problem?), It could be a Java garbage collection issue. I have found it useful to start the JVM with Solr in it using some parameters to tune garbage collection. I use these JVM options: -server -XX:+AggressiveOpts -d64 -XX:+UseConcMarkSweepGC -XX:+UseCompressedOops You've still got to make sure Solr has enough memory for what you're doing with it, with with your 5 million doc index might be more than you expect. On the other hand, giving a JVM too _much_ heap can cause slowdowns too, although I think the -XX:+UseConcMarkSweepGC should amelioerate that to some extent. Possibly more likely, it could instead be Solr readying the new indexes. Do you issue commits in the middle of 'execution', and could the slowdown happen right after a commit? When a commit is issued to Solr, Solr's got to switch new indexes in with the newly added documents, and 'warm' those indexes in various ways. Which can be a CPU (as well as RAM) intensive thing. (For these purposes a replication from master counts as a commit (because it is), and an optimize can count too (becaue it's close enough)). This can be especially a problem if you issue multiple commits very close together -- Solr's still working away at readying the index from the first commit, when the second comes in, and now Solr's trying to get ready two indexes at once (one of which will never be used because its' already outdated). Or even more than two if you issue a bunch of commits in rapid succession. I found that the uncommitted changes were applied and searchable. However, the updates were uncommitted. There is in general no way that uncomitted adds could be searchable, that's probably not happening. What is probably happening instead is that a commit _is_ happening. One way a commit can happen even if you aren't manually issuing one is with various auto-commit settings in solrconfig.xml. Commit any pending adds after X documents, or after T seconds, can both be configured. If they are configured, that could be causing commits to happen when you don't realize it, which could also trigger the slowdown due to a commit mentioned in the previous paragraph. Jonathan