[
https://issues.apache.org/jira/browse/SOLR-2326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13182775#comment-13182775
]
Matt Traynham edited comment on SOLR-2326 at 1/9/12 8:34 PM:
-------------------------------------------------------------
Hey Yury, I recently started seeing this same issue and thought I'd provide a
bit of input into what I found debugging in my 3.3 Branch.
I have found that a core reload does break the commits call after. But if you
actually reload a second time, it is fixed again. This is because it is
forcefully opening a new writer and a lock exception occurs every other time.
During the inform method of ReplicationHandler, if you have configured
replicate after startup, the direct update handler will forceOpenWriter().
{code:title=ReplicationHandler.java}
if (replicateAfter.contains("startup")) {
replicateOnStart = true;
RefCounted<SolrIndexSearcher> s = core.getNewestSearcher(false);
try {
IndexReader reader = s==null ? null : s.get().getReader();
if (reader!=null && reader.getIndexCommit() != null &&
reader.getIndexCommit().getGeneration() != 1L) {
try {
if(replicateOnOptimize){
Collection<IndexCommit> commits =
IndexReader.listCommits(reader.directory());
for (IndexCommit ic : commits) {
if(ic.isOptimized()){
if(indexCommitPoint == null ||
indexCommitPoint.getVersion() < ic.getVersion()) indexCommitPoint = ic;
}
}
} else{
indexCommitPoint = reader.getIndexCommit();
}
} finally {
// We don't need to save commit points for
replication, the SolrDeletionPolicy
// always saves the last commit point (and the last
optimized commit point, if needed)
/***
if(indexCommitPoint != null){
core.getDeletionPolicy().saveCommitPoint(indexCommitPoint.getVersion());
}
***/
}
}
if (core.getUpdateHandler() instanceof
DirectUpdateHandler2) {
((DirectUpdateHandler2)
core.getUpdateHandler()).forceOpenWriter();
} else {
LOG.warn("The update handler being used is not an
instance or sub-class of DirectUpdateHandler2. " +
"Replicate on Startup cannot work.");
}
{code}
Which will request a new lock, open a new writer and unlock. If a lock already
exists the exception will be thrown:
org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out and
actually bail out of creating a new writer.
{code:title=DirectUpdateHandler2.java}
public void forceOpenWriter() throws IOException {
iwCommit.lock();
try {
openWriter();
} finally {
iwCommit.unlock();
}
}
{code}
The openWriter method goes on to create a new SolrIndexWriter as well as a few
other objects like IndexFileDeleter and IndexDeletionPolicyWrapper, which
actually holds the commitPoints.
{code:title=IndexDeletionPolicyWrapper.java}
private volatile Map<Long, IndexCommit> solrVersionVsCommits = new
ConcurrentHashMap<Long, IndexCommit>();
/**
* Internal use for Lucene... do not explicitly call.
*/
public void onInit(List list) throws IOException {
List<IndexCommitWrapper> wrapperList = wrap(list);
deletionPolicy.onInit(wrapperList);
updateCommitPoints(wrapperList);
cleanReserves();
}
private void updateCommitPoints(List<IndexCommitWrapper> list) {
Map<Long, IndexCommit> map = new ConcurrentHashMap<Long, IndexCommit>();
for (IndexCommitWrapper wrapper : list) {
if (!wrapper.isDeleted())
map.put(wrapper.getVersion(), wrapper.delegate);
}
solrVersionVsCommits = map;
latestCommit = ((list.get(list.size() - 1)).delegate);
}
/**
* Gets the commit points for the index.
* This map instance may change between commits and commit points may be
deleted.
* It is recommended to reserve a commit point for the duration of usage
*
* @return a Map of version to commit points
*/
public Map<Long, IndexCommit> getCommits() {
return solrVersionVsCommits;
}
{code}
The problem being, if a writer never gets created correctly, the init method on
the IndexDeletionPolicyWrapper never gets called and the solrVersionVsCommits
map is empty. If anyone has any input on a solution, that would be greatly
appreciated.
Thanks,
Matt
was (Author: mtraynham):
Hey Yury, I recently started seeing this same issue and thought I'd provide
a bit of input into what I found debugging in my 3.3 Branch.
I have found that a core reload does break the commits call after. But if you
actually reload a second time, it is fixed again. This is because it is
forcefully opening a new writer and a lock exception occurs every other time.
During the inform method of ReplicationHandler, if you have configured
replicate after startup, the direct update handler will forceOpenWriter().
{code:title=ReplicationHandler.java}
if (replicateAfter.contains("startup")) {
replicateOnStart = true;
RefCounted<SolrIndexSearcher> s = core.getNewestSearcher(false);
try {
IndexReader reader = s==null ? null : s.get().getReader();
if (reader!=null && reader.getIndexCommit() != null &&
reader.getIndexCommit().getGeneration() != 1L) {
try {
if(replicateOnOptimize){
Collection<IndexCommit> commits =
IndexReader.listCommits(reader.directory());
for (IndexCommit ic : commits) {
if(ic.isOptimized()){
if(indexCommitPoint == null ||
indexCommitPoint.getVersion() < ic.getVersion()) indexCommitPoint = ic;
}
}
} else{
indexCommitPoint = reader.getIndexCommit();
}
} finally {
// We don't need to save commit points for
replication, the SolrDeletionPolicy
// always saves the last commit point (and the last
optimized commit point, if needed)
/***
if(indexCommitPoint != null){
core.getDeletionPolicy().saveCommitPoint(indexCommitPoint.getVersion());
}
***/
}
}
if (core.getUpdateHandler() instanceof
DirectUpdateHandler2) {
((DirectUpdateHandler2)
core.getUpdateHandler()).forceOpenWriter();
} else {
LOG.warn("The update handler being used is not an
instance or sub-class of DirectUpdateHandler2. " +
"Replicate on Startup cannot work.");
}
{code}
Which will request a new lock, open a new writer and unlock. If a lock already
exists the exception will be thrown:
org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out and
actually bail out of creating a new writer.
{code:title=DirectUpdateHandler2.java}
public void forceOpenWriter() throws IOException {
iwCommit.lock();
try {
openWriter();
} finally {
iwCommit.unlock();
}
}
{code}
The openWriter method goes on to create a new SolrIndexWriter as well as a few
other objects like IndexFileDeleter and IndexDeletionPolicyWrapper, which
actually holds the commitPoints.
{code:title=IndexDeletionPolicyWrapper}
private volatile Map<Long, IndexCommit> solrVersionVsCommits = new
ConcurrentHashMap<Long, IndexCommit>();
/**
* Internal use for Lucene... do not explicitly call.
*/
public void onInit(List list) throws IOException {
List<IndexCommitWrapper> wrapperList = wrap(list);
deletionPolicy.onInit(wrapperList);
updateCommitPoints(wrapperList);
cleanReserves();
}
private void updateCommitPoints(List<IndexCommitWrapper> list) {
Map<Long, IndexCommit> map = new ConcurrentHashMap<Long, IndexCommit>();
for (IndexCommitWrapper wrapper : list) {
if (!wrapper.isDeleted())
map.put(wrapper.getVersion(), wrapper.delegate);
}
solrVersionVsCommits = map;
latestCommit = ((list.get(list.size() - 1)).delegate);
}
/**
* Gets the commit points for the index.
* This map instance may change between commits and commit points may be
deleted.
* It is recommended to reserve a commit point for the duration of usage
*
* @return a Map of version to commit points
*/
public Map<Long, IndexCommit> getCommits() {
return solrVersionVsCommits;
}
{code}
The problem being, if a writer never gets created correctly, the init method on
the IndexDeletionPolicyWrapper never gets called and the solrVersionVsCommits
map is empty. If anyone has any input on a solution, that would be greatly
appreciated.
Thanks,
Matt
> Replication command indexversion fails to return index version
> --------------------------------------------------------------
>
> Key: SOLR-2326
> URL: https://issues.apache.org/jira/browse/SOLR-2326
> Project: Solr
> Issue Type: Bug
> Components: replication (java)
> Environment: Branch 3x latest
> Reporter: Eric Pugh
> Assignee: Mark Miller
> Fix For: 3.6, 4.0
>
>
> To test this, I took the /example/multicore/core0 solrconfig and added a
> simple replication handler:
> <requestHandler name="/replication" class="solr.ReplicationHandler" >
> <lst name="master">
> <str name="replicateAfter">commit</str>
> <str name="replicateAfter">startup</str>
> <str name="confFiles">schema.xml</str>
> </lst>
> </requestHandler>
> When I query the handler for details I get back the indexVersion that I
> expect:
> http://localhost:8983/solr/core0/replication?command=details&wt=json&indent=true
> But when I ask for just the indexVersion I get back a 0, which prevent the
> slaves from pulling updates:
> http://localhost:8983/solr/core0/replication?command=indexversion&wt=json&indent=true
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]