[ https://issues.apache.org/jira/browse/LUCENE-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12673610#action_12673610 ]
Karl Wettin commented on LUCENE-1537: ------------------------------------- I didn't try it out yet, but I have a few comments and questions on the patch: {code} Index: contrib/instantiated/src/java/org/apache/lucene/store/instantiated/InstantiatedIndexReader.java + + public Object clone() { + try { + doCommit(); + InstantiatedIndex clonedIndex = index.cloneWithDeletesNorms(); + return new InstantiatedIndexReader(clonedIndex); + } catch (IOException ioe) { + throw new RuntimeException("", ioe); + } + } Index: contrib/instantiated/src/java/org/apache/lucene/store/instantiated/InstantiatedIndex.java + + InstantiatedIndex cloneWithDeletesNorms() { + InstantiatedIndex clone = new InstantiatedIndex(); + clone.version = System.currentTimeMillis(); + clone.documentsByNumber = documentsByNumber; + clone.deletedDocuments = new HashSet<Integer>(deletedDocuments); + clone.termsByFieldAndText = termsByFieldAndText; + clone.orderedTerms = orderedTerms; + clone.normsByFieldNameAndDocumentNumber = new HashMap<String, byte[]>(normsByFieldNameAndDocumentNumber); + clone.fieldSettings = fieldSettings; + return clone; + } {code} Perhaps we should move deleted documents to the reader? It might be a bit of work to hook it up with term enum et c, but it could be worth looking in to. I think it makes more sense to keep the same instance of InstantiatedIndex and only produce a cloned InstantiatedIndexReader. It is the reader#clone we call upon so cloning the store sounds like a future placeholder for unwanted bugs. I see there are some left overs from your attempt to handle none optimized readers: {code} - documentsByNumber = new InstantiatedDocument[sourceIndexReader.numDocs()]; + documentsByNumber = new InstantiatedDocument[sourceIndexReader.maxDoc()]; // create documents for (int i = 0; i < sourceIndexReader.numDocs(); i++) { {code} I think if you switch to maxDoc it should also use maxDoc int the loop and skip any deleted document. {code} - for (InstantiatedDocument document : getDocumentsByNumber()) { + //for (InstantiatedDocument document : getDocumentsByNumber()) { + for (InstantiatedDocument document : getDocumentsNotDeleted()) { for (Field field : (List<Field>) document.getDocument().getFields()) { if (field.isTermVectorStored() && field.isStoreOffsetWithTermVector()) { TermPositionVector termPositionVector = (TermPositionVector) sourceIndexReader.getTermFreqVector(document.getDocumentNumber(), field.name()); @@ -312,7 +325,15 @@ public InstantiatedDocument[] getDocumentsByNumber() { return documentsByNumber; } - + + public List<InstantiatedDocument> getDocumentsNotDeleted() { + List<InstantiatedDocument> list = new ArrayList<InstantiatedDocument>(documentsByNumber.length-deletedDocuments.size()); + for (int x=0; x < documentsByNumber.length; x++) { + if (!deletedDocuments.contains(x)) list.add(documentsByNumber[x]); + } + return list; + } + {code} As the source never contains any deleted documents this really doesn't do anything but consume a bit of resources, or? {code} - int maxVal = getAssociatedDocuments()[max].getDocument().getDocumentNumber(); + InstantiatedTermDocumentInformation itdi = getAssociatedDocuments()[max]; + InstantiatedDocument id = itdi.getDocument(); + int maxVal = id.getDocumentNumber(); + //int maxVal = getAssociatedDocuments()[max].getDocument().getDocumentNumber(); {code} Is this refactor just for debugging purposes? I find it harder to read than the original one-liner. > InstantiatedIndexReader.clone > ----------------------------- > > Key: LUCENE-1537 > URL: https://issues.apache.org/jira/browse/LUCENE-1537 > Project: Lucene - Java > Issue Type: Improvement > Components: contrib/* > Affects Versions: 2.4 > Reporter: Jason Rutherglen > Assignee: Karl Wettin > Priority: Trivial > Fix For: 2.9 > > Attachments: LUCENE-1537.patch > > Original Estimate: 2h > Remaining Estimate: 2h > > This patch will implement IndexReader.clone for InstantiatedIndexReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org