[jira] Closed: (LUCENENET-384) QueryParsers exception on Windows 2008 Server
[ https://issues.apache.org/jira/browse/LUCENENET-384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy closed LUCENENET-384. -- Resolution: Cannot Reproduce Hi Rida, If you still have the same bug(with recent versions), you can open a new issue with more info. I'm closing this one. DIGY QueryParsers exception on Windows 2008 Server - Key: LUCENENET-384 URL: https://issues.apache.org/jira/browse/LUCENENET-384 Project: Lucene.Net Issue Type: Bug Environment: Lucene.Net 2.0.0.4 OS: Windows 2008 Server / 32bit Reporter: Rida Al-Masri Priority: Blocker I have developed an application that use Lucene.Net 2.0.0.4 and it works very well on Widows XP and Windows 2003 Server, but when I tried to use this application on Windows 2008 Server / 32bit it raises Lucene.Net.QueryParsers.ParseException for all the supplied queries, Your attention to this issue is highly appreciated. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (LUCENENET-375) Getting assert in SegmentReader.cs (Lucene.net_2_9_2)
[ https://issues.apache.org/jira/browse/LUCENENET-375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy resolved LUCENENET-375. Resolution: Fixed Patch applied to 2.9.2 branch and trunk. DIGY Getting assert in SegmentReader.cs (Lucene.net_2_9_2) - Key: LUCENENET-375 URL: https://issues.apache.org/jira/browse/LUCENENET-375 Project: Lucene.Net Issue Type: Bug Reporter: Digy Attachments: SegmentReader.patch Reported by *Patrick Ng* and *Kevin Miller* in mailing lists. No feedback yet :( Reason: Java version of Norm.Clone is implemented in an synchronized function but it is somehow omitted in Lucene.Net. DIGY -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (LUCENENET-376) Ver.2.9.2 SpanOrQuery.ToString() bug
[ https://issues.apache.org/jira/browse/LUCENENET-376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy resolved LUCENENET-376. Resolution: Fixed Assignee: Digy Patch applied to 2.9.2 and trunk DIGY Ver.2.9.2 SpanOrQuery.ToString() bug Key: LUCENENET-376 URL: https://issues.apache.org/jira/browse/LUCENENET-376 Project: Lucene.Net Issue Type: Bug Reporter: Andrei Iliev Assignee: Digy Attachments: SpanOrQuery.patch Bad conversion from java code. System.Collections.IEnumerator i = clauses.GetEnumerator(); while (i.MoveNext()) { SpanQuery clause = (SpanQuery) i.Current; buffer.Append(clause.ToString(field)); if (i.MoveNext()) { buffer.Append(, ); } } Shoud be changed to something like: int j = 0; while (i.MoveNext()) { j++; SpanQuery clause = (SpanQuery) i.Current; buffer.Append(clause.ToString(field)); if (jclauses.Count) { buffer.Append(, ); } } -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-2315) analysis.jsp highlight matches no longer works
[ https://issues.apache.org/jira/browse/SOLR-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982044#action_12982044 ] Pradeep commented on SOLR-2315: --- I have one week old code. It works for me. analysis.jsp highlight matches no longer works Key: SOLR-2315 URL: https://issues.apache.org/jira/browse/SOLR-2315 Project: Solr Issue Type: Bug Components: web gui Reporter: Hoss Man Fix For: 3.1, 4.0 As noted by Teruhiko Kurosaka on the mailing list, at some point since Solr 1.4, highlight matches stoped working on the analysis.jsp -- on both the 3x and trunk branches -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Solr-trunk - Build # 1374 - Failure
Build: https://hudson.apache.org/hudson/job/Solr-trunk/1374/ All tests passed Build Log (for compile errors): [...truncated 19409 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2858) Separate SegmentReaders (and other atomic readers) from composite IndexReaders
[ https://issues.apache.org/jira/browse/LUCENE-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982057#action_12982057 ] Uwe Schindler commented on LUCENE-2858: --- Any comments about removing write access from IndexReaders? I think setNorms() will be removed soo, but how about the others? I would propose to also make all IndexReaders simply *readers* not writers? Separate SegmentReaders (and other atomic readers) from composite IndexReaders -- Key: LUCENE-2858 URL: https://issues.apache.org/jira/browse/LUCENE-2858 Project: Lucene - Java Issue Type: Task Reporter: Uwe Schindler Fix For: 4.0 With current trunk, whenever you open an IndexReader on a directory you get back a DirectoryReader which is a composite reader. The interface of IndexReader has now lots of methods that simply throw UOE (in fact more than 50% of all methods that are commonly used ones are unuseable now). This confuses users and makes the API hard to understand. This issue should split atomic readers from reader collections with a separate API. After that, you are no longer able, to get TermsEnum without wrapping from those composite readers. We currently have helper classes for wrapping (SlowMultiReaderWrapper - please rename, the name is really ugly; or Multi*), those should be retrofitted to implement the correct classes (SlowMultiReaderWrapper would be an atomic reader but takes a composite reader as ctor param, maybe it could also simply take a ListAtomicReader). In my opinion, maybe composite readers could implement some collection APIs and also have the ReaderUtil method directly built in (possibly as a view in the util.Collection sense). In general composite readers do not really need to look like the previous IndexReaders, they could simply be a collection of SegmentReaders with some functionality like reopen. On the other side, atomic readers do not need reopen logic anymore? When a segment changes, you need a new atomic reader? - maybe because of deletions thats not the best idea, but we should investigate. Maybe make the whole reopen logic simplier to use (ast least on the collection reader level). We should decide about good names, i have no preference at the moment. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Issue Comment Edited: (LUCENE-2858) Separate SegmentReaders (and other atomic readers) from composite IndexReaders
[ https://issues.apache.org/jira/browse/LUCENE-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982057#action_12982057 ] Uwe Schindler edited comment on LUCENE-2858 at 1/15/11 5:26 AM: Any comments about removing write access from IndexReaders? I think setNorms() will be removed soon, but how about the others like deleteDocument()? I would propose to also make all IndexReaders simply *readers* not writers? was (Author: thetaphi): Any comments about removing write access from IndexReaders? I think setNorms() will be removed soo, but how about the others? I would propose to also make all IndexReaders simply *readers* not writers? Separate SegmentReaders (and other atomic readers) from composite IndexReaders -- Key: LUCENE-2858 URL: https://issues.apache.org/jira/browse/LUCENE-2858 Project: Lucene - Java Issue Type: Task Reporter: Uwe Schindler Fix For: 4.0 With current trunk, whenever you open an IndexReader on a directory you get back a DirectoryReader which is a composite reader. The interface of IndexReader has now lots of methods that simply throw UOE (in fact more than 50% of all methods that are commonly used ones are unuseable now). This confuses users and makes the API hard to understand. This issue should split atomic readers from reader collections with a separate API. After that, you are no longer able, to get TermsEnum without wrapping from those composite readers. We currently have helper classes for wrapping (SlowMultiReaderWrapper - please rename, the name is really ugly; or Multi*), those should be retrofitted to implement the correct classes (SlowMultiReaderWrapper would be an atomic reader but takes a composite reader as ctor param, maybe it could also simply take a ListAtomicReader). In my opinion, maybe composite readers could implement some collection APIs and also have the ReaderUtil method directly built in (possibly as a view in the util.Collection sense). In general composite readers do not really need to look like the previous IndexReaders, they could simply be a collection of SegmentReaders with some functionality like reopen. On the other side, atomic readers do not need reopen logic anymore? When a segment changes, you need a new atomic reader? - maybe because of deletions thats not the best idea, but we should investigate. Maybe make the whole reopen logic simplier to use (ast least on the collection reader level). We should decide about good names, i have no preference at the moment. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering
[ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982061#action_12982061 ] Dawid Weiss commented on SOLR-2282: --- I think I nailed it. I did whitebox-inspect Carrot2 code and thought it impossible for a concurrency bug to creep in (in particular with a simple controller), but what we didn't take into account is that Carrot2 infrastructure itself allows a scenario in which a single object instance is bound to multiple components at runtime (and is then effectively shared in a multi threaded context). This code happens to be in Solr's code base, not in Carrot2. The bug happens because of the following series of events: 1) The controller in Solr itself is initialized with a single instance of new LuceneLanguageModelFactory() -- this factory is then injected into all components at runtime. 2) The base class of LuceneLanguageModelFactory is DefaultLanguageModelFactory which has an object-local cache of stemmers and tokenizers. In Carrot2 3.4.2, factories are component-bound anyway, so a factory can reuse its resources. In the trunk version, this is no longer the case (factories simply create new objects as they are requested). 3) Because of the tokenizers/stemmers cache, tokenizers and stemmers can be used in parallel when two requests are made at the same time. I think this should be fairly repeatable on all computers, regardless of the number of cores/speed, it's just a matter of time. Clustering is relatively longer than tokenization, so for two tokenizations to overlap (and screw up internal data structures) is a rare event (and yet, as we could see, frequent enough to manifest itself during tests). {noformat} // Customize the language model factory. The implementation we provide here // is included in the code base of Solr, so that it's possible to refactor // the Lucene APIs the factory relies on if needed. initAttributes.put(PreprocessingPipeline.languageModelFactory, new LuceneLanguageModelFactory()); this.controller.init(initAttributes); {noformat} The fix for the problem would be to: 1) upgrade to trunk/future Carrot2 version (because of different memory management in factories), 2) pass a class instead of an instance to the initialization parameters. So this should do: {noformat} // Customize the language model factory. The implementation we provide here // is included in the code base of Solr, so that it's possible to refactor // the Lucene APIs the factory relies on if needed. initAttributes.put(PreprocessingPipeline.languageModelFactory, LuceneLanguageModelFactory.class); this.controller.init(initAttributes); {noformat} Works on my machine :) But I'll let Staszek review this again so that we're sure it's really this. Distributed Support for Search Result Clustering Key: SOLR-2282 URL: https://issues.apache.org/jira/browse/SOLR-2282 Project: Solr Issue Type: New Feature Components: contrib - Clustering Affects Versions: 1.4, 1.4.1 Reporter: Koji Sekiguchi Assignee: Koji Sekiguchi Priority: Minor Fix For: 3.1, 4.0 Attachments: SOLR-2282-diagnostics.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering
[ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982062#action_12982062 ] Dawid Weiss commented on SOLR-2282: --- One more side comment for those interested. I used my favorite technique for debugging such things -- created another project in Eclipse (AspectJ-enabled), created a runtime weaving launch config in Eclipse that started that particular test, wrote this aspect: {noformat} package com.carrotsearch.aspects; import java.util.HashMap; /** * Check for multithreaded access in supposedly single-threaded objects. */ public aspect Solr2282 { pointcut guardedMethods() : execution(* org.carrot2.text.analysis.ExtendedWhitespaceTokenizerImpl.*(..)); private HashMapObject, Thread t = new HashMapObject, Thread(); Object around() : guardedMethods() { Object tokenizer = thisJoinPoint.getThis(); Thread current = Thread.currentThread(); try { synchronized (Solr2282.class) { Thread owner = t.get(tokenizer); if (owner != null owner != current) halt(); t.put(tokenizer, current); } return proceed(); } catch (Throwable e) { halt(); return null; } finally { synchronized (Solr2282.class) { Thread owner = t.get(tokenizer); if (owner != null owner != current) halt(); t.remove(tokenizer); } } } private void halt() { System.out.println(## HALT! ); } } {noformat} and placed a VM-halting breakpoint in sysout inside halt()... Once I got two threads running on the same tokenizer instance, it was a matter of inspecting which objects are shared and how this could possibly happen. Aspect-oriented programming never really won me, but as a debugging/ performance analysis tool it simply rocks. Distributed Support for Search Result Clustering Key: SOLR-2282 URL: https://issues.apache.org/jira/browse/SOLR-2282 Project: Solr Issue Type: New Feature Components: contrib - Clustering Affects Versions: 1.4, 1.4.1 Reporter: Koji Sekiguchi Assignee: Koji Sekiguchi Priority: Minor Fix For: 3.1, 4.0 Attachments: SOLR-2282-diagnostics.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Release schedule Lucene 4?
This is unfortunately hard to say! There's tons of good stuff in 4.0, so we'd really like to release sooner rather than later. But then there's also alot of work remaining, eg we have 3 feature branches in flight right now, that we need to wrap up and land on trunk: * realtime (gives us concurrent flushing during indexing) * docvalues (adds column-stride fields) * bulkpostings (gives good search speedup for intblock codecs) Plus many open Jira issues. So it's hard to predict when all of this will be done Mike On Fri, Jan 14, 2011 at 12:31 PM, Gregor Heinrich gre...@arbylon.net wrote: Dear Lucene team, I am wondering whether there is an updated Lucene release schedule for the v4.0 stream. Any earliest/latest alpha/beta/stable date? And if not yet, where to track such info? Thanks in advance from Germany gregor - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: CorruptIndexException when indexing
Different ramBufferSizeMB during indexing should never cause corruption! Can you try setting the ram buffer to 256 MB in your test env and see if that makes the corruption go away? This could also be a hardware issue in your test env. If you run CheckIndex on the corrupt index does it always fail in the same way? Mike On Fri, Jan 14, 2011 at 6:43 AM, Li Li fancye...@gmail.com wrote: hi all, we have confronted this problem 3 times when testing The exception stack is Exception in thread Lucene Merge Thread #2 org.apache.lucene.index.MergePolicy$MergeException: org.apache.lucene.index.CorruptIndexException: docs out of order (7286 = 7286 ) at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:355) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:319) Caused by: org.apache.lucene.index.CorruptIndexException: docs out of order (7286 = 7286 ) at org.apache.lucene.index.FormatPostingsDocsWriter.addDoc(FormatPostingsDocsWriter.java:75) at org.apache.lucene.index.SegmentMerger.appendPostings(SegmentMerger.java:880) at org.apache.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java:818) at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:756) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:187) at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5354) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4937) Or Exception in thread Lucene Merge Thread #0 org.apache.lucene.index.MergePolicy$MergeException: java.lang.ArrayIndexOutOfBoundsException: 330 at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:355) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:319) Caused by: java.lang.ArrayIndexOutOfBoundsException: 330 at org.apache.lucene.util.BitVector.get(BitVector.java:102) at org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java:238) at org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java:168) at org.apache.lucene.index.SegmentTermPositions.next(SegmentTermPositions.java:98) at org.apache.lucene.index.SegmentMerger.appendPostings(SegmentMerger.java:870) at org.apache.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java:818) at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:756) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:187) at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5354) We did some minor modification based on lucene 2.9.1 and solr 1.4.0. we modified frq file to store 4 bytes for the positions of the term occured in these document(Accessing full postions in prx is time consuming that can't meed our needs). I can't tell it's our bug or lucene's own bug. I searched the mail list and found the mail problem during index merge posted in 2010-10-21. It's similar to our case. It seems the docList in frq file is wrongly stored. When Merging, when it's decoded, the wrong docID many larger than maxDocs(BitVector deletedDocs) which cause the second exception. Or docID delta is less than 0(it reads wrongly) which cause the first exception we are still continueing testing to turn off our modification and open infoStream in solr-config.xml We found a strange phenomenon. when we test, it sometimes hited exceptions but in our production environment, it never hit any. the hardware and software environments are the same. We checked carefully and find the only difference is this line in solr-config.xml ramBufferSizeMB32/ramBufferSizeMB in testing environment ramBufferSizeMB256/ramBufferSizeMBin production environment The indexed documents number for each machine is also roughly the same. 10M+ documents. I can't make sure the indice in production env are correct because even there are some terms' docList are wrong, if the doc delta 0 and don't have some deleted documents, it will not hit the 2 exceptions. The search results in production env and we don't find any strange results. Will when the ramBufferSizeMB is too small results in index corruption? - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2858) Separate SegmentReaders (and other atomic readers) from composite IndexReaders
[ https://issues.apache.org/jira/browse/LUCENE-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982102#action_12982102 ] Robert Muir commented on LUCENE-2858: - bq. I think setNorms() will be removed soon Why do you think this? On the norms cleanup issue, i only removed setNorm(float), because its completely useless. All it did was call Similarity.getDefault().encode(float) + setNorm(byte). Separate SegmentReaders (and other atomic readers) from composite IndexReaders -- Key: LUCENE-2858 URL: https://issues.apache.org/jira/browse/LUCENE-2858 Project: Lucene - Java Issue Type: Task Reporter: Uwe Schindler Fix For: 4.0 With current trunk, whenever you open an IndexReader on a directory you get back a DirectoryReader which is a composite reader. The interface of IndexReader has now lots of methods that simply throw UOE (in fact more than 50% of all methods that are commonly used ones are unuseable now). This confuses users and makes the API hard to understand. This issue should split atomic readers from reader collections with a separate API. After that, you are no longer able, to get TermsEnum without wrapping from those composite readers. We currently have helper classes for wrapping (SlowMultiReaderWrapper - please rename, the name is really ugly; or Multi*), those should be retrofitted to implement the correct classes (SlowMultiReaderWrapper would be an atomic reader but takes a composite reader as ctor param, maybe it could also simply take a ListAtomicReader). In my opinion, maybe composite readers could implement some collection APIs and also have the ReaderUtil method directly built in (possibly as a view in the util.Collection sense). In general composite readers do not really need to look like the previous IndexReaders, they could simply be a collection of SegmentReaders with some functionality like reopen. On the other side, atomic readers do not need reopen logic anymore? When a segment changes, you need a new atomic reader? - maybe because of deletions thats not the best idea, but we should investigate. Maybe make the whole reopen logic simplier to use (ast least on the collection reader level). We should decide about good names, i have no preference at the moment. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-2282) Distributed Support for Search Result Clustering
[ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stanislaw Osinski updated SOLR-2282: Attachment: SOLR-2282-concurrency-branch_3x.patch SOLR-2282-concurrency-trunk.patch Thanks for debugging this, Dawid! I think solution 2) you suggested would be the best because it applies both to version 3.4.2 of Carrot2 (currently used by Solr) and the 3.5.0 version (not yet released). I'm attaching patches for Solr trunk and branch_3x that fix the concurrency issue and correct a typo in a log message output by {{LuceneLanguageModelFactory}}. Distributed Support for Search Result Clustering Key: SOLR-2282 URL: https://issues.apache.org/jira/browse/SOLR-2282 Project: Solr Issue Type: New Feature Components: contrib - Clustering Affects Versions: 1.4, 1.4.1 Reporter: Koji Sekiguchi Assignee: Koji Sekiguchi Priority: Minor Fix For: 3.1, 4.0 Attachments: SOLR-2282-concurrency-branch_3x.patch, SOLR-2282-concurrency-trunk.patch, SOLR-2282-diagnostics.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Resolved: (LUCENE-2862) Track total term freq per term
[ https://issues.apache.org/jira/browse/LUCENE-2862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-2862. Resolution: Fixed Track total term freq per term -- Key: LUCENE-2862 URL: https://issues.apache.org/jira/browse/LUCENE-2862 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.0 Attachments: LUCENE-2862.patch Right now we track docFreq for each term (how many docs have the term), but the totalTermFreq (total number of occurrences of this term, ie sum of freq() for each doc that has the term) is also a useful stat (for flex scoring, PulsingCodec, etc.). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2858) Separate SegmentReaders (and other atomic readers) from composite IndexReaders
[ https://issues.apache.org/jira/browse/LUCENE-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982107#action_12982107 ] Uwe Schindler commented on LUCENE-2858: --- I was talking about replacing norms by CSF, maybe It's just not soon. Separate SegmentReaders (and other atomic readers) from composite IndexReaders -- Key: LUCENE-2858 URL: https://issues.apache.org/jira/browse/LUCENE-2858 Project: Lucene - Java Issue Type: Task Reporter: Uwe Schindler Fix For: 4.0 With current trunk, whenever you open an IndexReader on a directory you get back a DirectoryReader which is a composite reader. The interface of IndexReader has now lots of methods that simply throw UOE (in fact more than 50% of all methods that are commonly used ones are unuseable now). This confuses users and makes the API hard to understand. This issue should split atomic readers from reader collections with a separate API. After that, you are no longer able, to get TermsEnum without wrapping from those composite readers. We currently have helper classes for wrapping (SlowMultiReaderWrapper - please rename, the name is really ugly; or Multi*), those should be retrofitted to implement the correct classes (SlowMultiReaderWrapper would be an atomic reader but takes a composite reader as ctor param, maybe it could also simply take a ListAtomicReader). In my opinion, maybe composite readers could implement some collection APIs and also have the ReaderUtil method directly built in (possibly as a view in the util.Collection sense). In general composite readers do not really need to look like the previous IndexReaders, they could simply be a collection of SegmentReaders with some functionality like reopen. On the other side, atomic readers do not need reopen logic anymore? When a segment changes, you need a new atomic reader? - maybe because of deletions thats not the best idea, but we should investigate. Maybe make the whole reopen logic simplier to use (ast least on the collection reader level). We should decide about good names, i have no preference at the moment. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2858) Separate SegmentReaders (and other atomic readers) from composite IndexReaders
[ https://issues.apache.org/jira/browse/LUCENE-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982115#action_12982115 ] Robert Muir commented on LUCENE-2858: - Ah, ok. sorry i was confused. Still, i think we would need this method (somewhere) even with CSF, so that people can change the norms and they instantly take effect for searches. Separate SegmentReaders (and other atomic readers) from composite IndexReaders -- Key: LUCENE-2858 URL: https://issues.apache.org/jira/browse/LUCENE-2858 Project: Lucene - Java Issue Type: Task Reporter: Uwe Schindler Fix For: 4.0 With current trunk, whenever you open an IndexReader on a directory you get back a DirectoryReader which is a composite reader. The interface of IndexReader has now lots of methods that simply throw UOE (in fact more than 50% of all methods that are commonly used ones are unuseable now). This confuses users and makes the API hard to understand. This issue should split atomic readers from reader collections with a separate API. After that, you are no longer able, to get TermsEnum without wrapping from those composite readers. We currently have helper classes for wrapping (SlowMultiReaderWrapper - please rename, the name is really ugly; or Multi*), those should be retrofitted to implement the correct classes (SlowMultiReaderWrapper would be an atomic reader but takes a composite reader as ctor param, maybe it could also simply take a ListAtomicReader). In my opinion, maybe composite readers could implement some collection APIs and also have the ReaderUtil method directly built in (possibly as a view in the util.Collection sense). In general composite readers do not really need to look like the previous IndexReaders, they could simply be a collection of SegmentReaders with some functionality like reopen. On the other side, atomic readers do not need reopen logic anymore? When a segment changes, you need a new atomic reader? - maybe because of deletions thats not the best idea, but we should investigate. Maybe make the whole reopen logic simplier to use (ast least on the collection reader level). We should decide about good names, i have no preference at the moment. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Release schedule Lucene 4?
Well … we can decide on a list of features we want in 4.0 (e.g., fhe 3 you mention above), estimate the time it would take to finish them and then give a release date(s). That will get is faster to a release than if we wait for all JIRA issues to end + the separate branches we work on. We should also decide on a release date for 3x. And as usual, we should release more often than we do today :). Shai On Saturday, January 15, 2011, Michael McCandless luc...@mikemccandless.com wrote: This is unfortunately hard to say! There's tons of good stuff in 4.0, so we'd really like to release sooner rather than later. But then there's also alot of work remaining, eg we have 3 feature branches in flight right now, that we need to wrap up and land on trunk: * realtime (gives us concurrent flushing during indexing) * docvalues (adds column-stride fields) * bulkpostings (gives good search speedup for intblock codecs) Plus many open Jira issues. So it's hard to predict when all of this will be done Mike On Fri, Jan 14, 2011 at 12:31 PM, Gregor Heinrich gre...@arbylon.net wrote: Dear Lucene team, I am wondering whether there is an updated Lucene release schedule for the v4.0 stream. Any earliest/latest alpha/beta/stable date? And if not yet, where to track such info? Thanks in advance from Germany gregor - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2858) Separate SegmentReaders (and other atomic readers) from composite IndexReaders
[ https://issues.apache.org/jira/browse/LUCENE-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982126#action_12982126 ] Earwin Burrfoot commented on LUCENE-2858: - bq. Any comments about removing write access from IndexReaders? I think setNorms() will be removed soon, but how about the others like deleteDocument()? I would propose to also make all IndexReaders simply readers not writers? Voting with all my extremities - yes!! Separate SegmentReaders (and other atomic readers) from composite IndexReaders -- Key: LUCENE-2858 URL: https://issues.apache.org/jira/browse/LUCENE-2858 Project: Lucene - Java Issue Type: Task Reporter: Uwe Schindler Fix For: 4.0 With current trunk, whenever you open an IndexReader on a directory you get back a DirectoryReader which is a composite reader. The interface of IndexReader has now lots of methods that simply throw UOE (in fact more than 50% of all methods that are commonly used ones are unuseable now). This confuses users and makes the API hard to understand. This issue should split atomic readers from reader collections with a separate API. After that, you are no longer able, to get TermsEnum without wrapping from those composite readers. We currently have helper classes for wrapping (SlowMultiReaderWrapper - please rename, the name is really ugly; or Multi*), those should be retrofitted to implement the correct classes (SlowMultiReaderWrapper would be an atomic reader but takes a composite reader as ctor param, maybe it could also simply take a ListAtomicReader). In my opinion, maybe composite readers could implement some collection APIs and also have the ReaderUtil method directly built in (possibly as a view in the util.Collection sense). In general composite readers do not really need to look like the previous IndexReaders, they could simply be a collection of SegmentReaders with some functionality like reopen. On the other side, atomic readers do not need reopen logic anymore? When a segment changes, you need a new atomic reader? - maybe because of deletions thats not the best idea, but we should investigate. Maybe make the whole reopen logic simplier to use (ast least on the collection reader level). We should decide about good names, i have no preference at the moment. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2858) Separate SegmentReaders (and other atomic readers) from composite IndexReaders
[ https://issues.apache.org/jira/browse/LUCENE-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982132#action_12982132 ] Earwin Burrfoot commented on LUCENE-2858: - bq. Still, i think we would need this method (somewhere) even with CSF, so that people can change the norms and they instantly take effect for searches. This still puzzles me. I can strain my imagination, and get people who just need to change norms without reindexing. But doing this and *requiring* instant turnaround? Kid me not :) Separate SegmentReaders (and other atomic readers) from composite IndexReaders -- Key: LUCENE-2858 URL: https://issues.apache.org/jira/browse/LUCENE-2858 Project: Lucene - Java Issue Type: Task Reporter: Uwe Schindler Fix For: 4.0 With current trunk, whenever you open an IndexReader on a directory you get back a DirectoryReader which is a composite reader. The interface of IndexReader has now lots of methods that simply throw UOE (in fact more than 50% of all methods that are commonly used ones are unuseable now). This confuses users and makes the API hard to understand. This issue should split atomic readers from reader collections with a separate API. After that, you are no longer able, to get TermsEnum without wrapping from those composite readers. We currently have helper classes for wrapping (SlowMultiReaderWrapper - please rename, the name is really ugly; or Multi*), those should be retrofitted to implement the correct classes (SlowMultiReaderWrapper would be an atomic reader but takes a composite reader as ctor param, maybe it could also simply take a ListAtomicReader). In my opinion, maybe composite readers could implement some collection APIs and also have the ReaderUtil method directly built in (possibly as a view in the util.Collection sense). In general composite readers do not really need to look like the previous IndexReaders, they could simply be a collection of SegmentReaders with some functionality like reopen. On the other side, atomic readers do not need reopen logic anymore? When a segment changes, you need a new atomic reader? - maybe because of deletions thats not the best idea, but we should investigate. Maybe make the whole reopen logic simplier to use (ast least on the collection reader level). We should decide about good names, i have no preference at the moment. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2858) Separate SegmentReaders (and other atomic readers) from composite IndexReaders
[ https://issues.apache.org/jira/browse/LUCENE-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982134#action_12982134 ] Michael McCandless commented on LUCENE-2858: I don't think we should remove setNorm/deleteDocuments, even from the composite reader class. Deleting docs from IR has advantages over deleting from IW: the change is live to any searches running on that IR; you get an immediate count of how many docs were deleted; you can delete by docID. setNorm is also useful in that it can be use to boost docs (globally), live, if that reader is being used for searching. When/if we cutover norms - doc values we'll have to decide what to do about setNorm... At a higher level, for this strong typing of atomic vs composite IRs, we shouldn't try to change functionality -- let's just do a rote refactoring, such that methods that now throw UOE on IR are moved to the atomic reader only. Separately we can think about whether existing functions should be dropped... Separate SegmentReaders (and other atomic readers) from composite IndexReaders -- Key: LUCENE-2858 URL: https://issues.apache.org/jira/browse/LUCENE-2858 Project: Lucene - Java Issue Type: Task Reporter: Uwe Schindler Fix For: 4.0 With current trunk, whenever you open an IndexReader on a directory you get back a DirectoryReader which is a composite reader. The interface of IndexReader has now lots of methods that simply throw UOE (in fact more than 50% of all methods that are commonly used ones are unuseable now). This confuses users and makes the API hard to understand. This issue should split atomic readers from reader collections with a separate API. After that, you are no longer able, to get TermsEnum without wrapping from those composite readers. We currently have helper classes for wrapping (SlowMultiReaderWrapper - please rename, the name is really ugly; or Multi*), those should be retrofitted to implement the correct classes (SlowMultiReaderWrapper would be an atomic reader but takes a composite reader as ctor param, maybe it could also simply take a ListAtomicReader). In my opinion, maybe composite readers could implement some collection APIs and also have the ReaderUtil method directly built in (possibly as a view in the util.Collection sense). In general composite readers do not really need to look like the previous IndexReaders, they could simply be a collection of SegmentReaders with some functionality like reopen. On the other side, atomic readers do not need reopen logic anymore? When a segment changes, you need a new atomic reader? - maybe because of deletions thats not the best idea, but we should investigate. Maybe make the whole reopen logic simplier to use (ast least on the collection reader level). We should decide about good names, i have no preference at the moment. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly
[ https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Rowe updated LUCENE-2657: Attachment: LUCENE-2657.patch This patch cuts over {{ant generate-maven-artifacts}} to directly use {{mvn}}. After applying this patch, Maven 2.2.1 (and maybe 3.0.X? - untested) must be installed on your machine in order to run {{ant generate-maven-artifacts}}. Other changes in this patch: # Dropped all use of Maven Ant Tasks. # The top-level {{ant generate-maven-artifacts}} now works and is the best way to perform this task, since it will create a single timestamp for all artifacts; this target can also be run from either {{solr/}} or {{lucene/}}. # Removed the {{generate-maven-artifacts}} target from {{modules/build.xml}}, and transferred the responsibility for generating {{modules/*}} maven artifacts to {{lucene/build.xml}}. # The {{solr/src/webapp/}} module no longer installs or deploys its (empty) sources jar. # Remote Maven artifact deployment is no longer included in the Ant build - this can be performed by the Maven build. # {{mvn clean}} no longer removes {{solr/dist/}} or {{lucene/dist/}}, for two reasons: ## The Ant build populates {{dist/}} with things that the Maven build should not remove. Removing just {{dist/maven/}} won't work, because: ## I couldn't find a nice/simple way to remove a directory just once in the reactor build. The previous patch attempted to do this from the lucene core and solr core modules, but that solution was deleting their deployed parent POMs, since the reactor build orders the Solr and Lucene parent POMs before the core modules (the parent relationship requires this). Left to do: # Add Ant targets to test the Maven artifacts # Backport to branch_3x Replace Maven POM templates with full POMs, and change documentation accordingly Key: LUCENE-2657 URL: https://issues.apache.org/jira/browse/LUCENE-2657 Project: Lucene - Java Issue Type: Improvement Components: Build Affects Versions: 3.1, 4.0 Reporter: Steven Rowe Assignee: Steven Rowe Fix For: 3.1, 4.0 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch The current Maven POM templates only contain dependency information, the bare bones necessary for uploading artifacts to the Maven repository. The full Maven POMs in the attached patch include the information necessary to run a multi-module Maven build, in addition to serving the same purpose as the current POM templates. Several dependencies are not available through public maven repositories. A profile in the top-level POM can be activated to install these dependencies from the various {{lib/}} directories into your local repository. From the top-level directory: {code} mvn -N -Pbootstrap install {code} Once these non-Maven dependencies have been installed, to run all Lucene/Solr tests via Maven's surefire plugin, and populate your local repository with all artifacts, from the top level directory, run: {code} mvn install {code} When one Lucene/Solr module depends on another, the dependency is declared on the *artifact(s)* produced by the other module and deposited in your local repository, rather than on the other module's un-jarred compiler output in the {{build/}} directory, so you must run {{mvn install}} on the other module before its changes are visible to the module that depends on it. To create all the artifacts without running tests: {code} mvn -DskipTests install {code} I almost always include the {{clean}} phase when I do a build, e.g.: {code} mvn -DskipTests clean install {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2858) Separate SegmentReaders (and other atomic readers) from composite IndexReaders
[ https://issues.apache.org/jira/browse/LUCENE-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982152#action_12982152 ] Marvin Humphrey commented on LUCENE-2858: - Deleting docs from IR has advantages over deleting from IW: the change is live to any searches running on that IR; you get an immediate count of how many docs were deleted; you can delete by docID. Alternate plan: * Move responsibility for deletions to a pluggable DeletionsReader subcomponent of SegmentReader. * Have the default DeletionsReader be read-only. * People who need the esoteric functionality described above can use a subclass of DeletionsReader. Separate SegmentReaders (and other atomic readers) from composite IndexReaders -- Key: LUCENE-2858 URL: https://issues.apache.org/jira/browse/LUCENE-2858 Project: Lucene - Java Issue Type: Task Reporter: Uwe Schindler Fix For: 4.0 With current trunk, whenever you open an IndexReader on a directory you get back a DirectoryReader which is a composite reader. The interface of IndexReader has now lots of methods that simply throw UOE (in fact more than 50% of all methods that are commonly used ones are unuseable now). This confuses users and makes the API hard to understand. This issue should split atomic readers from reader collections with a separate API. After that, you are no longer able, to get TermsEnum without wrapping from those composite readers. We currently have helper classes for wrapping (SlowMultiReaderWrapper - please rename, the name is really ugly; or Multi*), those should be retrofitted to implement the correct classes (SlowMultiReaderWrapper would be an atomic reader but takes a composite reader as ctor param, maybe it could also simply take a ListAtomicReader). In my opinion, maybe composite readers could implement some collection APIs and also have the ReaderUtil method directly built in (possibly as a view in the util.Collection sense). In general composite readers do not really need to look like the previous IndexReaders, they could simply be a collection of SegmentReaders with some functionality like reopen. On the other side, atomic readers do not need reopen logic anymore? When a segment changes, you need a new atomic reader? - maybe because of deletions thats not the best idea, but we should investigate. Maybe make the whole reopen logic simplier to use (ast least on the collection reader level). We should decide about good names, i have no preference at the moment. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2858) Separate SegmentReaders (and other atomic readers) from composite IndexReaders
[ https://issues.apache.org/jira/browse/LUCENE-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982166#action_12982166 ] Earwin Burrfoot commented on LUCENE-2858: - APIs have to be there still. All that commity, segment-deletery, mutabley stuff (that spans both atomic and composite readers). So, while your plan is viable, it won't remove that much cruft. Separate SegmentReaders (and other atomic readers) from composite IndexReaders -- Key: LUCENE-2858 URL: https://issues.apache.org/jira/browse/LUCENE-2858 Project: Lucene - Java Issue Type: Task Reporter: Uwe Schindler Fix For: 4.0 With current trunk, whenever you open an IndexReader on a directory you get back a DirectoryReader which is a composite reader. The interface of IndexReader has now lots of methods that simply throw UOE (in fact more than 50% of all methods that are commonly used ones are unuseable now). This confuses users and makes the API hard to understand. This issue should split atomic readers from reader collections with a separate API. After that, you are no longer able, to get TermsEnum without wrapping from those composite readers. We currently have helper classes for wrapping (SlowMultiReaderWrapper - please rename, the name is really ugly; or Multi*), those should be retrofitted to implement the correct classes (SlowMultiReaderWrapper would be an atomic reader but takes a composite reader as ctor param, maybe it could also simply take a ListAtomicReader). In my opinion, maybe composite readers could implement some collection APIs and also have the ReaderUtil method directly built in (possibly as a view in the util.Collection sense). In general composite readers do not really need to look like the previous IndexReaders, they could simply be a collection of SegmentReaders with some functionality like reopen. On the other side, atomic readers do not need reopen logic anymore? When a segment changes, you need a new atomic reader? - maybe because of deletions thats not the best idea, but we should investigate. Maybe make the whole reopen logic simplier to use (ast least on the collection reader level). We should decide about good names, i have no preference at the moment. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Lucene-3.x - Build # 243 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-3.x/243/ All tests passed Build Log (for compile errors): [...truncated 21065 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Resolved: (SOLR-2282) Distributed Support for Search Result Clustering
[ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi resolved SOLR-2282. -- Resolution: Fixed Thanks everyone! trunk: Committed revision 1059426. 3x: Committed revision 1059428. Distributed Support for Search Result Clustering Key: SOLR-2282 URL: https://issues.apache.org/jira/browse/SOLR-2282 Project: Solr Issue Type: New Feature Components: contrib - Clustering Affects Versions: 1.4, 1.4.1 Reporter: Koji Sekiguchi Assignee: Koji Sekiguchi Priority: Minor Fix For: 3.1, 4.0 Attachments: SOLR-2282-concurrency-branch_3x.patch, SOLR-2282-concurrency-trunk.patch, SOLR-2282-diagnostics.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1191) NullPointerException in delta import
[ https://issues.apache.org/jira/browse/SOLR-1191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982193#action_12982193 ] Gunnlaugur Thor Briem commented on SOLR-1191: - I see this still happening on the tip of the 3.1 branch: {quote} Jan 15, 2011 9:47:39 PM org.apache.solr.handler.dataimport.DataImporter doDeltaImport SEVERE: Delta Import Failed java.lang.NullPointerException at org.apache.solr.handler.dataimport.DocBuilder.collectDelta(DocBuilder.java:860) at org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:282) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:176) at org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:356) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:395) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:374) {quote} Same kind of problem: {quote} pk=ds.id deltaQuery=SELECT id FROM [...] {quote} and same kind of workaround: {quote} pk=ds.id deltaQuery=SELECT id AS quot;ds.idquot;FROM [...] {quote} NullPointerException in delta import Key: SOLR-1191 URL: https://issues.apache.org/jira/browse/SOLR-1191 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 1.3, 1.4 Environment: OS: Windows Linux. Java: 1.6 DB: MySQL SQL Server Reporter: Ali Syed Assignee: Noble Paul Fix For: 1.4 Seeing few of these NullPointerException during delta imports. Once this happens delta import stops working and keeps giving the same error. java.lang.NullPointerException at org.apache.solr.handler.dataimport.DocBuilder.collectDelta(DocBuilder.java:622) at org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:240) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:159) at org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:337) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:376) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:355) Running delta import for a particular entity fixes the problem and delta import start working again. Here is the log just before after the exception 05/27 11:59:29 86987686 INFO btpool0-538 org.apache.solr.core.SolrCore - [localhost] webapp=/solr path=/dataimport params={command=delta-importoptimize=false} status=0 QTime=0 05/27 11:59:29 86987687 INFO Thread-4162 org.apache.solr.handler.dataimport.SolrWriter - Read dataimport.properties 05/27 11:59:29 86987687 INFO Thread-4162 org.apache.solr.handler.dataimport.DataImporter - Starting Delta Import 05/27 11:59:29 86987687 INFO Thread-4162 org.apache.solr.handler.dataimport.SolrWriter - Read dataimport.properties 05/27 11:59:29 86987687 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Starting delta collection. 05/27 11:59:29 86987690 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Running ModifiedRowKey() for Entity: content 05/27 11:59:29 86987690 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Completed ModifiedRowKey for Entity: content rows obtained : 0 05/27 11:59:29 86987690 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Completed DeletedRowKey for Entity: content rows obtained : 0 05/27 11:59:29 86987692 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Completed parentDeltaQuery for Entity: content 05/27 11:59:29 86987692 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Running ModifiedRowKey() for Entity: job 05/27 11:59:29 86987692 INFO Thread-4162 org.apache.solr.handler.dataimport.JdbcDataSource - Creating a connection for entity job with URL: jdbc:sqlserver://localhost;databaseName=TestDB 05/27 11:59:29 86987704 INFO Thread-4162 org.apache.solr.handler.dataimport.JdbcDataSource - Time taken for getConnection(): 12 05/27 11:59:29 86987707 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Completed ModifiedRowKey for Entity: job rows obtained : 0 05/27 11:59:29 86987707 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Completed DeletedRowKey for Entity: job rows obtained : 0 05/27 11:59:29 86987707 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Completed parentDeltaQuery for Entity: job 05/27 11:59:29 86987707 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Delta Import completed successfully 05/27 11:59:29 86987707 INFO
[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments
[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982200#action_12982200 ] Michael Busch commented on LUCENE-2324: --- I just committed fixes for some failing tests. Eg. the addIndexes() problem is now fixed. The problem was that I had accidentally removed the following line in DW.addIndexes(): {code} // Update SI appropriately info.setDocStore(info.getDocStoreOffset(), newDsName, info.getDocStoreIsCompoundFile()); {code} info.setDocStore() calls clearFiles(), which empties a SegmentInfo-local cache of all filenames that belong to the corresponding segment. Since addIndexes() changes the segment name, it is important to refill that cache with the new file names. This was a sneaky bug. We should probably call clearFiles() explicitly there in addIndexes(). For now I added a comment. Per thread DocumentsWriters that write their own private segments - Key: LUCENE-2324 URL: https://issues.apache.org/jira/browse/LUCENE-2324 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Michael Busch Assignee: Michael Busch Priority: Minor Fix For: Realtime Branch Attachments: LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, lucene-2324.patch, lucene-2324.patch, LUCENE-2324.patch, test.out, test.out, test.out See LUCENE-2293 for motivation and more details. I'm copying here Mike's summary he posted on 2293: Change the approach for how we buffer in RAM to a more isolated approach, whereby IW has N fully independent RAM segments in-process and when a doc needs to be indexed it's added to one of them. Each segment would also write its own doc stores and normal segment merging (not the inefficient merge we now do on flush) would merge them. This should be a good simplification in the chain (eg maybe we can remove the *PerThread classes). The segments can flush independently, letting us make much better concurrent use of IO CPU. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Resolved: (LUCENE-2867) Change contrib QP API that uses CharSequence as string identifier
[ https://issues.apache.org/jira/browse/LUCENE-2867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adriano Crestani resolved LUCENE-2867. -- Resolution: Fixed Assignee: Adriano Crestani Thanks for reviewing the patch Simon! The patch was applied on revision 1059436 Change contrib QP API that uses CharSequence as string identifier - Key: LUCENE-2867 URL: https://issues.apache.org/jira/browse/LUCENE-2867 Project: Lucene - Java Issue Type: Improvement Components: contrib/* Affects Versions: 3.0.3 Reporter: Adriano Crestani Assignee: Adriano Crestani Priority: Minor Fix For: 3.0.4 Attachments: lucene_2867_adriano_crestani_2011_01_13.patch There are some API methods on contrib queryparser that expects CharSequence as identifier. This is wrong, since it may lead to incorrect or mislead behavior, as shown on LUCENE-2855. To avoid this problem, these APIs will be changed and enforce the use of String instead of CharSequence on version 4. This patch already deprecate the old API methods and add new substitute methods that uses only String. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments
[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Rutherglen updated LUCENE-2324: - Attachment: test.out Are you also merging trunk in as svn up yields a lot of updates. There are new test failures in: TestSnapshotDeletionPolicy Per thread DocumentsWriters that write their own private segments - Key: LUCENE-2324 URL: https://issues.apache.org/jira/browse/LUCENE-2324 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Michael Busch Assignee: Michael Busch Priority: Minor Fix For: Realtime Branch Attachments: LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, lucene-2324.patch, lucene-2324.patch, LUCENE-2324.patch, test.out, test.out, test.out, test.out See LUCENE-2293 for motivation and more details. I'm copying here Mike's summary he posted on 2293: Change the approach for how we buffer in RAM to a more isolated approach, whereby IW has N fully independent RAM segments in-process and when a doc needs to be indexed it's added to one of them. Each segment would also write its own doc stores and normal segment merging (not the inefficient merge we now do on flush) would merge them. This should be a good simplification in the chain (eg maybe we can remove the *PerThread classes). The segments can flush independently, letting us make much better concurrent use of IO CPU. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments
[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982205#action_12982205 ] Jason Rutherglen commented on LUCENE-2324: -- The TestStressIndexing2 errors remind me of what I saw when working on LUCENE-2680. I'll take a look. They weren't there in the previous revisions of this branch. Per thread DocumentsWriters that write their own private segments - Key: LUCENE-2324 URL: https://issues.apache.org/jira/browse/LUCENE-2324 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Michael Busch Assignee: Michael Busch Priority: Minor Fix For: Realtime Branch Attachments: LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, lucene-2324.patch, lucene-2324.patch, LUCENE-2324.patch, test.out, test.out, test.out, test.out See LUCENE-2293 for motivation and more details. I'm copying here Mike's summary he posted on 2293: Change the approach for how we buffer in RAM to a more isolated approach, whereby IW has N fully independent RAM segments in-process and when a doc needs to be indexed it's added to one of them. Each segment would also write its own doc stores and normal segment merging (not the inefficient merge we now do on flush) would merge them. This should be a good simplification in the chain (eg maybe we can remove the *PerThread classes). The segments can flush independently, letting us make much better concurrent use of IO CPU. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Lucene-trunk - Build # 1427 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-trunk/1427/ All tests passed Build Log (for compile errors): [...truncated 16762 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments
[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982213#action_12982213 ] Jason Rutherglen commented on LUCENE-2324: -- In DW.flushAllThreads we're accessing indexWriter.segmentInfos while we're not synced on IW, so the segment infos vector may be changing as we're accessing it. I'm not sure how we can reasonably solve this, I don't think cloning segment infos will work. In trunk, doFlush is sync'ed on IW and so doesn't run into these problems. Perhaps for the flush all threads case we should simply sync on IW? Per thread DocumentsWriters that write their own private segments - Key: LUCENE-2324 URL: https://issues.apache.org/jira/browse/LUCENE-2324 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Michael Busch Assignee: Michael Busch Priority: Minor Fix For: Realtime Branch Attachments: LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, lucene-2324.patch, lucene-2324.patch, LUCENE-2324.patch, test.out, test.out, test.out, test.out See LUCENE-2293 for motivation and more details. I'm copying here Mike's summary he posted on 2293: Change the approach for how we buffer in RAM to a more isolated approach, whereby IW has N fully independent RAM segments in-process and when a doc needs to be indexed it's added to one of them. Each segment would also write its own doc stores and normal segment merging (not the inefficient merge we now do on flush) would merge them. This should be a good simplification in the chain (eg maybe we can remove the *PerThread classes). The segments can flush independently, letting us make much better concurrent use of IO CPU. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-1191) NullPointerException in delta import
[ https://issues.apache.org/jira/browse/SOLR-1191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunnlaugur Thor Briem updated SOLR-1191: Attachment: SOLR-1191.patch Patch to resolve this. It resolves deltaQuery columns against pk when they differ by prefix (and report error more helpfully when no column matches, or more than one column matches). No unit test, sorry (but there's not much deltaQuery coverage anyway). All existing unit tests pass, and this is working fine for me in production. NullPointerException in delta import Key: SOLR-1191 URL: https://issues.apache.org/jira/browse/SOLR-1191 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 1.3, 1.4 Environment: OS: Windows Linux. Java: 1.6 DB: MySQL SQL Server Reporter: Ali Syed Assignee: Noble Paul Fix For: 1.4 Attachments: SOLR-1191.patch Seeing few of these NullPointerException during delta imports. Once this happens delta import stops working and keeps giving the same error. java.lang.NullPointerException at org.apache.solr.handler.dataimport.DocBuilder.collectDelta(DocBuilder.java:622) at org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:240) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:159) at org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:337) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:376) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:355) Running delta import for a particular entity fixes the problem and delta import start working again. Here is the log just before after the exception 05/27 11:59:29 86987686 INFO btpool0-538 org.apache.solr.core.SolrCore - [localhost] webapp=/solr path=/dataimport params={command=delta-importoptimize=false} status=0 QTime=0 05/27 11:59:29 86987687 INFO Thread-4162 org.apache.solr.handler.dataimport.SolrWriter - Read dataimport.properties 05/27 11:59:29 86987687 INFO Thread-4162 org.apache.solr.handler.dataimport.DataImporter - Starting Delta Import 05/27 11:59:29 86987687 INFO Thread-4162 org.apache.solr.handler.dataimport.SolrWriter - Read dataimport.properties 05/27 11:59:29 86987687 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Starting delta collection. 05/27 11:59:29 86987690 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Running ModifiedRowKey() for Entity: content 05/27 11:59:29 86987690 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Completed ModifiedRowKey for Entity: content rows obtained : 0 05/27 11:59:29 86987690 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Completed DeletedRowKey for Entity: content rows obtained : 0 05/27 11:59:29 86987692 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Completed parentDeltaQuery for Entity: content 05/27 11:59:29 86987692 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Running ModifiedRowKey() for Entity: job 05/27 11:59:29 86987692 INFO Thread-4162 org.apache.solr.handler.dataimport.JdbcDataSource - Creating a connection for entity job with URL: jdbc:sqlserver://localhost;databaseName=TestDB 05/27 11:59:29 86987704 INFO Thread-4162 org.apache.solr.handler.dataimport.JdbcDataSource - Time taken for getConnection(): 12 05/27 11:59:29 86987707 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Completed ModifiedRowKey for Entity: job rows obtained : 0 05/27 11:59:29 86987707 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Completed DeletedRowKey for Entity: job rows obtained : 0 05/27 11:59:29 86987707 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Completed parentDeltaQuery for Entity: job 05/27 11:59:29 86987707 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Delta Import completed successfully 05/27 11:59:29 86987707 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Starting delta collection. 05/27 11:59:29 86987709 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Running ModifiedRowKey() for Entity: user 05/27 11:59:29 86987709 INFO Thread-4162 org.apache.solr.handler.dataimport.JdbcDataSource - Creating a connection for entity user with URL: jdbc:sqlserver://localhost;databaseName=TestDB 05/27 11:59:29 86987716 INFO Thread-4162 org.apache.solr.handler.dataimport.JdbcDataSource - Time taken for getConnection(): 7 05/27 11:59:29 86987873 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Completed
[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments
[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982220#action_12982220 ] Jason Rutherglen commented on LUCENE-2324: -- DW.deleteTerms iterates on DWPTs without acquiring the ThreadState.lock, instead DWPT.deleteTerms is synced (on DWPT). I think if a flush is occurring then deletes can get in at the same time? I don't think BufferedDeletes supports that? Per thread DocumentsWriters that write their own private segments - Key: LUCENE-2324 URL: https://issues.apache.org/jira/browse/LUCENE-2324 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Michael Busch Assignee: Michael Busch Priority: Minor Fix For: Realtime Branch Attachments: LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, lucene-2324.patch, lucene-2324.patch, LUCENE-2324.patch, test.out, test.out, test.out, test.out See LUCENE-2293 for motivation and more details. I'm copying here Mike's summary he posted on 2293: Change the approach for how we buffer in RAM to a more isolated approach, whereby IW has N fully independent RAM segments in-process and when a doc needs to be indexed it's added to one of them. Each segment would also write its own doc stores and normal segment merging (not the inefficient merge we now do on flush) would merge them. This should be a good simplification in the chain (eg maybe we can remove the *PerThread classes). The segments can flush independently, letting us make much better concurrent use of IO CPU. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1191) NullPointerException in delta import
[ https://issues.apache.org/jira/browse/SOLR-1191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982221#action_12982221 ] Gunnlaugur Thor Briem commented on SOLR-1191: - Neglected to mention: that patch is against branch_3x. NullPointerException in delta import Key: SOLR-1191 URL: https://issues.apache.org/jira/browse/SOLR-1191 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 1.3, 1.4 Environment: OS: Windows Linux. Java: 1.6 DB: MySQL SQL Server Reporter: Ali Syed Assignee: Noble Paul Fix For: 1.4 Attachments: SOLR-1191.patch Seeing few of these NullPointerException during delta imports. Once this happens delta import stops working and keeps giving the same error. java.lang.NullPointerException at org.apache.solr.handler.dataimport.DocBuilder.collectDelta(DocBuilder.java:622) at org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:240) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:159) at org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:337) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:376) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:355) Running delta import for a particular entity fixes the problem and delta import start working again. Here is the log just before after the exception 05/27 11:59:29 86987686 INFO btpool0-538 org.apache.solr.core.SolrCore - [localhost] webapp=/solr path=/dataimport params={command=delta-importoptimize=false} status=0 QTime=0 05/27 11:59:29 86987687 INFO Thread-4162 org.apache.solr.handler.dataimport.SolrWriter - Read dataimport.properties 05/27 11:59:29 86987687 INFO Thread-4162 org.apache.solr.handler.dataimport.DataImporter - Starting Delta Import 05/27 11:59:29 86987687 INFO Thread-4162 org.apache.solr.handler.dataimport.SolrWriter - Read dataimport.properties 05/27 11:59:29 86987687 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Starting delta collection. 05/27 11:59:29 86987690 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Running ModifiedRowKey() for Entity: content 05/27 11:59:29 86987690 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Completed ModifiedRowKey for Entity: content rows obtained : 0 05/27 11:59:29 86987690 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Completed DeletedRowKey for Entity: content rows obtained : 0 05/27 11:59:29 86987692 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Completed parentDeltaQuery for Entity: content 05/27 11:59:29 86987692 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Running ModifiedRowKey() for Entity: job 05/27 11:59:29 86987692 INFO Thread-4162 org.apache.solr.handler.dataimport.JdbcDataSource - Creating a connection for entity job with URL: jdbc:sqlserver://localhost;databaseName=TestDB 05/27 11:59:29 86987704 INFO Thread-4162 org.apache.solr.handler.dataimport.JdbcDataSource - Time taken for getConnection(): 12 05/27 11:59:29 86987707 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Completed ModifiedRowKey for Entity: job rows obtained : 0 05/27 11:59:29 86987707 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Completed DeletedRowKey for Entity: job rows obtained : 0 05/27 11:59:29 86987707 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Completed parentDeltaQuery for Entity: job 05/27 11:59:29 86987707 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Delta Import completed successfully 05/27 11:59:29 86987707 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Starting delta collection. 05/27 11:59:29 86987709 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Running ModifiedRowKey() for Entity: user 05/27 11:59:29 86987709 INFO Thread-4162 org.apache.solr.handler.dataimport.JdbcDataSource - Creating a connection for entity user with URL: jdbc:sqlserver://localhost;databaseName=TestDB 05/27 11:59:29 86987716 INFO Thread-4162 org.apache.solr.handler.dataimport.JdbcDataSource - Time taken for getConnection(): 7 05/27 11:59:29 86987873 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Completed ModifiedRowKey for Entity: user rows obtained : 46 05/27 11:59:29 86987873 INFO Thread-4162 org.apache.solr.handler.dataimport.DocBuilder - Completed DeletedRowKey for Entity: user rows obtained : 0 05/27 11:59:29 86987873 INFO Thread-4162