Re: Documentation Brainstorming
Paul Elschot <[EMAIL PROTECTED]> wrote on 30/05/2007 23:57:47: > On Thursday 31 May 2007 05:52, Erik Hatcher wrote: > > > > On May 30, 2007, at 9:33 PM, Grant Ingersoll wrote: > > >> I'd rather see each jar get its own javadoc, > > >> or at the very least, indicate which jar each > > >> class is defined in for the ones that aren't > > >> part of the core. > > >> > > > > > > Yeah, I don't like that all the contribs are built in together. > > > What do others think? I would vote for separating them out. > > > > I concur with having the contrib docs separated. I may have been the > > one (or at least assisted with it) who got the documentation build to > > fold it altogether as that was the goal at the time. It'd be much > > easier, build-wise, if all artifacts were kept entirely separate for > > all the various contrib libraries and the core, as well as the demo. > > > Currently it is not clear in the javadocs whether a class belongs > to core or contrib. Having separate javadocs would probably > improve that. > I have no experience in linking between javadoc "packages", > so I have no suggestion on how to make such a separation. I am all for separation. Though it is sometimes useful to have it all together, - perhaps two versions: all, and by module (core, contrib/x, contrib/y, etc.)? Or is this too cluttered - we already have it by release... - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Updated: (LUCENE-887) Interruptible segment merges
[ https://issues.apache.org/jira/browse/LUCENE-887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Busch updated LUCENE-887: - Attachment: ExtendedIndexWriter.java Here is the code I originally wrote to add a shutdown function to IndexWriter. This patch contains a class called ExtendedIndexWriter that (as you might guess ;) ) extends IndexWriter and adds a shutdown() method. This method may always be called by some thread, no matter if other threads are currently adding documents. Three scenarios might happen: 1) Shutdown() is called while there is no ongoing merge or addDocument: In this case the buffered documents are flushed to disk without triggering cascading merges. (I will commit a protected method flushRamSegments(boolean triggerMerge) to IndexWriter to support this. 2) Shutdown() is called while there is an ongoing merge: In this case an IOException is thrown by the extended FSOutputStream which makes the IndexWriter rollback the transaction. Thereafter flushRamSegments(false) is called to flush buffered docs if there are any. 3) Shutdown() is called while other threads are in addDocument: This is the tricky one. We don't want to throw the IOException before the addDocument has finished analyzing and indexing the document, because otherwise this document would be lost. Since buildDocument() is not synchronized we can not rely on IndexWriters mutex to wait for those threads to finish addDocument. Therefore I add a variable that counts how many threads are in addDocument(). A different mutex is used to increment, decrement and check this variable. Shutdown wait until indexing of those docs is done and continues like in case 1) or 2). I suggest whoever is interested should just look at the code. I'm sure there will be a lot of questions. There's still a lot of work that has to be done here, like writing testcases and examining how this works in the new autoCommit=false mode (I wrote this code before that new feature was committed). And we still have to decide whether this shutdown functionality should go into the Lucene core. > Interruptible segment merges > > > Key: LUCENE-887 > URL: https://issues.apache.org/jira/browse/LUCENE-887 > Project: Lucene - Java > Issue Type: New Feature > Components: Index >Reporter: Michael Busch >Priority: Minor > Fix For: 2.2 > > Attachments: ExtendedIndexWriter.java > > > Adds the ability to IndexWriter to interrupt an ongoing merge. This might be > necessary when Lucene is e. g. running as a service and has to stop indexing > within a certain period of time due to a shutdown request. > A solution would be to add a new method shutdown() to IndexWriter which > satisfies the following two requirements: > - if a merge is happening, abort it > - flush the buffered docs but do not trigger a merge > See also discussions about this feature on java-dev: > http://www.gossamer-threads.com/lists/lucene/java-dev/49008 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Resolved: (LUCENE-866) Multi-level skipping on posting lists
[ https://issues.apache.org/jira/browse/LUCENE-866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Busch resolved LUCENE-866. -- Resolution: Fixed Committed. > Multi-level skipping on posting lists > - > > Key: LUCENE-866 > URL: https://issues.apache.org/jira/browse/LUCENE-866 > Project: Lucene - Java > Issue Type: Improvement > Components: Index >Reporter: Michael Busch >Assignee: Michael Busch >Priority: Minor > Fix For: 2.2 > > Attachments: fileformats.patch, lucene-866.patch, lucene-866.patch > > > To accelerate posting list skips (TermDocs.skipTo(int)) Lucene uses skip > lists. > The default skip interval is set to 16. If we want to skip e. g. 100 > documents, > then it is not necessary to read 100 entries from the posting list, but only > 100/16 = 6 skip list entries plus 100%16 = 4 entries from the posting list. > This > speeds up conjunction (AND) and phrase queries significantly. > However, the skip interval is always a compromise. If you have a very big > index > with huge posting lists and you want to skip over lets say 100k documents, > then > it is still necessary to read 100k/16 = 6250 entries from the skip list. For > big > indexes the skip interval could be set to a higher value, but then after a > big > skip a long scan to the target doc might be necessary. > A solution for this compromise is to have multi-level skip lists that > guarantee a > logarithmic amount of skips to any target in the posting list. This patch > implements such an approach in the following way: > Example for skipInterval = 3: > c(skip > level 2) > c c c(skip > level 1) > x x x x x x x x x x (skip > level 0) > d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d (posting > list) > 3 6 9 12151821242730 (df) > > d - document > x - skip data > c - skip data with child pointer > > Skip level i contains every skipInterval-th entry from skip level i-1. > Therefore the > number of entries on level i is: floor(df / ((skipInterval ^ (i + 1))). > > Each skip entry on a level i>0 contains a pointer to the corresponding skip > entry in > list i-1. This guarantees a logarithmic amount of skips to find the target > document. > Implementations details: >* I factored the skipping code out of SegmentMerger and SegmentTermDocs to > simplify those classes. The two new classes AbstractSkipListReader and >AbstractSkipListWriter implement the skipping functionality. >* While AbstractSkipListReader and Writer take care of writing and reading > the > multiple skip levels, they do not implement an actual skip data format. > The two >new subclasses DefaultSkipListReader and Writer implement the skip > data format >that is currently used in Lucene (with two file pointers for the freq > and prox >file and with payload length information). I added this extra layer to > be >prepared for flexible indexing and different posting list formats. > > > File format changes: >* I added the new parameter 'maxSkipLevels' to the term dictionary and > increased the > version of this file. If maxSkipLevels is set to one, then the format of > the freq >file does not change at all, because we only have one skip level as > before. For >backwards compatibility maxSkipLevels is set to one automatically if > an index >without the new parameter is read. >* In case maxSkipLevels > 1, then the frq file changes as follows: > FreqFile (.frq) --> ^TermCount >SkipData--> <^(Min(maxSkipLevels, > floor(log(DocFreq/log(skipInterval))) - 1)>, > SkipLevel> >SkipLevel --> ^DocFreq/(SkipInterval^(Level + 1)) >Remark: The length of the SkipLevel is not stored for level 0, because > 1) it is not >needed, and 2) the format of this file does not change for > maxSkipLevels=1 then. > > > All unit tests pass with this patch. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Updated: (LUCENE-763) LuceneDictionary skips first word in enumeration
[ https://issues.apache.org/jira/browse/LUCENE-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Mallwitz updated LUCENE-763: -- Attachment: (was: LuceneDictionary.java) > LuceneDictionary skips first word in enumeration > > > Key: LUCENE-763 > URL: https://issues.apache.org/jira/browse/LUCENE-763 > Project: Lucene - Java > Issue Type: Bug > Components: Other >Affects Versions: 2.0.0 > Environment: Windows Sun JRE 1.4.2_10_b03 >Reporter: Dan Ertman > Attachments: TestLuceneDictionary.java > > > The current code for LuceneDictionary will always skip the first word of the > TermEnum. The reason is that it doesn't initially retrieve TermEnum.term - > its first call is to TermEnum.next, which moves it past the first term (line > 76). > To see this problem cause a failure, add this test to TestSpellChecker: > similar = spellChecker.suggestSimilar("eihgt",2); > assertEquals(1, similar.length); > assertEquals(similar[0], "eight"); > Because "eight" is the first word in the index, it will fail. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Updated: (LUCENE-763) LuceneDictionary skips first word in enumeration
[ https://issues.apache.org/jira/browse/LUCENE-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Mallwitz updated LUCENE-763: -- Attachment: (was: TestLuceneDictionary.java) > LuceneDictionary skips first word in enumeration > > > Key: LUCENE-763 > URL: https://issues.apache.org/jira/browse/LUCENE-763 > Project: Lucene - Java > Issue Type: Bug > Components: Other >Affects Versions: 2.0.0 > Environment: Windows Sun JRE 1.4.2_10_b03 >Reporter: Dan Ertman > > The current code for LuceneDictionary will always skip the first word of the > TermEnum. The reason is that it doesn't initially retrieve TermEnum.term - > its first call is to TermEnum.next, which moves it past the first term (line > 76). > To see this problem cause a failure, add this test to TestSpellChecker: > similar = spellChecker.suggestSimilar("eihgt",2); > assertEquals(1, similar.length); > assertEquals(similar[0], "eight"); > Because "eight" is the first word in the index, it will fail. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Updated: (LUCENE-763) LuceneDictionary skips first word in enumeration
[ https://issues.apache.org/jira/browse/LUCENE-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Mallwitz updated LUCENE-763: -- Attachment: TestLuceneDictionary.java New extended unit test case for class LuceneDictionary > LuceneDictionary skips first word in enumeration > > > Key: LUCENE-763 > URL: https://issues.apache.org/jira/browse/LUCENE-763 > Project: Lucene - Java > Issue Type: Bug > Components: Other >Affects Versions: 2.0.0 > Environment: Windows Sun JRE 1.4.2_10_b03 >Reporter: Dan Ertman > Attachments: TestLuceneDictionary.java > > > The current code for LuceneDictionary will always skip the first word of the > TermEnum. The reason is that it doesn't initially retrieve TermEnum.term - > its first call is to TermEnum.next, which moves it past the first term (line > 76). > To see this problem cause a failure, add this test to TestSpellChecker: > similar = spellChecker.suggestSimilar("eihgt",2); > assertEquals(1, similar.length); > assertEquals(similar[0], "eight"); > Because "eight" is the first word in the index, it will fail. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Updated: (LUCENE-763) LuceneDictionary skips first word in enumeration
[ https://issues.apache.org/jira/browse/LUCENE-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Mallwitz updated LUCENE-763: -- Attachment: LuceneDictionary.java Fixed class LuceneDictionary > LuceneDictionary skips first word in enumeration > > > Key: LUCENE-763 > URL: https://issues.apache.org/jira/browse/LUCENE-763 > Project: Lucene - Java > Issue Type: Bug > Components: Other >Affects Versions: 2.0.0 > Environment: Windows Sun JRE 1.4.2_10_b03 >Reporter: Dan Ertman > Attachments: LuceneDictionary.java, TestLuceneDictionary.java > > > The current code for LuceneDictionary will always skip the first word of the > TermEnum. The reason is that it doesn't initially retrieve TermEnum.term - > its first call is to TermEnum.next, which moves it past the first term (line > 76). > To see this problem cause a failure, add this test to TestSpellChecker: > similar = spellChecker.suggestSimilar("eihgt",2); > assertEquals(1, similar.length); > assertEquals(similar[0], "eight"); > Because "eight" is the first word in the index, it will fail. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [jira] Commented: (LUCENE-763) LuceneDictionary skips first word in enumeration
I knew the boolean flag which was in the class in the first place was used for something ... :-) Anyway, I have uploaded updated class and unit test files. Thanks Christian This e-mail has been scanned for all viruses by MessageLabs. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[EMAIL PROTECTED]: Project lucene-java (in module lucene-java) failed
To whom it may engage... This is an automated request, but not an unsolicited one. For more information please visit http://gump.apache.org/nagged.html, and/or contact the folk at [EMAIL PROTECTED] Project lucene-java has an issue affecting its community integration. This issue affects 4 projects, and has been outstanding for 15 runs. The current state of this project is 'Failed', with reason 'Build Failed'. For reference only, the following projects are affected by this: - eyebrowse : Web-based mail archive browsing - jakarta-lucene : Java Based Search Engine - jakarta-slide : Content Management System based on WebDAV technology - lucene-java : Java Based Search Engine Full details are available at: http://vmgump.apache.org/gump/public/lucene-java/lucene-java/index.html That said, some information snippets are provided here. The following annotations (debug/informational/warning/error messages) were provided: -DEBUG- Sole output [lucene-core-31052007.jar] identifier set to project name -DEBUG- Dependency on javacc exists, no need to add for property javacc.home. -INFO- Failed with reason build failed -DEBUG- Extracted fallback artifacts from Gump Repository The following work was performed: http://vmgump.apache.org/gump/public/lucene-java/lucene-java/gump_work/build_lucene-java_lucene-java.html Work Name: build_lucene-java_lucene-java (Type: Build) Work ended in a state of : Failed Elapsed: 26 secs Command Line: /opt/jdk1.5/bin/java -Djava.awt.headless=true -Xbootclasspath/p:/usr/local/gump/public/workspace/xml-commons/java/external/build/xml-apis.jar:/usr/local/gump/public/workspace/xml-xerces2/build/xercesImpl.jar org.apache.tools.ant.Main -Dgump.merge=/x1/gump/public/gump/work/merge.xml -Dbuild.sysclasspath=only -Dversion=31052007 -Djavacc.home=/usr/local/gump/packages/javacc-3.1 package [Working Directory: /usr/local/gump/public/workspace/lucene-java] CLASSPATH: /opt/jdk1.5/lib/tools.jar:/usr/local/gump/public/workspace/lucene-java/build/classes/java:/usr/local/gump/public/workspace/lucene-java/build/classes/demo:/usr/local/gump/public/workspace/lucene-java/build/classes/test:/usr/local/gump/public/workspace/ant/dist/lib/ant-jmf.jar:/usr/local/gump/public/workspace/ant/dist/lib/ant-swing.jar:/usr/local/gump/public/workspace/ant/dist/lib/ant-apache-resolver.jar:/usr/local/gump/public/workspace/ant/dist/lib/ant-trax.jar:/usr/local/gump/public/workspace/ant/dist/lib/ant-junit.jar:/usr/local/gump/public/workspace/ant/dist/lib/ant-launcher.jar:/usr/local/gump/public/workspace/ant/dist/lib/ant-nodeps.jar:/usr/local/gump/public/workspace/ant/dist/lib/ant.jar:/usr/local/gump/packages/junit3.8.1/junit.jar:/usr/local/gump/public/workspace/xml-commons/java/build/resolver.jar:/usr/local/gump/packages/je-1.7.1/lib/je.jar:/usr/local/gump/public/workspace/jakarta-commons/digester/dist/commons-digester.jar:/usr/local/gump/public/workspace/jakarta-regexp/build/jakarta-regexp-31052007.jar:/usr/local/gump/packages/javacc-3.1/bin/lib/javacc.jar:/usr/local/gump/public/workspace/jline/target/jline-0.9.92-SNAPSHOT.jar:/usr/local/gump/packages/jtidy-04aug2000r7-dev/build/Tidy.jar:/usr/local/gump/public/workspace/junit/dist/junit-31052007.jar:/usr/local/gump/public/workspace/xml-commons/java/external/build/xml-apis-ext.jar - [javac] location: class org.apache.lucene.store.db.DbDirectory [javac] DatabaseEntry key = new DatabaseEntry(new byte[0]); [javac] ^ [javac] /x1/gump/public/workspace/lucene-java/contrib/db/bdb/src/java/org/apache/lucene/store/db/DbDirectory.java:171: cannot find symbol [javac] symbol : class DatabaseEntry [javac] location: class org.apache.lucene.store.db.DbDirectory [javac] DatabaseEntry data = new DatabaseEntry((byte[]) null); [javac] ^ [javac] /x1/gump/public/workspace/lucene-java/contrib/db/bdb/src/java/org/apache/lucene/store/db/DbDirectory.java:171: cannot find symbol [javac] symbol : class DatabaseEntry [javac] location: class org.apache.lucene.store.db.DbDirectory [javac] DatabaseEntry data = new DatabaseEntry((byte[]) null); [javac] ^ [javac] /x1/gump/public/workspace/lucene-java/contrib/db/bdb/src/java/org/apache/lucene/store/db/DbDirectory.java:178: cannot find symbol [javac] symbol : variable DbConstants [javac] location: class org.apache.lucene.store.db.DbDirectory [javac]DbConstants.DB_SET_RANGE | flags) != DbConstants.DB_NOTFOUND) [javac]^ [javac] /x1/gump/public/workspace/lucene-java/contrib/db/bdb/src/java/org/apache/lucene/store/db/DbDirectory.java:178: cannot find symbol [javac] symbol : variable DbConstants [javac] location: class org.apache.lucene.store.db.DbDirectory [javac]
[EMAIL PROTECTED]: Project lucene-java (in module lucene-java) failed
To whom it may engage... This is an automated request, but not an unsolicited one. For more information please visit http://gump.apache.org/nagged.html, and/or contact the folk at [EMAIL PROTECTED] Project lucene-java has an issue affecting its community integration. This issue affects 4 projects, and has been outstanding for 15 runs. The current state of this project is 'Failed', with reason 'Build Failed'. For reference only, the following projects are affected by this: - eyebrowse : Web-based mail archive browsing - jakarta-lucene : Java Based Search Engine - jakarta-slide : Content Management System based on WebDAV technology - lucene-java : Java Based Search Engine Full details are available at: http://vmgump.apache.org/gump/public/lucene-java/lucene-java/index.html That said, some information snippets are provided here. The following annotations (debug/informational/warning/error messages) were provided: -DEBUG- Sole output [lucene-core-31052007.jar] identifier set to project name -DEBUG- Dependency on javacc exists, no need to add for property javacc.home. -INFO- Failed with reason build failed -DEBUG- Extracted fallback artifacts from Gump Repository The following work was performed: http://vmgump.apache.org/gump/public/lucene-java/lucene-java/gump_work/build_lucene-java_lucene-java.html Work Name: build_lucene-java_lucene-java (Type: Build) Work ended in a state of : Failed Elapsed: 26 secs Command Line: /opt/jdk1.5/bin/java -Djava.awt.headless=true -Xbootclasspath/p:/usr/local/gump/public/workspace/xml-commons/java/external/build/xml-apis.jar:/usr/local/gump/public/workspace/xml-xerces2/build/xercesImpl.jar org.apache.tools.ant.Main -Dgump.merge=/x1/gump/public/gump/work/merge.xml -Dbuild.sysclasspath=only -Dversion=31052007 -Djavacc.home=/usr/local/gump/packages/javacc-3.1 package [Working Directory: /usr/local/gump/public/workspace/lucene-java] CLASSPATH: /opt/jdk1.5/lib/tools.jar:/usr/local/gump/public/workspace/lucene-java/build/classes/java:/usr/local/gump/public/workspace/lucene-java/build/classes/demo:/usr/local/gump/public/workspace/lucene-java/build/classes/test:/usr/local/gump/public/workspace/ant/dist/lib/ant-jmf.jar:/usr/local/gump/public/workspace/ant/dist/lib/ant-swing.jar:/usr/local/gump/public/workspace/ant/dist/lib/ant-apache-resolver.jar:/usr/local/gump/public/workspace/ant/dist/lib/ant-trax.jar:/usr/local/gump/public/workspace/ant/dist/lib/ant-junit.jar:/usr/local/gump/public/workspace/ant/dist/lib/ant-launcher.jar:/usr/local/gump/public/workspace/ant/dist/lib/ant-nodeps.jar:/usr/local/gump/public/workspace/ant/dist/lib/ant.jar:/usr/local/gump/packages/junit3.8.1/junit.jar:/usr/local/gump/public/workspace/xml-commons/java/build/resolver.jar:/usr/local/gump/packages/je-1.7.1/lib/je.jar:/usr/local/gump/public/workspace/jakarta-commons/digester/dist/commons-digester.jar:/usr/local/gump/public/workspace/jakarta-regexp/build/jakarta-regexp-31052007.jar:/usr/local/gump/packages/javacc-3.1/bin/lib/javacc.jar:/usr/local/gump/public/workspace/jline/target/jline-0.9.92-SNAPSHOT.jar:/usr/local/gump/packages/jtidy-04aug2000r7-dev/build/Tidy.jar:/usr/local/gump/public/workspace/junit/dist/junit-31052007.jar:/usr/local/gump/public/workspace/xml-commons/java/external/build/xml-apis-ext.jar - [javac] location: class org.apache.lucene.store.db.DbDirectory [javac] DatabaseEntry key = new DatabaseEntry(new byte[0]); [javac] ^ [javac] /x1/gump/public/workspace/lucene-java/contrib/db/bdb/src/java/org/apache/lucene/store/db/DbDirectory.java:171: cannot find symbol [javac] symbol : class DatabaseEntry [javac] location: class org.apache.lucene.store.db.DbDirectory [javac] DatabaseEntry data = new DatabaseEntry((byte[]) null); [javac] ^ [javac] /x1/gump/public/workspace/lucene-java/contrib/db/bdb/src/java/org/apache/lucene/store/db/DbDirectory.java:171: cannot find symbol [javac] symbol : class DatabaseEntry [javac] location: class org.apache.lucene.store.db.DbDirectory [javac] DatabaseEntry data = new DatabaseEntry((byte[]) null); [javac] ^ [javac] /x1/gump/public/workspace/lucene-java/contrib/db/bdb/src/java/org/apache/lucene/store/db/DbDirectory.java:178: cannot find symbol [javac] symbol : variable DbConstants [javac] location: class org.apache.lucene.store.db.DbDirectory [javac]DbConstants.DB_SET_RANGE | flags) != DbConstants.DB_NOTFOUND) [javac]^ [javac] /x1/gump/public/workspace/lucene-java/contrib/db/bdb/src/java/org/apache/lucene/store/db/DbDirectory.java:178: cannot find symbol [javac] symbol : variable DbConstants [javac] location: class org.apache.lucene.store.db.DbDirectory [javac]
Re: svn commit: r543076 - in /lucene/java/trunk: ./ src/java/org/apache/lucene/index/ src/site/src/documentation/content/xdocs/ src/test/org/apache/lucene/index/
On May 31, 2007, at 3:48 AM, [EMAIL PROTECTED] wrote: + 7. LUCENE-866: Adds multi-level skip lists to the posting lists. This speeds +up most queries that use skipTo(), especially on big indexes with large posting +lists. For average AND queries the speedup is about 20%, for queries that +contain very frequence and very unique terms the speedup can be over 80%. +(Michael Busch) Minor typo frequence => frequent. Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Created: (LUCENE-897) Change how Core and Contrib javadocs are hosted
Change how Core and Contrib javadocs are hosted --- Key: LUCENE-897 URL: https://issues.apache.org/jira/browse/LUCENE-897 Project: Lucene - Java Issue Type: Improvement Components: Website Reporter: Grant Ingersoll Priority: Minor Change the site javadocs to: 1. separate contrib javadocs from core javadocs 2. Optionally, include a unified view as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: addIndexes()
Steven Parkes wrote: Is there any particular reason that the version that takes a Directory[] optimizes first? There was, but unfortunately I can't recall it now. Index merging has changed substantially since then, so, whatever it was, it may no longer apply. If no one can think of a good reason to optimize any longer, then probably we should remove it, no? Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: addIndexes()
On Thu, 31 May 2007, Doug Cutting wrote: Steven Parkes wrote: Is there any particular reason that the version that takes a Directory[] optimizes first? There was, but unfortunately I can't recall it now. Index merging has changed substantially since then, so, whatever it was, it may no longer apply. If no one can think of a good reason to optimize any longer, then probably we should remove it, no? No longer optimizing on this call would impact performance in what I'm doing. My usage pattern involves indexing in a MemoryIndex and adding that index to an index backed by a DbDirectory. If the index is not optimized first, the operation becomes very noisy in the database. In other words, if that change is made, please let us know so that I can adapt my code to explicitely optimize the MemoryIndex first. Thanks ! Andi.. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Documentation Brainstorming
Grant Ingersoll wrote: I'd rather see each jar get its own javadoc, or at the very least, indicate which jar each class is defined in for the ones that aren't part of the core. Yeah, I don't like that all the contribs are built in together. What do others think? I would vote for separating them out. I like the single javadoc build. The linking is nice, e.g., all Analyzer implementations are linked from Analyzer. It also makes it easier for folks to see everything that's included in the release in one place. Perhaps the names of the sections should be the name of the jar file, and/or the summary sentence in the package.html for contrib packages should name the jar file. Would that suffice? However if most folks really wish to split things, then some new navigational pages are required to provide a home for the various javadocs. Ideally this would provide the level of integration that, e.g., Ant's optional tasks have with Ant's core tasks: when browsing core tasks there's always a link to optional tasks, and vice-versa, so the optional stuff is always just a single click away. Putting contrib and core javadoc together achieves this. Achieving it with separate javadocs will be harder. Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-698) FilteredQuery ignores boost
[ https://issues.apache.org/jira/browse/LUCENE-698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500427 ] Doug Cutting commented on LUCENE-698: - > If boost is zero, then > sumOfSquaredWeights() returns zero as well, resulting in a > queryNorm of Infinity (due to a div by zero if DefaultSimilarity is > used). Then it multiplies boost and queryNorm and 0*Infinity=NaN. The bug here to me seems that queryNorm is Infinity. A boost of zero has a reasonable interpretation (don't influence scoring), but I don't see how a queryNorm of Infinity is ever useful. So perhaps we can remove the NaN by modifying the default implementation of queryNorm to return 1.0 instead of Infinity when passed zero. Would that cause any harm? > FilteredQuery ignores boost > --- > > Key: LUCENE-698 > URL: https://issues.apache.org/jira/browse/LUCENE-698 > Project: Lucene - Java > Issue Type: Bug > Components: Search >Affects Versions: 2.0.0 >Reporter: Yonik Seeley >Assignee: Michael Busch >Priority: Minor > Fix For: 2.2 > > Attachments: lucene-698.patch > > > Filtered query ignores it's own boost. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Reopened: (LUCENE-885) clean up build files so contrib tests are run more easily
[ https://issues.apache.org/jira/browse/LUCENE-885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man reopened LUCENE-885: - Officially reopening this bug as i have discovered that it causes the build to fail on java 1.4 the problem is that the contrib-crawl logic used by build-contrib and test-contrib is ignorant of the "skip 1.5 contribs" logic used in the javadocs (it is a javadoc specific property) and the individiaul 1.5 contribs (ie: gdata) assume that ify ou are trying to build them, you must have 1.5. patch is already ready to make the property more global, and to make the targets in the gdata build.xml act as NOOPs (echoing a message) based on the value ... just doing some more testing now before committing. > clean up build files so contrib tests are run more easily > - > > Key: LUCENE-885 > URL: https://issues.apache.org/jira/browse/LUCENE-885 > Project: Lucene - Java > Issue Type: Improvement > Components: Build >Reporter: Hoss Man >Assignee: Hoss Man > Fix For: 2.2 > > Attachments: LUCENE-885.patch, LUCENE-885.patch > > > Per mailing list discussion... > http://www.nabble.com/Tests%2C-Contribs%2C-and-Releases-tf3768924.html#a10655448 > Tests for contribs should be run when "ant test" is used, existing "test" > target renamed to "test-core" -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Created: (LUCENE-898) contrib/javascript is not packaged into releases
contrib/javascript is not packaged into releases Key: LUCENE-898 URL: https://issues.apache.org/jira/browse/LUCENE-898 Project: Lucene - Java Issue Type: Bug Components: Build Reporter: Hoss Man Priority: Trivial the contrib/javascript directory is (apparently) a collection of javascript utilities for lucene .. but it has not build files or any mechanism to package it, so it is excluded form releases. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-698) FilteredQuery ignores boost
[ https://issues.apache.org/jira/browse/LUCENE-698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500437 ] Yonik Seeley commented on LUCENE-698: - > the default implementation of queryNorm to return 1.0 instead of Infinity > when passed zero. That seems like it should be fine, esp since Similarity.queryNorm is only called at the top level when creating a weight. > FilteredQuery ignores boost > --- > > Key: LUCENE-698 > URL: https://issues.apache.org/jira/browse/LUCENE-698 > Project: Lucene - Java > Issue Type: Bug > Components: Search >Affects Versions: 2.0.0 >Reporter: Yonik Seeley >Assignee: Michael Busch >Priority: Minor > Fix For: 2.2 > > Attachments: lucene-698.patch > > > Filtered query ignores it's own boost. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-898) contrib/javascript is not packaged into releases
[ https://issues.apache.org/jira/browse/LUCENE-898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500453 ] Erik Hatcher commented on LUCENE-898: - My vote is to remove the javascript contrib area entirely. It doesn't really do all that much useful. I'd be surprised if anyone really uses it. > contrib/javascript is not packaged into releases > > > Key: LUCENE-898 > URL: https://issues.apache.org/jira/browse/LUCENE-898 > Project: Lucene - Java > Issue Type: Bug > Components: Build >Reporter: Hoss Man >Priority: Trivial > > the contrib/javascript directory is (apparently) a collection of javascript > utilities for lucene .. but it has not build files or any mechanism to > package it, so it is excluded form releases. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Closed: (LUCENE-763) LuceneDictionary skips first word in enumeration
[ https://issues.apache.org/jira/browse/LUCENE-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Naber closed LUCENE-763. --- Resolution: Fixed Fix Version/s: 2.2 Thanks, patch applied. > LuceneDictionary skips first word in enumeration > > > Key: LUCENE-763 > URL: https://issues.apache.org/jira/browse/LUCENE-763 > Project: Lucene - Java > Issue Type: Bug > Components: Other >Affects Versions: 2.0.0 > Environment: Windows Sun JRE 1.4.2_10_b03 >Reporter: Dan Ertman > Fix For: 2.2 > > Attachments: LuceneDictionary.java, TestLuceneDictionary.java > > > The current code for LuceneDictionary will always skip the first word of the > TermEnum. The reason is that it doesn't initially retrieve TermEnum.term - > its first call is to TermEnum.next, which moves it past the first term (line > 76). > To see this problem cause a failure, add this test to TestSpellChecker: > similar = spellChecker.suggestSimilar("eihgt",2); > assertEquals(1, similar.length); > assertEquals(similar[0], "eight"); > Because "eight" is the first word in the index, it will fail. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-887) Interruptible segment merges
[ https://issues.apache.org/jira/browse/LUCENE-887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500457 ] Michael McCandless commented on LUCENE-887: --- This looks great to me! I think we should keep it out of core (ie, as subclasses as you've done here) for now? So, if a shutdown request comes in then currently running addDocument calls are allowed to complete but if a new addDocument call tries to run it will hit an "IndexWriter already closed" IOException. Once the in-flight addDocument calls finish you then flush the ram segments without allowing cascading merge. This actually means you can potentially have too many "level 0" (just flushed) segments in the index but that should not be a big deal since the next merge would clean it up. And it should be rare. In shutdown(), after you call waitForAddDocument(), why not call clearInterrupt before calling flushRamSegments? Isn't the flushRamSegments() call guaranteed to hit the IndexWriterInterruptException if it's using an ExtendedFSDirectory and there are > 0 buffered docs? Also I think it's possible that the addDocument() call from another thread will hit the IndexWriterInterruptException, right? So those other threads should catch this and ignore it (since their doc was in fact succesfully added and only the followon merge was interrupted)? > Interruptible segment merges > > > Key: LUCENE-887 > URL: https://issues.apache.org/jira/browse/LUCENE-887 > Project: Lucene - Java > Issue Type: New Feature > Components: Index >Reporter: Michael Busch >Priority: Minor > Fix For: 2.2 > > Attachments: ExtendedIndexWriter.java > > > Adds the ability to IndexWriter to interrupt an ongoing merge. This might be > necessary when Lucene is e. g. running as a service and has to stop indexing > within a certain period of time due to a shutdown request. > A solution would be to add a new method shutdown() to IndexWriter which > satisfies the following two requirements: > - if a merge is happening, abort it > - flush the buffered docs but do not trigger a merge > See also discussions about this feature on java-dev: > http://www.gossamer-threads.com/lists/lucene/java-dev/49008 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Created: (LUCENE-899) several gdata build targets don't work from contrib/gdata
several gdata build targets don't work from contrib/gdata - Key: LUCENE-899 URL: https://issues.apache.org/jira/browse/LUCENE-899 Project: Lucene - Java Issue Type: Bug Reporter: Hoss Man the contrib/gdata/build.xml file is a little ... odd, and many of the targets do't work at all when called from that directory (only when using build-contrib from the top level) this problem predates LUCENE-885 ... [EMAIL PROTECTED]:~/svn/lucene-bugs/contrib/gdata-server$ svnversion 542768 [EMAIL PROTECTED]:~/svn/lucene-bugs/contrib/gdata-server$ ant test Buildfile: build.xml test: [echo] Building gdata-core... javacc-uptodate-check: javacc-notice: common.init: build-lucene: init: compile-core: [echo] Use gdata - compile-core task [javac] Compiling 5 source files to /home/chrish/svn/lucene-bugs/build/contrib/gdata-server/core/classes/java Warning: Reference build.path has not been set at runtime, but was found during build file parsing, attempting to resolve. Future versions of Ant may support referencing ids defined in non-executed targets. Warning: Reference common.build.path has not been set at runtime, but was found during build file parsing, attempting to resolve. Future versions of Ant may support referencing ids defined in non-executed targets. BUILD FAILED /home/chrish/svn/lucene-bugs/contrib/gdata-server/build.xml:87: The following error occurred while executing this line: /home/chrish/svn/lucene-bugs/contrib/gdata-server/src/core/build.xml:49: The following error occurred while executing this line: /home/chrish/svn/lucene-bugs/common-build.xml:298: /home/chrish/svn/lucene-bugs/contrib/gdata-server/src/core/ext-libs not found. Total time: 1 second -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Documentation Brainstorming
I like the suggestion of having two views: a unified view and then also a separate view. Slightly more work to setup, but should satisfy both camps. On May 31, 2007, at 1:16 PM, Doug Cutting wrote: I like the single javadoc build. The linking is nice, e.g., all Analyzer implementations are linked from Analyzer. It also makes it easier for folks to see everything that's included in the release in one place. True Perhaps the names of the sections should be the name of the jar file, and/or the summary sentence in the package.html for contrib packages should name the jar file. Would that suffice? I find the lower left frame to be the main pain for me, since it isn't clear there what is in core and what is in contrib. However if most folks really wish to split things, then some new navigational pages are required to provide a home for the various javadocs. Ideally this would provide the level of integration that, e.g., Ant's optional tasks have with Ant's core tasks: when browsing core tasks there's always a link to optional tasks, and vice-versa, so the optional stuff is always just a single click away. Putting contrib and core javadoc together achieves this. Achieving it with separate javadocs will be harder. Makes sense. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-897) Change how Core and Contrib javadocs are hosted
[ https://issues.apache.org/jira/browse/LUCENE-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500477 ] Grant Ingersoll commented on LUCENE-897: See http://www.gossamer-threads.com/lists/lucene/java-dev/49348 for reference > Change how Core and Contrib javadocs are hosted > --- > > Key: LUCENE-897 > URL: https://issues.apache.org/jira/browse/LUCENE-897 > Project: Lucene - Java > Issue Type: Improvement > Components: Website >Reporter: Grant Ingersoll >Priority: Minor > > Change the site javadocs to: > 1. separate contrib javadocs from core javadocs > 2. Optionally, include a unified view as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Resolved: (LUCENE-885) clean up build files so contrib tests are run more easily
[ https://issues.apache.org/jira/browse/LUCENE-885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved LUCENE-885. - Resolution: Fixed Committed revision 543257. Compilation and test of the entire tree should work fine now under 1.4 ... note that gdata doesn't actually run it's tests (even under 1.5) because of LUCENE-899 ... but this problem predates any work done for this issue, so i'm not going to look into it at this time as it relates to bugs in a specific contrib, and not in changes made to facilitate the building/testing of contribs. > clean up build files so contrib tests are run more easily > - > > Key: LUCENE-885 > URL: https://issues.apache.org/jira/browse/LUCENE-885 > Project: Lucene - Java > Issue Type: Improvement > Components: Build >Reporter: Hoss Man >Assignee: Hoss Man > Fix For: 2.2 > > Attachments: LUCENE-885.patch, LUCENE-885.patch > > > Per mailing list discussion... > http://www.nabble.com/Tests%2C-Contribs%2C-and-Releases-tf3768924.html#a10655448 > Tests for contribs should be run when "ant test" is used, existing "test" > target renamed to "test-core" -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: addIndexes()
Hmmm ... something's not meshing for me here. If I understood what you've said, you have a DbD index to which you are addIndexes'ing a memory index? I must have missed something, because addIndexes pre- and post-optimizes the target (Dbd) index, not the operand (mem) index. -Original Message- From: Andi Vajda [mailto:[EMAIL PROTECTED] Sent: Thursday, May 31, 2007 10:10 AM To: java-dev@lucene.apache.org Subject: Re: addIndexes() On Thu, 31 May 2007, Doug Cutting wrote: > Steven Parkes wrote: >> Is there any particular reason that the version that takes a Directory[] >> optimizes first? > > There was, but unfortunately I can't recall it now. Index merging has > changed substantially since then, so, whatever it was, it may no longer > apply. If no one can think of a good reason to optimize any longer, then > probably we should remove it, no? No longer optimizing on this call would impact performance in what I'm doing. My usage pattern involves indexing in a MemoryIndex and adding that index to an index backed by a DbDirectory. If the index is not optimized first, the operation becomes very noisy in the database. In other words, if that change is made, please let us know so that I can adapt my code to explicitely optimize the MemoryIndex first. Thanks ! Andi.. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
enabling java assertions in the tests
While testing LUCENE-866 I realized that Java assertions are disabled when *I* run 'ant test'. Others did have the assertion executed and causing that NPE. So I am not sure if this is general problem or only a Windows one. Compile wise we are ok, having "-source 1.4". At runtime, assertions can be enabled by running "java -ea". Using ant, setting "ANT_ARGS=-ea" is supposed to have the same effect, but it doesn't, at least not for me. Adding: to the task would enable assertions during tests regardless of ANT_OPTS variable (and hopefully on all OSs). Anyone sees a problem with adding this? Btw, I think we can/should use Java asserts more (there are currently only 4 active asserts under trunk/java). Doron - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: enabling java assertions in the tests
"Doron Cohen" <[EMAIL PROTECTED]> wrote: > > While testing LUCENE-866 I realized that Java assertions > are disabled when *I* run 'ant test'. I noticed this too; in my patch on LUCENE-843 I've turned on assertions for all unit tests (I'm using alot of asserts in that patch) as well. > Others did have the assertion executed and causing that > NPE. So I am not sure if this is general problem or only > a Windows one. > > Compile wise we are ok, having "-source 1.4". > At runtime, assertions can be enabled by running "java -ea". > Using ant, setting "ANT_ARGS=-ea" is supposed to have the > same effect, but it doesn't, at least not for me. > > Adding: > > > > to the task would enable assertions during tests > regardless of ANT_OPTS variable (and hopefully on all OSs). I had added under the tast and it also seems to work, but I like your solution better (it's clearer). > Anyone sees a problem with adding this? > > Btw, I think we can/should use Java asserts more (there are > currently only 4 active asserts under trunk/java). I agree! The asserts have been very helpful in my debugging in LUCENE-843. Mike - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-763) LuceneDictionary skips first word in enumeration
[ https://issues.apache.org/jira/browse/LUCENE-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500521 ] Steven Parkes commented on LUCENE-763: -- Can we also update the javadocs to reflect the different semantics between terms() and terms(term)? Here's some possible verbage. (Also tweaks the "after the given term" which I think isn't correct?) {noformat} Index: src/java/org/apache/lucene/index/IndexReader.java === --- src/java/org/apache/lucene/index/IndexReader.java (revision 543284) +++ src/java/org/apache/lucene/index/IndexReader.java (working copy) @@ -539,16 +539,21 @@ setNorm(doc, field, Similarity.encodeNorm(value)); } - /** Returns an enumeration of all the terms in the index. - * The enumeration is ordered by Term.compareTo(). Each term - * is greater than all that precede it in the enumeration. + /** Returns an enumeration of all the terms in the index. The + * enumeration is ordered by Term.compareTo(). Each term is greater + * than all that precede it in the enumeration. Note that after + * calling [EMAIL PROTECTED] #terms()}, [EMAIL PROTECTED] TermEnum#next()} must be called + * on the resulting enumeration before calling other methods such as + * [EMAIL PROTECTED] TermEnum#term()}. * @throws IOException if there is a low-level IO error */ public abstract TermEnum terms() throws IOException; - /** Returns an enumeration of all terms after a given term. - * The enumeration is ordered by Term.compareTo(). Each term - * is greater than all that precede it in the enumeration. + /** Returns an enumeration of all terms starting at a given term. If + * the given term does not exist, the enumeration is positioned a the + * first term greater than the supplied therm. The enumeration is + * ordered by Term.compareTo(). Each term is greater than all that + * precede it in the enumeration. * @throws IOException if there is a low-level IO error */ public abstract TermEnum terms(Term t) throws IOException; {noformat} > LuceneDictionary skips first word in enumeration > > > Key: LUCENE-763 > URL: https://issues.apache.org/jira/browse/LUCENE-763 > Project: Lucene - Java > Issue Type: Bug > Components: Other >Affects Versions: 2.0.0 > Environment: Windows Sun JRE 1.4.2_10_b03 >Reporter: Dan Ertman > Fix For: 2.2 > > Attachments: LuceneDictionary.java, TestLuceneDictionary.java > > > The current code for LuceneDictionary will always skip the first word of the > TermEnum. The reason is that it doesn't initially retrieve TermEnum.term - > its first call is to TermEnum.next, which moves it past the first term (line > 76). > To see this problem cause a failure, add this test to TestSpellChecker: > similar = spellChecker.suggestSimilar("eihgt",2); > assertEquals(1, similar.length); > assertEquals(similar[0], "eight"); > Because "eight" is the first word in the index, it will fail. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: enabling java assertions in the tests
On Friday 01 June 2007 00:30, Doron Cohen wrote: > > While testing LUCENE-866 I realized that Java assertions > are disabled when *I* run 'ant test'. > Others did have the assertion executed and causing that > NPE. So I am not sure if this is general problem or only > a Windows one. Indeed, see below. I'm running Linux and java 1.6.0. > Compile wise we are ok, having "-source 1.4". > At runtime, assertions can be enabled by running "java -ea". > Using ant, setting "ANT_ARGS=-ea" is supposed to have the > same effect, but it doesn't, at least not for me. > > Adding: > > > > > to the task would enable assertions during tests > regardless of ANT_OPTS variable (and hopefully on all OSs). > > Anyone sees a problem with adding this? My common-build.xml has this added in the junit task: Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: enabling java assertions in the tests
I think that having assertions is of no value if they are never turned on :) I suggest going carefully in adding assertions. There are a lot of places where assertions are inappropriate (e.g. checking parameters on a public method). I think Sun's document gives good guidelines: http://java.sun.com/j2se/1.4.2/docs/guide/lang/assert.html -- DM Smith On May 31, 2007, at 6:30 PM, Doron Cohen wrote: While testing LUCENE-866 I realized that Java assertions are disabled when *I* run 'ant test'. Others did have the assertion executed and causing that NPE. So I am not sure if this is general problem or only a Windows one. Compile wise we are ok, having "-source 1.4". At runtime, assertions can be enabled by running "java -ea". Using ant, setting "ANT_ARGS=-ea" is supposed to have the same effect, but it doesn't, at least not for me. Adding: to the task would enable assertions during tests regardless of ANT_OPTS variable (and hopefully on all OSs). Anyone sees a problem with adding this? Btw, I think we can/should use Java asserts more (there are currently only 4 active asserts under trunk/java). Doron - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: enabling java assertions in the tests
DM Smith wrote on 31/05/2007 15:59:05: > I think that having assertions is of no value if they are never > turned on :) > > I suggest going carefully in adding assertions. There are a lot of > places where assertions are inappropriate (e.g. checking parameters > on a public method). > > I think Sun's document gives good guidelines: > > http://java.sun.com/j2se/1.4.2/docs/guide/lang/assert.html Perhaps the most important guideline regarding assertions is that they should *never* have side effects, otherwise correctness is broken when assertions are disabled. Doron - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Created: (LUCENE-900) Enable Java asserts in the Junit tests
Enable Java asserts in the Junit tests -- Key: LUCENE-900 URL: https://issues.apache.org/jira/browse/LUCENE-900 Project: Lucene - Java Issue Type: Test Components: Build Reporter: Doron Cohen Assignee: Doron Cohen Priority: Minor For background see http://www.mail-archive.com/java-dev@lucene.apache.org/msg10307.html -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: enabling java assertions in the tests
Paul Elschot <[EMAIL PROTECTED]> wrote on 31/05/2007 16:21:09: > > Adding: > > > > > > > > > > to the task would enable assertions during tests > > regardless of ANT_OPTS variable (and hopefully on all OSs). > My common-build.xml has this added in the junit task: > > > > > This enables the asserts for all lucene-java packages, but not for any external jars being used. I think this is cleaner because the no-args form would enable asserts also in external jars and might add noise to our tests. I'll open an issue and patch it like this! - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: addIndexes()
On Thu, 31 May 2007, Steven Parkes wrote: Hmmm ... something's not meshing for me here. If I understood what you've said, you have a DbD index to which you are addIndexes'ing a memory index? I must have missed something, because addIndexes pre- and post-optimizes the target (Dbd) index, not the operand (mem) index. I stand corrected. I'm using an IndexWriter opened on a RAMDirectory to do the indexing for a given transaction. Then I call addIndexes([writer]) on the IndexWriter backed by the DbDirectory to persist this. This approach ash turned out to be considerably faster and less noisy in the database (the amount of random access changes) than indexing into the DbDirectory backed index directly and then optimizing it. The docs for addIndexes() say "After this completes, the index is optimized." I mistakenly thought that there was discussion here about making this no longer be the case. Andi.. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-887) Interruptible segment merges
[ https://issues.apache.org/jira/browse/LUCENE-887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500550 ] Michael Busch commented on LUCENE-887: -- > This looks great to me! Thanks for reviewing! > So, if a shutdown request comes in then currently running addDocument > calls are allowed to complete but if a new addDocument call tries to > run it will hit an "IndexWriter already closed" IOException. Once the > in-flight addDocument calls finish you then flush the ram segments > without allowing cascading merge. Exactly. > This actually means you can potentially have too many "level 0" (just > flushed) segments in the index but that should not be a big deal since > the next merge would clean it up. And it should be rare. Yes, unless another shutdown request comes while the first merge after restarting the system is happening (which should be very unlikely), this will be cleaned up. Also, once the system is up again the IndexWriter will delete left over file fragments from an aborted merge. > In shutdown(), after you call waitForAddDocument(), why not call > clearInterrupt before calling flushRamSegments? Isn't the > flushRamSegments() call guaranteed to hit the > IndexWriterInterruptException if it's using an ExtendedFSDirectory and > there are > 0 buffered docs? Hmm I think I did it this way in case we aren't using an ExtendedFSDirectory, because then the flush would just succeed without an IndexWriterInterruptException and we safe an instanceof check here. But you are right, we can just call clearInterrupt, but only if (d instanceof ExtendedFSDirectory) == true. That's probably simpler. Thereafter it is safe to call close() because the buffer is empty, so the call of flushRamSegments in close() won't do anything. > Also I think it's possible that the addDocument() call from another > thread will hit the IndexWriterInterruptException, right? So those > other threads should catch this and ignore it (since their doc was in > fact succesfully added and only the followon merge was interrupted)? Hmm I'm not sure if I understand this. I catch the IndexWriterInterruptException in addDocument() and in the catch block flushAfterInterrupt() is called which clears the interrupt flag. So IndexWriterInterruptException shouldn't be thrown again and addDocument() should just return normally? Or am I missing something. Could you give an example? > Interruptible segment merges > > > Key: LUCENE-887 > URL: https://issues.apache.org/jira/browse/LUCENE-887 > Project: Lucene - Java > Issue Type: New Feature > Components: Index >Reporter: Michael Busch >Priority: Minor > Fix For: 2.2 > > Attachments: ExtendedIndexWriter.java > > > Adds the ability to IndexWriter to interrupt an ongoing merge. This might be > necessary when Lucene is e. g. running as a service and has to stop indexing > within a certain period of time due to a shutdown request. > A solution would be to add a new method shutdown() to IndexWriter which > satisfies the following two requirements: > - if a merge is happening, abort it > - flush the buffered docs but do not trigger a merge > See also discussions about this feature on java-dev: > http://www.gossamer-threads.com/lists/lucene/java-dev/49008 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-698) FilteredQuery ignores boost
[ https://issues.apache.org/jira/browse/LUCENE-698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500552 ] Michael Busch commented on LUCENE-698: -- > So perhaps we can remove the NaN by modifying the default implementation of > queryNorm to return 1.0 instead of Infinity when passed zero. Would that > cause any harm? Yes I believe this should work, too. This would prevent the NaN score when DefaultSimilarity is used. It will be the responsibility of people who implement their own Similarity then to take care of this in a similar way. I'll open a new issue for fixing the DefaultSimilarity. > FilteredQuery ignores boost > --- > > Key: LUCENE-698 > URL: https://issues.apache.org/jira/browse/LUCENE-698 > Project: Lucene - Java > Issue Type: Bug > Components: Search >Affects Versions: 2.0.0 >Reporter: Yonik Seeley >Assignee: Michael Busch >Priority: Minor > Fix For: 2.2 > > Attachments: lucene-698.patch > > > Filtered query ignores it's own boost. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Created: (LUCENE-901) DefaultSimilarity.queryNorm() should never return Infinity
DefaultSimilarity.queryNorm() should never return Infinity -- Key: LUCENE-901 URL: https://issues.apache.org/jira/browse/LUCENE-901 Project: Lucene - Java Issue Type: Bug Components: Search Reporter: Michael Busch Priority: Trivial Currently DefaultSimilarity.queryNorm() returns Infinity if sumOfSquaredWeights=0. This can result in a score of NaN (e. g. in TermScorer) if boost=0.0f. A simple fix would be to return 1.0f in case zero is passed in. See LUCENE-698 for discussions about this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-898) contrib/javascript is not packaged into releases
[ https://issues.apache.org/jira/browse/LUCENE-898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500569 ] Michael Busch commented on LUCENE-898: -- > My vote is to remove the javascript contrib area entirely. +1. It also seems that this package is unmaintained. No files have been changed since February 2005, when it was moved from the sandbox to contrib. > contrib/javascript is not packaged into releases > > > Key: LUCENE-898 > URL: https://issues.apache.org/jira/browse/LUCENE-898 > Project: Lucene - Java > Issue Type: Bug > Components: Build >Reporter: Hoss Man >Priority: Trivial > > the contrib/javascript directory is (apparently) a collection of javascript > utilities for lucene .. but it has not build files or any mechanism to > package it, so it is excluded form releases. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-901) DefaultSimilarity.queryNorm() should never return Infinity
[ https://issues.apache.org/jira/browse/LUCENE-901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500576 ] Hoss Man commented on LUCENE-901: - I'm not sure if i agree with this concept. Do we really want the curve of values from queryNorm to have a step drop down from really *huge* values when sumOfSquaredWeights is "near" zero to "1" when sumOfSquaredWeights becomes so close to zero it can only be represented as 0.0f ? Float.MAX_VALUE seems like a better choice then 1, but I haven't really thought through wether or not that will still trigger NaN scores. > DefaultSimilarity.queryNorm() should never return Infinity > -- > > Key: LUCENE-901 > URL: https://issues.apache.org/jira/browse/LUCENE-901 > Project: Lucene - Java > Issue Type: Bug > Components: Search >Reporter: Michael Busch >Priority: Trivial > > Currently DefaultSimilarity.queryNorm() returns Infinity if > sumOfSquaredWeights=0. > This can result in a score of NaN (e. g. in TermScorer) if boost=0.0f. > A simple fix would be to return 1.0f in case zero is passed in. > See LUCENE-698 for discussions about this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-898) contrib/javascript is not packaged into releases
[ https://issues.apache.org/jira/browse/LUCENE-898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500575 ] Otis Gospodnetic commented on LUCENE-898: - I think the files have not changed in a while because they work. I believe Kelvin Tan (the author) used/uses this stuff somewhere. I'm typically for cleaning things up, but somehow I feel that this javascript stuff should be left alone (it ain't broken, is it?). > contrib/javascript is not packaged into releases > > > Key: LUCENE-898 > URL: https://issues.apache.org/jira/browse/LUCENE-898 > Project: Lucene - Java > Issue Type: Bug > Components: Build >Reporter: Hoss Man >Priority: Trivial > > the contrib/javascript directory is (apparently) a collection of javascript > utilities for lucene .. but it has not build files or any mechanism to > package it, so it is excluded form releases. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]