[Lucene.Net] [jira] [Closed] (LUCENENET-397) Resolution of the legal issues
[ https://issues.apache.org/jira/browse/LUCENENET-397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Troy Howard closed LUCENENET-397. - Resolution: Not A Problem This version of Luke.Net is dependent on WPF which conflicts with our desire for this to be cross platform. Legal clearance is not necessary. Resolution of the legal issues -- Key: LUCENENET-397 URL: https://issues.apache.org/jira/browse/LUCENENET-397 Project: Lucene.Net Issue Type: Sub-task Components: Lucene.Net Contrib Reporter: Scott Lombard Assignee: Troy Howard Priority: Blocker Labels: Luke.Net Fix For: Lucene.Net 2.9.4 Resolution of the legal issues around ingesting the code into Lucene.Net. Coordinate with Aaron Powell to obtain software grant paperwork. Per Stefan Bodewig (Incubating Mentor): All it takes is: * attach the code to a JIRA ticket. * have software grants signed by all contributors to the original code base. * write a single page for the Incubator site * start a vote on Incubator general and wait for 72 hours. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Closed] (LUCENENET-398) Prepare the code for ingestion
[ https://issues.apache.org/jira/browse/LUCENENET-398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Troy Howard closed LUCENENET-398. - Resolution: Not A Problem Not including this code due to WPF Prepare the code for ingestion -- Key: LUCENENET-398 URL: https://issues.apache.org/jira/browse/LUCENENET-398 Project: Lucene.Net Issue Type: Sub-task Components: Lucene.Net Contrib Reporter: Scott Lombard Assignee: Sergey Mirvoda Labels: Luke.Net Fix For: Lucene.Net 2.9.4 Prepare source to be imported in the Lucene.Net respository. Staging area is a bitbucket fork at: https://bitbucket.org/thoward/luke.net-incbuating from original codebase at: https://bitbucket.org/slace/luke.net See tasks on bitbucket site (forked) for source-code related issues that need to be addressed prior to ingesting into Lucene.Net codebase. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (LUCENE-3096) MultiSearcher does not work correctly with Not on NumericRange
[ https://issues.apache.org/jira/browse/LUCENE-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13033478#comment-13033478 ] Uwe Schindler commented on LUCENE-3096: --- This is a well-known bug (LUCENE-2756), which is unfixable (query rewrite across different searchers is wrong) without totally changing the way how queries are rewritten. To fix the bug, you should use a MultiReader on your IndexReaders and use a simple IndexSearcher on top of that MultiReader: {code} IndexReader[] readers; readers[0] = IndexReader.open(directory); readers[1] = IndexReader.open(otherdirectory); ... IndexSearcher searcher = new IndexSearcher(new MultiReader(readers)); {code} MultiSearcher and ParallelMultiSearcher were deprecated in 3.1 because of this and disappear in coming Lucene 4.0. ParallelMultiSearcher functionality is now available through IndexSearcher in 3.1 (it parallelizes across index segments, LUCENE-2837). I will close this as won't fix if nobody objects. MultiSearcher does not work correctly with Not on NumericRange -- Key: LUCENE-3096 URL: https://issues.apache.org/jira/browse/LUCENE-3096 Project: Lucene - Java Issue Type: Bug Components: Search Affects Versions: 3.0.2 Reporter: John Wang Hi, Keith My colleague xiaoyang and I just confirmed that this is actually due to a lucene bug on Multisearcher. In particular, If we search with Not on NumericRange and we use MultiSearcher, we will wrong search results (However, if we use IndexSearcher, the result is correct). Basically the NotOfNumericRange does not have impact on multisearcher. We suspect it is because the createWeight() function in MultiSearcher and hope you can help us to fix this bug of lucene. I attached the code to reproduce this case. Please check it out. In the attached code, I have two separate functions : (1) testNumericRangeSingleSearcher(Query query) where I create 6 documents, with a field called id= 1,2,3,4,5,6 respectively . Then I search by the query which is +MatchAllDocs -NumericRange(3,3). The expected result then should be 5 hits since the document 3 is MUST_NOT. (2) testNumericRangeMultiSearcher(Query query) where i create 2 RamDirectory(), each of which has 3 documents, 1,2,3; and 4,5,6. Then I search by the same query as above using multiSearcher. The expected result should also be 5 hits. However, from (1), we get 5 hits = expected results, while in (2) we get 6 hits != expected results. We also experimented this with our zoie/bobo open source tools and get the same results because our multi-bobo-browser is built on multi-searcher in lucene. I already emailed the lucene community group. Hopefully we can get some feedback soon. If you have any further concern, pls let me know! Thank you very much! Code: (based on lucene 3.0.x) import java.io.IOException; import java.io.PrintStream; import java.text.DecimalFormat; import org.apache.lucene.analysis.WhitespaceAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.document.NumericField; import org.apache.lucene.index.CorruptIndexException; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.Term; import org.apache.lucene.search.BooleanQuery; import org.apache.lucene.search.FieldCache; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.MatchAllDocsQuery; import org.apache.lucene.search.MultiSearcher; import org.apache.lucene.search.NumericRangeQuery; import org.apache.lucene.search.Query; import org.apache.lucene.search.ScoreDoc; import org.apache.lucene.search.Searchable; import org.apache.lucene.search.Sort; import org.apache.lucene.search.SortField; import org.apache.lucene.search.TermQuery; import org.apache.lucene.search.TopDocs; import org.apache.lucene.search.BooleanClause.Occur; import org.apache.lucene.store.Directory; import org.apache.lucene.store.LockObtainFailedException; import org.apache.lucene.store.RAMDirectory; import com.convertlucene.ConvertFrom2To3; public class TestNumericRange { public final static void main(String[] args) { try { BooleanQuery query = new BooleanQuery(); query.add(NumericRangeQuery.newIntRange(numId, 3, 3, true, true), Occur.MUST_NOT); query.add(new MatchAllDocsQuery(), Occur.MUST); testNumericRangeSingleSearcher(query); testNumericRangeMultiSearcher(query); } catch(Exception e) { e.printStackTrace(); } } public static void testNumericRangeSingleSearcher(Query query) throws CorruptIndexException, LockObtainFailedException, IOException { String[] ids = {1, 2, 3, 4, 5, 6}; Directory directory = new RAMDirectory(); IndexWriter writer =
[jira] [Commented] (LUCENE-3096) MultiSearcher does not work correctly with Not on NumericRange
[ https://issues.apache.org/jira/browse/LUCENE-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13033479#comment-13033479 ] Uwe Schindler commented on LUCENE-3096: --- This was also already reported and answered on the java-user@lao list: [http://www.gossamer-threads.com/lists/lucene/java-user/123996] MultiSearcher does not work correctly with Not on NumericRange -- Key: LUCENE-3096 URL: https://issues.apache.org/jira/browse/LUCENE-3096 Project: Lucene - Java Issue Type: Bug Components: Search Affects Versions: 3.0.2 Reporter: John Wang Hi, Keith My colleague xiaoyang and I just confirmed that this is actually due to a lucene bug on Multisearcher. In particular, If we search with Not on NumericRange and we use MultiSearcher, we will wrong search results (However, if we use IndexSearcher, the result is correct). Basically the NotOfNumericRange does not have impact on multisearcher. We suspect it is because the createWeight() function in MultiSearcher and hope you can help us to fix this bug of lucene. I attached the code to reproduce this case. Please check it out. In the attached code, I have two separate functions : (1) testNumericRangeSingleSearcher(Query query) where I create 6 documents, with a field called id= 1,2,3,4,5,6 respectively . Then I search by the query which is +MatchAllDocs -NumericRange(3,3). The expected result then should be 5 hits since the document 3 is MUST_NOT. (2) testNumericRangeMultiSearcher(Query query) where i create 2 RamDirectory(), each of which has 3 documents, 1,2,3; and 4,5,6. Then I search by the same query as above using multiSearcher. The expected result should also be 5 hits. However, from (1), we get 5 hits = expected results, while in (2) we get 6 hits != expected results. We also experimented this with our zoie/bobo open source tools and get the same results because our multi-bobo-browser is built on multi-searcher in lucene. I already emailed the lucene community group. Hopefully we can get some feedback soon. If you have any further concern, pls let me know! Thank you very much! Code: (based on lucene 3.0.x) import java.io.IOException; import java.io.PrintStream; import java.text.DecimalFormat; import org.apache.lucene.analysis.WhitespaceAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.document.NumericField; import org.apache.lucene.index.CorruptIndexException; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.Term; import org.apache.lucene.search.BooleanQuery; import org.apache.lucene.search.FieldCache; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.MatchAllDocsQuery; import org.apache.lucene.search.MultiSearcher; import org.apache.lucene.search.NumericRangeQuery; import org.apache.lucene.search.Query; import org.apache.lucene.search.ScoreDoc; import org.apache.lucene.search.Searchable; import org.apache.lucene.search.Sort; import org.apache.lucene.search.SortField; import org.apache.lucene.search.TermQuery; import org.apache.lucene.search.TopDocs; import org.apache.lucene.search.BooleanClause.Occur; import org.apache.lucene.store.Directory; import org.apache.lucene.store.LockObtainFailedException; import org.apache.lucene.store.RAMDirectory; import com.convertlucene.ConvertFrom2To3; public class TestNumericRange { public final static void main(String[] args) { try { BooleanQuery query = new BooleanQuery(); query.add(NumericRangeQuery.newIntRange(numId, 3, 3, true, true), Occur.MUST_NOT); query.add(new MatchAllDocsQuery(), Occur.MUST); testNumericRangeSingleSearcher(query); testNumericRangeMultiSearcher(query); } catch(Exception e) { e.printStackTrace(); } } public static void testNumericRangeSingleSearcher(Query query) throws CorruptIndexException, LockObtainFailedException, IOException { String[] ids = {1, 2, 3, 4, 5, 6}; Directory directory = new RAMDirectory(); IndexWriter writer = new IndexWriter(directory, new WhitespaceAnalyzer(), IndexWriter.MaxFieldLength.UNLIMITED); for (int i = 0; i ids.length; i++) { Document doc = new Document(); doc.add(new Field(id, ids[i], Field.Store.YES, Field.Index.NOT_ANALYZED)); doc.add(new NumericField(numId).setIntValue(Integer.valueOf(ids[i]))); writer.addDocument(doc); } writer.close(); IndexSearcher searcher = new IndexSearcher(directory); TopDocs docs = searcher.search(query, 10); System.out.println(SingleSearcher: testNumericRange: hitNum: + docs.totalHits); for(ScoreDoc doc : docs.scoreDocs)
[jira] [Commented] (SOLR-2516) Solr should not cache Searchers
[ https://issues.apache.org/jira/browse/SOLR-2516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13033480#comment-13033480 ] Uwe Schindler commented on SOLR-2516: - I totally agree. On the Lucene side, the documentation/wiki should also be fixed not to say: creating IndexSearcher is heavy and they should be cached, more it should explain that the IndexReaders are heavy. Solr should not cache Searchers --- Key: SOLR-2516 URL: https://issues.apache.org/jira/browse/SOLR-2516 Project: Solr Issue Type: Bug Components: search Reporter: John Wang only IndexReaders should be cached (where data resides) Searcher is a thin execution wrapper around it and thus should not be cached. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3093) Build failed in the flexscoring branch because of Javadoc warnings
[ https://issues.apache.org/jira/browse/LUCENE-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Mark Nemeskey updated LUCENE-3093: Attachment: LUCENE-3093.patch Patch to fix the issue. Build failed in the flexscoring branch because of Javadoc warnings -- Key: LUCENE-3093 URL: https://issues.apache.org/jira/browse/LUCENE-3093 Project: Lucene - Java Issue Type: Bug Components: Javadocs Environment: N/A Reporter: David Mark Nemeskey Priority: Minor Attachments: LUCENE-3093.patch Original Estimate: 24h Remaining Estimate: 24h Ant build log: [javadoc] Standard Doclet version 1.6.0_24 [javadoc] Building tree for all the packages and classes... [javadoc] /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/src/java/org/apache/lucene/search/Similarity.java:93: warning - Tag @link: can't find tf(float) in org.apache.lucene.search.Similarity [javadoc] /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/src/java/org/apache/lucene/search/TFIDFSimilarity.java:588: warning - @param argument term is not a parameter name. [javadoc] /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/src/java/org/apache/lucene/search/TFIDFSimilarity.java:588: warning - @param argument docFreq is not a parameter name. [javadoc] /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/src/java/org/apache/lucene/search/TFIDFSimilarity.java:618: warning - @param argument terms is not a parameter name. [javadoc] Generating /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/org/apache/lucene/store/instantiated//package-summary.html... [javadoc] Copying file /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/contrib/instantiated/src/java/org/apache/lucene/store/instantiated/doc-files/classdiagram.png to directory /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/org/apache/lucene/store/instantiated/doc-files... [javadoc] Copying file /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/contrib/instantiated/src/java/org/apache/lucene/store/instantiated/doc-files/HitCollectionBench.jpg to directory /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/org/apache/lucene/store/instantiated/doc-files... [javadoc] Copying file /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/contrib/instantiated/src/java/org/apache/lucene/store/instantiated/doc-files/classdiagram.uxf to directory /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/org/apache/lucene/store/instantiated/doc-files... [javadoc] Generating /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/serialized-form.html... [javadoc] Copying file /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/prettify/stylesheet+prettify.css to file /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/stylesheet+prettify.css... [javadoc] Building index for all the packages and classes... [javadoc] Building index for all classes... [javadoc] Generating /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/help-doc.html... [javadoc] 4 warnings -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3093) Build failed in the flexscoring branch because of Javadoc warnings
[ https://issues.apache.org/jira/browse/LUCENE-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Mark Nemeskey updated LUCENE-3093: Lucene Fields: [New, Patch Available] (was: [New]) Remaining Estimate: 1h (was: 24h) Original Estimate: 1h (was: 24h) Build failed in the flexscoring branch because of Javadoc warnings -- Key: LUCENE-3093 URL: https://issues.apache.org/jira/browse/LUCENE-3093 Project: Lucene - Java Issue Type: Bug Components: Javadocs Environment: N/A Reporter: David Mark Nemeskey Priority: Minor Attachments: LUCENE-3093.patch Original Estimate: 1h Remaining Estimate: 1h Ant build log: [javadoc] Standard Doclet version 1.6.0_24 [javadoc] Building tree for all the packages and classes... [javadoc] /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/src/java/org/apache/lucene/search/Similarity.java:93: warning - Tag @link: can't find tf(float) in org.apache.lucene.search.Similarity [javadoc] /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/src/java/org/apache/lucene/search/TFIDFSimilarity.java:588: warning - @param argument term is not a parameter name. [javadoc] /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/src/java/org/apache/lucene/search/TFIDFSimilarity.java:588: warning - @param argument docFreq is not a parameter name. [javadoc] /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/src/java/org/apache/lucene/search/TFIDFSimilarity.java:618: warning - @param argument terms is not a parameter name. [javadoc] Generating /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/org/apache/lucene/store/instantiated//package-summary.html... [javadoc] Copying file /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/contrib/instantiated/src/java/org/apache/lucene/store/instantiated/doc-files/classdiagram.png to directory /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/org/apache/lucene/store/instantiated/doc-files... [javadoc] Copying file /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/contrib/instantiated/src/java/org/apache/lucene/store/instantiated/doc-files/HitCollectionBench.jpg to directory /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/org/apache/lucene/store/instantiated/doc-files... [javadoc] Copying file /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/contrib/instantiated/src/java/org/apache/lucene/store/instantiated/doc-files/classdiagram.uxf to directory /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/org/apache/lucene/store/instantiated/doc-files... [javadoc] Generating /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/serialized-form.html... [javadoc] Copying file /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/prettify/stylesheet+prettify.css to file /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/stylesheet+prettify.css... [javadoc] Building index for all the packages and classes... [javadoc] Building index for all classes... [javadoc] Generating /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/help-doc.html... [javadoc] 4 warnings -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: 3.2.0 (or 3.1.1)
+1 for 3.2. Mike http://blog.mikemccandless.com On Sat, May 14, 2011 at 12:32 AM, Shai Erera ser...@gmail.com wrote: +1 for 3.2! And also, we should adopt that approach going forward (no more bug fix releases for the stable branch, except for the last release before 4.0 is out). That means updating the release TODO with e.g., not creating a branch for 3.2.x, only tag it. When 4.0 is out, we branch 3.x.y out of the last 3.x tag. Shai On Saturday, May 14, 2011, Ryan McKinley ryan...@gmail.com wrote: On Fri, May 13, 2011 at 6:40 PM, Grant Ingersoll gsing...@apache.org wrote: It's been just over 1 month since the last release. We've all said we want to get to about a 3 month release cycle (if not more often). I think this means we should start shooting for a next release sometime in June. Which, in my mind, means we should start working on wrapping up issues now, IMO. Here's what's open for 3.2 against: Lucene: https://issues.apache.org/jira/browse/LUCENE/fixforversion/12316070 Solr: https://issues.apache.org/jira/browse/SOLR/fixforversion/12316172 Thoughts? +1 for 3.2 with a new feature freeze pretty soon - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-3093) Build failed in the flexscoring branch because of Javadoc warnings
[ https://issues.apache.org/jira/browse/LUCENE-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir reassigned LUCENE-3093: --- Assignee: Robert Muir Build failed in the flexscoring branch because of Javadoc warnings -- Key: LUCENE-3093 URL: https://issues.apache.org/jira/browse/LUCENE-3093 Project: Lucene - Java Issue Type: Bug Components: Javadocs Environment: N/A Reporter: David Mark Nemeskey Assignee: Robert Muir Priority: Minor Attachments: LUCENE-3093.patch Original Estimate: 1h Remaining Estimate: 1h Ant build log: [javadoc] Standard Doclet version 1.6.0_24 [javadoc] Building tree for all the packages and classes... [javadoc] /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/src/java/org/apache/lucene/search/Similarity.java:93: warning - Tag @link: can't find tf(float) in org.apache.lucene.search.Similarity [javadoc] /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/src/java/org/apache/lucene/search/TFIDFSimilarity.java:588: warning - @param argument term is not a parameter name. [javadoc] /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/src/java/org/apache/lucene/search/TFIDFSimilarity.java:588: warning - @param argument docFreq is not a parameter name. [javadoc] /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/src/java/org/apache/lucene/search/TFIDFSimilarity.java:618: warning - @param argument terms is not a parameter name. [javadoc] Generating /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/org/apache/lucene/store/instantiated//package-summary.html... [javadoc] Copying file /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/contrib/instantiated/src/java/org/apache/lucene/store/instantiated/doc-files/classdiagram.png to directory /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/org/apache/lucene/store/instantiated/doc-files... [javadoc] Copying file /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/contrib/instantiated/src/java/org/apache/lucene/store/instantiated/doc-files/HitCollectionBench.jpg to directory /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/org/apache/lucene/store/instantiated/doc-files... [javadoc] Copying file /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/contrib/instantiated/src/java/org/apache/lucene/store/instantiated/doc-files/classdiagram.uxf to directory /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/org/apache/lucene/store/instantiated/doc-files... [javadoc] Generating /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/serialized-form.html... [javadoc] Copying file /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/prettify/stylesheet+prettify.css to file /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/stylesheet+prettify.css... [javadoc] Building index for all the packages and classes... [javadoc] Building index for all classes... [javadoc] Generating /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/help-doc.html... [javadoc] 4 warnings -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3093) Build failed in the flexscoring branch because of Javadoc warnings
[ https://issues.apache.org/jira/browse/LUCENE-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13033502#comment-13033502 ] Robert Muir commented on LUCENE-3093: - Thanks David! I'll commit soon. Build failed in the flexscoring branch because of Javadoc warnings -- Key: LUCENE-3093 URL: https://issues.apache.org/jira/browse/LUCENE-3093 Project: Lucene - Java Issue Type: Bug Components: Javadocs Environment: N/A Reporter: David Mark Nemeskey Assignee: Robert Muir Priority: Minor Attachments: LUCENE-3093.patch Original Estimate: 1h Remaining Estimate: 1h Ant build log: [javadoc] Standard Doclet version 1.6.0_24 [javadoc] Building tree for all the packages and classes... [javadoc] /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/src/java/org/apache/lucene/search/Similarity.java:93: warning - Tag @link: can't find tf(float) in org.apache.lucene.search.Similarity [javadoc] /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/src/java/org/apache/lucene/search/TFIDFSimilarity.java:588: warning - @param argument term is not a parameter name. [javadoc] /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/src/java/org/apache/lucene/search/TFIDFSimilarity.java:588: warning - @param argument docFreq is not a parameter name. [javadoc] /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/src/java/org/apache/lucene/search/TFIDFSimilarity.java:618: warning - @param argument terms is not a parameter name. [javadoc] Generating /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/org/apache/lucene/store/instantiated//package-summary.html... [javadoc] Copying file /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/contrib/instantiated/src/java/org/apache/lucene/store/instantiated/doc-files/classdiagram.png to directory /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/org/apache/lucene/store/instantiated/doc-files... [javadoc] Copying file /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/contrib/instantiated/src/java/org/apache/lucene/store/instantiated/doc-files/HitCollectionBench.jpg to directory /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/org/apache/lucene/store/instantiated/doc-files... [javadoc] Copying file /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/contrib/instantiated/src/java/org/apache/lucene/store/instantiated/doc-files/classdiagram.uxf to directory /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/org/apache/lucene/store/instantiated/doc-files... [javadoc] Generating /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/serialized-form.html... [javadoc] Copying file /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/prettify/stylesheet+prettify.css to file /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/stylesheet+prettify.css... [javadoc] Building index for all the packages and classes... [javadoc] Building index for all classes... [javadoc] Generating /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/help-doc.html... [javadoc] 4 warnings -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 8041 - Failure
Build: https://builds.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/8041/ No tests ran. Build Log (for compile errors): [...truncated 1875 lines...] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: compile-core: compile-test-framework: compile-test: [javac] Compiling 1 source file to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/test contrib-build.init: compile-memory: compile-highlighter: compile-analyzers-common: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: compile-core: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/benchmark/build/classes/java [javac] Compiling 123 source files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/benchmark/build/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/benchmark/src/java/org/apache/lucene/benchmark/byTask/feeds/demohtml/SimpleCharStream.java:211: warning: [dep-ann] deprecated name isnt annotated with @Deprecated [javac] public int getColumn() { [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/benchmark/src/java/org/apache/lucene/benchmark/byTask/feeds/demohtml/SimpleCharStream.java:220: warning: [dep-ann] deprecated name isnt annotated with @Deprecated [javac] public int getLine() { [javac] ^ [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] 2 warnings compile: [echo] Building grouping... common.init: build-lucene: jflex-uptodate-check: jflex-notice: javacc-uptodate-check: javacc-notice: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: compile-core: compile-test-framework: compile-test: [javac] Compiling 1 source file to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/test init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: compile-core: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/grouping/build/classes/java [javac] Compiling 7 source files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/grouping/build/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/grouping/src/java/org/apache/lucene/search/grouping/FirstPassGroupingCollector.java:232: cannot find symbol [javac] symbol : method pollLast() [javac] location: class java.util.TreeSetorg.apache.lucene.search.grouping.CollectedSearchGroup [javac] final CollectedSearchGroup bottomGroup = orderedGroups.pollLast(); [javac] ^ [javac] 1 error [...truncated 11 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-3.x #122: POMs out of sync
Build: https://builds.apache.org/hudson/job/Lucene-Solr-Maven-3.x/122/ No tests ran. Build Log (for compile errors): [...truncated 6744 lines...] jar-analyzers: common.init: build-lucene: contrib-build.init: lucene-jar-uptodate: jar-lucene: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-3.x/checkout/lucene/build/contrib/demo/classes/java [javac] Compiling 2 source files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-3.x/checkout/lucene/build/contrib/demo/classes/java compile-core: jar-core: [jar] Building jar: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-3.x/checkout/lucene/build/contrib/demo/lucene-demo-3.2-SNAPSHOT.jar jar: compile-test: [echo] Building demo... jar-analyzers: common.init: build-lucene: contrib-build.init: lucene-jar-uptodate: jar-lucene: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: compile-core: compile-test-framework: common.compile-test: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-3.x/checkout/lucene/build/contrib/demo/classes/test [javac] Compiling 1 source file to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-3.x/checkout/lucene/build/contrib/demo/classes/test [copy] Copying 16 files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-3.x/checkout/lucene/build/contrib/demo/classes/test build-artifacts-and-tests: [echo] Building grouping... common.init: build-lucene: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: compile-core: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-3.x/checkout/lucene/contrib/grouping/build/classes/java [javac] Compiling 7 source files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-3.x/checkout/lucene/contrib/grouping/build/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-3.x/checkout/lucene/contrib/grouping/src/java/org/apache/lucene/search/grouping/FirstPassGroupingCollector.java:230: cannot find symbol [javac] symbol : method pollLast() [javac] location: class java.util.TreeSetorg.apache.lucene.search.grouping.CollectedSearchGroup [javac] final CollectedSearchGroup bottomGroup = orderedGroups.pollLast(); [javac] ^ [javac] Note: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-3.x/checkout/lucene/contrib/grouping/src/java/org/apache/lucene/search/grouping/CachingCollector.java uses or overrides a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] 1 error [...truncated 12 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-3.x - Build # 8045 - Failure
Build: https://builds.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/8045/ No tests ran. Build Log (for compile errors): [...truncated 984 lines...] [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: compile-core: jar-core: [jar] Building jar: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/build/lucene-core-3.2-SNAPSHOT.jar init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/build/contrib/demo/classes/java [javac] Compiling 2 source files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/build/contrib/demo/classes/java compile-core: jar-core: [jar] Building jar: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/build/contrib/demo/lucene-demo-3.2-SNAPSHOT.jar jar: compile-test: [echo] Building demo... jar-analyzers: common.init: build-lucene: contrib-build.init: lucene-jar-uptodate: jar-lucene: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: compile-core: compile-test-framework: common.compile-test: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/build/contrib/demo/classes/test [javac] Compiling 1 source file to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/build/contrib/demo/classes/test [copy] Copying 16 files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/build/contrib/demo/classes/test build-artifacts-and-tests: [echo] Building grouping... common.init: build-lucene: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: compile-core: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/contrib/grouping/build/classes/java [javac] Compiling 7 source files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/contrib/grouping/build/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/contrib/grouping/src/java/org/apache/lucene/search/grouping/FirstPassGroupingCollector.java:230: cannot find symbol [javac] symbol : method pollLast() [javac] location: class java.util.TreeSetorg.apache.lucene.search.grouping.CollectedSearchGroup [javac] final CollectedSearchGroup bottomGroup = orderedGroups.pollLast(); [javac] ^ [javac] Note: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/contrib/grouping/src/java/org/apache/lucene/search/grouping/CachingCollector.java uses or overrides a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] 1 error [...truncated 12 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 8042 - Still Failing
Build: https://builds.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/8042/ No tests ran. Build Log (for compile errors): [...truncated 1843 lines...] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: compile-core: compile-test-framework: compile-test: [javac] Compiling 1 source file to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/test contrib-build.init: compile-memory: compile-highlighter: compile-analyzers-common: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: compile-core: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/benchmark/build/classes/java [javac] Compiling 123 source files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/benchmark/build/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/benchmark/src/java/org/apache/lucene/benchmark/byTask/feeds/demohtml/SimpleCharStream.java:211: warning: [dep-ann] deprecated name isnt annotated with @Deprecated [javac] public int getColumn() { [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/benchmark/src/java/org/apache/lucene/benchmark/byTask/feeds/demohtml/SimpleCharStream.java:220: warning: [dep-ann] deprecated name isnt annotated with @Deprecated [javac] public int getLine() { [javac] ^ [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] 2 warnings compile: [echo] Building grouping... common.init: build-lucene: jflex-uptodate-check: jflex-notice: javacc-uptodate-check: javacc-notice: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: compile-core: compile-test-framework: compile-test: [javac] Compiling 1 source file to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/test init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: compile-core: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/grouping/build/classes/java [javac] Compiling 7 source files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/grouping/build/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/grouping/src/java/org/apache/lucene/search/grouping/FirstPassGroupingCollector.java:232: cannot find symbol [javac] symbol : method pollLast() [javac] location: class java.util.TreeSetorg.apache.lucene.search.grouping.CollectedSearchGroup [javac] final CollectedSearchGroup bottomGroup = orderedGroups.pollLast(); [javac] ^ [javac] 1 error [...truncated 11 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-3096) MultiSearcher does not work correctly with Not on NumericRange
[ https://issues.apache.org/jira/browse/LUCENE-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved LUCENE-3096. --- Resolution: Duplicate Fix Version/s: 3.1 This is a duplicate of LUCENE-2756 and fixed by deprecating (3.1) and removing (4.0) broken (Parallel)MultiSearcher in favour of IndexSearcher on top of MultiReader. MultiSearcher does not work correctly with Not on NumericRange -- Key: LUCENE-3096 URL: https://issues.apache.org/jira/browse/LUCENE-3096 Project: Lucene - Java Issue Type: Bug Components: Search Affects Versions: 3.0.2 Reporter: John Wang Fix For: 3.1 Hi, Keith My colleague xiaoyang and I just confirmed that this is actually due to a lucene bug on Multisearcher. In particular, If we search with Not on NumericRange and we use MultiSearcher, we will wrong search results (However, if we use IndexSearcher, the result is correct). Basically the NotOfNumericRange does not have impact on multisearcher. We suspect it is because the createWeight() function in MultiSearcher and hope you can help us to fix this bug of lucene. I attached the code to reproduce this case. Please check it out. In the attached code, I have two separate functions : (1) testNumericRangeSingleSearcher(Query query) where I create 6 documents, with a field called id= 1,2,3,4,5,6 respectively . Then I search by the query which is +MatchAllDocs -NumericRange(3,3). The expected result then should be 5 hits since the document 3 is MUST_NOT. (2) testNumericRangeMultiSearcher(Query query) where i create 2 RamDirectory(), each of which has 3 documents, 1,2,3; and 4,5,6. Then I search by the same query as above using multiSearcher. The expected result should also be 5 hits. However, from (1), we get 5 hits = expected results, while in (2) we get 6 hits != expected results. We also experimented this with our zoie/bobo open source tools and get the same results because our multi-bobo-browser is built on multi-searcher in lucene. I already emailed the lucene community group. Hopefully we can get some feedback soon. If you have any further concern, pls let me know! Thank you very much! Code: (based on lucene 3.0.x) import java.io.IOException; import java.io.PrintStream; import java.text.DecimalFormat; import org.apache.lucene.analysis.WhitespaceAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.document.NumericField; import org.apache.lucene.index.CorruptIndexException; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.Term; import org.apache.lucene.search.BooleanQuery; import org.apache.lucene.search.FieldCache; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.MatchAllDocsQuery; import org.apache.lucene.search.MultiSearcher; import org.apache.lucene.search.NumericRangeQuery; import org.apache.lucene.search.Query; import org.apache.lucene.search.ScoreDoc; import org.apache.lucene.search.Searchable; import org.apache.lucene.search.Sort; import org.apache.lucene.search.SortField; import org.apache.lucene.search.TermQuery; import org.apache.lucene.search.TopDocs; import org.apache.lucene.search.BooleanClause.Occur; import org.apache.lucene.store.Directory; import org.apache.lucene.store.LockObtainFailedException; import org.apache.lucene.store.RAMDirectory; import com.convertlucene.ConvertFrom2To3; public class TestNumericRange { public final static void main(String[] args) { try { BooleanQuery query = new BooleanQuery(); query.add(NumericRangeQuery.newIntRange(numId, 3, 3, true, true), Occur.MUST_NOT); query.add(new MatchAllDocsQuery(), Occur.MUST); testNumericRangeSingleSearcher(query); testNumericRangeMultiSearcher(query); } catch(Exception e) { e.printStackTrace(); } } public static void testNumericRangeSingleSearcher(Query query) throws CorruptIndexException, LockObtainFailedException, IOException { String[] ids = {1, 2, 3, 4, 5, 6}; Directory directory = new RAMDirectory(); IndexWriter writer = new IndexWriter(directory, new WhitespaceAnalyzer(), IndexWriter.MaxFieldLength.UNLIMITED); for (int i = 0; i ids.length; i++) { Document doc = new Document(); doc.add(new Field(id, ids[i], Field.Store.YES, Field.Index.NOT_ANALYZED)); doc.add(new NumericField(numId).setIntValue(Integer.valueOf(ids[i]))); writer.addDocument(doc); } writer.close(); IndexSearcher searcher = new IndexSearcher(directory); TopDocs docs = searcher.search(query, 10); System.out.println(SingleSearcher: testNumericRange: hitNum:
[jira] [Commented] (LUCENE-3096) MultiSearcher does not work correctly with Not on NumericRange
[ https://issues.apache.org/jira/browse/LUCENE-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13033504#comment-13033504 ] Uwe Schindler commented on LUCENE-3096: --- An alternative way to fix this in 3.0 (without giving up to use MultiSearcher) is to set the rewrite mode of MultiTermQueries (like NumericRangeQuery) to CONSTANT_SCORE_REWRITE. But this only fixes the bug for those queries (as no BooleanQuery is used during rewrite). Alltogether, negative queries in MultiSearcher are broken and it depends on index contents if the bug actually affects search results. MultiSearcher does not work correctly with Not on NumericRange -- Key: LUCENE-3096 URL: https://issues.apache.org/jira/browse/LUCENE-3096 Project: Lucene - Java Issue Type: Bug Components: Search Affects Versions: 3.0.2 Reporter: John Wang Fix For: 3.1 Hi, Keith My colleague xiaoyang and I just confirmed that this is actually due to a lucene bug on Multisearcher. In particular, If we search with Not on NumericRange and we use MultiSearcher, we will wrong search results (However, if we use IndexSearcher, the result is correct). Basically the NotOfNumericRange does not have impact on multisearcher. We suspect it is because the createWeight() function in MultiSearcher and hope you can help us to fix this bug of lucene. I attached the code to reproduce this case. Please check it out. In the attached code, I have two separate functions : (1) testNumericRangeSingleSearcher(Query query) where I create 6 documents, with a field called id= 1,2,3,4,5,6 respectively . Then I search by the query which is +MatchAllDocs -NumericRange(3,3). The expected result then should be 5 hits since the document 3 is MUST_NOT. (2) testNumericRangeMultiSearcher(Query query) where i create 2 RamDirectory(), each of which has 3 documents, 1,2,3; and 4,5,6. Then I search by the same query as above using multiSearcher. The expected result should also be 5 hits. However, from (1), we get 5 hits = expected results, while in (2) we get 6 hits != expected results. We also experimented this with our zoie/bobo open source tools and get the same results because our multi-bobo-browser is built on multi-searcher in lucene. I already emailed the lucene community group. Hopefully we can get some feedback soon. If you have any further concern, pls let me know! Thank you very much! Code: (based on lucene 3.0.x) import java.io.IOException; import java.io.PrintStream; import java.text.DecimalFormat; import org.apache.lucene.analysis.WhitespaceAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.document.NumericField; import org.apache.lucene.index.CorruptIndexException; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.Term; import org.apache.lucene.search.BooleanQuery; import org.apache.lucene.search.FieldCache; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.MatchAllDocsQuery; import org.apache.lucene.search.MultiSearcher; import org.apache.lucene.search.NumericRangeQuery; import org.apache.lucene.search.Query; import org.apache.lucene.search.ScoreDoc; import org.apache.lucene.search.Searchable; import org.apache.lucene.search.Sort; import org.apache.lucene.search.SortField; import org.apache.lucene.search.TermQuery; import org.apache.lucene.search.TopDocs; import org.apache.lucene.search.BooleanClause.Occur; import org.apache.lucene.store.Directory; import org.apache.lucene.store.LockObtainFailedException; import org.apache.lucene.store.RAMDirectory; import com.convertlucene.ConvertFrom2To3; public class TestNumericRange { public final static void main(String[] args) { try { BooleanQuery query = new BooleanQuery(); query.add(NumericRangeQuery.newIntRange(numId, 3, 3, true, true), Occur.MUST_NOT); query.add(new MatchAllDocsQuery(), Occur.MUST); testNumericRangeSingleSearcher(query); testNumericRangeMultiSearcher(query); } catch(Exception e) { e.printStackTrace(); } } public static void testNumericRangeSingleSearcher(Query query) throws CorruptIndexException, LockObtainFailedException, IOException { String[] ids = {1, 2, 3, 4, 5, 6}; Directory directory = new RAMDirectory(); IndexWriter writer = new IndexWriter(directory, new WhitespaceAnalyzer(), IndexWriter.MaxFieldLength.UNLIMITED); for (int i = 0; i ids.length; i++) { Document doc = new Document(); doc.add(new Field(id, ids[i], Field.Store.YES, Field.Index.NOT_ANALYZED)); doc.add(new
[jira] [Issue Comment Edited] (LUCENE-3096) MultiSearcher does not work correctly with Not on NumericRange
[ https://issues.apache.org/jira/browse/LUCENE-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13033504#comment-13033504 ] Uwe Schindler edited comment on LUCENE-3096 at 5/14/11 11:35 AM: - An alternative way to bypass this in 3.0 (without giving up to use MultiSearcher) is to set the rewrite mode of MultiTermQueries (like NumericRangeQuery) to CONSTANT_SCORE_REWRITE. But this only fixes the bug for those queries (as no BooleanQuery is used during rewrite). Alltogether, negative queries in MultiSearcher are broken and it depends on index contents if the bug actually affects search results. was (Author: thetaphi): An alternative way to fix this in 3.0 (without giving up to use MultiSearcher) is to set the rewrite mode of MultiTermQueries (like NumericRangeQuery) to CONSTANT_SCORE_REWRITE. But this only fixes the bug for those queries (as no BooleanQuery is used during rewrite). Alltogether, negative queries in MultiSearcher are broken and it depends on index contents if the bug actually affects search results. MultiSearcher does not work correctly with Not on NumericRange -- Key: LUCENE-3096 URL: https://issues.apache.org/jira/browse/LUCENE-3096 Project: Lucene - Java Issue Type: Bug Components: Search Affects Versions: 3.0.2 Reporter: John Wang Fix For: 3.1 Hi, Keith My colleague xiaoyang and I just confirmed that this is actually due to a lucene bug on Multisearcher. In particular, If we search with Not on NumericRange and we use MultiSearcher, we will wrong search results (However, if we use IndexSearcher, the result is correct). Basically the NotOfNumericRange does not have impact on multisearcher. We suspect it is because the createWeight() function in MultiSearcher and hope you can help us to fix this bug of lucene. I attached the code to reproduce this case. Please check it out. In the attached code, I have two separate functions : (1) testNumericRangeSingleSearcher(Query query) where I create 6 documents, with a field called id= 1,2,3,4,5,6 respectively . Then I search by the query which is +MatchAllDocs -NumericRange(3,3). The expected result then should be 5 hits since the document 3 is MUST_NOT. (2) testNumericRangeMultiSearcher(Query query) where i create 2 RamDirectory(), each of which has 3 documents, 1,2,3; and 4,5,6. Then I search by the same query as above using multiSearcher. The expected result should also be 5 hits. However, from (1), we get 5 hits = expected results, while in (2) we get 6 hits != expected results. We also experimented this with our zoie/bobo open source tools and get the same results because our multi-bobo-browser is built on multi-searcher in lucene. I already emailed the lucene community group. Hopefully we can get some feedback soon. If you have any further concern, pls let me know! Thank you very much! Code: (based on lucene 3.0.x) import java.io.IOException; import java.io.PrintStream; import java.text.DecimalFormat; import org.apache.lucene.analysis.WhitespaceAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.document.NumericField; import org.apache.lucene.index.CorruptIndexException; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.Term; import org.apache.lucene.search.BooleanQuery; import org.apache.lucene.search.FieldCache; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.MatchAllDocsQuery; import org.apache.lucene.search.MultiSearcher; import org.apache.lucene.search.NumericRangeQuery; import org.apache.lucene.search.Query; import org.apache.lucene.search.ScoreDoc; import org.apache.lucene.search.Searchable; import org.apache.lucene.search.Sort; import org.apache.lucene.search.SortField; import org.apache.lucene.search.TermQuery; import org.apache.lucene.search.TopDocs; import org.apache.lucene.search.BooleanClause.Occur; import org.apache.lucene.store.Directory; import org.apache.lucene.store.LockObtainFailedException; import org.apache.lucene.store.RAMDirectory; import com.convertlucene.ConvertFrom2To3; public class TestNumericRange { public final static void main(String[] args) { try { BooleanQuery query = new BooleanQuery(); query.add(NumericRangeQuery.newIntRange(numId, 3, 3, true, true), Occur.MUST_NOT); query.add(new MatchAllDocsQuery(), Occur.MUST); testNumericRangeSingleSearcher(query); testNumericRangeMultiSearcher(query); } catch(Exception e) { e.printStackTrace(); } } public static void testNumericRangeSingleSearcher(Query query) throws CorruptIndexException,
[jira] [Updated] (LUCENE-3093) Build failed in the flexscoring branch because of Javadoc warnings
[ https://issues.apache.org/jira/browse/LUCENE-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3093: Fix Version/s: flexscoring branch Build failed in the flexscoring branch because of Javadoc warnings -- Key: LUCENE-3093 URL: https://issues.apache.org/jira/browse/LUCENE-3093 Project: Lucene - Java Issue Type: Bug Components: Javadocs Environment: N/A Reporter: David Mark Nemeskey Assignee: Robert Muir Priority: Minor Fix For: flexscoring branch Attachments: LUCENE-3093.patch Original Estimate: 1h Remaining Estimate: 1h Ant build log: [javadoc] Standard Doclet version 1.6.0_24 [javadoc] Building tree for all the packages and classes... [javadoc] /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/src/java/org/apache/lucene/search/Similarity.java:93: warning - Tag @link: can't find tf(float) in org.apache.lucene.search.Similarity [javadoc] /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/src/java/org/apache/lucene/search/TFIDFSimilarity.java:588: warning - @param argument term is not a parameter name. [javadoc] /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/src/java/org/apache/lucene/search/TFIDFSimilarity.java:588: warning - @param argument docFreq is not a parameter name. [javadoc] /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/src/java/org/apache/lucene/search/TFIDFSimilarity.java:618: warning - @param argument terms is not a parameter name. [javadoc] Generating /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/org/apache/lucene/store/instantiated//package-summary.html... [javadoc] Copying file /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/contrib/instantiated/src/java/org/apache/lucene/store/instantiated/doc-files/classdiagram.png to directory /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/org/apache/lucene/store/instantiated/doc-files... [javadoc] Copying file /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/contrib/instantiated/src/java/org/apache/lucene/store/instantiated/doc-files/HitCollectionBench.jpg to directory /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/org/apache/lucene/store/instantiated/doc-files... [javadoc] Copying file /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/contrib/instantiated/src/java/org/apache/lucene/store/instantiated/doc-files/classdiagram.uxf to directory /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/org/apache/lucene/store/instantiated/doc-files... [javadoc] Generating /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/serialized-form.html... [javadoc] Copying file /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/prettify/stylesheet+prettify.css to file /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/stylesheet+prettify.css... [javadoc] Building index for all the packages and classes... [javadoc] Building index for all classes... [javadoc] Generating /home/savior/Development/workspaces/java/Lucene-GSoC/lucene/build/docs/api/all/help-doc.html... [javadoc] 4 warnings -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-1421) Ability to group search results by field
[ https://issues.apache.org/jira/browse/LUCENE-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13033506#comment-13033506 ] Martijn van Groningen commented on LUCENE-1421: --- Michael I see you have committed it to the trunk. Nice work! Only one quest why is the SearchGroup class now package protected? For me the documentation in overview.html suggest that I can just use it in any package. As for porting this code to the 3x branch I see that this branch doesn't have modules. Does it mean that it will be a Lucene contrib? Ability to group search results by field Key: LUCENE-1421 URL: https://issues.apache.org/jira/browse/LUCENE-1421 Project: Lucene - Java Issue Type: New Feature Components: Search Reporter: Artyom Sokolov Assignee: Michael McCandless Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-1421.patch, LUCENE-1421.patch, lucene-grouping.patch It would be awesome to group search results by specified field. Some functionality was provided for Apache Solr but I think it should be done in Core Lucene. There could be some useful information like total hits about collapsed data like total count and so on. Thanks, Artyom -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2959) [GSoC] Implementing State of the Art Ranking for Lucene
[ https://issues.apache.org/jira/browse/LUCENE-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2959: Fix Version/s: flexscoring branch [GSoC] Implementing State of the Art Ranking for Lucene --- Key: LUCENE-2959 URL: https://issues.apache.org/jira/browse/LUCENE-2959 Project: Lucene - Java Issue Type: New Feature Components: Examples, Javadocs, Query/Scoring Reporter: David Mark Nemeskey Assignee: Robert Muir Labels: gsoc2011, lucene-gsoc-11, mentor Fix For: flexscoring branch Attachments: LUCENE-2959_mockdfr.patch, implementation_plan.pdf, proposal.pdf Lucene employs the Vector Space Model (VSM) to rank documents, which compares unfavorably to state of the art algorithms, such as BM25. Moreover, the architecture is tailored specically to VSM, which makes the addition of new ranking functions a non- trivial task. This project aims to bring state of the art ranking methods to Lucene and to implement a query architecture with pluggable ranking functions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2392) Enable flexible scoring
[ https://issues.apache.org/jira/browse/LUCENE-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2392: Fix Version/s: (was: 4.0) flexscoring branch Enable flexible scoring --- Key: LUCENE-2392 URL: https://issues.apache.org/jira/browse/LUCENE-2392 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Michael McCandless Assignee: Michael McCandless Fix For: flexscoring branch Attachments: LUCENE-2392.patch, LUCENE-2392.patch, LUCENE-2392.patch, LUCENE-2392_take2.patch This is a first step (nowhere near committable!), implementing the design iterated to in the recent Baby steps towards making Lucene's scoring more flexible java-dev thread. The idea is (if you turn it on for your Field; it's off by default) to store full stats in the index, into a new _X.sts file, per doc (X field) in the index. And then have FieldSimilarityProvider impls that compute doc's boost bytes (norms) from these stats. The patch is able to index the stats, merge them when segments are merged, and provides an iterator-only API. It also has starting point for per-field Sims that use the stats iterator API to compute boost bytes. But it's not at all tied into actual searching! There's still tons left to do, eg, how does one configure via Field/FieldType which stats one wants indexed. All tests pass, and I added one new TestStats unit test. The stats I record now are: - field's boost - field's unique term count (a b c a a b -- 3) - field's total term count (a b c a a b -- 6) - total term count per-term (sum of total term count for all docs that have this term) Still need at least the total term count for each field. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3097) Post grouping faceting
Post grouping faceting -- Key: LUCENE-3097 URL: https://issues.apache.org/jira/browse/LUCENE-3097 Project: Lucene - Java Issue Type: New Feature Reporter: Martijn van Groningen Priority: Minor Fix For: 3.2, 4.0 This issues focuses on implementing post grouping faceting. * How to handle multivalued fields. What field value to show with the facet. * Where the facet counts should be based on ** Facet counts can be based on the normal documents. Ungrouped counts. ** Facet counts can be based on the groups. Grouped counts. ** Facet counts can be based on the combination of group value and facet value. Matrix counts. And properly more implementation options. The first two methods are implemented in the SOLR-236 patch. For the first option it calculates a DocSet based on the individual documents from the query result. For the second option it calculates a DocSet for all the most relevant documents of a group. Once the DocSet is computed the FacetComponent and StatsComponent use one the DocSet to create facets and statistics. This last one is a bit more complex. I think it is best explained with an example. Lets say we search on travel offers: |||hotel||departure_airport||duration|| |Hotel a|AMS|5 |Hotel a|DUS|10 |Hotel b|AMS|5 |Hotel b|AMS|10 If we group by hotel and have a facet for airport. Most end users expect (according to my experience off course) the following airport facet: AMS: 2 DUS: 1 The above result can't be achieved by the first two methods. You either get counts 2 for both airports or 1 for both airports. This issue is blocked by LUCENE-3079. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3097) Post grouping faceting
[ https://issues.apache.org/jira/browse/LUCENE-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martijn van Groningen updated LUCENE-3097: -- Description: This issues focuses on implementing post grouping faceting. * How to handle multivalued fields. What field value to show with the facet. * Where the facet counts should be based on ** Facet counts can be based on the normal documents. Ungrouped counts. ** Facet counts can be based on the groups. Grouped counts. ** Facet counts can be based on the combination of group value and facet value. Matrix counts. And properly more implementation options. The first two methods are implemented in the SOLR-236 patch. For the first option it calculates a DocSet based on the individual documents from the query result. For the second option it calculates a DocSet for all the most relevant documents of a group. Once the DocSet is computed the FacetComponent and StatsComponent use one the DocSet to create facets and statistics. This last one is a bit more complex. I think it is best explained with an example. Lets say we search on travel offers: |||hotel||departure_airport||duration|| |Hotel a|AMS|5 |Hotel a|DUS|10 |Hotel b|AMS|5 |Hotel b|AMS|10 If we group by hotel and have a facet for airport. Most end users expect (according to my experience off course) the following airport facet: AMS: 2 DUS: 1 The above result can't be achieved by the first two methods. You either get counts AMS:3 and DUS:1 or 1 for both airports. was: This issues focuses on implementing post grouping faceting. * How to handle multivalued fields. What field value to show with the facet. * Where the facet counts should be based on ** Facet counts can be based on the normal documents. Ungrouped counts. ** Facet counts can be based on the groups. Grouped counts. ** Facet counts can be based on the combination of group value and facet value. Matrix counts. And properly more implementation options. The first two methods are implemented in the SOLR-236 patch. For the first option it calculates a DocSet based on the individual documents from the query result. For the second option it calculates a DocSet for all the most relevant documents of a group. Once the DocSet is computed the FacetComponent and StatsComponent use one the DocSet to create facets and statistics. This last one is a bit more complex. I think it is best explained with an example. Lets say we search on travel offers: |||hotel||departure_airport||duration|| |Hotel a|AMS|5 |Hotel a|DUS|10 |Hotel b|AMS|5 |Hotel b|AMS|10 If we group by hotel and have a facet for airport. Most end users expect (according to my experience off course) the following airport facet: AMS: 2 DUS: 1 The above result can't be achieved by the first two methods. You either get counts 2 for both airports or 1 for both airports. This issue is blocked by LUCENE-3079. Post grouping faceting -- Key: LUCENE-3097 URL: https://issues.apache.org/jira/browse/LUCENE-3097 Project: Lucene - Java Issue Type: New Feature Reporter: Martijn van Groningen Priority: Minor Fix For: 3.2, 4.0 This issues focuses on implementing post grouping faceting. * How to handle multivalued fields. What field value to show with the facet. * Where the facet counts should be based on ** Facet counts can be based on the normal documents. Ungrouped counts. ** Facet counts can be based on the groups. Grouped counts. ** Facet counts can be based on the combination of group value and facet value. Matrix counts. And properly more implementation options. The first two methods are implemented in the SOLR-236 patch. For the first option it calculates a DocSet based on the individual documents from the query result. For the second option it calculates a DocSet for all the most relevant documents of a group. Once the DocSet is computed the FacetComponent and StatsComponent use one the DocSet to create facets and statistics. This last one is a bit more complex. I think it is best explained with an example. Lets say we search on travel offers: |||hotel||departure_airport||duration|| |Hotel a|AMS|5 |Hotel a|DUS|10 |Hotel b|AMS|5 |Hotel b|AMS|10 If we group by hotel and have a facet for airport. Most end users expect (according to my experience off course) the following airport facet: AMS: 2 DUS: 1 The above result can't be achieved by the first two methods. You either get counts AMS:3 and DUS:1 or 1 for both airports. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3098) Grouped total count
Grouped total count --- Key: LUCENE-3098 URL: https://issues.apache.org/jira/browse/LUCENE-3098 Project: Lucene - Java Issue Type: New Feature Reporter: Martijn van Groningen Fix For: 3.2, 4.0 When grouping currently you can get two counts: * Total hit count. Which counts all documents that matched the query. * Total grouped hit count. Which counts all documents that have been grouped in the top N groups. Since the end user gets groups in his search result instead of plain documents with grouping. The total number of groups as total count makes more sense in many situations. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3098) Grouped total count
[ https://issues.apache.org/jira/browse/LUCENE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13033537#comment-13033537 ] Martijn van Groningen commented on LUCENE-3098: --- I think this can be implemented as separate collector and then together with the SecondPassGroupingCollector executed in the second search. We can use the MultiCollector for that. Grouped total count --- Key: LUCENE-3098 URL: https://issues.apache.org/jira/browse/LUCENE-3098 Project: Lucene - Java Issue Type: New Feature Reporter: Martijn van Groningen Fix For: 3.2, 4.0 When grouping currently you can get two counts: * Total hit count. Which counts all documents that matched the query. * Total grouped hit count. Which counts all documents that have been grouped in the top N groups. Since the end user gets groups in his search result instead of plain documents with grouping. The total number of groups as total count makes more sense in many situations. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2480) Text extraction of password protected files
[ https://issues.apache.org/jira/browse/SOLR-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-2480: - Attachment: SOLR-2480.patch New patch. According to custom, ExtractingRequestHandlerTest class should be at o.a.s.handler.extraction, but curiously it was o.a.s.handler. I corrected it in this patch. Text extraction of password protected files --- Key: SOLR-2480 URL: https://issues.apache.org/jira/browse/SOLR-2480 Project: Solr Issue Type: Improvement Components: contrib - Solr Cell (Tika extraction) Affects Versions: 1.4.1, 3.1 Reporter: Shinichiro Abe Assignee: Koji Sekiguchi Priority: Minor Fix For: 3.2, 4.0 Attachments: SOLR-2480-idea1.patch, SOLR-2480.patch, SOLR-2480.patch, SOLR-2480.patch, password-is-solrcell.docx Proposal: There are password-protected files. PDF, Office documents in 2007 format/97 format. These files are posted using SolrCell. We do not have to read these files if we do not know the reading password of files. So, these files may not be extracted text. My requirement is that these files should be processed normally without extracting text, and without throwing exception. This background: Now, when you post a password-protected file, solr returns 500 server error. Solr catches the error in ExtractingDocumentLoader and throws TikException. I use ManifoldCF. If the solr server responds 500, ManifoldCF judge is that this document should be retried because I have absolutely no idea what happened. And it attempts to retry posting many times without getting the password. In the other case, my customer posts the files with embedded images. Sometimes it seems that solr throws TikaException of unknown cause. He wants to post just metadata without extracting text, but makes him stop posting by the exception. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3092) NRTCachingDirectory, to buffer small segments in a RAMDir
[ https://issues.apache.org/jira/browse/LUCENE-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13033545#comment-13033545 ] Michael McCandless commented on LUCENE-3092: {quote} just an idea: with issues like this that work (but hackishly) and are self contained, I'm not sure we should block them on some huge refactoring like IOContext if they are actually usable {quote} +1, progress not perfection. I'll clean up the patch -- add an example code fragment of how you use it, a test case (aside: it'd be nice if, somehow, we could randomly swap this into our tests... we'd need a newMergeScheduler() method that would tap into this Dir impl if it had been picked, but also, we'd have to get this contrib module on core's classpath...), and a comment saying this class does spooking stuff tracking merges and threads. NRTCachingDirectory, to buffer small segments in a RAMDir - Key: LUCENE-3092 URL: https://issues.apache.org/jira/browse/LUCENE-3092 Project: Lucene - Java Issue Type: Improvement Components: Store Reporter: Michael McCandless Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-3092-listener.patch, LUCENE-3092.patch I created this simply Directory impl, whose goal is reduce IO contention in a frequent reopen NRT use case. The idea is, when reopening quickly, but not indexing that much content, you wind up with many small files created with time, that can possibly stress the IO system eg if merges, searching are also fighting for IO. So, NRTCachingDirectory puts these newly created files into a RAMDir, and only when they are merged into a too-large segment, does it then write-through to the real (delegate) directory. This lets you spend some RAM to reduce I0. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2480) Text extraction of password protected files
[ https://issues.apache.org/jira/browse/SOLR-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi resolved SOLR-2480. -- Resolution: Fixed trunk: Committed revision 1103120. 3x: Committed revision 1103124. Text extraction of password protected files --- Key: SOLR-2480 URL: https://issues.apache.org/jira/browse/SOLR-2480 Project: Solr Issue Type: Improvement Components: contrib - Solr Cell (Tika extraction) Affects Versions: 1.4.1, 3.1 Reporter: Shinichiro Abe Assignee: Koji Sekiguchi Priority: Minor Fix For: 3.2, 4.0 Attachments: SOLR-2480-idea1.patch, SOLR-2480.patch, SOLR-2480.patch, SOLR-2480.patch, password-is-solrcell.docx Proposal: There are password-protected files. PDF, Office documents in 2007 format/97 format. These files are posted using SolrCell. We do not have to read these files if we do not know the reading password of files. So, these files may not be extracted text. My requirement is that these files should be processed normally without extracting text, and without throwing exception. This background: Now, when you post a password-protected file, solr returns 500 server error. Solr catches the error in ExtractingDocumentLoader and throws TikException. I use ManifoldCF. If the solr server responds 500, ManifoldCF judge is that this document should be retried because I have absolutely no idea what happened. And it attempts to retry posting many times without getting the password. In the other case, my customer posts the files with embedded images. Sometimes it seems that solr throws TikaException of unknown cause. He wants to post just metadata without extracting text, but makes him stop posting by the exception. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2399) Solr Admin Interface, reworked
[ https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13033576#comment-13033576 ] Stefan Matheis (steffkes) commented on SOLR-2399: - Used this rainy Saturday to complete the [Replication Page|http://files.mathe.is/solr-admin/10_replication.png] The possible Actions depend on Master/Slave-Configuration. The Iterations-List contains all Dates, per default only the latest ones [successful failed] are displayed. Yellow Background (in the Index-Section) indicates a difference between the Slave- and the Master-Data. Short Screencast to see how the progress-bar will look while replicating (WIP-State two days ago): http://screencast.com/t/W1JcBaHO42C Solr Admin Interface, reworked -- Key: SOLR-2399 URL: https://issues.apache.org/jira/browse/SOLR-2399 Project: Solr Issue Type: Improvement Components: web gui Reporter: Stefan Matheis (steffkes) Priority: Minor Fix For: 4.0 Attachments: SOLR-2399.patch *The idea was to create a new, fresh (and hopefully clean) Solr Admin Interface.* [Based on this [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]] I've quickly created a Github-Repository (Just for me, to keep track of the changes) » https://github.com/steffkes/solr-admin Quick Tour: [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png], [Query-Form|http://files.mathe.is/solr-admin/02_query.png], [Plugins|http://files.mathe.is/solr-admin/05_plugins.png], [Logging|http://files.mathe.is/solr-admin/07_logging.png], [Analysis|http://files.mathe.is/solr-admin/04_analysis.png], [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png], [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png], [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png], [Replication|http://files.mathe.is/solr-admin/10_replication.png] Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3100) IW.commit() writes but fails to fsync the N.fnx file
IW.commit() writes but fails to fsync the N.fnx file Key: LUCENE-3100 URL: https://issues.apache.org/jira/browse/LUCENE-3100 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless Fix For: 4.0 In making a unit test for NRTCachingDir (LUCENE-3092) I hit this surprising bug! Because the new N.fnx file is written at the last minute along with the segments file, it's not included in the sis.files() that IW uses to figure out which files to sync. This bug means one could call IW.commit(), successfully, return, and then the machine could crash and when it comes back up your index could be corrupted. We should hopefully first fix TestCrash so that it hits this bug (maybe it needs more/better randomization?), then fix the bug -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2517) Use varargs for Field(String name, byte[] value)
Use varargs for Field(String name, byte[] value) Key: SOLR-2517 URL: https://issues.apache.org/jira/browse/SOLR-2517 Project: Solr Issue Type: Improvement Affects Versions: 3.1 Reporter: Gabriele Kahlout Priority: Trivial Really trivial that it might not be worth the attention of an issue, but since searcher.getSimilarity().encodeNormValue(..) returns a byte instead of a byte[] it becomes handy (or is there extra info that needs to be put in the array?). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2517) Use varargs for Field(String name, byte[] value)
[ https://issues.apache.org/jira/browse/SOLR-2517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriele Kahlout updated SOLR-2517: --- Attachment: SOLR-2517.patch Use varargs for Field(String name, byte[] value) Key: SOLR-2517 URL: https://issues.apache.org/jira/browse/SOLR-2517 Project: Solr Issue Type: Improvement Affects Versions: 3.1 Reporter: Gabriele Kahlout Priority: Trivial Attachments: SOLR-2517.patch Really trivial that it might not be worth the attention of an issue, but since searcher.getSimilarity().encodeNormValue(..) returns a byte instead of a byte[] it becomes handy (or is there extra info that needs to be put in the array?). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3092) NRTCachingDirectory, to buffer small segments in a RAMDir
[ https://issues.apache.org/jira/browse/LUCENE-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3092: --- Attachment: LUCENE-3092.patch New patch. I reduced the over-synchronized methods (hopefully not too much!), improved jdocs (added an example usage), added CHANGES entry, and added a test case. But: the test case currently fails, due to LUCENE-3100. NRTCachingDirectory, to buffer small segments in a RAMDir - Key: LUCENE-3092 URL: https://issues.apache.org/jira/browse/LUCENE-3092 Project: Lucene - Java Issue Type: Improvement Components: Store Reporter: Michael McCandless Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-3092-listener.patch, LUCENE-3092.patch, LUCENE-3092.patch I created this simply Directory impl, whose goal is reduce IO contention in a frequent reopen NRT use case. The idea is, when reopening quickly, but not indexing that much content, you wind up with many small files created with time, that can possibly stress the IO system eg if merges, searching are also fighting for IO. So, NRTCachingDirectory puts these newly created files into a RAMDir, and only when they are merged into a too-large segment, does it then write-through to the real (delegate) directory. This lets you spend some RAM to reduce I0. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1103227 - /lucene/dev/trunk/modules/grouping/build.xml
Thanks Steven! Mike http://blog.mikemccandless.com On Sat, May 14, 2011 at 5:24 PM, sar...@apache.org wrote: Author: sarowe Date: Sat May 14 21:24:50 2011 New Revision: 1103227 URL: http://svn.apache.org/viewvc?rev=1103227view=rev Log: LUCENE-1421: Specialize the dist-maven target for the new grouping module, so that generate-maven-artifacts can succeed Modified: lucene/dev/trunk/modules/grouping/build.xml Modified: lucene/dev/trunk/modules/grouping/build.xml URL: http://svn.apache.org/viewvc/lucene/dev/trunk/modules/grouping/build.xml?rev=1103227r1=1103226r2=1103227view=diff == --- lucene/dev/trunk/modules/grouping/build.xml (original) +++ lucene/dev/trunk/modules/grouping/build.xml Sat May 14 21:24:50 2011 @@ -10,4 +10,6 @@ import file=../../lucene/contrib/contrib-build.xml/ property name=working.dir location=work/ + + target name=dist-maven depends=jar-core,javadocs,contrib-build.dist-maven / /project - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org