Re: [VOTE] Release PyLucene 3.6.0 rc2
+1 to release. Downloaded on OS X, verified sig/md5, built jcc and pylucene and ran tests. On Mon, May 7, 2012 at 8:20 PM, Andi Vajda va...@apache.org wrote: Please vote to release these artifacts as PyLucene 3.6.0-2. Thanks ! Andi.. ps: the KEYS file for PyLucene release signing is at: http://svn.apache.org/repos/asf/lucene/pylucene/dist/KEYS http://people.apache.org/~vajda/staging_area/KEYS pps: here is my +1 -- lucidimagination.com
Re: [VOTE] Release PyLucene 3.6.0 rc2
On Fri, 11 May 2012, Robert Muir wrote: +1 to release. Downloaded on OS X, verified sig/md5, built jcc and pylucene and ran tests. This vote has passed. Thank you all who voted ! Andi.. On Mon, May 7, 2012 at 8:20 PM, Andi Vajda va...@apache.org wrote: Please vote to release these artifacts as PyLucene 3.6.0-2. Thanks ! Andi.. ps: the KEYS file for PyLucene release signing is at: http://svn.apache.org/repos/asf/lucene/pylucene/dist/KEYS http://people.apache.org/~vajda/staging_area/KEYS pps: here is my +1 -- lucidimagination.com
[jira] [Created] (SOLR-3449) QueryComponent.doFieldSortValues throw ArrayIndexOutOfBoundsException when has maxDoc=0 Segment
Linbin Chen created SOLR-3449: - Summary: QueryComponent.doFieldSortValues throw ArrayIndexOutOfBoundsException when has maxDoc=0 Segment Key: SOLR-3449 URL: https://issues.apache.org/jira/browse/SOLR-3449 Project: Solr Issue Type: Bug Components: search Affects Versions: 3.5, 3.6 Reporter: Linbin Chen Fix For: 3.6.1 Attachments: SOLR-3449.patch have index {code} Segment name=_9, offest=[docBase=0, maxDoc=245] idx=0 Segment name=_a, offest=[docBase=245, maxDoc=3] idx=1 Segment name=_b, offest=[docBase=248, maxDoc=0] idx=2 Segment name=_c, offest=[docBase=248, maxDoc=1] idx=3 Segment name=_d, offest=[docBase=249, maxDoc=0] idx=4 Segment name=_e, offest=[docBase=249, maxDoc=1] idx=5 Segment name=_f, offest=[docBase=250, maxDoc=0] idx=6 Segment name=_g, offest=[docBase=250, maxDoc=3] idx=7 Segment name=_h, offest=[docBase=253, maxDoc=0] idx=8 {code} maxDoc=0 's Segment maybe create by mergeIndexes。(can make sure maxDoc=0 's segment not merge, but when couldn't control merge indexes) when use fsv=true get sort values, hit docId=249 throw ArrayIndexOutOfBoundsException {code} 2012-5-11 14:28:28 org.apache.solr.common.SolrException log ERROR: java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.lucene.search.FieldComparator$LongComparator.copy(FieldComparator.java:600) at org.apache.solr.handler.component.QueryComponent.doFieldSortValues(QueryComponent.java:463) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:400) {code} reason: //idx 012345678 //int[] maxDocs={245, 3, 0, 1, 0, 1, 0, 3, 0}; int[] offsets = { 0, 245, 248, 248, 249, 249, 250, 250, 253}; org.apache.solr.search.SolrIndexReader.readerIndex(249, offsets) return idx=4 not 5。 correct idx=5。 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3449) QueryComponent.doFieldSortValues throw ArrayIndexOutOfBoundsException when has maxDoc=0 Segment
[ https://issues.apache.org/jira/browse/SOLR-3449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Linbin Chen updated SOLR-3449: -- Attachment: SOLR-3449.patch QueryComponent.doFieldSortValues throw ArrayIndexOutOfBoundsException when has maxDoc=0 Segment --- Key: SOLR-3449 URL: https://issues.apache.org/jira/browse/SOLR-3449 Project: Solr Issue Type: Bug Components: search Affects Versions: 3.5, 3.6 Reporter: Linbin Chen Fix For: 3.6.1 Attachments: SOLR-3449.patch have index {code} Segment name=_9, offest=[docBase=0, maxDoc=245] idx=0 Segment name=_a, offest=[docBase=245, maxDoc=3] idx=1 Segment name=_b, offest=[docBase=248, maxDoc=0] idx=2 Segment name=_c, offest=[docBase=248, maxDoc=1] idx=3 Segment name=_d, offest=[docBase=249, maxDoc=0] idx=4 Segment name=_e, offest=[docBase=249, maxDoc=1] idx=5 Segment name=_f, offest=[docBase=250, maxDoc=0] idx=6 Segment name=_g, offest=[docBase=250, maxDoc=3] idx=7 Segment name=_h, offest=[docBase=253, maxDoc=0] idx=8 {code} maxDoc=0 's Segment maybe create by mergeIndexes。(can make sure maxDoc=0 's segment not merge, but when couldn't control merge indexes) when use fsv=true get sort values, hit docId=249 throw ArrayIndexOutOfBoundsException {code} 2012-5-11 14:28:28 org.apache.solr.common.SolrException log ERROR: java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.lucene.search.FieldComparator$LongComparator.copy(FieldComparator.java:600) at org.apache.solr.handler.component.QueryComponent.doFieldSortValues(QueryComponent.java:463) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:400) {code} reason: {code} //idx 012345678 //int[] maxDocs={245, 3, 0, 1, 0, 1, 0, 3, 0}; int[] offsets = { 0, 245, 248, 248, 249, 249, 250, 250, 253}; org.apache.solr.search.SolrIndexReader.readerIndex(249, offsets) return idx=4 not 5。 {code} correct idx=5。 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3449) QueryComponent.doFieldSortValues throw ArrayIndexOutOfBoundsException when has maxDoc=0 Segment
[ https://issues.apache.org/jira/browse/SOLR-3449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Linbin Chen updated SOLR-3449: -- Description: have index {code} Segment name=_9, offest=[docBase=0, maxDoc=245] idx=0 Segment name=_a, offest=[docBase=245, maxDoc=3] idx=1 Segment name=_b, offest=[docBase=248, maxDoc=0] idx=2 Segment name=_c, offest=[docBase=248, maxDoc=1] idx=3 Segment name=_d, offest=[docBase=249, maxDoc=0] idx=4 Segment name=_e, offest=[docBase=249, maxDoc=1] idx=5 Segment name=_f, offest=[docBase=250, maxDoc=0] idx=6 Segment name=_g, offest=[docBase=250, maxDoc=3] idx=7 Segment name=_h, offest=[docBase=253, maxDoc=0] idx=8 {code} maxDoc=0 's Segment maybe create by mergeIndexes。(can make sure maxDoc=0 's segment not merge, but when couldn't control merge indexes) when use fsv=true get sort values, hit docId=249 throw ArrayIndexOutOfBoundsException {code} 2012-5-11 14:28:28 org.apache.solr.common.SolrException log ERROR: java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.lucene.search.FieldComparator$LongComparator.copy(FieldComparator.java:600) at org.apache.solr.handler.component.QueryComponent.doFieldSortValues(QueryComponent.java:463) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:400) {code} reason: {code} //idx 012345678 //int[] maxDocs={245, 3, 0, 1, 0, 1, 0, 3, 0}; int[] offsets = { 0, 245, 248, 248, 249, 249, 250, 250, 253}; org.apache.solr.search.SolrIndexReader.readerIndex(249, offsets) return idx=4 not 5。 {code} correct idx=5。 patch {code} Index: solr/core/src/java/org/apache/solr/search/SolrIndexReader.java === --- solr/core/src/java/org/apache/solr/search/SolrIndexReader.java (revision 1337028) +++ solr/core/src/java/org/apache/solr/search/SolrIndexReader.java (working copy) @@ -138,6 +138,16 @@ } else { // exact match on the offset. + //skip equal offest + for(int i=mid+1; i=high; i++) { + if(doc == offsets[i]) { + //skip offests[i] == doc + mid = i; + } else { + //stop skip offest + break; + } + } return mid; } } {code} was: have index {code} Segment name=_9, offest=[docBase=0, maxDoc=245] idx=0 Segment name=_a, offest=[docBase=245, maxDoc=3] idx=1 Segment name=_b, offest=[docBase=248, maxDoc=0] idx=2 Segment name=_c, offest=[docBase=248, maxDoc=1] idx=3 Segment name=_d, offest=[docBase=249, maxDoc=0] idx=4 Segment name=_e, offest=[docBase=249, maxDoc=1] idx=5 Segment name=_f, offest=[docBase=250, maxDoc=0] idx=6 Segment name=_g, offest=[docBase=250, maxDoc=3] idx=7 Segment name=_h, offest=[docBase=253, maxDoc=0] idx=8 {code} maxDoc=0 's Segment maybe create by mergeIndexes。(can make sure maxDoc=0 's segment not merge, but when couldn't control merge indexes) when use fsv=true get sort values, hit docId=249 throw ArrayIndexOutOfBoundsException {code} 2012-5-11 14:28:28 org.apache.solr.common.SolrException log ERROR: java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.lucene.search.FieldComparator$LongComparator.copy(FieldComparator.java:600) at org.apache.solr.handler.component.QueryComponent.doFieldSortValues(QueryComponent.java:463) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:400) {code} reason: {code} //idx 012345678 //int[] maxDocs={245, 3, 0, 1, 0, 1, 0, 3, 0}; int[] offsets = { 0, 245, 248, 248, 249, 249, 250, 250, 253}; org.apache.solr.search.SolrIndexReader.readerIndex(249, offsets) return idx=4 not 5。 {code} correct idx=5。 QueryComponent.doFieldSortValues throw ArrayIndexOutOfBoundsException when has maxDoc=0 Segment --- Key: SOLR-3449 URL: https://issues.apache.org/jira/browse/SOLR-3449 Project: Solr Issue Type: Bug Components: search Affects Versions: 3.5, 3.6 Reporter: Linbin Chen Fix For: 3.6.1 Attachments: SOLR-3449.patch have index {code} Segment name=_9, offest=[docBase=0, maxDoc=245] idx=0 Segment name=_a, offest=[docBase=245, maxDoc=3] idx=1 Segment name=_b, offest=[docBase=248, maxDoc=0] idx=2 Segment name=_c, offest=[docBase=248, maxDoc=1] idx=3 Segment name=_d, offest=[docBase=249, maxDoc=0] idx=4 Segment name=_e, offest=[docBase=249, maxDoc=1] idx=5 Segment name=_f, offest=[docBase=250, maxDoc=0] idx=6 Segment name=_g, offest=[docBase=250, maxDoc=3] idx=7 Segment name=_h, offest=[docBase=253, maxDoc=0] idx=8 {code} maxDoc=0 's
[jira] [Commented] (SOLR-3377) eDismax: A fielded query wrapped by parens is not recognized
[ https://issues.apache.org/jira/browse/SOLR-3377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273099#comment-13273099 ] Bernd Fehling commented on SOLR-3377: - Shoot me an enhanced unit test which covers your requests and i will look into this. But, while looking through all the test cases I think we really need a clear definition of rules and define a BNF syntax description for edismax and then implement the BNF syntax. This has two advantages, the user knows how to construct a valid query and we can clean up the patch work inside edismax. This can also obey fallback mechanism to always return a valid query. How about that? eDismax: A fielded query wrapped by parens is not recognized Key: SOLR-3377 URL: https://issues.apache.org/jira/browse/SOLR-3377 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 3.6 Reporter: Jan Høydahl Assignee: Bernd Fehling Fix For: 4.0, 3.6.1 Attachments: SOLR-3377.patch, SOLR-3377.patch, SOLR-3377.patch, SOLR-3377.patch As reported by bernd on the user list, a query like this {{q=(name:test)}} will yield 0 hits in 3.6 while it worked in 3.5. It works without the parens. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3447) solrj cannot handle org.apache.solr.common.SolrException when the schema is not correct
[ https://issues.apache.org/jira/browse/SOLR-3447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273108#comment-13273108 ] ESSOUSSI Jamel commented on SOLR-3447: -- Hi, When the schema is not valid and I try to index solr document I get this response from solr: HTTP/1.1 400 Mauvaise Requ.te Server: Apache-Coyote/1.1 Content-Type: text/html;charset=utf-8 Content-Length: 1125 Date: Thu, 10 May 2012 16:00:00 GMT Connection: close htmlheadtitleApache Tomcat/6.0.28 - Rapport d'erreur/titlestyle!--H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} H2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} H3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A {color : black;}A.name {color : black;}HR {color : #525D76;}--/style /headbodyh1Etat HTTP 400 - ERROR: [doc=280304571883] unknown field 'shop_'/h1HR size=1 noshade=noshadepbtype/b Rapport d'..tat/ppbmessage/b uERROR: [doc=280304571883] unknown field 'shop_'/u/ppbdescription/b uLa requ..te envoy..e par le client ..tait syntaxiquement incorrecte (ERROR: [doc=280304571883] unknown field 'shop_')./u/pHR size=1 noshade=noshadeh3Apache Tomcat/6.0.28/h3/body/html And when the schema is valid, I get this response : HTTP/1.1 200 OK Server: Apache-Coyote/1.1 Content-Type: application/xml;charset=UTF-8 Transfer-Encoding: chunked Date: Thu, 10 May 2012 15:53:09 GMT 93 ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime3/int/lst /response -- Is that it is normal that the result eq 0 when the schema is good. -- and if the schema is not good, why solrj is not capable of handling solrexception Best Reagards -- Jamel ESSOUSSI solrj cannot handle org.apache.solr.common.SolrException when the schema is not correct --- Key: SOLR-3447 URL: https://issues.apache.org/jira/browse/SOLR-3447 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 3.6 Environment: jdk 1.6.0_26, tomcat 6.0.35 , solrj 3.6.0 solr-core 3.6.0 and solr.war (4.0) Reporter: ESSOUSSI Jamel Fix For: 3.6 Hi; I have an incorrect schema, a missing field : and when I add a documents (UpdateResponse ur = solrServer.add(docs);), I have not be able to catch exception in solrj and the UpdateResponse cannot handle result. Best Regards -- Jamel ESSOUSSI -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Closed] (SOLR-3448) Date math in range queries does not handle plus sign
[ https://issues.apache.org/jira/browse/SOLR-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lance Norskog closed SOLR-3448. --- Resolution: Invalid Date math in range queries does not handle plus sign Key: SOLR-3448 URL: https://issues.apache.org/jira/browse/SOLR-3448 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Lance Norskog This query: {code} facet.query=timestamp:[NOW-1YEAR/DAY%20TO%20NOW/DAY+1DAY] {code} gives this error: {code} Cannot parse '[NOW-1YEAR/DAY TO NOW/DAY 1DAY]': Encountered RANGE_GOOP 1DAY at line 1, column 26. Was expecting one of: ] ... } ... {code} Should the fix be to add a backslash in front of +1DAY? That does not work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3448) Date math in range queries does not handle plus sign
[ https://issues.apache.org/jira/browse/SOLR-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273116#comment-13273116 ] Lance Norskog commented on SOLR-3448: - Oy. It was even worse: the field was not a date field. My Solr was magic so it made the field anyway instead of complaining to me. Date math in range queries does not handle plus sign Key: SOLR-3448 URL: https://issues.apache.org/jira/browse/SOLR-3448 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Lance Norskog This query: {code} facet.query=timestamp:[NOW-1YEAR/DAY%20TO%20NOW/DAY+1DAY] {code} gives this error: {code} Cannot parse '[NOW-1YEAR/DAY TO NOW/DAY 1DAY]': Encountered RANGE_GOOP 1DAY at line 1, column 26. Was expecting one of: ] ... } ... {code} Should the fix be to add a backslash in front of +1DAY? That does not work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3159) Upgrade to Jetty 8
[ https://issues.apache.org/jira/browse/SOLR-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273124#comment-13273124 ] Massimo Carro commented on SOLR-3159: - I use solr 4 with jetty 8 : highlight inside qt definition doesn't go. Upgrade to Jetty 8 -- Key: SOLR-3159 URL: https://issues.apache.org/jira/browse/SOLR-3159 Project: Solr Issue Type: Task Reporter: Ryan McKinley Assignee: Ryan McKinley Priority: Minor Fix For: 4.0 Attachments: SOLR-3159-maven.patch Solr is currently tested (and bundled) with a patched jetty-6 version. Ideally we can release and test with a standard version. Jetty-6 (at codehaus) is just maintenance now. New development and improvements are now hosted at eclipse. Assuming performance is equivalent, I think we should switch to Jetty 8. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3450) CoreAdminHandler.handleStatusAction
Trym Møller created SOLR-3450: - Summary: CoreAdminHandler.handleStatusAction Key: SOLR-3450 URL: https://issues.apache.org/jira/browse/SOLR-3450 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0 Environment: Linux version 2.6.32-29-server (buildd@allspice) (gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5) ) #58-Ubuntu SMP Fri Feb 11 21:06:51 UTC 2011 Java(TM) SE Runtime Environment (build 1.6.0_26-b03) Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode) Reporter: Trym Møller Priority: Minor May 8, 2012 12:49:49 PM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: Error handling 'status' action at org.apache.solr.handler.admin.CoreAdminHandler.handleStatusAction(CoreAdminHandler.java:551) at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:161) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:360) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:173) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) Caused by: java.lang.IllegalArgumentException: /usr/lib/solr-4.0/example/dataDir/index.20120419210203/_kvon_0.frq does not exist at org.apache.commons.io.FileUtils.sizeOf(FileUtils.java:2053) at org.apache.commons.io.FileUtils.sizeOfDirectory(FileUtils.java:2089) at org.apache.solr.handler.admin.CoreAdminHandler.getIndexSize(CoreAdminHandler.java:837) at org.apache.solr.handler.admin.CoreAdminHandler.getCoreStatus(CoreAdminHandler.java:822) at org.apache.solr.handler.admin.CoreAdminHandler.handleStatusAction(CoreAdminHandler.java:542) ... 21 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4049) PrefixQuery (or it's superclass MultiTermQuery) ignores index time boosts
Michael Wyraz created LUCENE-4049: - Summary: PrefixQuery (or it's superclass MultiTermQuery) ignores index time boosts Key: LUCENE-4049 URL: https://issues.apache.org/jira/browse/LUCENE-4049 Project: Lucene - Java Issue Type: Bug Components: core/search Affects Versions: 3.6, 3.5 Environment: Java Reporter: Michael Wyraz It is possible to set boost to fields or documents during indexing, so certain documents can be boostes over others. This works well with TermQuery or FuzzyQuery but not with PrefixQuery which ignores the individual values. Test Code below: import java.io.IOException; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.index.IndexReader; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.IndexWriterConfig; import org.apache.lucene.index.Term; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.PrefixQuery; import org.apache.lucene.search.Query; import org.apache.lucene.search.ScoreDoc; import org.apache.lucene.search.TopScoreDocCollector; import org.apache.lucene.store.Directory; import org.apache.lucene.store.RAMDirectory; import org.apache.lucene.util.Version; import com.evermind.tools.calendar.StopWatch; public class LuceneTest { public static void main(String[] args) throws Exception { Directory index=new RAMDirectory(); StandardAnalyzer analyzer=new StandardAnalyzer(Version.LUCENE_35); IndexWriterConfig config=new IndexWriterConfig(Version.LUCENE_35, analyzer); IndexWriter w = new IndexWriter(index, config); addDoc(w, Hello 1,1); addDoc(w, Hello 2,2); addDoc(w, Hello 3,1); w.close(); StopWatch.stop(); IndexReader reader = IndexReader.open(index); IndexSearcher searcher = new IndexSearcher(reader); //Query q = new TermQuery(new Term(f1,hello)); Query q = new PrefixQuery(new Term(f1,hello)); TopScoreDocCollector collector = TopScoreDocCollector.create(10, true); searcher.search(q, collector); for (ScoreDoc hit: collector.topDocs().scoreDocs) { Document d = searcher.doc(hit.doc); System.err.println(d.get(f1)+ +hit.score+ +hit.doc); } } private static void addDoc(IndexWriter w, String value, float boost) throws IOException { Document doc = new Document(); doc.add(new Field(f1, value, Field.Store.YES, Field.Index.ANALYZED)); doc.setBoost(boost); w.addDocument(doc); } } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 13953 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/13953/ 1 tests failed. FAILED: junit.framework.TestSuite.org.apache.solr.cloud.BasicDistributedZkTest Error Message: ERROR: SolrIndexSearcher opens=79 closes=77 Stack Trace: java.lang.AssertionError: ERROR: SolrIndexSearcher opens=79 closes=77 at __randomizedtesting.SeedInfo.seed([FF8963706B7FE441]:0) at org.junit.Assert.fail(Assert.java:93) at org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:212) at org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:101) at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1961) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:742) at org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63) at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75) at org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:38) at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Build Log (for compile errors): [...truncated 11324 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1337005 - in /lucene/dev/trunk/lucene/test-framework/src/java/org/apache/lucene: index/AlcoholicMergePolicy.java util/LuceneTestCase.java
+ public static enum Drink { + + Beer(15), Wine(17), Champagne(21), WhiteRussian(22), + SingleMalt(30); + + + public long drunk() { + return drunkFactor; + } I think this isn't an independent value. This isn't even a Markov chain as it doesn't depend on the last state of the observed object and the drink to follow -- full history of drinks consumed so far would have to be considered, their order and quantities matter (i.e., beer after champagne, singlemalt after beer etc.). Overflows (or so-called burst points) would certainly have to be empirically established as there is no theoretical model for them known in literature... Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: G1 Garbage Collector enabled for Java 7 builds
+1. I expect lots of bugs to appear soon... I tried G1 at some point in the past and just couldn't get it to work reliably (it was a longer while ago though). Dawid On Thu, May 10, 2012 at 10:23 PM, Uwe Schindler u...@thetaphi.de wrote: Hi, I enabled the G1 Garbage Collector for the Java 7 Builds (TEST_JVM_FLAGS). If something goes wrong, we have another Java 7 bug... :-) It is not yet enabled by default in Java 7 (and not in u4, too - Jenkins runs u3), but it might be in the future, so we should test this. Maybe we have a random Garbage collector selector for our builds, I am thinking about that. Currently it's passed as a env var by the Jenkins config. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Edited] (SOLR-3450) CoreAdminHandler.handleStatusAction
[ https://issues.apache.org/jira/browse/SOLR-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273166#comment-13273166 ] Per Steffensen edited comment on SOLR-3450 at 5/11/12 10:53 AM: Guess is that FileUtils.sizeOfDirectory starts by listing all files in the directory and afterwards works through that list getting the size of each file and adding it to a total sum. If a file disappears from the time the directory is listed and the time where the algorithm tries to find its size, you will end up like this. A file might disappear during index merge. Only guessing. Might want to be a little more robuste here. Regards, Per Steffensen was (Author: steff1193): Guess is that FileUtils.sizeOfDirectory starts by listing all files in the directory and afterwards works through that list getting the size of each file and adding it to a total sum. If a file disappears from the time the directory is listed and the time where the algorithm tries to find its size, you will end up like this. A file might disappear during index merge. Only guessing. Regards, Per Steffensen CoreAdminHandler.handleStatusAction --- Key: SOLR-3450 URL: https://issues.apache.org/jira/browse/SOLR-3450 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0 Environment: Linux version 2.6.32-29-server (buildd@allspice) (gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5) ) #58-Ubuntu SMP Fri Feb 11 21:06:51 UTC 2011 Java(TM) SE Runtime Environment (build 1.6.0_26-b03) Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode) Reporter: Trym Møller Priority: Minor May 8, 2012 12:49:49 PM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: Error handling 'status' action at org.apache.solr.handler.admin.CoreAdminHandler.handleStatusAction(CoreAdminHandler.java:551) at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:161) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:360) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:173) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) Caused by: java.lang.IllegalArgumentException: /usr/lib/solr-4.0/example/dataDir/index.20120419210203/_kvon_0.frq does not exist at org.apache.commons.io.FileUtils.sizeOf(FileUtils.java:2053) at org.apache.commons.io.FileUtils.sizeOfDirectory(FileUtils.java:2089) at org.apache.solr.handler.admin.CoreAdminHandler.getIndexSize(CoreAdminHandler.java:837) at org.apache.solr.handler.admin.CoreAdminHandler.getCoreStatus(CoreAdminHandler.java:822) at org.apache.solr.handler.admin.CoreAdminHandler.handleStatusAction(CoreAdminHandler.java:542) ... 21 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To
[jira] [Commented] (SOLR-3450) CoreAdminHandler.handleStatusAction
[ https://issues.apache.org/jira/browse/SOLR-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273166#comment-13273166 ] Per Steffensen commented on SOLR-3450: -- Guess is that FileUtils.sizeOfDirectory starts by listing all files in the directory and afterwards works through that list getting the size of each file and adding it to a total sum. If a file disappears from the time the directory is listed and the time where the algorithm tries to find its size, you will end up like this. A file might disappear during index merge. Only guessing. Regards, Per Steffensen CoreAdminHandler.handleStatusAction --- Key: SOLR-3450 URL: https://issues.apache.org/jira/browse/SOLR-3450 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0 Environment: Linux version 2.6.32-29-server (buildd@allspice) (gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5) ) #58-Ubuntu SMP Fri Feb 11 21:06:51 UTC 2011 Java(TM) SE Runtime Environment (build 1.6.0_26-b03) Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode) Reporter: Trym Møller Priority: Minor May 8, 2012 12:49:49 PM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: Error handling 'status' action at org.apache.solr.handler.admin.CoreAdminHandler.handleStatusAction(CoreAdminHandler.java:551) at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:161) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:360) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:173) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) Caused by: java.lang.IllegalArgumentException: /usr/lib/solr-4.0/example/dataDir/index.20120419210203/_kvon_0.frq does not exist at org.apache.commons.io.FileUtils.sizeOf(FileUtils.java:2053) at org.apache.commons.io.FileUtils.sizeOfDirectory(FileUtils.java:2089) at org.apache.solr.handler.admin.CoreAdminHandler.getIndexSize(CoreAdminHandler.java:837) at org.apache.solr.handler.admin.CoreAdminHandler.getCoreStatus(CoreAdminHandler.java:822) at org.apache.solr.handler.admin.CoreAdminHandler.handleStatusAction(CoreAdminHandler.java:542) ... 21 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4049) PrefixQuery (or it's superclass MultiTermQuery) ignores index time boosts
[ https://issues.apache.org/jira/browse/LUCENE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-4049. - Resolution: Not A Problem setRewriteMethod(SCORING_BOOLEAN_QUERY_REWRITE) PrefixQuery (or it's superclass MultiTermQuery) ignores index time boosts - Key: LUCENE-4049 URL: https://issues.apache.org/jira/browse/LUCENE-4049 Project: Lucene - Java Issue Type: Bug Components: core/search Affects Versions: 3.5, 3.6 Environment: Java Reporter: Michael Wyraz It is possible to set boost to fields or documents during indexing, so certain documents can be boostes over others. This works well with TermQuery or FuzzyQuery but not with PrefixQuery which ignores the individual values. Test Code below: import java.io.IOException; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.index.IndexReader; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.IndexWriterConfig; import org.apache.lucene.index.Term; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.PrefixQuery; import org.apache.lucene.search.Query; import org.apache.lucene.search.ScoreDoc; import org.apache.lucene.search.TopScoreDocCollector; import org.apache.lucene.store.Directory; import org.apache.lucene.store.RAMDirectory; import org.apache.lucene.util.Version; import com.evermind.tools.calendar.StopWatch; public class LuceneTest { public static void main(String[] args) throws Exception { Directory index=new RAMDirectory(); StandardAnalyzer analyzer=new StandardAnalyzer(Version.LUCENE_35); IndexWriterConfig config=new IndexWriterConfig(Version.LUCENE_35, analyzer); IndexWriter w = new IndexWriter(index, config); addDoc(w, Hello 1,1); addDoc(w, Hello 2,2); addDoc(w, Hello 3,1); w.close(); StopWatch.stop(); IndexReader reader = IndexReader.open(index); IndexSearcher searcher = new IndexSearcher(reader); //Query q = new TermQuery(new Term(f1,hello)); Query q = new PrefixQuery(new Term(f1,hello)); TopScoreDocCollector collector = TopScoreDocCollector.create(10, true); searcher.search(q, collector); for (ScoreDoc hit: collector.topDocs().scoreDocs) { Document d = searcher.doc(hit.doc); System.err.println(d.get(f1)+ +hit.score+ +hit.doc); } } private static void addDoc(IndexWriter w, String value, float boost) throws IOException { Document doc = new Document(); doc.add(new Field(f1, value, Field.Store.YES, Field.Index.ANALYZED)); doc.setBoost(boost); w.addDocument(doc); } } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3842) Analyzing Suggester
[ https://issues.apache.org/jira/browse/LUCENE-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3842: Attachment: LUCENE-3842.patch updated patch: i fixed the bug in tokenStreamToAutomaton (just use lastEndPos instead) Analyzing Suggester --- Key: LUCENE-3842 URL: https://issues.apache.org/jira/browse/LUCENE-3842 Project: Lucene - Java Issue Type: New Feature Components: modules/spellchecker Affects Versions: 3.6, 4.0 Reporter: Robert Muir Attachments: LUCENE-3842-TokenStream_to_Automaton.patch, LUCENE-3842.patch, LUCENE-3842.patch, LUCENE-3842.patch, LUCENE-3842.patch Since we added shortest-path wFSA search in LUCENE-3714, and generified the comparator in LUCENE-3801, I think we should look at implementing suggesters that have more capabilities than just basic prefix matching. In particular I think the most flexible approach is to integrate with Analyzer at both build and query time, such that we build a wFST with: input: analyzed text such as ghost0christmas0past -- byte 0 here is an optional token separator output: surface form such as the ghost of christmas past weight: the weight of the suggestion we make an FST with PairOutputsweight,output, but only do the shortest path operation on the weight side (like the test in LUCENE-3801), at the same time accumulating the output (surface form), which will be the actual suggestion. This allows a lot of flexibility: * Using even standardanalyzer means you can offer suggestions that ignore stopwords, e.g. if you type in ghost of chr..., it will suggest the ghost of christmas past * we can add support for synonyms/wdf/etc at both index and query time (there are tradeoffs here, and this is not implemented!) * this is a basis for more complicated suggesters such as Japanese suggesters, where the analyzed form is in fact the reading, so we would add a TokenFilter that copies ReadingAttribute into term text to support that... * other general things like offering suggestions that are more fuzzy like using a plural stemmer or ignoring accents or whatever. According to my benchmarks, suggestions are still very fast with the prototype (e.g. ~ 100,000 QPS), and the FST size does not explode (its short of twice that of a regular wFST, but this is still far smaller than TST or JaSpell, etc). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1337005 - in /lucene/dev/trunk/lucene/test-framework/src/java/org/apache/lucene: index/AlcoholicMergePolicy.java util/LuceneTestCase.java
There is room for improvement :) Yeah we should introduce an amount and order of drinks. Also the drinking speed is important and whether non-alcoholic beverages are consumed during the night. Martijn On 11 May 2012 1growFactor2:33, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: + public static enum Drink { + + Beer(15), Wine(17), Champagne(21), WhiteRussian(22), + SingleMalt(30); + + + public long drunk() { + return drunkFactor; + } I think this isn't an independent value. This isn't even a Markov chain as it doesn't depend on the last state of the observed object and the drink to follow -- full history of drinks consumed so far would have to be considered, their order and quantities matter (i.e., beer after champagne, singlemalt after beer etc.). Overflows (or so-called burst points) would certainly have to be empirically established as there is no theoretical model for them known in literature... Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Met vriendelijke groet, Martijn van Groningen - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4026) TestIndexWriterReader hang
[ https://issues.apache.org/jira/browse/LUCENE-4026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-4026: Attachment: LUCENE-4026.patch here is a new patch with a testcase that exercises this code more. I got this test to fail 1 in 20K runs with the bug and it didn't fail with the fix. Still not an evidence but better than no test, that's for sure. TestIndexWriterReader hang -- Key: LUCENE-4026 URL: https://issues.apache.org/jira/browse/LUCENE-4026 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Assignee: Simon Willnauer Attachments: LUCENE-4026.patch, LUCENE-4026.patch hung in jenkins. seed is D344294F98D3F637 (and the usual nightly flags, -Dtests.nightly=true, -Dtests.multiplier=3, -Dtests.linedocsfile=huge) Didn't try to reproduce yet. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3451) Solrj QueryResponse as JSON response
Pavan Kumar created SOLR-3451: - Summary: Solrj QueryResponse as JSON response Key: SOLR-3451 URL: https://issues.apache.org/jira/browse/SOLR-3451 Project: Solr Issue Type: Bug Components: clients - java Reporter: Pavan Kumar Hi, I have a requiremnt to get Solrj QueryResponse as a JSON response. As per the Solrj API it's supporting BinaryResponseParser and XMLResponseParser. Is there any way to get the Solrj QueryResponse as a JSON response. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4040) Improve QueryParser and supported syntax documentation
[ https://issues.apache.org/jira/browse/LUCENE-4040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273336#comment-13273336 ] Robert Muir commented on LUCENE-4040: - Thanks Mike, I committed this! Improve QueryParser and supported syntax documentation -- Key: LUCENE-4040 URL: https://issues.apache.org/jira/browse/LUCENE-4040 Project: Lucene - Java Issue Type: Improvement Components: modules/queryparser Reporter: Chris Male Priority: Minor Attachments: LUCENE-4040.patch, LUCENE-4040.patch In LUCENE-4024 there were some changes to the fuzzy query syntax. Only the Classic QueryParser really documents its syntax, which makes it hard to know whether the changes effected other QPs. Compounding this issue there are many classes which have no javadocs at all and I found myself quite confused when I consolidated all the QPs into their module. We should do a concerted effort to improve the documentation so that it is clear what syntax is supported by what QPs and so that at least the user facing classes have javadocs. As part of this, I wonder whether we should give the syntax supported by the Classic QueryParser a new name (rather than just Lucene's query syntax) since other QPs can and do support other syntax, and then somehow add some typed control over this, so QPs have to declare programmatically that they support the syntax and so we can verify that by randomly plugging in implementations into tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 13960 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/13960/ 1 tests failed. FAILED: junit.framework.TestSuite.org.apache.solr.cloud.BasicDistributedZkTest Error Message: ERROR: SolrIndexSearcher opens=79 closes=77 Stack Trace: java.lang.AssertionError: ERROR: SolrIndexSearcher opens=79 closes=77 at __randomizedtesting.SeedInfo.seed([1DE6EE9D997C002D]:0) at org.junit.Assert.fail(Assert.java:93) at org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:212) at org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:101) at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1961) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:742) at org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63) at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75) at org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:38) at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Build Log (for compile errors): [...truncated 11192 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3842) Analyzing Suggester
[ https://issues.apache.org/jira/browse/LUCENE-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3842: --- Attachment: LUCENE-3842.patch Patch, fixing TS2A to insert holes ... this is causing the AnalyzingCompletionTest.testStandard to fail... we have to fix its query-time to insert holes too... Analyzing Suggester --- Key: LUCENE-3842 URL: https://issues.apache.org/jira/browse/LUCENE-3842 Project: Lucene - Java Issue Type: New Feature Components: modules/spellchecker Affects Versions: 3.6, 4.0 Reporter: Robert Muir Attachments: LUCENE-3842-TokenStream_to_Automaton.patch, LUCENE-3842.patch, LUCENE-3842.patch, LUCENE-3842.patch, LUCENE-3842.patch, LUCENE-3842.patch Since we added shortest-path wFSA search in LUCENE-3714, and generified the comparator in LUCENE-3801, I think we should look at implementing suggesters that have more capabilities than just basic prefix matching. In particular I think the most flexible approach is to integrate with Analyzer at both build and query time, such that we build a wFST with: input: analyzed text such as ghost0christmas0past -- byte 0 here is an optional token separator output: surface form such as the ghost of christmas past weight: the weight of the suggestion we make an FST with PairOutputsweight,output, but only do the shortest path operation on the weight side (like the test in LUCENE-3801), at the same time accumulating the output (surface form), which will be the actual suggestion. This allows a lot of flexibility: * Using even standardanalyzer means you can offer suggestions that ignore stopwords, e.g. if you type in ghost of chr..., it will suggest the ghost of christmas past * we can add support for synonyms/wdf/etc at both index and query time (there are tradeoffs here, and this is not implemented!) * this is a basis for more complicated suggesters such as Japanese suggesters, where the analyzed form is in fact the reading, so we would add a TokenFilter that copies ReadingAttribute into term text to support that... * other general things like offering suggestions that are more fuzzy like using a plural stemmer or ignoring accents or whatever. According to my benchmarks, suggestions are still very fast with the prototype (e.g. ~ 100,000 QPS), and the FST size does not explode (its short of twice that of a regular wFST, but this is still far smaller than TST or JaSpell, etc). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1535) Pre-analyzed field type
[ https://issues.apache.org/jira/browse/SOLR-1535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273408#comment-13273408 ] Neil Hooey commented on SOLR-1535: -- When I asked Hoss at Lucene Revolution yesterday, he said you could manually set _term frequency_ in a pre-analyzed field, but I couldn't find any reference to it in the JSON parser. Is there a way to specify term frequency for each term in the field? Pre-analyzed field type --- Key: SOLR-1535 URL: https://issues.apache.org/jira/browse/SOLR-1535 Project: Solr Issue Type: New Feature Affects Versions: 1.5 Reporter: Andrzej Bialecki Assignee: Andrzej Bialecki Fix For: 4.0 Attachments: SOLR-1535.patch, SOLR-1535.patch, SOLR-1535.patch, preanalyzed.patch, preanalyzed.patch PreAnalyzedFieldType provides a functionality to index (and optionally store) content that was already processed and split into tokens using some external processing chain. This implementation defines a serialization format for sending tokens with any currently supported Attributes (eg. type, posIncr, payload, ...). This data is de-serialized into a regular TokenStream that is returned in Field.tokenStreamValue() and thus added to the index as index terms, and optionally a stored part that is returned in Field.stringValue() and is then added as a stored value of the field. This field type is useful for integrating Solr with existing text-processing pipelines, such as third-party NLP systems. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes
[ https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273417#comment-13273417 ] Oleg Shevelyov commented on SOLR-2155: -- Hi David, does new 1.0.5 version include polygon search? If not, please, could you clarify where to apply GeoHashPrefixFilter patch? It doesn't match solr 3.1 sources, and obviously higher versions as well. I saw you mentioned that you successfully implemented polygon search but I still don't get how to make it work. Thanks Geospatial search using geohash prefixes Key: SOLR-2155 URL: https://issues.apache.org/jira/browse/SOLR-2155 Project: Solr Issue Type: Improvement Reporter: David Smiley Assignee: David Smiley Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, Solr2155-for-1.0.2-3.x-port.patch {panel:title=NOTICE} The status of this issue is a plugin for Solr 3.x located here: https://github.com/dsmiley/SOLR-2155. Look at the introductory readme and download the plugin .jar file. Lucene 4's new spatial module is largely based on this code. The Solr 4 glue for it should come very soon but as of this writing it's hosted temporarily at https://github.com/spatial4j. For more information on using SOLR-2155 with Solr 3, see http://wiki.apache.org/solr/SpatialSearch#SOLR-2155 This JIRA issue is closed because it won't be committed in its current form. {panel} There currently isn't a solution in Solr for doing geospatial filtering on documents that have a variable number of points. This scenario occurs when there is location extraction (i.e. via a gazateer) occurring on free text. None, one, or many geospatial locations might be extracted from any given document and users want to limit their search results to those occurring in a user-specified area. I've implemented this by furthering the GeoHash based work in Lucene/Solr with a geohash prefix based filter. A geohash refers to a lat-lon box on the earth. Each successive character added further subdivides the box into a 4x8 (or 8x4 depending on the even/odd length of the geohash) grid. The first step in this scheme is figuring out which geohash grid squares cover the user's search query. I've added various extra methods to GeoHashUtils (and added tests) to assist in this purpose. The next step is an actual Lucene Filter, GeoHashPrefixFilter, that uses these geohash prefixes in TermsEnum.seek() to skip to relevant grid squares in the index. Once a matching geohash grid is found, the points therein are compared against the user's query to see if it matches. I created an abstraction GeoShape extended by subclasses named PointDistance... and CartesianBox to support different queried shapes so that the filter need not care about these details. This work was presented at LuceneRevolution in Boston on October 8th. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3842) Analyzing Suggester
[ https://issues.apache.org/jira/browse/LUCENE-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273419#comment-13273419 ] Robert Muir commented on LUCENE-3842: - testStandard is also bogus: it has 2 asserts. the first one should pass, but the second one should really only work if you disable positionincrements in the (mock) stopfilter. Analyzing Suggester --- Key: LUCENE-3842 URL: https://issues.apache.org/jira/browse/LUCENE-3842 Project: Lucene - Java Issue Type: New Feature Components: modules/spellchecker Affects Versions: 3.6, 4.0 Reporter: Robert Muir Attachments: LUCENE-3842-TokenStream_to_Automaton.patch, LUCENE-3842.patch, LUCENE-3842.patch, LUCENE-3842.patch, LUCENE-3842.patch, LUCENE-3842.patch Since we added shortest-path wFSA search in LUCENE-3714, and generified the comparator in LUCENE-3801, I think we should look at implementing suggesters that have more capabilities than just basic prefix matching. In particular I think the most flexible approach is to integrate with Analyzer at both build and query time, such that we build a wFST with: input: analyzed text such as ghost0christmas0past -- byte 0 here is an optional token separator output: surface form such as the ghost of christmas past weight: the weight of the suggestion we make an FST with PairOutputsweight,output, but only do the shortest path operation on the weight side (like the test in LUCENE-3801), at the same time accumulating the output (surface form), which will be the actual suggestion. This allows a lot of flexibility: * Using even standardanalyzer means you can offer suggestions that ignore stopwords, e.g. if you type in ghost of chr..., it will suggest the ghost of christmas past * we can add support for synonyms/wdf/etc at both index and query time (there are tradeoffs here, and this is not implemented!) * this is a basis for more complicated suggesters such as Japanese suggesters, where the analyzed form is in fact the reading, so we would add a TokenFilter that copies ReadingAttribute into term text to support that... * other general things like offering suggestions that are more fuzzy like using a plural stemmer or ignoring accents or whatever. According to my benchmarks, suggestions are still very fast with the prototype (e.g. ~ 100,000 QPS), and the FST size does not explode (its short of twice that of a regular wFST, but this is still far smaller than TST or JaSpell, etc). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 2507 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/2507/ 1 tests failed. REGRESSION: org.apache.solr.cloud.RecoveryZkTest.testDistribSearch Error Message: Thread threw an uncaught exception, thread: Thread[Lucene Merge Thread #1,6,] Stack Trace: java.lang.RuntimeException: Thread threw an uncaught exception, thread: Thread[Lucene Merge Thread #1,6,] at com.carrotsearch.randomizedtesting.RunnerThreadGroup.processUncaught(RunnerThreadGroup.java:96) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:849) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:688) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:724) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:735) at org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63) at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75) at org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:38) at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Caused by: org.apache.lucene.index.MergePolicy$MergeException: org.apache.lucene.store.AlreadyClosedException: this Directory is closed at __randomizedtesting.SeedInfo.seed([FB767E191B0392E0]:0) at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:507) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:480) Caused by: org.apache.lucene.store.AlreadyClosedException: this Directory is closed at org.apache.lucene.store.Directory.ensureOpen(Directory.java:244) at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:241) at org.apache.lucene.index.IndexFileDeleter.refresh(IndexFileDeleter.java:345) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3019) at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:382) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:451) Build Log (for compile errors): [...truncated 12364 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-2371) Update fileformats spec to match how flex's standard codec writes terms
[ https://issues.apache.org/jira/browse/LUCENE-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-2371. - Resolution: Fixed Update fileformats spec to match how flex's standard codec writes terms --- Key: LUCENE-2371 URL: https://issues.apache.org/jira/browse/LUCENE-2371 Project: Lucene - Java Issue Type: Bug Components: general/website Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.0 The standard codec changes how the terms index is written (eg uses packed ints, writes a whole field's terms at once, etc.)... we have to fix file formats on the web site to match. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-3374) HttpClient jar not included in distribution
[ https://issues.apache.org/jira/browse/SOLR-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved SOLR-3374. -- Resolution: Fixed HttpClient jar not included in distribution --- Key: SOLR-3374 URL: https://issues.apache.org/jira/browse/SOLR-3374 Project: Solr Issue Type: Improvement Components: clients - java Affects Versions: 3.6 Reporter: Roger Håkansson Assignee: Sami Siren Priority: Minor Fix For: 3.6.1 Attachments: SOLR-3374.patch In 3.6 CommonsHttpSolrServer is deprecated in favor for HttpSolrServer however in the distribution under solrj-lib, non of the required jar files for HttpClient 4.x is included -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 13963 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/13963/ 1 tests failed. REGRESSION: org.apache.solr.update.SoftAutoCommitTest.testSoftAndHardCommitMaxTimeDelete Error Message: searcher529 wasn't soon enough after soft529: 1336760134608 ! 1336760134169 + 100 (fudge) Stack Trace: java.lang.AssertionError: searcher529 wasn't soon enough after soft529: 1336760134608 ! 1336760134169 + 100 (fudge) at __randomizedtesting.SeedInfo.seed([D97244FD57C53365:1E3EFC604C6DFED5]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.solr.update.SoftAutoCommitTest.testSoftAndHardCommitMaxTimeDelete(SoftAutoCommitTest.java:250) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1961) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:806) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:867) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:881) at org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63) at org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:774) at org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:696) at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69) at org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:629) at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75) at org.apache.lucene.util.LuceneTestCase$SaveThreadAndTestNameRule$1.evaluate(LuceneTestCase.java:668) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:813) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:688) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:724) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:735) at org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63) at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75) at org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:38) at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Build Log (for compile errors): [...truncated 10378 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1535) Pre-analyzed field type
[ https://issues.apache.org/jira/browse/SOLR-1535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273501#comment-13273501 ] Andrzej Bialecki commented on SOLR-1535: - Hoss was wrong :) there is no way to do this, as there is no way to do this in TokenStream - you should view the PreAnalyzed field type as a serialized TokenStream (with the added functionality to specify the stored part independently). Pre-analyzed field type --- Key: SOLR-1535 URL: https://issues.apache.org/jira/browse/SOLR-1535 Project: Solr Issue Type: New Feature Affects Versions: 1.5 Reporter: Andrzej Bialecki Assignee: Andrzej Bialecki Fix For: 4.0 Attachments: SOLR-1535.patch, SOLR-1535.patch, SOLR-1535.patch, preanalyzed.patch, preanalyzed.patch PreAnalyzedFieldType provides a functionality to index (and optionally store) content that was already processed and split into tokens using some external processing chain. This implementation defines a serialization format for sending tokens with any currently supported Attributes (eg. type, posIncr, payload, ...). This data is de-serialized into a regular TokenStream that is returned in Field.tokenStreamValue() and thus added to the index as index terms, and optionally a stored part that is returned in Field.stringValue() and is then added as a stored value of the field. This field type is useful for integrating Solr with existing text-processing pipelines, such as third-party NLP systems. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3221) Make Shard handler threadpool configurable
[ https://issues.apache.org/jira/browse/SOLR-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273514#comment-13273514 ] Markus Jelsma commented on SOLR-3221: - I would agree that latency is preferred as default. Make Shard handler threadpool configurable -- Key: SOLR-3221 URL: https://issues.apache.org/jira/browse/SOLR-3221 Project: Solr Issue Type: Improvement Affects Versions: 3.6, 4.0 Reporter: Greg Bowyer Assignee: Erick Erickson Labels: distributed, http, shard Fix For: 3.6, 4.0 Attachments: SOLR-3221-3x_branch.patch, SOLR-3221-3x_branch.patch, SOLR-3221-3x_branch.patch, SOLR-3221-3x_branch.patch, SOLR-3221-3x_branch.patch, SOLR-3221-trunk.patch, SOLR-3221-trunk.patch, SOLR-3221-trunk.patch, SOLR-3221-trunk.patch, SOLR-3221-trunk.patch From profiling of monitor contention, as well as observations of the 95th and 99th response times for nodes that perform distributed search (or ‟aggregator‟ nodes) it would appear that the HttpShardHandler code currently does a suboptimal job of managing outgoing shard level requests. Presently the code contained within lucene 3.5's SearchHandler and Lucene trunk / 3x's ShardHandlerFactory create arbitrary threads in order to service distributed search requests. This is done presently to limit the size of the threadpool such that it does not consume resources in deployment configurations that do not use distributed search. This unfortunately has two impacts on the response time if the node coordinating the distribution is under high load. The usage of the MaxConnectionsPerHost configuration option results in aggressive activity on semaphores within HttpCommons, it has been observed that the aggregator can have a response time far greater than that of the searchers. The above monitor contention would appear to suggest that in some cases its possible for liveness issues to occur and for simple queries to be starved of resources simply due to a lack of attention from the viewpoint of context switching. With, as mentioned above the http commons connection being hotly contended The fair, queue based configuration eliminates this, at the cost of throughput. This patch aims to make the threadpool largely configurable allowing for those using solr to choose the throughput vs latency balance they desire. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Edited] (SOLR-1535) Pre-analyzed field type
[ https://issues.apache.org/jira/browse/SOLR-1535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273501#comment-13273501 ] Andrzej Bialecki edited comment on SOLR-1535 at 5/11/12 7:20 PM: -- Hoss was wrong :) there is no way to do this, as there is no way to do this in TokenStream - you should view the PreAnalyzed field type as a serialized TokenStream (with the added functionality to specify the stored part independently). Edit: I started adding some documentation to http://wiki.apache.org/solr/PreAnalyzedField . was (Author: ab): Hoss was wrong :) there is no way to do this, as there is no way to do this in TokenStream - you should view the PreAnalyzed field type as a serialized TokenStream (with the added functionality to specify the stored part independently). Pre-analyzed field type --- Key: SOLR-1535 URL: https://issues.apache.org/jira/browse/SOLR-1535 Project: Solr Issue Type: New Feature Affects Versions: 1.5 Reporter: Andrzej Bialecki Assignee: Andrzej Bialecki Fix For: 4.0 Attachments: SOLR-1535.patch, SOLR-1535.patch, SOLR-1535.patch, preanalyzed.patch, preanalyzed.patch PreAnalyzedFieldType provides a functionality to index (and optionally store) content that was already processed and split into tokens using some external processing chain. This implementation defines a serialization format for sending tokens with any currently supported Attributes (eg. type, posIncr, payload, ...). This data is de-serialized into a regular TokenStream that is returned in Field.tokenStreamValue() and thus added to the index as index terms, and optionally a stored part that is returned in Field.stringValue() and is then added as a stored value of the field. This field type is useful for integrating Solr with existing text-processing pipelines, such as third-party NLP systems. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4049) PrefixQuery (or it's superclass MultiTermQuery) ignores index time boosts
[ https://issues.apache.org/jira/browse/LUCENE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273525#comment-13273525 ] Michael Wyraz commented on LUCENE-4049: --- Robert, could you please explain it a bit more (maybe with the code above)? I wonder why PrefixQuery behaves unlike the other query types there. PrefixQuery (or it's superclass MultiTermQuery) ignores index time boosts - Key: LUCENE-4049 URL: https://issues.apache.org/jira/browse/LUCENE-4049 Project: Lucene - Java Issue Type: Bug Components: core/search Affects Versions: 3.5, 3.6 Environment: Java Reporter: Michael Wyraz It is possible to set boost to fields or documents during indexing, so certain documents can be boostes over others. This works well with TermQuery or FuzzyQuery but not with PrefixQuery which ignores the individual values. Test Code below: import java.io.IOException; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.index.IndexReader; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.IndexWriterConfig; import org.apache.lucene.index.Term; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.PrefixQuery; import org.apache.lucene.search.Query; import org.apache.lucene.search.ScoreDoc; import org.apache.lucene.search.TopScoreDocCollector; import org.apache.lucene.store.Directory; import org.apache.lucene.store.RAMDirectory; import org.apache.lucene.util.Version; import com.evermind.tools.calendar.StopWatch; public class LuceneTest { public static void main(String[] args) throws Exception { Directory index=new RAMDirectory(); StandardAnalyzer analyzer=new StandardAnalyzer(Version.LUCENE_35); IndexWriterConfig config=new IndexWriterConfig(Version.LUCENE_35, analyzer); IndexWriter w = new IndexWriter(index, config); addDoc(w, Hello 1,1); addDoc(w, Hello 2,2); addDoc(w, Hello 3,1); w.close(); StopWatch.stop(); IndexReader reader = IndexReader.open(index); IndexSearcher searcher = new IndexSearcher(reader); //Query q = new TermQuery(new Term(f1,hello)); Query q = new PrefixQuery(new Term(f1,hello)); TopScoreDocCollector collector = TopScoreDocCollector.create(10, true); searcher.search(q, collector); for (ScoreDoc hit: collector.topDocs().scoreDocs) { Document d = searcher.doc(hit.doc); System.err.println(d.get(f1)+ +hit.score+ +hit.doc); } } private static void addDoc(IndexWriter w, String value, float boost) throws IOException { Document doc = new Document(); doc.add(new Field(f1, value, Field.Store.YES, Field.Index.ANALYZED)); doc.setBoost(boost); w.addDocument(doc); } } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4049) PrefixQuery (or it's superclass MultiTermQuery) ignores index time boosts
[ https://issues.apache.org/jira/browse/LUCENE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273536#comment-13273536 ] Simon Willnauer commented on LUCENE-4049: - The default rewrite method in PrefixQuery / MTQ is ConstantScore ie. it will create a constant score query return 1.0f for all matching documents. if you want a real scoring query change the rewrite method (MultiTermQuery#setRewriteMethod()) to MultiTermQuery#SCORING_BOOLEAN_QUERY_REWRITE. that is btw. true for all MTQ subclasses. PrefixQuery (or it's superclass MultiTermQuery) ignores index time boosts - Key: LUCENE-4049 URL: https://issues.apache.org/jira/browse/LUCENE-4049 Project: Lucene - Java Issue Type: Bug Components: core/search Affects Versions: 3.5, 3.6 Environment: Java Reporter: Michael Wyraz It is possible to set boost to fields or documents during indexing, so certain documents can be boostes over others. This works well with TermQuery or FuzzyQuery but not with PrefixQuery which ignores the individual values. Test Code below: import java.io.IOException; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.index.IndexReader; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.IndexWriterConfig; import org.apache.lucene.index.Term; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.PrefixQuery; import org.apache.lucene.search.Query; import org.apache.lucene.search.ScoreDoc; import org.apache.lucene.search.TopScoreDocCollector; import org.apache.lucene.store.Directory; import org.apache.lucene.store.RAMDirectory; import org.apache.lucene.util.Version; import com.evermind.tools.calendar.StopWatch; public class LuceneTest { public static void main(String[] args) throws Exception { Directory index=new RAMDirectory(); StandardAnalyzer analyzer=new StandardAnalyzer(Version.LUCENE_35); IndexWriterConfig config=new IndexWriterConfig(Version.LUCENE_35, analyzer); IndexWriter w = new IndexWriter(index, config); addDoc(w, Hello 1,1); addDoc(w, Hello 2,2); addDoc(w, Hello 3,1); w.close(); StopWatch.stop(); IndexReader reader = IndexReader.open(index); IndexSearcher searcher = new IndexSearcher(reader); //Query q = new TermQuery(new Term(f1,hello)); Query q = new PrefixQuery(new Term(f1,hello)); TopScoreDocCollector collector = TopScoreDocCollector.create(10, true); searcher.search(q, collector); for (ScoreDoc hit: collector.topDocs().scoreDocs) { Document d = searcher.doc(hit.doc); System.err.println(d.get(f1)+ +hit.score+ +hit.doc); } } private static void addDoc(IndexWriter w, String value, float boost) throws IOException { Document doc = new Document(); doc.add(new Field(f1, value, Field.Store.YES, Field.Index.ANALYZED)); doc.setBoost(boost); w.addDocument(doc); } } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes
[ https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273540#comment-13273540 ] David Smiley commented on SOLR-2155: Hi Oleg. No, it was stripped out a long while ago. But come to think of it, now that this issue isn't going to get committed and is also hosted somewhere outside Apache (it's on GitHub), I can re-introduce the polygon support that was formerly there. It's not a priority for me right now but if you find the last .patch file on this issue that includes the JTS support (which in some comment above I mentioned stripping it out so you could grab the version prior to that), then you could resurrect it. There was just one source file, plus a small hook into my query parser above. JTS did all the work, really. If you want to try and bring it back, then do so and send me a pull-request on github. All said and done, it's a very small amount of work; the integration was done it just needs to be brought back. Geospatial search using geohash prefixes Key: SOLR-2155 URL: https://issues.apache.org/jira/browse/SOLR-2155 Project: Solr Issue Type: Improvement Reporter: David Smiley Assignee: David Smiley Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, Solr2155-for-1.0.2-3.x-port.patch {panel:title=NOTICE} The status of this issue is a plugin for Solr 3.x located here: https://github.com/dsmiley/SOLR-2155. Look at the introductory readme and download the plugin .jar file. Lucene 4's new spatial module is largely based on this code. The Solr 4 glue for it should come very soon but as of this writing it's hosted temporarily at https://github.com/spatial4j. For more information on using SOLR-2155 with Solr 3, see http://wiki.apache.org/solr/SpatialSearch#SOLR-2155 This JIRA issue is closed because it won't be committed in its current form. {panel} There currently isn't a solution in Solr for doing geospatial filtering on documents that have a variable number of points. This scenario occurs when there is location extraction (i.e. via a gazateer) occurring on free text. None, one, or many geospatial locations might be extracted from any given document and users want to limit their search results to those occurring in a user-specified area. I've implemented this by furthering the GeoHash based work in Lucene/Solr with a geohash prefix based filter. A geohash refers to a lat-lon box on the earth. Each successive character added further subdivides the box into a 4x8 (or 8x4 depending on the even/odd length of the geohash) grid. The first step in this scheme is figuring out which geohash grid squares cover the user's search query. I've added various extra methods to GeoHashUtils (and added tests) to assist in this purpose. The next step is an actual Lucene Filter, GeoHashPrefixFilter, that uses these geohash prefixes in TermsEnum.seek() to skip to relevant grid squares in the index. Once a matching geohash grid is found, the points therein are compared against the user's query to see if it matches. I created an abstraction GeoShape extended by subclasses named PointDistance... and CartesianBox to support different queried shapes so that the filter need not care about these details. This work was presented at LuceneRevolution in Boston on October 8th. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4049) PrefixQuery (or it's superclass MultiTermQuery) ignores index time boosts
[ https://issues.apache.org/jira/browse/LUCENE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273572#comment-13273572 ] Michael Wyraz commented on LUCENE-4049: --- Thank you, this solved the problem. But I had to set it explicitely for PrefixQuery, so this is not the default there. PrefixQuery (or it's superclass MultiTermQuery) ignores index time boosts - Key: LUCENE-4049 URL: https://issues.apache.org/jira/browse/LUCENE-4049 Project: Lucene - Java Issue Type: Bug Components: core/search Affects Versions: 3.5, 3.6 Environment: Java Reporter: Michael Wyraz It is possible to set boost to fields or documents during indexing, so certain documents can be boostes over others. This works well with TermQuery or FuzzyQuery but not with PrefixQuery which ignores the individual values. Test Code below: import java.io.IOException; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.index.IndexReader; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.IndexWriterConfig; import org.apache.lucene.index.Term; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.PrefixQuery; import org.apache.lucene.search.Query; import org.apache.lucene.search.ScoreDoc; import org.apache.lucene.search.TopScoreDocCollector; import org.apache.lucene.store.Directory; import org.apache.lucene.store.RAMDirectory; import org.apache.lucene.util.Version; import com.evermind.tools.calendar.StopWatch; public class LuceneTest { public static void main(String[] args) throws Exception { Directory index=new RAMDirectory(); StandardAnalyzer analyzer=new StandardAnalyzer(Version.LUCENE_35); IndexWriterConfig config=new IndexWriterConfig(Version.LUCENE_35, analyzer); IndexWriter w = new IndexWriter(index, config); addDoc(w, Hello 1,1); addDoc(w, Hello 2,2); addDoc(w, Hello 3,1); w.close(); StopWatch.stop(); IndexReader reader = IndexReader.open(index); IndexSearcher searcher = new IndexSearcher(reader); //Query q = new TermQuery(new Term(f1,hello)); Query q = new PrefixQuery(new Term(f1,hello)); TopScoreDocCollector collector = TopScoreDocCollector.create(10, true); searcher.search(q, collector); for (ScoreDoc hit: collector.topDocs().scoreDocs) { Document d = searcher.doc(hit.doc); System.err.println(d.get(f1)+ +hit.score+ +hit.doc); } } private static void addDoc(IndexWriter w, String value, float boost) throws IOException { Document doc = new Document(); doc.add(new Field(f1, value, Field.Store.YES, Field.Index.ANALYZED)); doc.setBoost(boost); w.addDocument(doc); } } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Edited] (LUCENE-4049) PrefixQuery (or it's superclass MultiTermQuery) ignores index time boosts
[ https://issues.apache.org/jira/browse/LUCENE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273572#comment-13273572 ] Michael Wyraz edited comment on LUCENE-4049 at 5/11/12 8:18 PM: Thank you, this solved the problem. was (Author: mich...@wyraz.de): Thank you, this solved the problem. But I had to set it explicitely for PrefixQuery, so this is not the default there. PrefixQuery (or it's superclass MultiTermQuery) ignores index time boosts - Key: LUCENE-4049 URL: https://issues.apache.org/jira/browse/LUCENE-4049 Project: Lucene - Java Issue Type: Bug Components: core/search Affects Versions: 3.5, 3.6 Environment: Java Reporter: Michael Wyraz It is possible to set boost to fields or documents during indexing, so certain documents can be boostes over others. This works well with TermQuery or FuzzyQuery but not with PrefixQuery which ignores the individual values. Test Code below: import java.io.IOException; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.index.IndexReader; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.IndexWriterConfig; import org.apache.lucene.index.Term; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.PrefixQuery; import org.apache.lucene.search.Query; import org.apache.lucene.search.ScoreDoc; import org.apache.lucene.search.TopScoreDocCollector; import org.apache.lucene.store.Directory; import org.apache.lucene.store.RAMDirectory; import org.apache.lucene.util.Version; import com.evermind.tools.calendar.StopWatch; public class LuceneTest { public static void main(String[] args) throws Exception { Directory index=new RAMDirectory(); StandardAnalyzer analyzer=new StandardAnalyzer(Version.LUCENE_35); IndexWriterConfig config=new IndexWriterConfig(Version.LUCENE_35, analyzer); IndexWriter w = new IndexWriter(index, config); addDoc(w, Hello 1,1); addDoc(w, Hello 2,2); addDoc(w, Hello 3,1); w.close(); StopWatch.stop(); IndexReader reader = IndexReader.open(index); IndexSearcher searcher = new IndexSearcher(reader); //Query q = new TermQuery(new Term(f1,hello)); Query q = new PrefixQuery(new Term(f1,hello)); TopScoreDocCollector collector = TopScoreDocCollector.create(10, true); searcher.search(q, collector); for (ScoreDoc hit: collector.topDocs().scoreDocs) { Document d = searcher.doc(hit.doc); System.err.println(d.get(f1)+ +hit.score+ +hit.doc); } } private static void addDoc(IndexWriter w, String value, float boost) throws IOException { Document doc = new Document(); doc.add(new Field(f1, value, Field.Store.YES, Field.Index.ANALYZED)); doc.setBoost(boost); w.addDocument(doc); } } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-723) SolrCore aliasing/swapping may lead to confusing JMX
[ https://issues.apache.org/jira/browse/SOLR-723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Bowyer updated SOLR-723: - Attachment: (was: SOLR-723-solr-core-swap-JMX-issues-lucene_3x.patch) SolrCore aliasing/swapping may lead to confusing JMX -- Key: SOLR-723 URL: https://issues.apache.org/jira/browse/SOLR-723 Project: Solr Issue Type: Bug Affects Versions: 1.3 Reporter: Henri Biestro Priority: Minor Attachments: SOLR-723-solr-core-swap-JMX-issues-lucene_3x.patch As mentioned by Yonik in SOLR-647, JMX registers the core with its name. After swapping or re-aliasing the core, the JMX tracking name does not correspond to the actual core anymore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-723) SolrCore aliasing/swapping may lead to confusing JMX
[ https://issues.apache.org/jira/browse/SOLR-723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Bowyer updated SOLR-723: - Attachment: SOLR-723-solr-core-swap-JMX-issues-lucene_3x.patch SolrCore aliasing/swapping may lead to confusing JMX -- Key: SOLR-723 URL: https://issues.apache.org/jira/browse/SOLR-723 Project: Solr Issue Type: Bug Affects Versions: 1.3 Reporter: Henri Biestro Priority: Minor Attachments: SOLR-723-solr-core-swap-JMX-issues-lucene_3x.patch As mentioned by Yonik in SOLR-647, JMX registers the core with its name. After swapping or re-aliasing the core, the JMX tracking name does not correspond to the actual core anymore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-723) SolrCore aliasing/swapping may lead to confusing JMX
[ https://issues.apache.org/jira/browse/SOLR-723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Bowyer updated SOLR-723: - Attachment: (was: SOLR-723-solr-core-swap-JMX-issues-lucene_3x.patch) SolrCore aliasing/swapping may lead to confusing JMX -- Key: SOLR-723 URL: https://issues.apache.org/jira/browse/SOLR-723 Project: Solr Issue Type: Bug Affects Versions: 1.3 Reporter: Henri Biestro Priority: Minor Attachments: SOLR-723-solr-core-swap-JMX-issues-lucene_3x.patch As mentioned by Yonik in SOLR-647, JMX registers the core with its name. After swapping or re-aliasing the core, the JMX tracking name does not correspond to the actual core anymore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3489) Refactor test classes that use assumeFalse(codec != SimpleText, Memory) to use new annotation and move the expensive methods to separate classes
[ https://issues.apache.org/jira/browse/LUCENE-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3489: Attachment: LUCENE-3489.patch attached is a patch generalizing the UseNoExpensiveMemory annotation to @AvoidCodecs that takes a list of codecs to avoid. This way, tests that cannot work with Lucene3x codec can just avoid it, using another codec, rather than assuming (in general its bad that many of the tests of actual new functionality often dont run at all because of the current assumes) Refactor test classes that use assumeFalse(codec != SimpleText, Memory) to use new annotation and move the expensive methods to separate classes Key: LUCENE-3489 URL: https://issues.apache.org/jira/browse/LUCENE-3489 Project: Lucene - Java Issue Type: Test Components: general/test Affects Versions: 4.0 Reporter: Uwe Schindler Fix For: 4.1 Attachments: LUCENE-3489.patch Folloup for LUCENE-3463. TODO: - Move test-methods that need the new @UseNoMemoryExpensiveCodec annotation to separate classes - Eliminate the assumeFalse-calls that check the current codec and disable the test if SimpleText or Memory is used -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 13970 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/13970/ 1 tests failed. FAILED: junit.framework.TestSuite.org.apache.solr.cloud.BasicDistributedZkTest Error Message: ERROR: SolrIndexSearcher opens=80 closes=78 Stack Trace: java.lang.AssertionError: ERROR: SolrIndexSearcher opens=80 closes=78 at __randomizedtesting.SeedInfo.seed([35A799C36F061472]:0) at org.junit.Assert.fail(Assert.java:93) at org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:212) at org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:101) at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1961) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:742) at org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63) at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75) at org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:38) at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Build Log (for compile errors): [...truncated 11327 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3171) BlockJoinQuery/Collector
[ https://issues.apache.org/jira/browse/LUCENE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273759#comment-13273759 ] David Webb commented on LUCENE-3171: Is there a wiki page on how to use this? I need to implement an index with nested docs and an example scheme and query would be awesome. Thanks! BlockJoinQuery/Collector Key: LUCENE-3171 URL: https://issues.apache.org/jira/browse/LUCENE-3171 Project: Lucene - Java Issue Type: Improvement Components: modules/other Reporter: Michael McCandless Fix For: 3.4, 4.0 Attachments: LUCENE-3171.patch, LUCENE-3171.patch, LUCENE-3171.patch I created a single-pass Query + Collector to implement nested docs. The approach is similar to LUCENE-2454, in that the app must index documents in join order, as a block (IW.add/updateDocuments), with the parent doc at the end of the block, except that this impl is one pass. Once you join at indexing time, you can take any query that matches child docs and join it up to the parent docID space, using BlockJoinQuery. You then use BlockJoinCollector, which sorts parent docs by provided Sort, to gather results, grouped by parent; this collector finds any BlockJoinQuerys (using Scorer.visitScorers) and retains the child docs corresponding to each collected parent doc. After searching is done, you retrieve the TopGroups from a provided BlockJoinQuery. Like LUCENE-2454, this is less general than the arbitrary joins in Solr (SOLR-2272) or parent/child from ElasticSearch (https://github.com/elasticsearch/elasticsearch/issues/553), since you must do the join at indexing time as a doc block, but it should be able to handle nested joins as well as joins to multiple tables, though I don't yet have test cases for these. I put this in a new Join module (modules/join); I think as we refactor join impls we should put them here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3221) Make Shard handler threadpool configurable
[ https://issues.apache.org/jira/browse/SOLR-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273799#comment-13273799 ] Greg Bowyer commented on SOLR-3221: --- I agree, I was being cowardly when I wrote it because I am not a committer :D Make Shard handler threadpool configurable -- Key: SOLR-3221 URL: https://issues.apache.org/jira/browse/SOLR-3221 Project: Solr Issue Type: Improvement Affects Versions: 3.6, 4.0 Reporter: Greg Bowyer Assignee: Erick Erickson Labels: distributed, http, shard Fix For: 3.6, 4.0 Attachments: SOLR-3221-3x_branch.patch, SOLR-3221-3x_branch.patch, SOLR-3221-3x_branch.patch, SOLR-3221-3x_branch.patch, SOLR-3221-3x_branch.patch, SOLR-3221-trunk.patch, SOLR-3221-trunk.patch, SOLR-3221-trunk.patch, SOLR-3221-trunk.patch, SOLR-3221-trunk.patch From profiling of monitor contention, as well as observations of the 95th and 99th response times for nodes that perform distributed search (or ‟aggregator‟ nodes) it would appear that the HttpShardHandler code currently does a suboptimal job of managing outgoing shard level requests. Presently the code contained within lucene 3.5's SearchHandler and Lucene trunk / 3x's ShardHandlerFactory create arbitrary threads in order to service distributed search requests. This is done presently to limit the size of the threadpool such that it does not consume resources in deployment configurations that do not use distributed search. This unfortunately has two impacts on the response time if the node coordinating the distribution is under high load. The usage of the MaxConnectionsPerHost configuration option results in aggressive activity on semaphores within HttpCommons, it has been observed that the aggregator can have a response time far greater than that of the searchers. The above monitor contention would appear to suggest that in some cases its possible for liveness issues to occur and for simple queries to be starved of resources simply due to a lack of attention from the viewpoint of context switching. With, as mentioned above the http commons connection being hotly contended The fair, queue based configuration eliminates this, at the cost of throughput. This patch aims to make the threadpool largely configurable allowing for those using solr to choose the throughput vs latency balance they desire. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3489) Refactor test classes that use assumeFalse(codec != SimpleText, Memory) to use new annotation and move the expensive methods to separate classes
[ https://issues.apache.org/jira/browse/LUCENE-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273811#comment-13273811 ] Uwe Schindler commented on LUCENE-3489: --- I like the annotation. Can we maybe change it to look like @SuppressWarnings? So it does not need codecs={} or if there is only one codec, no {} at all? Should be not too hard? Otherwise strong +1! Refactor test classes that use assumeFalse(codec != SimpleText, Memory) to use new annotation and move the expensive methods to separate classes Key: LUCENE-3489 URL: https://issues.apache.org/jira/browse/LUCENE-3489 Project: Lucene - Java Issue Type: Test Components: general/test Affects Versions: 4.0 Reporter: Uwe Schindler Fix For: 4.1 Attachments: LUCENE-3489.patch Folloup for LUCENE-3463. TODO: - Move test-methods that need the new @UseNoMemoryExpensiveCodec annotation to separate classes - Eliminate the assumeFalse-calls that check the current codec and disable the test if SimpleText or Memory is used -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3489) Refactor test classes that use assumeFalse(codec != SimpleText, Memory) to use new annotation and move the expensive methods to separate classes
[ https://issues.apache.org/jira/browse/LUCENE-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273812#comment-13273812 ] Uwe Schindler commented on LUCENE-3489: --- It's easy, just rename codecs to String[] value and you are done. After that you can use @AvoidCodecs(SimpleText) or @AvoidCodecs({SimpleText,Lucene3x}) Refactor test classes that use assumeFalse(codec != SimpleText, Memory) to use new annotation and move the expensive methods to separate classes Key: LUCENE-3489 URL: https://issues.apache.org/jira/browse/LUCENE-3489 Project: Lucene - Java Issue Type: Test Components: general/test Affects Versions: 4.0 Reporter: Uwe Schindler Fix For: 4.1 Attachments: LUCENE-3489.patch Folloup for LUCENE-3463. TODO: - Move test-methods that need the new @UseNoMemoryExpensiveCodec annotation to separate classes - Eliminate the assumeFalse-calls that check the current codec and disable the test if SimpleText or Memory is used -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Edited] (LUCENE-3489) Refactor test classes that use assumeFalse(codec != SimpleText, Memory) to use new annotation and move the expensive methods to separate classes
[ https://issues.apache.org/jira/browse/LUCENE-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273812#comment-13273812 ] Uwe Schindler edited comment on LUCENE-3489 at 5/12/12 2:21 AM: It's easy, just rename codecs to String[] value and you are done. After that you can use @AvoidCodecs(SimpleText) or @AvoidCodecs({SimpleText,Lucene3x}) See: http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/SuppressWarnings.html was (Author: thetaphi): It's easy, just rename codecs to String[] value and you are done. After that you can use @AvoidCodecs(SimpleText) or @AvoidCodecs({SimpleText,Lucene3x}) Refactor test classes that use assumeFalse(codec != SimpleText, Memory) to use new annotation and move the expensive methods to separate classes Key: LUCENE-3489 URL: https://issues.apache.org/jira/browse/LUCENE-3489 Project: Lucene - Java Issue Type: Test Components: general/test Affects Versions: 4.0 Reporter: Uwe Schindler Fix For: 4.1 Attachments: LUCENE-3489.patch Folloup for LUCENE-3463. TODO: - Move test-methods that need the new @UseNoMemoryExpensiveCodec annotation to separate classes - Eliminate the assumeFalse-calls that check the current codec and disable the test if SimpleText or Memory is used -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4050) Change SegmentInfos format to plain text
Andrzej Bialecki created LUCENE-4050: - Summary: Change SegmentInfos format to plain text Key: LUCENE-4050 URL: https://issues.apache.org/jira/browse/LUCENE-4050 Project: Lucene - Java Issue Type: Improvement Components: core/codecs Reporter: Andrzej Bialecki Fix For: 4.0 I propose to change the format of SegmentInfos file (segments_NN) to use plain text instead of the current binary format. SegmentInfos file represents a commit point, and it also declares what codecs were used for writing each of the segments that the commit point consists of. However, this is a chicken and egg situation - in theory the format of this file is customizable via Codec.getSegmentInfosFormat, but in practice we have to first discover what is the codec implementation that wrote this file - so the SegmentCoreReaders assumes a certain fixed binary layout of a preamble of this file that contains the codec name... and then the file is read again, only this time using the right Codec. This is ugly. Instead I propose to use a simple plain text format, either line oriented properties or JSON, in such a way that newer versions could easily extend it, and which wouldn't require any special Codec to read and parse. Consequently we could remove SegmentInfosFormat altogether, and instead add SegmentInfoFormat (notice the singular) to Codec to read single per-segment SegmentInfo-s in a codec-specific way. E.g. for Lucene40 codec we could either add another file or we could extend the .fnm file (FieldInfos) to contain also this information. Then the plain text SegmentInfos would contain just the following information: * list of global files for this commit point (if any) * list of segments for this commit point, and their corresponding codec class names * user data map -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org