Lucene Flex
Is SOLR compatible with Lucene 3* or Flex branch?
Range Queries, Geospatial
Hi Dev, I've read very interesting interview with Ryan, http://www.lucidimagination.com/Community/Hear-from-the-Experts/Podcasts-and -Videos/Interview-Ryan-McKinley Another finding is https://issues.apache.org/jira/browse/SOLR-773 (lucene/contrib/spatial) Is there any more staff going on for SOLR 1.5 (and existing SOLR 1.4)? I need filtering on 2-dimension like x:[1 TO 10100] y:[7900 TO 8000] (that's why I need SOLR:))) Any thoughts? I'd love to implement something quick-simple-efficient if it doesn't exist yet, like R-Tree (http://en.wikipedia.org/wiki/R-tree), or Geohash (http://en.wikipedia.org/wiki/Geohash) I haven't tried Local Lucene and SOLR-773 yet. Thanks, Fuad Efendi +1 416-993-2060 http://www.tokenizer.ca/
RE: Range Queries, Geospatial
Most of SOLR-773 is incorporated into trunk at this point w/ the exception of Cartesian Tier filtering (and, more generically, intelligent field type specific spatial filtering via SOLR-1568). I hope to complete that in the next week or so. Geohash is in and supported by Solr right now. There is no R-Tree support. Thanks Grant; I found geohash, https://issues.apache.org/jira/browse/SOLR-1586 and https://issues.apache.org/jira/browse/SOLR-1302 - I believe SOLR-1.4 doesn't have it and I need version from trunk, right?
[jira] Commented: (SOLR-1764) While indexing a java.lang.IllegalStateException: Can't overwrite cause exception is thrown
[ https://issues.apache.org/jira/browse/SOLR-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12831458#action_12831458 ] Fuad Efendi commented on SOLR-1764: --- Funny, it might happen that this is not a problem with JDK 1.6.0_9; or may be with latest JDK. As a quick workaround... Also, you may try to use SolrJ with binary format... I'll try to check that elementwordamp;word/element doesn't cause a problem... While indexing a java.lang.IllegalStateException: Can't overwrite cause exception is thrown - Key: SOLR-1764 URL: https://issues.apache.org/jira/browse/SOLR-1764 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 1.4 Environment: Windows XP, JBoss 4.2.3 GA Reporter: Michael McGowan Priority: Blocker I get an exception while indexing. It seems that I'm unable to see the root cause of the exception because it is masked by another java.lang.IllegalStateException: Can't overwrite cause exception. Here is the stacktrace : 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: {} 0 15 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.common.SolrException log SEVERE: java.lang.IllegalStateException: Can't overwrite cause at java.lang.Throwable.initCause(Throwable.java:320) at com.ctc.wstx.compat.Jdk14Impl.setInitCause(Jdk14Impl.java:70) at com.ctc.wstx.exc.WstxException.init(WstxException.java:46) at com.ctc.wstx.exc.WstxIOException.init(WstxIOException.java:16) at com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:536) at com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:592) at com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:648) at com.ctc.wstx.stax.WstxInputFactory.createXMLStreamReader(WstxInputFactory.java:319) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:68) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.jboss.web.tomcat.filters.ReplyHeaderFilter.doFilter(ReplyHeaderFilter.java:96) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:230) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) at org.jboss.web.tomcat.security.SecurityAssociationValve.invoke(SecurityAssociationValve.java:182) at org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:84) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.jboss.web.tomcat.service.jca.CachedConnectionValve.invoke(CachedConnectionValve.java:157) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:262) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:446) at java.lang.Thread.run(Thread.java:619) 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/update params={wt=xmlversion=2.2} status=500 QTime=15 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.common.SolrException log SEVERE: java.lang.IllegalStateException: Can't overwrite cause at java.lang.Throwable.initCause(Throwable.java:320) at com.ctc.wstx.compat.Jdk14Impl.setInitCause(Jdk14Impl.java:70) at com.ctc.wstx.exc.WstxException.init
[jira] Commented: (SOLR-1764) While indexing a java.lang.IllegalStateException: Can't overwrite cause exception is thrown
[ https://issues.apache.org/jira/browse/SOLR-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12831268#action_12831268 ] Fuad Efendi commented on SOLR-1764: --- Michael, Which version of Java are you using? I believe something wrong with XML (upload) file, and specific Java version classes conflict with WoodStox, although SOLR may need improvement too: http://forums.sun.com/thread.jspa?threadID=5150576 While indexing a java.lang.IllegalStateException: Can't overwrite cause exception is thrown - Key: SOLR-1764 URL: https://issues.apache.org/jira/browse/SOLR-1764 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 1.4 Environment: Windows XP, JBoss 4.2.3 GA Reporter: Michael McGowan Priority: Blocker I get an exception while indexing. It seems that I'm unable to see the root cause of the exception because it is masked by another java.lang.IllegalStateException: Can't overwrite cause exception. Here is the stacktrace : 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: {} 0 15 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.common.SolrException log SEVERE: java.lang.IllegalStateException: Can't overwrite cause at java.lang.Throwable.initCause(Throwable.java:320) at com.ctc.wstx.compat.Jdk14Impl.setInitCause(Jdk14Impl.java:70) at com.ctc.wstx.exc.WstxException.init(WstxException.java:46) at com.ctc.wstx.exc.WstxIOException.init(WstxIOException.java:16) at com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:536) at com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:592) at com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:648) at com.ctc.wstx.stax.WstxInputFactory.createXMLStreamReader(WstxInputFactory.java:319) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:68) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.jboss.web.tomcat.filters.ReplyHeaderFilter.doFilter(ReplyHeaderFilter.java:96) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:230) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) at org.jboss.web.tomcat.security.SecurityAssociationValve.invoke(SecurityAssociationValve.java:182) at org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:84) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.jboss.web.tomcat.service.jca.CachedConnectionValve.invoke(CachedConnectionValve.java:157) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:262) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:446) at java.lang.Thread.run(Thread.java:619) 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/update params={wt=xmlversion=2.2} status=500 QTime=15 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.common.SolrException log SEVERE: java.lang.IllegalStateException: Can't overwrite cause at java.lang.Throwable.initCause(Throwable.java:320) at com.ctc.wstx.compat.Jdk14Impl.setInitCause(Jdk14Impl.java:70) at com.ctc.wstx.exc.WstxException.init
[jira] Issue Comment Edited: (SOLR-1764) While indexing a java.lang.IllegalStateException: Can't overwrite cause exception is thrown
[ https://issues.apache.org/jira/browse/SOLR-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12831268#action_12831268 ] Fuad Efendi edited comment on SOLR-1764 at 2/9/10 2:48 AM: --- Michael, Which version of Java are you using? I believe something wrong with XML (upload) file, and specific Java version classes conflict with WoodStox, although SOLR may need improvement too: http://forums.sun.com/thread.jspa?threadID=5150576 It says that text nodes such as prim name=y[-A-Z0-9.,()/='+:?!%amp;amp;*; ]/prim can be split (for instance, to porcess entities), depending on implementation, and, to be safe, SOLR needs something like {code} while (reader.isCharacters()) { sb.append(reader.getText()); reader.next(); } {code} was (Author: funtick): Michael, Which version of Java are you using? I believe something wrong with XML (upload) file, and specific Java version classes conflict with WoodStox, although SOLR may need improvement too: http://forums.sun.com/thread.jspa?threadID=5150576 While indexing a java.lang.IllegalStateException: Can't overwrite cause exception is thrown - Key: SOLR-1764 URL: https://issues.apache.org/jira/browse/SOLR-1764 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 1.4 Environment: Windows XP, JBoss 4.2.3 GA Reporter: Michael McGowan Priority: Blocker I get an exception while indexing. It seems that I'm unable to see the root cause of the exception because it is masked by another java.lang.IllegalStateException: Can't overwrite cause exception. Here is the stacktrace : 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: {} 0 15 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.common.SolrException log SEVERE: java.lang.IllegalStateException: Can't overwrite cause at java.lang.Throwable.initCause(Throwable.java:320) at com.ctc.wstx.compat.Jdk14Impl.setInitCause(Jdk14Impl.java:70) at com.ctc.wstx.exc.WstxException.init(WstxException.java:46) at com.ctc.wstx.exc.WstxIOException.init(WstxIOException.java:16) at com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:536) at com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:592) at com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:648) at com.ctc.wstx.stax.WstxInputFactory.createXMLStreamReader(WstxInputFactory.java:319) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:68) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.jboss.web.tomcat.filters.ReplyHeaderFilter.doFilter(ReplyHeaderFilter.java:96) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:230) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) at org.jboss.web.tomcat.security.SecurityAssociationValve.invoke(SecurityAssociationValve.java:182) at org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:84) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.jboss.web.tomcat.service.jca.CachedConnectionValve.invoke(CachedConnectionValve.java:157) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:262) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583
[jira] Issue Comment Edited: (SOLR-1764) While indexing a java.lang.IllegalStateException: Can't overwrite cause exception is thrown
[ https://issues.apache.org/jira/browse/SOLR-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12831268#action_12831268 ] Fuad Efendi edited comment on SOLR-1764 at 2/9/10 2:51 AM: --- Michael, Which version of Java are you using? I believe something wrong with XML (upload) file, and specific Java version classes conflict with WoodStox, although SOLR may need improvement too: http://forums.sun.com/thread.jspa?threadID=5150576 It says that text nodes such as prim name=y[-A-Z0-9.,()/='+:?!%amp;amp;*; ]/prim can be split (for instance, to process entities), depending on implementation, and, to be safe, SOLR needs something like {code} while (reader.isCharacters()) { sb.append(reader.getText()); reader.next(); } {code} was (Author: funtick): Michael, Which version of Java are you using? I believe something wrong with XML (upload) file, and specific Java version classes conflict with WoodStox, although SOLR may need improvement too: http://forums.sun.com/thread.jspa?threadID=5150576 It says that text nodes such as prim name=y[-A-Z0-9.,()/='+:?!%amp;amp;*; ]/prim can be split (for instance, to porcess entities), depending on implementation, and, to be safe, SOLR needs something like {code} while (reader.isCharacters()) { sb.append(reader.getText()); reader.next(); } {code} While indexing a java.lang.IllegalStateException: Can't overwrite cause exception is thrown - Key: SOLR-1764 URL: https://issues.apache.org/jira/browse/SOLR-1764 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 1.4 Environment: Windows XP, JBoss 4.2.3 GA Reporter: Michael McGowan Priority: Blocker I get an exception while indexing. It seems that I'm unable to see the root cause of the exception because it is masked by another java.lang.IllegalStateException: Can't overwrite cause exception. Here is the stacktrace : 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: {} 0 15 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.common.SolrException log SEVERE: java.lang.IllegalStateException: Can't overwrite cause at java.lang.Throwable.initCause(Throwable.java:320) at com.ctc.wstx.compat.Jdk14Impl.setInitCause(Jdk14Impl.java:70) at com.ctc.wstx.exc.WstxException.init(WstxException.java:46) at com.ctc.wstx.exc.WstxIOException.init(WstxIOException.java:16) at com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:536) at com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:592) at com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:648) at com.ctc.wstx.stax.WstxInputFactory.createXMLStreamReader(WstxInputFactory.java:319) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:68) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.jboss.web.tomcat.filters.ReplyHeaderFilter.doFilter(ReplyHeaderFilter.java:96) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:230) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) at org.jboss.web.tomcat.security.SecurityAssociationValve.invoke(SecurityAssociationValve.java:182) at org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:84) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.jboss.web.tomcat.service.jca.CachedConnectionValve.invoke(CachedConnectionValve.java:157) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109
[jira] Closed: (SOLR-711) SimpleFacets: Performance Boost for Tokenized Fields for smaller DocSet using Term Vectors
[ https://issues.apache.org/jira/browse/SOLR-711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fuad Efendi closed SOLR-711. Resolution: Fixed Thanks Shalin for pointing to SOLR-475 which is very advanced solution to term counting approach. SimpleFacets: Performance Boost for Tokenized Fields for smaller DocSet using Term Vectors -- Key: SOLR-711 URL: https://issues.apache.org/jira/browse/SOLR-711 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.3 Reporter: Fuad Efendi Fix For: 1.4 Original Estimate: 1680h Remaining Estimate: 1680h From [http://www.nabble.com/SimpleFacets%3A-Performance-Boost-for-Tokenized-Fields-td19033760.html]: Scenario: - 10,000,000 documents in the index; - 5-10 terms per document; - 200,000 unique terms for a tokenized field. _Obviously calculating sizes of 200,000 intersections with FilterCache is 100 times slower than traversing 10 - 20,000 documents for smaller DocSets and counting frequencies of Terms._ Not applicable if size of DocSet is close to total number of unique tokens (200,000 in our scenario). See SimpleFacets.java: {code} public NamedList getFacetTermEnumCounts( SolrIndexSearcher searcher, DocSet docs, ... {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-667) Alternate LRUCache implementation
[ https://issues.apache.org/jira/browse/SOLR-667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12635221#action_12635221 ] Fuad Efendi commented on SOLR-667: -- Paul, Yonik, thanks for your efforts; BTW 'Concurrent'HashMap uses spinloops for 'safe' updates in order to avoid synchronization (instead of giving up CPU cycles); there are always cases when it is not faster that simple HashMap with synchronization. LingPipe uses different approach, see last comment at SOLR-665. Also, why are you in-a-loop with LRU? LFU is logically better. +1 and thanks for sharing. Alternate LRUCache implementation - Key: SOLR-667 URL: https://issues.apache.org/jira/browse/SOLR-667 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Noble Paul Fix For: 1.4 Attachments: ConcurrentLRUCache.java, ConcurrentLRUCache.java, ConcurrentLRUCache.java, SOLR-667.patch, SOLR-667.patch, SOLR-667.patch, SOLR-667.patch The only available SolrCache i.e LRUCache is based on _LinkedHashMap_ which has _get()_ also synchronized. This can cause severe bottlenecks for faceted search. Any alternate implementation which can be faster/better must be considered. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-711) SimpleFacets: Performance Boost for Tokenized Fields for smaller DocSet using Term Vectors
SimpleFacets: Performance Boost for Tokenized Fields for smaller DocSet using Term Vectors -- Key: SOLR-711 URL: https://issues.apache.org/jira/browse/SOLR-711 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.3 Reporter: Fuad Efendi Fix For: 1.4 From [url]http://www.nabble.com/SimpleFacets%3A-Performance-Boost-for-Tokenized-Fields-td19033760.html[/url]: Scenario: - 10,000,000 documents in the index; - 5-10 terms per document; - 200,000 unique terms for a tokenized field. _Obviously calculating sizes of 200,000 intersections with FilterCache is 100 times slower than traversing 10 - 20,000 documents for smaller DocSets and counting frequencies of Terms._ Not applicable if size of DocSet is close to total number of unique tokens (200,000 in our scenario). See SimpleFacets: {{ public NamedList getFacetTermEnumCounts( SolrIndexSearcher searcher, DocSet docs, String field, int offset, int limit, int mincount, boolean missing, boolean sort, String prefix) throws IOException {...} }} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-711) SimpleFacets: Performance Boost for Tokenized Fields for smaller DocSet using Term Vectors
[ https://issues.apache.org/jira/browse/SOLR-711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fuad Efendi updated SOLR-711: - Comment: was deleted SimpleFacets: Performance Boost for Tokenized Fields for smaller DocSet using Term Vectors -- Key: SOLR-711 URL: https://issues.apache.org/jira/browse/SOLR-711 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.3 Reporter: Fuad Efendi Fix For: 1.4 Original Estimate: 1680h Remaining Estimate: 1680h From [http://www.nabble.com/SimpleFacets%3A-Performance-Boost-for-Tokenized-Fields-td19033760.html]: Scenario: - 10,000,000 documents in the index; - 5-10 terms per document; - 200,000 unique terms for a tokenized field. _Obviously calculating sizes of 200,000 intersections with FilterCache is 100 times slower than traversing 10 - 20,000 documents for smaller DocSets and counting frequencies of Terms._ Not applicable if size of DocSet is close to total number of unique tokens (200,000 in our scenario). See SimpleFacets: {code:title=SimpleFacets.java|borderStyle=solid} public NamedList getFacetTermEnumCounts( SolrIndexSearcher searcher, DocSet docs, ... {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-711) SimpleFacets: Performance Boost for Tokenized Fields for smaller DocSet using Term Vectors
[ https://issues.apache.org/jira/browse/SOLR-711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fuad Efendi updated SOLR-711: - Description: From [http://www.nabble.com/SimpleFacets%3A-Performance-Boost-for-Tokenized-Fields-td19033760.html]: Scenario: - 10,000,000 documents in the index; - 5-10 terms per document; - 200,000 unique terms for a tokenized field. _Obviously calculating sizes of 200,000 intersections with FilterCache is 100 times slower than traversing 10 - 20,000 documents for smaller DocSets and counting frequencies of Terms._ Not applicable if size of DocSet is close to total number of unique tokens (200,000 in our scenario). See SimpleFacets.java: {code} public NamedList getFacetTermEnumCounts( SolrIndexSearcher searcher, DocSet docs, ... {code} was: From [http://www.nabble.com/SimpleFacets%3A-Performance-Boost-for-Tokenized-Fields-td19033760.html]: Scenario: - 10,000,000 documents in the index; - 5-10 terms per document; - 200,000 unique terms for a tokenized field. _Obviously calculating sizes of 200,000 intersections with FilterCache is 100 times slower than traversing 10 - 20,000 documents for smaller DocSets and counting frequencies of Terms._ Not applicable if size of DocSet is close to total number of unique tokens (200,000 in our scenario). See SimpleFacets: {code:title=SimpleFacets.java|borderStyle=solid} public NamedList getFacetTermEnumCounts( SolrIndexSearcher searcher, DocSet docs, ... {code} SimpleFacets: Performance Boost for Tokenized Fields for smaller DocSet using Term Vectors -- Key: SOLR-711 URL: https://issues.apache.org/jira/browse/SOLR-711 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.3 Reporter: Fuad Efendi Fix For: 1.4 Original Estimate: 1680h Remaining Estimate: 1680h From [http://www.nabble.com/SimpleFacets%3A-Performance-Boost-for-Tokenized-Fields-td19033760.html]: Scenario: - 10,000,000 documents in the index; - 5-10 terms per document; - 200,000 unique terms for a tokenized field. _Obviously calculating sizes of 200,000 intersections with FilterCache is 100 times slower than traversing 10 - 20,000 documents for smaller DocSets and counting frequencies of Terms._ Not applicable if size of DocSet is close to total number of unique tokens (200,000 in our scenario). See SimpleFacets.java: {code} public NamedList getFacetTermEnumCounts( SolrIndexSearcher searcher, DocSet docs, ... {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-711) SimpleFacets: Performance Boost for Tokenized Fields for smaller DocSet using Term Vectors
[ https://issues.apache.org/jira/browse/SOLR-711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fuad Efendi updated SOLR-711: - Description: From [http://www.nabble.com/SimpleFacets%3A-Performance-Boost-for-Tokenized-Fields-td19033760.html]: Scenario: - 10,000,000 documents in the index; - 5-10 terms per document; - 200,000 unique terms for a tokenized field. _Obviously calculating sizes of 200,000 intersections with FilterCache is 100 times slower than traversing 10 - 20,000 documents for smaller DocSets and counting frequencies of Terms._ Not applicable if size of DocSet is close to total number of unique tokens (200,000 in our scenario). See SimpleFacets: {code:title=SimpleFacets.java|borderStyle=solid} public NamedList getFacetTermEnumCounts( SolrIndexSearcher searcher, DocSet docs, ... {code} was: From [url]http://www.nabble.com/SimpleFacets%3A-Performance-Boost-for-Tokenized-Fields-td19033760.html[/url]: Scenario: - 10,000,000 documents in the index; - 5-10 terms per document; - 200,000 unique terms for a tokenized field. _Obviously calculating sizes of 200,000 intersections with FilterCache is 100 times slower than traversing 10 - 20,000 documents for smaller DocSets and counting frequencies of Terms._ Not applicable if size of DocSet is close to total number of unique tokens (200,000 in our scenario). See SimpleFacets: {{ public NamedList getFacetTermEnumCounts( SolrIndexSearcher searcher, DocSet docs, String field, int offset, int limit, int mincount, boolean missing, boolean sort, String prefix) throws IOException {...} }} trivial formatting SimpleFacets: Performance Boost for Tokenized Fields for smaller DocSet using Term Vectors -- Key: SOLR-711 URL: https://issues.apache.org/jira/browse/SOLR-711 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.3 Reporter: Fuad Efendi Fix For: 1.4 Original Estimate: 1680h Remaining Estimate: 1680h From [http://www.nabble.com/SimpleFacets%3A-Performance-Boost-for-Tokenized-Fields-td19033760.html]: Scenario: - 10,000,000 documents in the index; - 5-10 terms per document; - 200,000 unique terms for a tokenized field. _Obviously calculating sizes of 200,000 intersections with FilterCache is 100 times slower than traversing 10 - 20,000 documents for smaller DocSets and counting frequencies of Terms._ Not applicable if size of DocSet is close to total number of unique tokens (200,000 in our scenario). See SimpleFacets: {code:title=SimpleFacets.java|borderStyle=solid} public NamedList getFacetTermEnumCounts( SolrIndexSearcher searcher, DocSet docs, ... {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Only 3 issues left
such as... using Term Vectors for fast faceting on tokenized... but it is in TODOs of source files! cNET data was so small when SimpleFacets were born, only 40 docs... Quoting Shalin Shekhar Mangar [EMAIL PROTECTED]: Only 3.so alienmust...add..more...issues...argh! On Mon, Aug 18, 2008 at 9:20 PM, Otis Gospodnetic [EMAIL PROTECTED] wrote: Hi Look mom, only 3 issues to go! https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=truemode=hidesorter/order=DESCsorter/field=priorityresolution=-1pid=12310230fixfor=12312486 Out of those, 1 is trivial (lucene jar update), 1 looks committed (the maven one), and only SOLR-646 is serious. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch -- Regards, Shalin Shekhar Mangar.
[jira] Created: (SOLR-671) Range queries with 'slong' field type do not retrieve correct results
Range queries with 'slong' field type do not retrieve correct results - Key: SOLR-671 URL: https://issues.apache.org/jira/browse/SOLR-671 Project: Solr Issue Type: Bug Environment: SOLR-1.3-DEV Schema: !-- Numeric field types that manipulate the value into a string value that isn't human-readable in its internal form, but with a lexicographic ordering the same as the numeric ordering, so that range queries work correctly. -- fieldType name=sint class=solr.SortableIntField sortMissingLast=true omitNorms=true/ fieldType name=slong class=solr.SortableLongField sortMissingLast=true omitNorms=true/ fieldType name=sfloat class=solr.SortableFloatField sortMissingLast=true omitNorms=true/ fieldType name=sdouble class=solr.SortableDoubleField sortMissingLast=true omitNorms=true/ field name=timestamp type=slong indexed=true stored=true/ Reporter: Fuad Efendi Range queries always return all results (do not filter): timestamp:[1019386401114 TO 1219386401114] lst name=debug str name=rawquerystringtimestamp:[1019386401114 TO 1219386401114]/str str name=querystringtimestamp:[1019386401114 TO 1219386401114]/str str name=parsedquerytimestamp:[1019386401114 TO 1219386401114]/str str name=parsedquery_toStringtimestamp:[#8;#0;εごᅚ TO #8;#0;ѯ刯慚]/str ... str name=QParserOldLuceneQParser/str -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-671) Range queries with 'slong' field type do not retrieve correct results
[ https://issues.apache.org/jira/browse/SOLR-671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fuad Efendi updated SOLR-671: - Priority: Blocker (was: Major) Affects Version/s: 1.3 Range queries with 'slong' field type do not retrieve correct results - Key: SOLR-671 URL: https://issues.apache.org/jira/browse/SOLR-671 Project: Solr Issue Type: Bug Affects Versions: 1.3 Environment: SOLR-1.3-DEV Schema: !-- Numeric field types that manipulate the value into a string value that isn't human-readable in its internal form, but with a lexicographic ordering the same as the numeric ordering, so that range queries work correctly. -- fieldType name=sint class=solr.SortableIntField sortMissingLast=true omitNorms=true/ fieldType name=slong class=solr.SortableLongField sortMissingLast=true omitNorms=true/ fieldType name=sfloat class=solr.SortableFloatField sortMissingLast=true omitNorms=true/ fieldType name=sdouble class=solr.SortableDoubleField sortMissingLast=true omitNorms=true/ field name=timestamp type=slong indexed=true stored=true/ Reporter: Fuad Efendi Priority: Blocker Original Estimate: 168h Remaining Estimate: 168h Range queries always return all results (do not filter): timestamp:[1019386401114 TO 1219386401114] lst name=debug str name=rawquerystringtimestamp:[1019386401114 TO 1219386401114]/str str name=querystringtimestamp:[1019386401114 TO 1219386401114]/str str name=parsedquerytimestamp:[1019386401114 TO 1219386401114]/str str name=parsedquery_toStringtimestamp:[#8;#0;εごᅚ TO #8;#0;ѯ刯慚]/str ... str name=QParserOldLuceneQParser/str -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-671) Range queries with 'slong' field type do not retrieve correct results
[ https://issues.apache.org/jira/browse/SOLR-671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fuad Efendi updated SOLR-671: - Priority: Trivial (was: Blocker) Issue Type: Test (was: Bug) I executed another query which works fine: timestamp:[* TO 1000] - 0 results Finally found it works... Please close. Range queries with 'slong' field type do not retrieve correct results - Key: SOLR-671 URL: https://issues.apache.org/jira/browse/SOLR-671 Project: Solr Issue Type: Test Affects Versions: 1.3 Environment: SOLR-1.3-DEV Schema: !-- Numeric field types that manipulate the value into a string value that isn't human-readable in its internal form, but with a lexicographic ordering the same as the numeric ordering, so that range queries work correctly. -- fieldType name=sint class=solr.SortableIntField sortMissingLast=true omitNorms=true/ fieldType name=slong class=solr.SortableLongField sortMissingLast=true omitNorms=true/ fieldType name=sfloat class=solr.SortableFloatField sortMissingLast=true omitNorms=true/ fieldType name=sdouble class=solr.SortableDoubleField sortMissingLast=true omitNorms=true/ field name=timestamp type=slong indexed=true stored=true/ Reporter: Fuad Efendi Priority: Trivial Original Estimate: 168h Remaining Estimate: 168h Range queries always return all results (do not filter): timestamp:[1019386401114 TO 1219386401114] lst name=debug str name=rawquerystringtimestamp:[1019386401114 TO 1219386401114]/str str name=querystringtimestamp:[1019386401114 TO 1219386401114]/str str name=parsedquerytimestamp:[1019386401114 TO 1219386401114]/str str name=parsedquery_toStringtimestamp:[#8;#0;εごᅚ TO #8;#0;ѯ刯慚]/str ... str name=QParserOldLuceneQParser/str -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-671) Range queries with 'slong' field type do not retrieve correct results
[ https://issues.apache.org/jira/browse/SOLR-671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fuad Efendi updated SOLR-671: - Priority: Major (was: Trivial) Issue Type: Bug (was: Test) Here is test case, similar to Arrays.sort() bug (unsigned...): {code} long time1 = System.currentTimeMillis() - 30*24*3600*1000; long time2 = 30*24*3600*1000; System.out.println(time1); System.out.println(time1-time2); Output: 1219389000674 1221091967970 {code} (time1-time2) time1! What happens inside SOLR slong for such queries? Range queries with 'slong' field type do not retrieve correct results - Key: SOLR-671 URL: https://issues.apache.org/jira/browse/SOLR-671 Project: Solr Issue Type: Bug Affects Versions: 1.3 Environment: SOLR-1.3-DEV Schema: !-- Numeric field types that manipulate the value into a string value that isn't human-readable in its internal form, but with a lexicographic ordering the same as the numeric ordering, so that range queries work correctly. -- fieldType name=sint class=solr.SortableIntField sortMissingLast=true omitNorms=true/ fieldType name=slong class=solr.SortableLongField sortMissingLast=true omitNorms=true/ fieldType name=sfloat class=solr.SortableFloatField sortMissingLast=true omitNorms=true/ fieldType name=sdouble class=solr.SortableDoubleField sortMissingLast=true omitNorms=true/ field name=timestamp type=slong indexed=true stored=true/ Reporter: Fuad Efendi Original Estimate: 168h Remaining Estimate: 168h Range queries always return all results (do not filter): timestamp:[1019386401114 TO 1219386401114] lst name=debug str name=rawquerystringtimestamp:[1019386401114 TO 1219386401114]/str str name=querystringtimestamp:[1019386401114 TO 1219386401114]/str str name=parsedquerytimestamp:[1019386401114 TO 1219386401114]/str str name=parsedquery_toStringtimestamp:[#8;#0;εごᅚ TO #8;#0;ѯ刯慚]/str ... str name=QParserOldLuceneQParser/str -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-671) Range queries with 'slong' field type do not retrieve correct results
[ https://issues.apache.org/jira/browse/SOLR-671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12619223#action_12619223 ] funtick edited comment on SOLR-671 at 8/2/08 7:12 AM: -- Here is test case, similar to Arrays.sort() bug (unsigned...): {code} long time1 = System.currentTimeMillis(); long time2 = 30*24*3600*1000; System.out.println(time1); System.out.println(time1-time2); Output: 1219389000674 1221091967970 {code} (time1-time2) time1! What happens inside SOLR slong for such queries? was (Author: funtick): Here is test case, similar to Arrays.sort() bug (unsigned...): {code} long time1 = System.currentTimeMillis() - 30*24*3600*1000; long time2 = 30*24*3600*1000; System.out.println(time1); System.out.println(time1-time2); Output: 1219389000674 1221091967970 {code} (time1-time2) time1! What happens inside SOLR slong for such queries? Range queries with 'slong' field type do not retrieve correct results - Key: SOLR-671 URL: https://issues.apache.org/jira/browse/SOLR-671 Project: Solr Issue Type: Bug Affects Versions: 1.3 Environment: SOLR-1.3-DEV Schema: !-- Numeric field types that manipulate the value into a string value that isn't human-readable in its internal form, but with a lexicographic ordering the same as the numeric ordering, so that range queries work correctly. -- fieldType name=sint class=solr.SortableIntField sortMissingLast=true omitNorms=true/ fieldType name=slong class=solr.SortableLongField sortMissingLast=true omitNorms=true/ fieldType name=sfloat class=solr.SortableFloatField sortMissingLast=true omitNorms=true/ fieldType name=sdouble class=solr.SortableDoubleField sortMissingLast=true omitNorms=true/ field name=timestamp type=slong indexed=true stored=true/ Reporter: Fuad Efendi Original Estimate: 168h Remaining Estimate: 168h Range queries always return all results (do not filter): timestamp:[1019386401114 TO 1219386401114] lst name=debug str name=rawquerystringtimestamp:[1019386401114 TO 1219386401114]/str str name=querystringtimestamp:[1019386401114 TO 1219386401114]/str str name=parsedquerytimestamp:[1019386401114 TO 1219386401114]/str str name=parsedquery_toStringtimestamp:[#8;#0;εごᅚ TO #8;#0;ѯ刯慚]/str ... str name=QParserOldLuceneQParser/str -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-671) Range queries with 'slong' field type do not retrieve correct results
[ https://issues.apache.org/jira/browse/SOLR-671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12619227#action_12619227 ] Fuad Efendi commented on SOLR-671: -- {code} long time1 = System.currentTimeMillis(); long time2 = 30*24*3600*1000; long time3 = time1 - time2; System.out.println(Time1: +time1); System.out.println(Time2: +time2); System.out.println(Time3: +time3); Time1: 1217686478242 Time2: -1702967296 Time3: 1219389445538 {code} bug is obvious... {code} long time1 = System.currentTimeMillis(); long time2 = 30*24*3600*1000L; long time3 = time1 - time2; System.out.println(Time1: +time1); System.out.println(Time2: +time2); System.out.println(Time3: +time3); Time1: 1217686559557 Time2: 259200 Time3: 1215094559557 {code} Close it... Range queries with 'slong' field type do not retrieve correct results - Key: SOLR-671 URL: https://issues.apache.org/jira/browse/SOLR-671 Project: Solr Issue Type: Bug Environment: SOLR-1.3-DEV Schema: !-- Numeric field types that manipulate the value into a string value that isn't human-readable in its internal form, but with a lexicographic ordering the same as the numeric ordering, so that range queries work correctly. -- fieldType name=sint class=solr.SortableIntField sortMissingLast=true omitNorms=true/ fieldType name=slong class=solr.SortableLongField sortMissingLast=true omitNorms=true/ fieldType name=sfloat class=solr.SortableFloatField sortMissingLast=true omitNorms=true/ fieldType name=sdouble class=solr.SortableDoubleField sortMissingLast=true omitNorms=true/ field name=timestamp type=slong indexed=true stored=true/ Reporter: Fuad Efendi Original Estimate: 168h Remaining Estimate: 168h Range queries always return all results (do not filter): timestamp:[1019386401114 TO 1219386401114] lst name=debug str name=rawquerystringtimestamp:[1019386401114 TO 1219386401114]/str str name=querystringtimestamp:[1019386401114 TO 1219386401114]/str str name=parsedquerytimestamp:[1019386401114 TO 1219386401114]/str str name=parsedquery_toStringtimestamp:[#8;#0;εごᅚ TO #8;#0;ѯ刯慚]/str ... str name=QParserOldLuceneQParser/str -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-671) Range queries with 'slong' field type do not retrieve correct results
[ https://issues.apache.org/jira/browse/SOLR-671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fuad Efendi updated SOLR-671: - Priority: Trivial (was: Major) Issue Type: Test (was: Bug) Affects Version/s: (was: 1.3) Range queries with 'slong' field type do not retrieve correct results - Key: SOLR-671 URL: https://issues.apache.org/jira/browse/SOLR-671 Project: Solr Issue Type: Test Environment: SOLR-1.3-DEV Schema: !-- Numeric field types that manipulate the value into a string value that isn't human-readable in its internal form, but with a lexicographic ordering the same as the numeric ordering, so that range queries work correctly. -- fieldType name=sint class=solr.SortableIntField sortMissingLast=true omitNorms=true/ fieldType name=slong class=solr.SortableLongField sortMissingLast=true omitNorms=true/ fieldType name=sfloat class=solr.SortableFloatField sortMissingLast=true omitNorms=true/ fieldType name=sdouble class=solr.SortableDoubleField sortMissingLast=true omitNorms=true/ field name=timestamp type=slong indexed=true stored=true/ Reporter: Fuad Efendi Priority: Trivial Original Estimate: 168h Remaining Estimate: 168h Range queries always return all results (do not filter): timestamp:[1019386401114 TO 1219386401114] lst name=debug str name=rawquerystringtimestamp:[1019386401114 TO 1219386401114]/str str name=querystringtimestamp:[1019386401114 TO 1219386401114]/str str name=parsedquerytimestamp:[1019386401114 TO 1219386401114]/str str name=parsedquery_toStringtimestamp:[#8;#0;εごᅚ TO #8;#0;ѯ刯慚]/str ... str name=QParserOldLuceneQParser/str -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12619058#action_12619058 ] Fuad Efendi commented on SOLR-665: -- Guys at LingPipe (Natural Language Processing) http://alias-i.com/ are using excellent Map implementations with optimistic concurrency strategy: http://alias-i.com/lingpipe/docs/api/com/aliasi/util/FastCache.html http://alias-i.com/lingpipe/docs/api/com/aliasi/util/HardFastCache.html FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java, SimplestConcurrentLRUCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-667) Alternate LRUCache implementation
[ https://issues.apache.org/jira/browse/SOLR-667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12618750#action_12618750 ] Fuad Efendi commented on SOLR-667: -- bq. ...safety, where nothing bad ever happens to an object. When _SOLR_ adds object to cache or remove it from cache it does not change it, it manipulates with internal arrays of pointers to objects (which are probably atomic, but I don't know such JVM GC internals in-depth...) Looks heavy with TreeSet... Alternate LRUCache implementation - Key: SOLR-667 URL: https://issues.apache.org/jira/browse/SOLR-667 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Noble Paul Attachments: ConcurrentLRUCache.java The only available SolrCache i.e LRUCache is based on _LinkedHashMap_ which has _get()_ also synchronized. This can cause severe bottlenecks for faceted search. Any alternate implementation which can be faster/better must be considered. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12618766#action_12618766 ] Fuad Efendi commented on SOLR-665: -- I don't think ConcurrentHashMap will improve performance, and ConcurrentMap is not what SOLR needs: {code} V putIfAbsent(K key, V value); V replace(K key, V value); boolean replace(K key, V oldValue, V newValue); {code} There is also some(...) overhead with _oldValue_ and _the state of the hash table at some point_; additional memory requirements; etc... can we design something plain-simpler being focused on SOLR specific requirements? Without all functionality of Map etc... FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java, SimplestConcurrentLRUCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-667) Alternate LRUCache implementation
[ https://issues.apache.org/jira/browse/SOLR-667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12618805#action_12618805 ] Fuad Efendi commented on SOLR-667: -- Paul, I have never ever suggested to use 'volatile' 'to avoid synchronization' for concurrent programming. I only noticed some extremely stupid code where SOLR uses _double_synchronization and AtomicLong inside: {code} public synchronized Object put(Object key, Object value) { if (state == State.LIVE) { stats.inserts.incrementAndGet(); } synchronized (map) { // increment local inserts regardless of state??? // it does make it more consistent with the current size... inserts++; return map.put(key,value); } } {code} Each tool has an area of applicability, and even ConcurrentHashMap just slightly intersects with SOLR needs; SOLR does not need 'consistent view at a point in time' on cached objects. 'volatile' is part of Java Specs, and implemented differently by different vendors. I use volatile (instead of more expensive AtomicLong) only and only to prevent JVM HotSpot Optimizer from some _not-applicable_ staff... Alternate LRUCache implementation - Key: SOLR-667 URL: https://issues.apache.org/jira/browse/SOLR-667 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Noble Paul Attachments: ConcurrentLRUCache.java The only available SolrCache i.e LRUCache is based on _LinkedHashMap_ which has _get()_ also synchronized. This can cause severe bottlenecks for faceted search. Any alternate implementation which can be faster/better must be considered. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-667) Alternate LRUCache implementation
[ https://issues.apache.org/jira/browse/SOLR-667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12618824#action_12618824 ] Fuad Efendi commented on SOLR-667: -- Thanks Yonik, I even guess that in some cases synchronization is faster than sun.misc.Unsafe.compareAndSwapLong(this, valueOffset, expect, update); {code} public final long incrementAndGet() { for (;;) { long current = get(); long next = current + 1; if (compareAndSet(current, next)) return next; } } {code} - extremal level of safety with some level of concurrency... Do we need exact value for 'stats.inserts' (if it is not synchronized)? It can be 'long' inside synchronized block... Alternate LRUCache implementation - Key: SOLR-667 URL: https://issues.apache.org/jira/browse/SOLR-667 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Noble Paul Attachments: ConcurrentLRUCache.java The only available SolrCache i.e LRUCache is based on _LinkedHashMap_ which has _get()_ also synchronized. This can cause severe bottlenecks for faceted search. Any alternate implementation which can be faster/better must be considered. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12618339#action_12618339 ] Fuad Efendi commented on SOLR-665: -- Nobble, thanks for feedback! Of course my code is buggy but I only wanted _to illustrate_ simplest idea; I am extremely busy with other staff (Liferay) and can't focus on SOLR improvements... may be during weekend. bq. ...will always evaluate to false. And the reference will always have one value - yes, this is bug. There are other bugs too... bq. We must be removing the entry which was accessed first (not last).. I mean (and code too) the same; probably wrong wording bq. And the static volatile counter is not threadsafe. Do we _really-really_ need thread safety here? By using 'volatile' I only prevent _some_ JVMs from trying to optimize some code (and cause problems with per-instance variables which never change). bq. There is no need to use a WeakReference anywhere Agree... bq. To get that you must maintian a linkedlist the way linkedhashmap maintains. No other shortcut. May be... but looks similar to Arrays.sort(), or TreeSet, and etc I am trying to avoid this. 'No other shortcut' - may be, but I am unsure. Thanks! FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, ConcurrentLRUWeakCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12618339#action_12618339 ] funtick edited comment on SOLR-665 at 7/30/08 7:06 AM: --- Noble, thanks for feedback! Of course my code is buggy but I only wanted _to illustrate_ simplest idea; I am extremely busy with other staff (Liferay) and can't focus on SOLR improvements... may be during weekend. bq. ...will always evaluate to false. And the reference will always have one value - yes, this is bug. There are other bugs too... bq. We must be removing the entry which was accessed first (not last).. I mean (and code too) the same; probably wrong wording bq. And the static volatile counter is not threadsafe. Do we _really-really_ need thread safety here? By using 'volatile' I only prevent _some_ JVMs from trying to optimize some code (and cause problems with per-instance variables which never change). bq. There is no need to use a WeakReference anywhere Agree... bq. To get that you must maintian a linkedlist the way linkedhashmap maintains. No other shortcut. May be... but looks similar to Arrays.sort(), or TreeSet, and etc I am trying to avoid this. 'No other shortcut' - may be, but I am unsure. Thanks! was (Author: funtick): Nobble, thanks for feedback! Of course my code is buggy but I only wanted _to illustrate_ simplest idea; I am extremely busy with other staff (Liferay) and can't focus on SOLR improvements... may be during weekend. bq. ...will always evaluate to false. And the reference will always have one value - yes, this is bug. There are other bugs too... bq. We must be removing the entry which was accessed first (not last).. I mean (and code too) the same; probably wrong wording bq. And the static volatile counter is not threadsafe. Do we _really-really_ need thread safety here? By using 'volatile' I only prevent _some_ JVMs from trying to optimize some code (and cause problems with per-instance variables which never change). bq. There is no need to use a WeakReference anywhere Agree... bq. To get that you must maintian a linkedlist the way linkedhashmap maintains. No other shortcut. May be... but looks similar to Arrays.sort(), or TreeSet, and etc I am trying to avoid this. 'No other shortcut' - may be, but I am unsure. Thanks! FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, ConcurrentLRUWeakCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12618339#action_12618339 ] funtick edited comment on SOLR-665 at 7/30/08 7:08 AM: --- Noble, thanks for feedback! Of course my code is buggy but I only wanted _to illustrate_ simplest idea; I am extremely busy with other staff (Liferay) and can't focus on SOLR improvements... may be during weekend. bq. ...will always evaluate to false. And the reference will always have one value - yes, this is bug. There are other bugs too... bq. We must be removing the entry which was accessed first (not last).. I mean (and code too) the same; probably wrong wording bq. And the static volatile counter is not threadsafe. Do we _really-really_ need thread safety here? By using 'volatile' I only prevent _some_ JVMs from trying to optimize some code (and cause problems). bq. There is no need to use a WeakReference anywhere Agree... bq. To get that you must maintian a linkedlist the way linkedhashmap maintains. No other shortcut. May be... but looks similar to Arrays.sort(), or TreeSet, and etc I am trying to avoid this. 'No other shortcut' - may be, but I am unsure. Thanks! was (Author: funtick): Noble, thanks for feedback! Of course my code is buggy but I only wanted _to illustrate_ simplest idea; I am extremely busy with other staff (Liferay) and can't focus on SOLR improvements... may be during weekend. bq. ...will always evaluate to false. And the reference will always have one value - yes, this is bug. There are other bugs too... bq. We must be removing the entry which was accessed first (not last).. I mean (and code too) the same; probably wrong wording bq. And the static volatile counter is not threadsafe. Do we _really-really_ need thread safety here? By using 'volatile' I only prevent _some_ JVMs from trying to optimize some code (and cause problems with per-instance variables which never change). bq. There is no need to use a WeakReference anywhere Agree... bq. To get that you must maintian a linkedlist the way linkedhashmap maintains. No other shortcut. May be... but looks similar to Arrays.sort(), or TreeSet, and etc I am trying to avoid this. 'No other shortcut' - may be, but I am unsure. Thanks! FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, ConcurrentLRUWeakCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fuad Efendi updated SOLR-665: - Attachment: SimplestLRUCache.java FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, ConcurrentLRUWeakCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java, SimplestLRUCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fuad Efendi updated SOLR-665: - Attachment: SimplestLRUCache.java FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, ConcurrentLRUWeakCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java, SimplestLRUCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fuad Efendi updated SOLR-665: - Attachment: (was: ConcurrentLRUWeakCache.java) FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java, SimplestLRUCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fuad Efendi updated SOLR-665: - Attachment: (was: ConcurrentLRUWeakCache.java) FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java, SimplestLRUCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fuad Efendi updated SOLR-665: - Attachment: (was: SimplestLRUCache.java) FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java, SimplestConcurrentLRUCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fuad Efendi updated SOLR-665: - Attachment: SimplestConcurrentLRUCache.java FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java, SimplestConcurrentLRUCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-669) SOLR currently does not support caching for (Query, FacetFieldList)
SOLR currently does not support caching for (Query, FacetFieldList) --- Key: SOLR-669 URL: https://issues.apache.org/jira/browse/SOLR-669 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Reporter: Fuad Efendi It is huge performance bottleneck and it describes huge difference between qtime and SolrJ's elapsedTime. I quickly browsed SolrIndexSearcher: it caches only (Key, DocSet/DocList Lucene Ids) key-value pairs and it does not have cache for (Query, FacetFieldList). filterCache stores DocList for each 'filter' and is used for constant recalculations... This would be significant performance improvement. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-669) SOLR currently does not support caching for (Query, FacetFieldList)
[ https://issues.apache.org/jira/browse/SOLR-669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fuad Efendi updated SOLR-669: - Remaining Estimate: 1680h (was: 0.03h) Original Estimate: 1680h (was: 0.03h) SOLR currently does not support caching for (Query, FacetFieldList) --- Key: SOLR-669 URL: https://issues.apache.org/jira/browse/SOLR-669 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Reporter: Fuad Efendi Original Estimate: 1680h Remaining Estimate: 1680h It is huge performance bottleneck and it describes huge difference between qtime and SolrJ's elapsedTime. I quickly browsed SolrIndexSearcher: it caches only (Key, DocSet/DocList Lucene Ids) key-value pairs and it does not have cache for (Query, FacetFieldList). filterCache stores DocList for each 'filter' and is used for constant recalculations... This would be significant performance improvement. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
= (EntryK,V)getEntry(key); if (e == null) return null; e.recordAccess(this); return e.value; } {code} bq. Consider the following case: thread A performs a synchronized put, thread B performs an unsynchronized get on the same key. B gets scheduled before A completes, the returned value will be undefined. the returned value is well defined: it is either null or correct value. bq. That's exactly the case here - the update thread modifies the map structurally! Who cares? We are not iterating the map! bq. That said, if you can show conclusively (e.g. with a profiler) that the synchronized access is indeed the bottleneck and incurs a heavy penalty on performance, then I'm all for investigating this further. *What?!!* Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is easier to understand and troubleshoot... bq. I don't see the point of the static popularityCounter... that looks like a bug. No, it is not a bug. it is virtually checkpoint, like as a timer, one timer for all instances. We can use System.currentTimeMillis() instead, but static volatile long is faster. About specific use case: yes... if someone has 0.5 seconds response time for faceted queries I am very happy... I had 15 seconds before going with FIFO. FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java, SimplestConcurrentLRUCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
); if (e == null) return null; e.recordAccess(this); return e.value; } {code} bq. Consider the following case: thread A performs a synchronized put, thread B performs an unsynchronized get on the same key. B gets scheduled before A completes, the returned value will be undefined. the returned value is well defined: it is either null or correct value. bq. That's exactly the case here - the update thread modifies the map structurally! Who cares? We are not iterating the map! bq. That said, if you can show conclusively (e.g. with a profiler) that the synchronized access is indeed the bottleneck and incurs a heavy penalty on performance, then I'm all for investigating this further. *What?!!* Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is easier to understand and troubleshoot... bq. I don't see the point of the static popularityCounter... that looks like a bug. No, it is not a bug. it is virtually checkpoint, like as a timer, one timer for all instances. We can use System.currentTimeMillis() instead, but static volatile long is faster. About specific use case: yes... if someone has 0.5 seconds response time for faceted queries I am very happy... I had 15 seconds before going with FIFO. FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java, SimplestConcurrentLRUCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-669) SOLR currently does not support caching for (Query, FacetFieldList)
[ https://issues.apache.org/jira/browse/SOLR-669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12618574#action_12618574 ] Fuad Efendi commented on SOLR-669: -- This piece of code in SimpleFacets: {code} if (sf.multiValued() || ft.isTokenized() || ft instanceof BoolField) { // Always use filters for booleans... we know the number of values is very small. counts = getFacetTermEnumCounts(searcher, docs, field, offset, limit, mincount,missing,sort,prefix); } else { // TODO: future logic could use filters instead of the fieldcache if // the number of terms in the field is small enough. counts = getFieldCacheCounts(searcher, docs, field, offset,limit, mincount, missing, sort, prefix); } {code} - optimization for single-valued non-tokenized... 'Lucene FieldCache to get counts for each unique field value in docs' We should implement *additional* caching to support this _the FilterCache to get the intersection_; FilterCache stores DocSet only and does not store NamedList of field-intersections: {code} /** * Returns a list of terms in the specified field along with the * corresponding count of documents in the set that match that constraint. * This method uses the FilterCache to get the intersection count between codedocs/code * and the DocSet for each term in the filter. * * @see FacetParams#FACET_LIMIT * @see FacetParams#FACET_ZEROS * @see FacetParams#FACET_MISSING */ public NamedList getFacetTermEnumCounts(SolrIndexSearcher searcher, DocSet docs, String field, int offset, int limit, int mincount, boolean missing, boolean sort, String prefix) throws IOException { ... } {code} SOLR currently does not support caching for (Query, FacetFieldList) --- Key: SOLR-669 URL: https://issues.apache.org/jira/browse/SOLR-669 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Reporter: Fuad Efendi Original Estimate: 1680h Remaining Estimate: 1680h It is huge performance bottleneck and it describes huge difference between qtime and SolrJ's elapsedTime. I quickly browsed SolrIndexSearcher: it caches only (Key, DocSet/DocList Lucene Ids) key-value pairs and it does not have cache for (Query, FacetFieldList). filterCache stores DocList for each 'filter' and is used for constant recalculations... This would be significant performance improvement. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fuad Efendi updated SOLR-665: - Attachment: ConcurrentLRUWeakCache.java bug fix FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fuad Efendi updated SOLR-665: - Attachment: ConcurrentLRUWeakCache.java another bug... and AtomicReference is generic... never used it before. Could be even 'weaker' if we use 'hashcode' which is long (and atomic) instead of Key which is object (and unsafe), and distribution of hashcode is ok... FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, ConcurrentLRUWeakCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617441#action_12617441 ] Fuad Efendi commented on SOLR-665: -- Regarding Thread Safety: - yes, we need synchronized get() method for LRU cache because each access _reorders_ LinkedHashMap. - absolutely no need to synchronize get() method for FIFO! - probably we need to deal with insertion, but do not synchronize it: instead, extend LinkedHashMap and make some 'removal' synchronized... With caches large enough to store all object we do not need it. We probably do not need to synchronize 'removal' at all - it removes entry but does not remove/finalize referenced object. From JavaDoc: Note that this implementation is not synchronized. If multiple threads access a linked hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally. However, we do not modify cache structurally during iteration loop or any other 'structure' access (we do not use Iterator!) - so, advice from JavaDoc is not applicable. We should synchronize only removeEntryForKey of HashMap; unfortunately we can't do it without rewriting HashMap. Probably we can use ConcurrentHashMap as a base class of LinkedHashMap, but I don't know answer yet... I am guessing that unsynchronized entry removal won't be significant issue in multithreaded environment. FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617443#action_12617443 ] Fuad Efendi commented on SOLR-665: -- - Classes from java.util.concurrent.atomic designed NOT to be synchronized, per-instance stats should be replaced to AtomicLong instead of {{long}}: {{ private long lookups; private long hits; private long inserts; private long evictions;}} - get() method of FIFO do not need any synchronization - get() method of LRU reorers LinkedHashMap, unsynchronized access may cause orphan Entry objects - synchronized insertion for FIFO won't cause performance degradation because get() is unsynchronized. FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617445#action_12617445 ] Fuad Efendi commented on SOLR-665: -- bq. bq. absolutely no need to synchronize get() method for FIFO! bq. A cache is not a read-only object. Gets need to be synchronized because other threads can be changing the cache. I am familiar with Doug Lea's findings (he wrote his book in 1996, and it became part of java.util.concurrent after 10(!!!) years). bq. changing the cache - what is 'cache change'? Changing the stored Key, changing the referenced object - never ever happens in SOLR. Removal of object - yes. More correctly: removal of key. get(MyKey) will return null OR will return MyValue, and ConcurentCacheModification will never be thrown in SOLR (we are not using Iterator!). We can insert concurrently the same (key, value) pairs - not a problem. FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617454#action_12617454 ] Fuad Efendi commented on SOLR-665: -- bq. If multiple threads access this map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more mappings; merely changing the value associated with a key that an instance already contains is not a structural modification.) I already commented it. Just try to avoid 'books' and 'authors', also try to find meaning in JavaDocs and try to browse JavaSource instead... - _structural_ _modification_, related _ConcurrentModificationException_, and related _Iterator_: not applicable to SOLR Cache. Will never happen. We need to synchronize inserts because each insert may calculate size and remove 'eldest' entry, and we need to avoid OOMs. We need to synchronize retrieves for LRU because 'Linked' HashMap (with Access Order) will change links (will reorder Map). And we do not need to synchronize retrieves from Insertion-Ordered LinkedHashMap (FIFO). It is classic... i'd like to research more java.util.concurrent FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617466#action_12617466 ] Fuad Efendi commented on SOLR-665: -- Why? We have PoC at least, and we know the bottleneck! We need improvement: avoid _some_ synchronization; it is extremely easy with FIFO. We may try to improve LRU too. Not everything in JAVA is extremely good: for instance, synchronization. Even for single-threaded application, it needs additionally 600 CPU cycles (I've read it somewhere for SUN Java 1.3 on Windows) Yonik, please allow some time to think / to discuss. I'll try also to provide 'concurrent' LRU; but this issue is FIFO related. FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Reopened: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fuad Efendi reopened SOLR-665: -- get() method do not need to be synchronized for FIFO cache. Unsynchronized object retrieval is not structural modification of _Insertion-Ordered_ LinkedHashMap. Unsynchronized cache improves performance of multithreaded applications linear to number of CPUs. FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617481#action_12617481 ] Fuad Efendi commented on SOLR-665: -- bq. ...and at least one of the threads modifies the map structurally, it imust/i be synchronized externally. - so that thread running on _insert_ must be synchronized and _not_ thread running on _retrieve_. Again, we need synchronize _insert_ only to avoid memory leaks, not more. SOLR does not iterate Map. FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617482#action_12617482 ] Fuad Efendi commented on SOLR-665: -- ConcurrentModificationException is thrown only when we iterate Map and another thread modified structure; with LRU each get() modifies structure, with FIFO - only inserts... FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617520#action_12617520 ] Fuad Efendi commented on SOLR-665: -- Shalin, we are already using _AtomicLong of Java 5_; JVM is defferent story... JRockit R27 is JVM from BEA-Oracle, and its JDK 6 powered (rt.jar comes from SUN). I just tried to compare synchronized with unsynchronized and found it _is_ the main problem for faceting... Another problem: somehow faceting recalculates each time (using filterCache during repeated _recalculations_), queryCache is not enough... FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617523#action_12617523 ] Fuad Efendi commented on SOLR-665: -- We need to invite Doug Lea to this discussion... http://en.wikipedia.org/wiki/Doug_Lea http://gee.cs.oswego.edu/dl/index.html We may simply use java.util.concurrent.locks instead of heavy synchronized... we may also use Executor framework instead of single-thread faceting... We may even base SOLR on Apache MINA project. FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617529#action_12617529 ] Fuad Efendi commented on SOLR-665: -- concerns are probably because of misunderstanding of some _contract_... {code} /** * The table, resized as necessary. Length MUST Always be a power of two. */ transient Entry[] table; void resize(int newCapacity) { Entry[] oldTable = table; int oldCapacity = oldTable.length; if (oldCapacity == MAXIMUM_CAPACITY) { threshold = Integer.MAX_VALUE; return; } Entry[] newTable = new Entry[newCapacity]; transfer(newTable); table = newTable; threshold = (int)(newCapacity * loadFactor); } public V get(Object key) { if (key == null) return getForNullKey(); int hash = hash(key.hashCode()); for (EntryK,V e = table[indexFor(hash, table.length)]; e != null; e = e.next) { Object k; if (e.hash == hash ((k = e.key) == key || key.equals(k))) return e.value; } return null; } {code} - in worst case we will have pointer to _old_ table and even with _new_ one of smaller size we won't get _any_ ArrayIndexOutOfBounds. There is no any _contract_ requiring synchronization on get() of HashMap or LinkedHashMap; it IS application specific. FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617529#action_12617529 ] funtick edited comment on SOLR-665 at 7/28/08 12:51 PM: concerns are probably because of misunderstanding of some _contract_... {code} /** * The table, resized as necessary. Length MUST Always be a power of two. */ transient Entry[] table; void resize(int newCapacity) { Entry[] oldTable = table; int oldCapacity = oldTable.length; if (oldCapacity == MAXIMUM_CAPACITY) { threshold = Integer.MAX_VALUE; return; } Entry[] newTable = new Entry[newCapacity]; transfer(newTable); table = newTable; threshold = (int)(newCapacity * loadFactor); } public V get(Object key) { if (key == null) return getForNullKey(); int hash = hash(key.hashCode()); for (EntryK,V e = table[indexFor(hash, table.length)]; e != null; e = e.next) { Object k; if (e.hash == hash ((k = e.key) == key || key.equals(k))) return e.value; } return null; } {code} - in worst case we will have pointer to _old_ table and even with _new_ one of smaller size we won't get _any_ ArrayIndexOutOfBounds. There is no any _contract_ requiring synchronization on get() of HashMap or LinkedHashMap; it IS application specific. - we will never have _wrong_ results because Entry is immutable was (Author: funtick): concerns are probably because of misunderstanding of some _contract_... {code} /** * The table, resized as necessary. Length MUST Always be a power of two. */ transient Entry[] table; void resize(int newCapacity) { Entry[] oldTable = table; int oldCapacity = oldTable.length; if (oldCapacity == MAXIMUM_CAPACITY) { threshold = Integer.MAX_VALUE; return; } Entry[] newTable = new Entry[newCapacity]; transfer(newTable); table = newTable; threshold = (int)(newCapacity * loadFactor); } public V get(Object key) { if (key == null) return getForNullKey(); int hash = hash(key.hashCode()); for (EntryK,V e = table[indexFor(hash, table.length)]; e != null; e = e.next) { Object k; if (e.hash == hash ((k = e.key) == key || key.equals(k))) return e.value; } return null; } {code} - in worst case we will have pointer to _old_ table and even with _new_ one of smaller size we won't get _any_ ArrayIndexOutOfBounds. There is no any _contract_ requiring synchronization on get() of HashMap or LinkedHashMap; it IS application specific. FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617529#action_12617529 ] funtick edited comment on SOLR-665 at 7/28/08 1:04 PM: --- concerns are probably because of misunderstanding of some _contract_... {code} /** * The table, resized as necessary. Length MUST Always be a power of two. */ transient Entry[] table; void resize(int newCapacity) { Entry[] oldTable = table; int oldCapacity = oldTable.length; if (oldCapacity == MAXIMUM_CAPACITY) { threshold = Integer.MAX_VALUE; return; } Entry[] newTable = new Entry[newCapacity]; transfer(newTable); table = newTable; threshold = (int)(newCapacity * loadFactor); } public V get(Object key) { if (key == null) return getForNullKey(); int hash = hash(key.hashCode()); for (EntryK,V e = table[indexFor(hash, table.length)]; e != null; e = e.next) { Object k; if (e.hash == hash ((k = e.key) == key || key.equals(k))) return e.value; } return null; } {code} - in worst case we will have pointer to _old_ table and even with _new_ one of smaller size we won't get _any_ ArrayIndexOutOfBounds. - There is no any _contract_ requiring synchronization on get() of HashMap or LinkedHashMap; it IS application specific. - we will never have _wrong_ results because Entry is immutable {code} /** * Transfer all entries from current table to newTable. */ void transfer(Entry[] newTable) { Entry[] src = table; int newCapacity = newTable.length; for (int j = 0; j src.length; j++) { EntryK,V e = src[j]; if (e != null) { src[j] = null; do { EntryK,V next = e.next; int i = indexFor(e.hash, newCapacity); e.next = newTable[i]; newTable[i] = e; e = next; } while (e != null); } } } {code} - We won't have even any NullPointerException after src[j] = null. P.S. Of course, I agree - it is Java internals, and it is not public Map interface-_contract_ - should we avoid to use implementation then? and base decision on specific implementation from SUN? I believe it is specified somewhere in JSR too... {code} * @author Doug Lea * @author Josh Bloch * @author Arthur van Hoff * @author Neal Gafter * @version 1.65, 03/03/05 {code} was (Author: funtick): concerns are probably because of misunderstanding of some _contract_... {code} /** * The table, resized as necessary. Length MUST Always be a power of two. */ transient Entry[] table; void resize(int newCapacity) { Entry[] oldTable = table; int oldCapacity = oldTable.length; if (oldCapacity == MAXIMUM_CAPACITY) { threshold = Integer.MAX_VALUE; return; } Entry[] newTable = new Entry[newCapacity]; transfer(newTable); table = newTable; threshold = (int)(newCapacity * loadFactor); } public V get(Object key) { if (key == null) return getForNullKey(); int hash = hash(key.hashCode()); for (EntryK,V e = table[indexFor(hash, table.length)]; e != null; e = e.next) { Object k; if (e.hash == hash ((k = e.key) == key || key.equals(k))) return e.value; } return null; } {code} - in worst case we will have pointer to _old_ table and even with _new_ one of smaller size we won't get _any_ ArrayIndexOutOfBounds. - There is no any _contract_ requiring synchronization on get() of HashMap or LinkedHashMap; it IS application specific. - we will never have _wrong_ results because Entry is immutable FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics
[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
in JSR too... {code} * @author Doug Lea * @author Josh Bloch * @author Arthur van Hoff * @author Neal Gafter * @version 1.65, 03/03/05 {code} FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617562#action_12617562 ] Fuad Efendi commented on SOLR-665: -- {quote}Simply replacing synchronized with java.util.concurrent.locks doesn't increase performance. There needs to be a specific strategy for employing these locks in a way that makes sense.{quote} I absolutely agree... ConcurrentHashMap is based on some level of acceptable safety, for specific tasks only bq. They do not throw ConcurrentModificationException. However, iterators are designed to be used by only one thread at a time. We can try to design specific caches directly implementing Map or ConcurrentMap interfaces. We should define 'safety' levels (for instance, _null_ is not a problem if cache already contains this object added by another thread concurrently; cache elements are explicitly immutable objects; and etc.) FIFO looks simplest and it _does_ work indeed; for LRU we need reordering for each get(), _OR_ we can make it weaker using weak (approximate) reordering somehow... FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
{ EntryK,V next = e.next; int i = indexFor(e.hash, newCapacity); e.next = newTable[i]; newTable[i] = e; e = next; } while (e != null); } } } {code} - We won't have even any NullPointerException after src[j] = null. P.S. Of course, I agree - it is Java internals, and it is not public Map interface-_contract_ - should we avoid to use implementation then? I believe it is specified somewhere in JSR too... {code} * @author Doug Lea * @author Josh Bloch * @author Arthur van Hoff * @author Neal Gafter * @version 1.65, 03/03/05 {code} FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
; for (int j = 0; j src.length; j++) { EntryK,V e = src[j]; if (e != null) { src[j] = null; do { EntryK,V next = e.next; int i = indexFor(e.hash, newCapacity); e.next = newTable[i]; newTable[i] = e; e = next; } while (e != null); } } } {code} - We won't have even any NullPointerException after src[j] = null. P.S. Of course, I agree - it is Java internals, and it is not public Map interface-_contract_ - should we avoid to use implementation then? I believe it is specified somewhere in JSR too... {code} * @author Doug Lea * @author Josh Bloch * @author Arthur van Hoff * @author Neal Gafter * @version 1.65, 03/03/05 {code} P.P.S. Do not forget to look at the top of this discussion: {code} description: xxx Cache(maxSize=1000, initialSize=1000) size : 2668705 cumulative_inserts : 4246246 {code} - _cumulative_inserts_ is almost double of _size_ which shows that double-inserts are real - I checked catalina_out: no any NullPointerException, ArrayIndexOutOfBoundsException, and etc. - I don't think we should be worried _too much_ about possible change of Map implementation by SUN :P... in this case we should use neither java.lang.String nor java.util.Date (some are placed in wrong packages). FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617639#action_12617639 ] Fuad Efendi commented on SOLR-665: -- This is extremely simple Concurrent LRU, I spent an hour to create it; it is based on ConcurrentHashMap. I don't use java.util.concurrent.locks, and I am trying to focus on _requirements only_ avoiding implementing unnecessary methods of Map interface (so that I am not following _contract_ ;) very sorry!) {code} import java.util.Map; import java.util.concurrent.ConcurrentHashMap; public class ConcurrentLRUK, V { protected ConcurrentHashMapK, ValueWrapperV map; protected int maxEntries; public ConcurrentLRU(int maxEntries) { map = new ConcurrentHashMapK, ValueWrapperV(); this.maxEntries = maxEntries; } public V put(K key, V value) { ValueWrapperV wrapper = map.put(key, new ValueWrapperV(value)); checkRemove(); return value; } void checkRemove() { if (map.size() = maxEntries) return; Map.EntryK, ValueWrapperV eldestEntry = null; long eldestAge = Long.MAX_VALUE; for (Map.EntryK, ValueWrapperV entry : map.entrySet()) { long popularity = entry.getValue().popularity; if (eldestEntry == null || eldestEntry.getValue().popularity popularity) { eldestEntry = entry; } } map.remove(eldestEntry.getKey(), eldestEntry.getValue()); } public V get(Object key) { ValueWrapperV wrapper = map.get(key); return wrapper == null ? null : wrapper.getValue(); } public final static class ValueWrapperV { static volatile long popularityCounter; volatile long popularity; V value; ValueWrapper(V value) { this.value = value; popularity = popularityCounter++; } public boolean equals(Object o) { if (!(o instanceof ValueWrapper)) { return false; } return (value == null ? ((ValueWrapper) o).value == null : value.equals(((ValueWrapper) o).value)); } public int hashCode() { return value.hashCode(); } public V getValue() { popularity = popularityCounter++; return value; } } } {code} FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
) { if (!(o instanceof ValueWrapper)) { return false; } return (value == null ? ((ValueWrapper) o).value == null : value.equals(((ValueWrapper) o).value)); } public int hashCode() { return value.hashCode(); } public V getValue() { popularity = popularityCounter++; return value; } } } {code} FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
; popularity = popularityCounter++; } public boolean equals(Object o) { if (!(o instanceof ValueWrapper)) { return false; } return (value == null ? ((ValueWrapper) o).value == null : value.equals(((ValueWrapper) o).value)); } public int hashCode() { return value.hashCode(); } public V getValue() { popularity = popularityCounter++; return value; } } } {code} FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617655#action_12617655 ] Fuad Efendi commented on SOLR-665: -- bq. The eviction code looks like it would be relatively expensive but get() method of LinkedHashMap reorders whole map!!! (Of course, CPU load is evenly distributed between several get() so that we can't see it) Other implementations even use Arrays.sort() or something similar. I don't see easier solution than that... probably some random-access policy with predictable range of popularity, we can evict anything 'old' and not necessarily 'eldest'... FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617657#action_12617657 ] Fuad Efendi commented on SOLR-665: -- Lars, I used FIFO because it is extremely simple to get unsynchronized _get()_: {code} map = new LinkedHashMap(initialSize, 0.75f, true) - LRU Cache (and we need synchronized get()) map = new LinkedHashMap(initialSize, 0.75f, false) - FIFO (and we do not need synchronized get()) {code} Yonik, I'll try to improve ConcurrentLRU and to share findings... of course FIFO is not what we need. FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617657#action_12617657 ] funtick edited comment on SOLR-665 at 7/28/08 7:36 PM: --- Lars, I used FIFO because it is extremely simple to get unsynchronized _get()_: {code} map = new LinkedHashMap(initialSize, 0.75f, true) - LRU Cache (and we need synchronized get()) map = new LinkedHashMap(initialSize, 0.75f, false) - FIFO (and we do not need synchronized get()) {code} Yonik, I'll try to improve ConcurrentLRU and to share findings... of course FIFO is not what we need. bq. No it doesn't... think linked-list. It moves a single item, which is pretty fast. yes, so I wrote 'evenly distributed between several get() so we can't see it' - it keeps List ordered and we can't unsynchronize it with all subsequences!!! was (Author: funtick): Lars, I used FIFO because it is extremely simple to get unsynchronized _get()_: {code} map = new LinkedHashMap(initialSize, 0.75f, true) - LRU Cache (and we need synchronized get()) map = new LinkedHashMap(initialSize, 0.75f, false) - FIFO (and we do not need synchronized get()) {code} Yonik, I'll try to improve ConcurrentLRU and to share findings... of course FIFO is not what we need. FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617668#action_12617668 ] Fuad Efendi commented on SOLR-665: -- bq. No, it is not! Your analysis seems to ignore the java memory model (partially constructed objects and all that). I don't know how many different ways to say it please do yourself a favor and read up on the java memory model (and the book I previously referenced is great for this). This is hard stuff (at the lowest levels). Ok. May be I can get reference to wrong object type, or even object scheduled for finalization... But we are not inserting into Map 'partially constructed objects', isn't it? Simplest scenario: Thread A tries to get variable (4 bytes of address in JVM) pointing to object O. Another thread B concurrently assigns _null_ to that variable. Isn't it solved at CPU level yet? Or, may be on 64bit system thread B assigns zero to first 2 bytes, and then to another 2 bytes? I need to study this book... BTW, I am running JVM with '-server' option. FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617673#action_12617673 ] Fuad Efendi commented on SOLR-665: -- bq. The it references the map. It says explicitely that the map must be synchronized. - I agree, thanks for pointing to it. Synchronize! BTW, Joshua Bloch developed Arrays.sort(), and bug was found after 9 years. Nothing is perfect. ConcurrentLRU looks extremely simple and easy to improve. Should we check SUN's bug database before using ConcurrentHashMap? It has some related... FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617654#action_12617654 ] funtick edited comment on SOLR-665 at 7/28/08 8:36 PM: --- bq. The Solr admin pages will not give you exact measurements. Yes, and I do not need exact measurements! It gives me averageTimePerRequest which improved almost 10 times on production server. Should I right JUnit tests and execute it in a single-threaded environment? Better is to use The Grinder, but I don't have time and spare CPUs. bq. I've seen throughputs in excess of 400 searches per second. But 'searches per second' is not the same as 'average response time'!!! bq. Are you using highlighting or anything else that might be CPU-intensive at all? Yes, I am using highlighting. You can see it at http://www.tokenizer.org bq. I'm guessing that you're caching the results of all queries in memory such that no disk access is necessary. {color:red} But this is another bug of SOLR!!! I am using extremely large caches but SOLR still *recalculates* facet intersections. {color} bq. Consider the following case: thread A performs a synchronized put, thread B performs an unsynchronized get on the same key. B gets scheduled before A completes, the returned value will be undefined. the returned value is well defined: it is either null or correct value. bq. That's exactly the case here - the update thread modifies the map structurally! Who cares? We are not iterating the map! Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is easier to understand and troubleshoot... bq. I don't see the point of the static popularityCounter... that looks like a bug. No, it is not a bug. it is virtually checkpoint, like as a timer, one timer for all instances. We can use System.currentTimeMillis() instead, but static volatile long is faster. About specific use case: yes... if someone has 0.5 seconds response time for faceted queries I am very happy... I had 15 seconds before going with FIFO. was (Author: funtick): bq. The Solr admin pages will not give you exact measurements. Yes, and I do not need exact measurements! It gives me averageTimePerRequest which improved almost 10 times on production server. Should I right JUnit tests and execute it in a single-threaded environment? Better is to use The Grinder, but I don't have time and spare CPUs. bq. Consider the following case: thread A performs a synchronized put, thread B performs an unsynchronized get on the same key. B gets scheduled before A completes, the returned value will be undefined. the returned value is well defined: it is either null or correct value. bq. That's exactly the case here - the update thread modifies the map structurally! Who cares? We are not iterating the map! Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is easier to understand and troubleshoot... bq. I don't see the point of the static popularityCounter... that looks like a bug. No, it is not a bug. it is virtually checkpoint, like as a timer, one timer for all instances. We can use System.currentTimeMillis() instead, but static volatile long is faster. About specific use case: yes... if someone has 0.5 seconds response time for faceted queries I am very happy... I had 15 seconds before going with FIFO. FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120
[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617654#action_12617654 ] funtick edited comment on SOLR-665 at 7/28/08 8:41 PM: --- bq. The Solr admin pages will not give you exact measurements. Yes, and I do not need exact measurements! It gives me averageTimePerRequest which improved almost 10 times on production server. Should I right JUnit tests and execute it in a single-threaded environment? Better is to use The Grinder, but I don't have time and spare CPUs. bq. I've seen throughputs in excess of 400 searches per second. But 'searches per second' is not the same as 'average response time'!!! bq. Are you using highlighting or anything else that might be CPU-intensive at all? Yes, I am using highlighting. You can see it at http://www.tokenizer.org bq. I'm guessing that you're caching the results of all queries in memory such that no disk access is necessary. {color:red} But this is another bug of SOLR!!! I am using extremely large caches but SOLR still *recalculates* facet intersections. {color} bq. A FIFO cache might become a bottleneck itself - if the cache is very large and the most frequently accessed item is inserted just after the cache is created, all accesses will need to traverse all the other entries before getting that item. - sorry, I didn't understand... yes, if cache contains 10 entries and 'most popular item' is removed... Why 'traverse all the other entries before getting that item'? why 9 items are less popular (cumulative) than single one (absolute)? bq. Consider the following case: thread A performs a synchronized put, thread B performs an unsynchronized get on the same key. B gets scheduled before A completes, the returned value will be undefined. the returned value is well defined: it is either null or correct value. bq. That's exactly the case here - the update thread modifies the map structurally! Who cares? We are not iterating the map! Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is easier to understand and troubleshoot... bq. I don't see the point of the static popularityCounter... that looks like a bug. No, it is not a bug. it is virtually checkpoint, like as a timer, one timer for all instances. We can use System.currentTimeMillis() instead, but static volatile long is faster. About specific use case: yes... if someone has 0.5 seconds response time for faceted queries I am very happy... I had 15 seconds before going with FIFO. was (Author: funtick): bq. The Solr admin pages will not give you exact measurements. Yes, and I do not need exact measurements! It gives me averageTimePerRequest which improved almost 10 times on production server. Should I right JUnit tests and execute it in a single-threaded environment? Better is to use The Grinder, but I don't have time and spare CPUs. bq. I've seen throughputs in excess of 400 searches per second. But 'searches per second' is not the same as 'average response time'!!! bq. Are you using highlighting or anything else that might be CPU-intensive at all? Yes, I am using highlighting. You can see it at http://www.tokenizer.org bq. I'm guessing that you're caching the results of all queries in memory such that no disk access is necessary. {color:red} But this is another bug of SOLR!!! I am using extremely large caches but SOLR still *recalculates* facet intersections. {color} bq. Consider the following case: thread A performs a synchronized put, thread B performs an unsynchronized get on the same key. B gets scheduled before A completes, the returned value will be undefined. the returned value is well defined: it is either null or correct value. bq. That's exactly the case here - the update thread modifies the map structurally! Who cares? We are not iterating the map! Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is easier to understand and troubleshoot... bq. I don't see the point of the static popularityCounter... that looks like a bug. No, it is not a bug. it is virtually checkpoint, like as a timer, one timer for all instances. We can use System.currentTimeMillis() instead, but static volatile long is faster. About specific use case: yes... if someone has 0.5 seconds response time for faceted queries I am very happy... I had 15 seconds before going with FIFO. FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached
[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
popularityCounter... that looks like a bug. No, it is not a bug. it is virtually checkpoint, like as a timer, one timer for all instances. We can use System.currentTimeMillis() instead, but static volatile long is faster. About specific use case: yes... if someone has 0.5 seconds response time for faceted queries I am very happy... I had 15 seconds before going with FIFO. FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
(this); return e.value; } {code} bq. Consider the following case: thread A performs a synchronized put, thread B performs an unsynchronized get on the same key. B gets scheduled before A completes, the returned value will be undefined. the returned value is well defined: it is either null or correct value. bq. That's exactly the case here - the update thread modifies the map structurally! Who cares? We are not iterating the map! Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is easier to understand and troubleshoot... bq. I don't see the point of the static popularityCounter... that looks like a bug. No, it is not a bug. it is virtually checkpoint, like as a timer, one timer for all instances. We can use System.currentTimeMillis() instead, but static volatile long is faster. About specific use case: yes... if someone has 0.5 seconds response time for faceted queries I am very happy... I had 15 seconds before going with FIFO. FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617684#action_12617684 ] Fuad Efendi commented on SOLR-665: -- bq. Fuad is after fastest-possible reads, everybody is after reasonable behavior in the face of concurrent writes Thanks and sorry for runtime errors; FIFO looks strange at first, but... for large cache (10 items), most popular item can be _mistakenly_ removed... but I don't think there are any 'most popular facets' etc.; it's evenly distributed in most cases. Another issue: SOLR always tries _recalculate_ _facets_ even with extremely large filterCache queryResultCache, even the same faceted query shows always the same long response times. FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617684#action_12617684 ] funtick edited comment on SOLR-665 at 7/28/08 9:20 PM: --- bq. Fuad is after fastest-possible reads, everybody is after reasonable behavior in the face of concurrent writes Thanks and sorry for runtime errors; FIFO looks strange at first, but... for large cache (10 items), most popular item can be _mistakenly_ removed... but I don't think there are any 'most popular facets' etc.; it's evenly distributed in most cases. Another issue: SOLR always tries _recalculate_ _facets_ even with extremely large filterCache queryResultCache, even the same faceted query shows always the same long response times. bq. It is if nothing is modifying the map during the get. If something is modifying the map you don't know how the implementation handles the insert of a new value. It might copy the object, and you'd end up with half an object or even an invalid memory location. That's why the javadoc says that you must synchronize accesses if anything modifies the map - this is not limited to iterators. I agree of course... However, we are not dealing with unknown implementation of java.util.Map clonig (java.lang.Cloneable) objects somehow or using some weird object introspection etc was (Author: funtick): bq. Fuad is after fastest-possible reads, everybody is after reasonable behavior in the face of concurrent writes Thanks and sorry for runtime errors; FIFO looks strange at first, but... for large cache (10 items), most popular item can be _mistakenly_ removed... but I don't think there are any 'most popular facets' etc.; it's evenly distributed in most cases. Another issue: SOLR always tries _recalculate_ _facets_ even with extremely large filterCache queryResultCache, even the same faceted query shows always the same long response times. FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617692#action_12617692 ] Fuad Efendi commented on SOLR-665: -- BTW there is almost no any functional difference between LRU and FIFO. And there is *huge difference* between LRU (Least Recently Used) and LFU (Least Frequently Used). It's easy to implement ConcurrentLFU based on provided ConcurrentLRU template; of course, following the main _contract_ org.apache.solr.search.SolrCache. FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617699#action_12617699 ] Fuad Efendi commented on SOLR-665: -- Paul, I want to do it... understand me, since February I am having constant performance problems with faceted queries (15-20 seconds response time), I ordered new server with 32 Gb 2x quad-core (8x times more power!) but it didn't improve performance; finally I commented sync in LRUCache and made it FIFO... I was very impatient with this post, just tried to share very real staff... FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name:filterCache class: org.apache.solr.search.LRUCache version:1.0 description:LRU Cache(maxSize=1000, initialSize=1000) stats: lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fuad Efendi updated SOLR-665: - Attachment: FIFOCache.java FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617357#action_12617357 ] Fuad Efendi commented on SOLR-665: -- I renamed to FIFOCache just before opening an issue; in a local system it is (modified) LRUCache so that filterCache has reference to 'old' name... FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617359#action_12617359 ] Fuad Efendi commented on SOLR-665: -- Almost forgot: I am estimating performance gains basing on real application in-production, multithreaded, Tomcat 6.0.16 JRockit R27 (Java 6) SLES10 two quad-core Opteron 2350 (8 CPUs total) 25Gb for SOLR... And, first queries run a long, more than a minute (warming up caches with faceted query id:[* TO *]) average Time-Per-Request decreases over time and it is now 1591.5232 giving 10x performance boost. Facets are highly distributed as you can see from website and filterCache... HTTP caching is supported - to see real timing you should execute real queries... ConcurrentHashMap is not applicable - we are not modifying cached item indeed... FIFO is without 'Out' if we have enough memory. FIFO Cache (Unsynchronized): 9x times performance boost --- Key: SOLR-665 URL: https://issues.apache.org/jira/browse/SOLR-665 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Environment: JRockit R27 (Java 6) Reporter: Fuad Efendi Attachments: FIFOCache.java Original Estimate: 672h Remaining Estimate: 672h Attached is modified version of LRUCache where 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that reordering/true (performance bottleneck of LRU) is replaced to insertion-order/false (so that it became FIFO) 2. Almost all (absolutely unneccessary) synchronized statements commented out See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html Performance metrics (taken from SOLR Admin): LRU Requests: 7638 Average Time-Per-Request: 15300 Average Request-per-Second: 0.06 FIFO: Requests: 3355 Average Time-Per-Request: 1610 Average Request-per-Second: 0.11 Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org) Current number of documents: 7494689 name: filterCache class:org.apache.solr.search.LRUCache version: 1.0 description: LRU Cache(maxSize=1000, initialSize=1000) stats:lookups : 15966954582 hits : 16391851546 hitratio : 0.102 inserts : 4246120 evictions : 0 size : 2668705 cumulative_lookups : 16415839763 cumulative_hits : 16411608101 cumulative_hitratio : 0.99 cumulative_inserts : 4246246 cumulative_evictions : 0 Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-653) remove overwrite command from solrj API
[ https://issues.apache.org/jira/browse/SOLR-653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616909#action_12616909 ] Fuad Efendi commented on SOLR-653: -- Thanks Ryan, Another question, add(CollectionSolrInputDocument docs) - what may happen if Collection contains documents with same uniqueKey? remove overwrite command from solrj API - Key: SOLR-653 URL: https://issues.apache.org/jira/browse/SOLR-653 Project: Solr Issue Type: Improvement Components: clients - java Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-653-remove-solrj-overwrite.patch The solrj API should not expose the 'overwrite' option. Using it will most likely cause errors. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-653) remove overwrite command from solrj API
[ https://issues.apache.org/jira/browse/SOLR-653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616667#action_12616667 ] Fuad Efendi commented on SOLR-653: -- Will it overwrite by default? I need to update document using it's primary key. If I use overwrite=false new document is added with same primary key (defined in schema as uniqueKeyid/uniqueKey) and I'll have two different documents with same _unique_ key even after commitoptimize. Is there bug with 'uniqueKey'? Thanks remove overwrite command from solrj API - Key: SOLR-653 URL: https://issues.apache.org/jira/browse/SOLR-653 Project: Solr Issue Type: Improvement Components: clients - java Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-653-remove-solrj-overwrite.patch The solrj API should not expose the 'overwrite' option. Using it will most likely cause errors. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
solr http 500 look
What about commenting out this piece of outdated code in SolrServlet: } catch (Throwable e) { SolrException.log(log,e); sendErr(500, SolrException.toStr(e), request, response); } For instance, SUN Java 5 not necessarily has resources to output OutOfMemoryError including stack trace; only JRockit can do it... I understand that historically SOLR developers tried to implement full power of HTTP, but let's be more pragmatic...
solr http 500 look
What about commenting out this piece of outdated code in SolrServlet: } catch (Throwable e) { SolrException.log(log,e); sendErr(500, SolrException.toStr(e), request, response); } For instance, SUN Java 5 not necessarily has resources to output OutOfMemoryError including stack trace; only JRockit can do it... I understand that historically SOLR developers tried to implement full power of HTTP, but let's be more pragmatic... P.S. It didn't reach SOLR-DEV. Forwarding to SOLR-USER. Sent: Sunday, July 20, 2008 7:34 PM To: solr-dev@lucene.apache.org Subject: solr http 500 look ...
RE: LRUCache - synchronized!?
What, ConcurrentHashMap? I briefly considered it when I threw the caching stuff together... but the key here is that it's an LRUCache using LinkedHashMap, and there is no ConcurrentLinkedHashMap. -Yonik There is a lot more... Ask Doug Lea, only small piece of his work is part of Java 5/6, after 12 years of PoC... As a sample, EhCache has different builds depending on Java version... P.S. I am unable to post message regarding OutOfMemoryError (-Xmx8192M -Xms8192M, RAM 16Gb) (something cache related, probably smth 'unsynchronized'; happens sometimes, during high loads)
RE: LRUCache - synchronized!?
:) Have you tried using a ConcurrentHashMap? - Of course. And having code like Object get(Object key) { synchronized (map) { ... atomic.incrementAndGet() ... } forces me to do more sanity checks... Now I understand why SOLR uses single overloaded CPU in 4-CPU system when faceting is enabled. Even get() method is synchronized... I need to learn how to use plugin jars, don't want to alter source... Thanks! On 1-Apr-08, at 6:58 PM, Fuad Efendi wrote: Can we have anything better? I can't use 4-CPUs :( Thanks! You can have anything your heart desires... Solr is open-source :) Have you tried using a ConcurrentHashMap? -Mike
LRUCache - synchronized!?
Can we have anything better? I can't use 4-CPUs :( Thanks! P.S. Blocked on lock LRUCache.java:128 V.555343 7/11/07
RE: Default Logging: OFF
Thanks... I am thinking about making *persistent* runtime configuration options (available via SOLR admin screen)... It would be best option... Of course we can write (global) config file and put it somewhere, even for people who would expect the JVM default configuration to work - many people expect HTTP Client to work without deadlocks ;) in Java 6 (even 1.5.14; SUN)(what about backward compatibility? Write once run everywhere?!!) : Currently, default settings output UPDATE messages (thousands per second : in my case). Is it WARNING/SEVERE setting in SOLR source? They are INFO level messages, which is the jvm defualt. One INFO level message per request doesn't seem that excessive to me, almost any HTTP server i can think of is going to have an access log that records even more info per request. Solr doens't have any logging configuration options in the solrconfig.xml for a variety of reasons, a few off hte top of my head... * most servlet containers already have a way to cofigure logging. If solr had it's own way (that trumped the servlet container) that would piss a lot of people off would expect their container configs to work, and make configuring solr really confusing. * even if people use a servlet container that doesn't have a convinient way to configure logging, the JVM has a default logging configuration mechanism which allows for truely globals logging configuration (used by every JVM instance on the machine) with overrides possible per JVM. If solr had it's own way (that trumped the servlet container) that would piss a lot of people who would expect the JVM default configuration to work, and make configuring solr really confusing.
RE: Default Logging: OFF
Currently, default settings output UPDATE messages (thousands per second in my case). Is it WARNING/SEVERE setting in SOLR source? I disagree that the default should be off. Logging doesn't impact performance that much unless it is set to something like FINEST. I don't think you want to ignore SECERE and/or WARNING message.
Default Logging: OFF
Hi, How to set *default* SOLR logging level to OFF? My logs quickly become very huge...
[jira] Commented: (SOLR-127) Make Solr more friendly to external HTTP caches
[ https://issues.apache.org/jira/browse/SOLR-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12567067#action_12567067 ] Fuad Efendi commented on SOLR-127: -- In my configuration I do not need SOLR caching at all; but I use HTTP caching more effectively. HTTPD memory- and disk- cache is used between Client and Middleware. No any caching between Middleware and SOLR. Middleware responds to HTTPD with 304 if necessary, with correct Last-Modified etc., and request do not reach SOLR. This caching configuration works fine with AJAX too, without SOLR's caching headers. I've seen unnecessary extra-work with this implementation... taking long time... and tried to point on some meanings of response codes (for Web). Make Solr more friendly to external HTTP caches --- Key: SOLR-127 URL: https://issues.apache.org/jira/browse/SOLR-127 Project: Solr Issue Type: Wish Reporter: Hoss Man Assignee: Hoss Man Fix For: 1.3 Attachments: CacheUnitTest.patch, CacheUnitTest.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch an offhand comment I saw recently reminded me of something that really bugged me about the serach solution i used *before* Solr -- it didn't play nicely with HTTP caches that might be sitting in front of it. at the moment, Solr doesn't put in particularly usefull info in the HTTP Response headers to aid in caching (ie: Last-Modified), responds to all HEAD requests with a 400, and doesn't do anything special with If-Modified-Since. t the very least, we can set a Last-Modified based on when the current IndexReder was open (if not the Date on the IndexReader) and use the same info to determing how to respond to If-Modified-Since requests. (for the record, i think the reason this hasn't occured to me in the 2+ years i've been using Solr, is because with the internal caching, i've yet to need to put a proxy cache in front of Solr) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-127) Make Solr more friendly to external HTTP caches
[ https://issues.apache.org/jira/browse/SOLR-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12567072#action_12567072 ] Fuad Efendi commented on SOLR-127: -- Regarding HTTP-Caching-Load-Balancer between SOLR and Middleware: You need to deal with additional internal http-cache at middleware. In most cases Middleware generates content from different sources and can't reroute If-Modified-Since request to SOLR without internal caching. For instance, if you are using SOLRJ, you have to implement *additional* cache for SolrDocument... Make Solr more friendly to external HTTP caches --- Key: SOLR-127 URL: https://issues.apache.org/jira/browse/SOLR-127 Project: Solr Issue Type: Wish Reporter: Hoss Man Assignee: Hoss Man Fix For: 1.3 Attachments: CacheUnitTest.patch, CacheUnitTest.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch an offhand comment I saw recently reminded me of something that really bugged me about the serach solution i used *before* Solr -- it didn't play nicely with HTTP caches that might be sitting in front of it. at the moment, Solr doesn't put in particularly usefull info in the HTTP Response headers to aid in caching (ie: Last-Modified), responds to all HEAD requests with a 400, and doesn't do anything special with If-Modified-Since. t the very least, we can set a Last-Modified based on when the current IndexReder was open (if not the Date on the IndexReader) and use the same info to determing how to respond to If-Modified-Since requests. (for the record, i think the reason this hasn't occured to me in the 2+ years i've been using Solr, is because with the internal caching, i've yet to need to put a proxy cache in front of Solr) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-127) Make Solr more friendly to external HTTP caches
[ https://issues.apache.org/jira/browse/SOLR-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12567077#action_12567077 ] Fuad Efendi commented on SOLR-127: -- Thomas, Walter, Finally I agree, thanks! Middleware should not send/reroute If-Modified-Since, and should not implement internal cache (in provided by me contr-sample): with caching enabled, it will simply retrieve cached content. I do not agree with 400, it is place for DoS attacks. Query parsing error should be 200 with caching response codes. Of course, I know RFC 2616. Make Solr more friendly to external HTTP caches --- Key: SOLR-127 URL: https://issues.apache.org/jira/browse/SOLR-127 Project: Solr Issue Type: Wish Reporter: Hoss Man Assignee: Hoss Man Fix For: 1.3 Attachments: CacheUnitTest.patch, CacheUnitTest.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch an offhand comment I saw recently reminded me of something that really bugged me about the serach solution i used *before* Solr -- it didn't play nicely with HTTP caches that might be sitting in front of it. at the moment, Solr doesn't put in particularly usefull info in the HTTP Response headers to aid in caching (ie: Last-Modified), responds to all HEAD requests with a 400, and doesn't do anything special with If-Modified-Since. t the very least, we can set a Last-Modified based on when the current IndexReder was open (if not the Date on the IndexReader) and use the same info to determing how to respond to If-Modified-Since requests. (for the record, i think the reason this hasn't occured to me in the 2+ years i've been using Solr, is because with the internal caching, i've yet to need to put a proxy cache in front of Solr) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-127) Make Solr more friendly to external HTTP caches
[ https://issues.apache.org/jira/browse/SOLR-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12567064#action_12567064 ] Fuad Efendi commented on SOLR-127: -- I agree. Caching Load Balancer between SOLR and APP Servers is excellent idea, and it can be black box without any knowlege about SOLR API. AJAX can use internal cache of web browser; FLEX probably too... Question: do we need caching of static (non-changed) content from SOLR such as 400: Query parsing error?.. Make Solr more friendly to external HTTP caches --- Key: SOLR-127 URL: https://issues.apache.org/jira/browse/SOLR-127 Project: Solr Issue Type: Wish Reporter: Hoss Man Assignee: Hoss Man Fix For: 1.3 Attachments: CacheUnitTest.patch, CacheUnitTest.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch an offhand comment I saw recently reminded me of something that really bugged me about the serach solution i used *before* Solr -- it didn't play nicely with HTTP caches that might be sitting in front of it. at the moment, Solr doesn't put in particularly usefull info in the HTTP Response headers to aid in caching (ie: Last-Modified), responds to all HEAD requests with a 400, and doesn't do anything special with If-Modified-Since. t the very least, we can set a Last-Modified based on when the current IndexReder was open (if not the Date on the IndexReader) and use the same info to determing how to respond to If-Modified-Since requests. (for the record, i think the reason this hasn't occured to me in the 2+ years i've been using Solr, is because with the internal caching, i've yet to need to put a proxy cache in front of Solr) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-127) Make Solr more friendly to external HTTP caches
[ https://issues.apache.org/jira/browse/SOLR-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12567081#action_12567081 ] Fuad Efendi commented on SOLR-127: -- Fortunately, we are not using 404 trying to retrieve removed document... In initial design (I believe) SOLR developers simply wrapped all exceptions into 400, and empty result set is not an exception. Make Solr more friendly to external HTTP caches --- Key: SOLR-127 URL: https://issues.apache.org/jira/browse/SOLR-127 Project: Solr Issue Type: Wish Reporter: Hoss Man Assignee: Hoss Man Fix For: 1.3 Attachments: CacheUnitTest.patch, CacheUnitTest.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch an offhand comment I saw recently reminded me of something that really bugged me about the serach solution i used *before* Solr -- it didn't play nicely with HTTP caches that might be sitting in front of it. at the moment, Solr doesn't put in particularly usefull info in the HTTP Response headers to aid in caching (ie: Last-Modified), responds to all HEAD requests with a 400, and doesn't do anything special with If-Modified-Since. t the very least, we can set a Last-Modified based on when the current IndexReder was open (if not the Date on the IndexReader) and use the same info to determing how to respond to If-Modified-Since requests. (for the record, i think the reason this hasn't occured to me in the 2+ years i've been using Solr, is because with the internal caching, i've yet to need to put a proxy cache in front of Solr) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
RE: /example/solr/bin is empty in trunk
Try ant example in the base dir to build the example. Thanks, it works
[jira] Commented: (SOLR-127) Make Solr more friendly to external HTTP caches
[ https://issues.apache.org/jira/browse/SOLR-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12566865#action_12566865 ] Fuad Efendi commented on SOLR-127: -- This is an alternative to initially proposed HTTP-caching, and it is extremely easy to implement: Simply add request parameter http.header=If-Modified-Since: Tue, 05 Feb 2008 03:50:00 GMT (better is to use other names, do not use http.header parameter; see below...) Let SOLR to respond via standard XML message Not Modified, and avoid using 304 response code What do you think? We can even encapsulate MAX-AGE, EXPIRES, and other useful stuff (like as additional UPDATE-FREQUENCY: 30 days) into XML, and all those staff can depend on internal Lucene statistics (and not on hard-coded values in SOLR-CONFIG). We should not use HTTP-Protocol response headers such as 304/400/500 to describe SOLR's external API. Sample: Apache HTTPD front-end, Tomcat (Struts-based middleware), and SOLR (backend). With your initial proposal different users will get different data. Why? Multithreading at Apache HTTPD. At least, there are some possible fluctuations, cache is not shared in some configurations, etc. Each thread may get own copy of last-modified, and different users will see different data. It won't work for most business cases. Without HTTP: is modified? when is next update of BOOKS category? - all caches around the world have the same timestamp for BOOKS category ... ... ... Make Solr more friendly to external HTTP caches --- Key: SOLR-127 URL: https://issues.apache.org/jira/browse/SOLR-127 Project: Solr Issue Type: Wish Reporter: Hoss Man Assignee: Hoss Man Fix For: 1.3 Attachments: CacheUnitTest.patch, CacheUnitTest.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch an offhand comment I saw recently reminded me of something that really bugged me about the serach solution i used *before* Solr -- it didn't play nicely with HTTP caches that might be sitting in front of it. at the moment, Solr doesn't put in particularly usefull info in the HTTP Response headers to aid in caching (ie: Last-Modified), responds to all HEAD requests with a 400, and doesn't do anything special with If-Modified-Since. t the very least, we can set a Last-Modified based on when the current IndexReder was open (if not the Date on the IndexReader) and use the same info to determing how to respond to If-Modified-Since requests. (for the record, i think the reason this hasn't occured to me in the 2+ years i've been using Solr, is because with the internal caching, i've yet to need to put a proxy cache in front of Solr) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-127) Make Solr more friendly to external HTTP caches
[ https://issues.apache.org/jira/browse/SOLR-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12566869#action_12566869 ] Fuad Efendi commented on SOLR-127: -- Of course ETag etc. will synchronize caches; but anyway why do we need such features of HTTP specs? HTTP Caching is widely used to cache responces from HTTP Servers, content (HTML, PDF, JPG, EXE) can be cached at coprorate proxy, and locally in Internet Explorer's internal cache. That is the main idea. *Are SOLR-XML responses roving the world and reaching internal cache of Mozilla Firefox, or corporate caching proxies?* -Not. Clients of SOLR: Middleware. Do they need to act as caching-proxy? May be Just another use case: middleware publishes current time weather together with response from SOLR; middleware wants to cache responses from SOLR and do not rely on requests coming from end users because of frequent weather changes ;) - it depends on implementation of such middleware, for sure, it will try to cache SolrDocument objects instead of pure XML, and such kind of caching is not HTTP-related. Make Solr more friendly to external HTTP caches --- Key: SOLR-127 URL: https://issues.apache.org/jira/browse/SOLR-127 Project: Solr Issue Type: Wish Reporter: Hoss Man Assignee: Hoss Man Fix For: 1.3 Attachments: CacheUnitTest.patch, CacheUnitTest.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch an offhand comment I saw recently reminded me of something that really bugged me about the serach solution i used *before* Solr -- it didn't play nicely with HTTP caches that might be sitting in front of it. at the moment, Solr doesn't put in particularly usefull info in the HTTP Response headers to aid in caching (ie: Last-Modified), responds to all HEAD requests with a 400, and doesn't do anything special with If-Modified-Since. t the very least, we can set a Last-Modified based on when the current IndexReder was open (if not the Date on the IndexReader) and use the same info to determing how to respond to If-Modified-Since requests. (for the record, i think the reason this hasn't occured to me in the 2+ years i've been using Solr, is because with the internal caching, i've yet to need to put a proxy cache in front of Solr) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
/example/solr/bin is empty in trunk
Is it correct?.. I want to try distribution/replication in v.2.3
[no subject]
A little inconvenience: published API have missed some classes such as WordDelimiterFilter... Thanks, Fuad http://www.tokenizer.org
RE: Broken Backward Compatibility
I am not sure, didn't have a time to check all details... Simply: old version of an application tried to send delete request, and could not parse response due to (possible) schema changes and hardcoded tag values... I am not using version number in URL for delete POSTs... Thanks -Original Message- From: Chris Hostetter [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 19, 2007 1:11 AM To: solr-dev@lucene.apache.org Subject: Re: Broken Backward Compatibility : The only problem is changed protocols, old client application does not : understand new XML responses from server. It recalls some very old Can you elaborate on what XML responses have changed for you? To the best of my knowledge any solrconfig.xml that worked with Solr 1.1 will continue to work with 1.2 and provide the same XML response for updates commands. The XmlResponseWriter for queries supports a version param to ensure that even if it changes overtime, you can always get the response format you expect -- but i don't think it had a version bump in Solr 1.2 -Hoss