[jira] [Commented] (SOLR-2020) HttpComponentsSolrServer
[ https://issues.apache.org/jira/browse/SOLR-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13257535#comment-13257535 ] Shawn Heisey commented on SOLR-2020: If this is a problem for 3.6 as well can you commit the fix there? I plan to upgrade to 3.6 in the near future and I would like to use the new client. If it is a problem on 3.6, I am wondering if this is an important enough problem to release 3.6.1. > HttpComponentsSolrServer > > > Key: SOLR-2020 > URL: https://issues.apache.org/jira/browse/SOLR-2020 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 1.4.1 > Environment: Any >Reporter: Chantal Ackermann >Assignee: Sami Siren >Priority: Minor > Fix For: 3.6, 4.0 > > Attachments: HttpComponentsSolrServer.java, > HttpComponentsSolrServerTest.java, SOLR-2020-3x.patch, > SOLR-2020-HttpSolrServer.patch, SOLR-2020.patch, SOLR-2020.patch, > SOLR-2020.patch, SOLR-2020.patch > > > Implementation of SolrServer that uses the Apache Http Components framework. > Http Components (http://hc.apache.org/) is the successor of Commons > HttpClient and thus HttpComponentsSolrServer would be a successor of > CommonsHttpSolrServer, in the future. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1972) Need additional query stats in admin interface - median, 95th and 99th percentile
[ https://issues.apache.org/jira/browse/SOLR-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255730#comment-13255730 ] Shawn Heisey commented on SOLR-1972: Purely hypothetical stuff, probably way beyond my skills: Would it be possible (and useful) to use a Lucene index (RAMDirectory maybe) to store the query time data for performance reasons, or is the current array implementation good enough? > Need additional query stats in admin interface - median, 95th and 99th > percentile > - > > Key: SOLR-1972 > URL: https://issues.apache.org/jira/browse/SOLR-1972 > Project: Solr > Issue Type: Improvement >Affects Versions: 1.4 >Reporter: Shawn Heisey >Priority: Minor > Attachments: SOLR-1972-branch3x-url_pattern.patch, > SOLR-1972-url_pattern.patch, SOLR-1972.patch, SOLR-1972.patch, > SOLR-1972.patch, SOLR-1972.patch, elyograg-1972-3.2.patch, > elyograg-1972-3.2.patch, elyograg-1972-trunk.patch, elyograg-1972-trunk.patch > > > I would like to see more detailed query statistics from the admin GUI. This > is what you can get now: > requests : 809 > errors : 0 > timeouts : 0 > totalTime : 70053 > avgTimePerRequest : 86.59209 > avgRequestsPerSecond : 0.8148785 > I'd like to see more data on the time per request - median, 95th percentile, > 99th percentile, and any other statistical function that makes sense to > include. In my environment, the first bunch of queries after startup tend to > take several seconds each. I find that the average value tends to be useless > until it has several thousand queries under its belt and the caches are > thoroughly warmed. The statistical functions I have mentioned would quickly > eliminate the influence of those initial slow queries. > The system will have to store individual data about each query. I don't know > if this is something Solr does already. It would be nice to have a > configurable count of how many of the most recent data points are kept, to > control the amount of memory the feature uses. The default value could be > something like 1024 or 4096. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache
[ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255727#comment-13255727 ] Shawn Heisey commented on SOLR-2889: bq. Would you mind posting some information about the results of your work and how much performance gain you made. If you have benchmark results this would be ideal. Did you notice any increase/decrease in memory and CPU demand? I haven't done any extensive testing. The testing that I did do for SOLR-2906 suggested that the LFU cache did not offer any performance benefit over LRU, but that it didn't really cause a performance detriment either. I think this means that the idea was sound, but any speedups gained from the different methodology were lost because of the basic and non-optimized implementation. It was not a definitive test - I have two copies of my production distributed index for redundancy purposes, with haproxy doing load balancing between the two. I can set one set of servers to LFU and the other to LRU, but it's production, so the two sets of servers never receive the same queries and I don't really want to try any isolation tests on production equipment. My testbed is too small for a doing tests with all production data - one server with all resources smaller than production. I could do some tests with smaller data sets that will fit entirely in RAM, but that will take a lot of planning that I currently don't have time to do. The LRU cache is highly optimized for speed, but I didn't really understand the optimizations and they don't apply to LFU as far as I can tell. At this time I am still using LRU cache because I don't dare change the production configuration without authorization and I can't leave production servers in test mode for very long. > Implement Adaptive Replacement Cache > > > Key: SOLR-2889 > URL: https://issues.apache.org/jira/browse/SOLR-2889 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Priority: Minor > > Currently Solr's caches are LRU, which doesn't look at hitcount to decide > which entries are most important. There is a method that takes both > frequency and time of cache hits into account: > http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache > If it's feasible, this could be a good addition to Solr/Lucene. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3284) StreamingUpdateSolrServer swallows exceptions
[ https://issues.apache.org/jira/browse/SOLR-3284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255705#comment-13255705 ] Shawn Heisey commented on SOLR-3284: After looking at existing tests to see how I might implement tests for this new functionality, I couldn't see how to do it. Also, I noticed that there are tests for SolrCloud and something else called ChaosMonkey. All tests in solr/ pass with this patch, but I don't know how SolrCloud might be affected. I would hope that it already handles exceptions properly and therefore wouldn't have any problems, but I have never looked at the code or used SolrCloud. > StreamingUpdateSolrServer swallows exceptions > - > > Key: SOLR-3284 > URL: https://issues.apache.org/jira/browse/SOLR-3284 > Project: Solr > Issue Type: Improvement > Components: clients - java >Affects Versions: 3.5, 4.0 >Reporter: Shawn Heisey > Attachments: SOLR-3284.patch > > > StreamingUpdateSolrServer eats exceptions thrown by lower level code, such as > HttpClient, when doing adds. It may happen with other methods, though I know > that query and deleteByQuery will throw exceptions. I believe that this is a > result of the queue/Runner design. That's what makes SUSS perform better, > but it means you sacrifice the ability to programmatically determine that > there was a problem with your update. All errors are logged via slf4j, but > that's not terribly helpful except with determining what went wrong after the > fact. > When using CommonsHttpSolrServer, I've been able to rely on getting an > exception thrown by pretty much any error, letting me use try/catch to detect > problems. > There's probably enough dependent code out there that it would not be a good > idea to change the design of SUSS, unless there were alternate constructors > or additional methods available to configure new/old behavior. Fixing this > is probably not trivial, so it's probably a better idea to come up with a new > server object based on CHSS. This is outside my current skillset. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3284) StreamingUpdateSolrServer swallows exceptions
[ https://issues.apache.org/jira/browse/SOLR-3284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255659#comment-13255659 ] Shawn Heisey commented on SOLR-3284: The patch should also apply successfully to 3.6. > StreamingUpdateSolrServer swallows exceptions > - > > Key: SOLR-3284 > URL: https://issues.apache.org/jira/browse/SOLR-3284 > Project: Solr > Issue Type: Improvement > Components: clients - java >Affects Versions: 3.5, 4.0 >Reporter: Shawn Heisey > Attachments: SOLR-3284.patch > > > StreamingUpdateSolrServer eats exceptions thrown by lower level code, such as > HttpClient, when doing adds. It may happen with other methods, though I know > that query and deleteByQuery will throw exceptions. I believe that this is a > result of the queue/Runner design. That's what makes SUSS perform better, > but it means you sacrifice the ability to programmatically determine that > there was a problem with your update. All errors are logged via slf4j, but > that's not terribly helpful except with determining what went wrong after the > fact. > When using CommonsHttpSolrServer, I've been able to rely on getting an > exception thrown by pretty much any error, letting me use try/catch to detect > problems. > There's probably enough dependent code out there that it would not be a good > idea to change the design of SUSS, unless there were alternate constructors > or additional methods available to configure new/old behavior. Fixing this > is probably not trivial, so it's probably a better idea to come up with a new > server object based on CHSS. This is outside my current skillset. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3284) StreamingUpdateSolrServer swallows exceptions
[ https://issues.apache.org/jira/browse/SOLR-3284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255602#comment-13255602 ] Shawn Heisey commented on SOLR-3284: If the Solr server goes down in between updates done with the concurrent server, doing further updates will fail, but the calling code will not know that. With the Commons or Http server, an exception is thrown that my code catches. I don't think that just overriding handleError is enough. If Solr goes down but the machine is still up, you have immediate failure detection because the connection will be refused. If the server goes away entirely, it could take a couple of minutes to fail. You would have to provide methods to check that 1) all background operations are complete and 2) they were error free. I can no longer remember whether an exception is thrown when trying a commit against a down machine with the concurrent server. IIRC it does throw one in this instance. I definitely believe that it should. Perhaps the current handleError code could update class-level members (with names like "boolean updateErrored" and "SolrServerException updateException") that could be checked and used by the commit method. If they are set, it would reset them and throw an exception (fast-fail) without actually trying the commit. There should probably be a constructor option and a set method to either activate this new behavior or restore the original behavior. When I first designed my code, I was relying on the exceptions thrown by the commons server when doing the actual update, so it's too late by the time it reaches the commit - it has already updated the position values. I now realize that this is incorrect design, though I might never have figured it out without my attempt to use the concurrent server. It's going to be a bit painful to redesign my code to put off updating position values until after a successful commit operation. It's something I do intend to do. > StreamingUpdateSolrServer swallows exceptions > - > > Key: SOLR-3284 > URL: https://issues.apache.org/jira/browse/SOLR-3284 > Project: Solr > Issue Type: Improvement > Components: clients - java >Affects Versions: 3.5, 4.0 >Reporter: Shawn Heisey > > StreamingUpdateSolrServer eats exceptions thrown by lower level code, such as > HttpClient, when doing adds. It may happen with other methods, though I know > that query and deleteByQuery will throw exceptions. I believe that this is a > result of the queue/Runner design. That's what makes SUSS perform better, > but it means you sacrifice the ability to programmatically determine that > there was a problem with your update. All errors are logged via slf4j, but > that's not terribly helpful except with determining what went wrong after the > fact. > When using CommonsHttpSolrServer, I've been able to rely on getting an > exception thrown by pretty much any error, letting me use try/catch to detect > problems. > There's probably enough dependent code out there that it would not be a good > idea to change the design of SUSS, unless there were alternate constructors > or additional methods available to configure new/old behavior. Fixing this > is probably not trivial, so it's probably a better idea to come up with a new > server object based on CHSS. This is outside my current skillset. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache
[ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254671#comment-13254671 ] Shawn Heisey commented on SOLR-2889: After thinking about what ended up being the final code for SOLR-2906, I know that I won't be able to tackle this, but I am wondering whether this is really necessary any more. The timeDecay option on the LFU cache implementation could be viewed as an LRU tweak to the LFU cache, which I think fulfills my original goals even if it's not a true ARC cache. Does that mean this issue should be closed? I can't say. I hope someone really smart is able to provide some serious speed optimization for the new LFU cache. > Implement Adaptive Replacement Cache > > > Key: SOLR-2889 > URL: https://issues.apache.org/jira/browse/SOLR-2889 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Priority: Minor > > Currently Solr's caches are LRU, which doesn't look at hitcount to decide > which entries are most important. There is a method that takes both > frequency and time of cache hits into account: > http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache > If it's feasible, this could be a good addition to Solr/Lucene. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3333) Create an option that allows a query to be cached, but not used for warming
[ https://issues.apache.org/jira/browse/SOLR-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254664#comment-13254664 ] Shawn Heisey commented on SOLR-: I just thought of a localparam syntax for this: {!cache=nowarm} > Create an option that allows a query to be cached, but not used for warming > --- > > Key: SOLR- > URL: https://issues.apache.org/jira/browse/SOLR- > Project: Solr > Issue Type: New Feature >Affects Versions: 3.5, 4.0 >Reporter: Shawn Heisey > > The application that uses my Solr install builds complex filter queries for > employees because they have access to everything, whereas most users have > access to a small subset. > Because of this, autowarming on the filterCache can take 30-60 seconds even > though autoWarm is set to just 4 queries. > If we had a way (probably a localparam) to tell Solr to not use those filters > when autowarming, but to go ahead and put them in the filterCache and use > them until there's a new commit, that would eliminate this problem. > Employees might have their queries take longer, but regular users would not > be affected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3327) Logging UI should indicate which loggers are set vs implicit
[ https://issues.apache.org/jira/browse/SOLR-3327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253123#comment-13253123 ] Shawn Heisey commented on SOLR-3327: Is there any way to give users the slf4j level options and have those translated automatically behind the scenes into the correct levels for the framework that's actually in use? > Logging UI should indicate which loggers are set vs implicit > > > Key: SOLR-3327 > URL: https://issues.apache.org/jira/browse/SOLR-3327 > Project: Solr > Issue Type: Improvement > Components: web gui >Reporter: Ryan McKinley >Priority: Trivial > Fix For: 4.0 > > Attachments: SOLR-3327.patch, logging.png > > > The new logging UI looks great! > http://localhost:8983/solr/#/~logging > It would be nice to indicate which ones are set explicitly vs implicit -- > perhaps making the line bold when set=true -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1972) Need additional query stats in admin interface - median, 95th and 99th percentile
[ https://issues.apache.org/jira/browse/SOLR-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252724#comment-13252724 ] Shawn Heisey commented on SOLR-1972: Let's have everyone pretend for a minute that I have a slightly better grasp of Solr/Lucene internals than I actually do. Perhaps I will one day be able to take what you say and figure out what it means. I am very interested in having these stats available without patching Solr on my own. What would be the right way to go about re-implementing this as a module (along with some unit tests) so the code could be committed? > Need additional query stats in admin interface - median, 95th and 99th > percentile > - > > Key: SOLR-1972 > URL: https://issues.apache.org/jira/browse/SOLR-1972 > Project: Solr > Issue Type: Improvement >Affects Versions: 1.4 >Reporter: Shawn Heisey >Priority: Minor > Attachments: SOLR-1972-branch3x-url_pattern.patch, > SOLR-1972-url_pattern.patch, SOLR-1972.patch, SOLR-1972.patch, > SOLR-1972.patch, SOLR-1972.patch, elyograg-1972-3.2.patch, > elyograg-1972-3.2.patch, elyograg-1972-trunk.patch, elyograg-1972-trunk.patch > > > I would like to see more detailed query statistics from the admin GUI. This > is what you can get now: > requests : 809 > errors : 0 > timeouts : 0 > totalTime : 70053 > avgTimePerRequest : 86.59209 > avgRequestsPerSecond : 0.8148785 > I'd like to see more data on the time per request - median, 95th percentile, > 99th percentile, and any other statistical function that makes sense to > include. In my environment, the first bunch of queries after startup tend to > take several seconds each. I find that the average value tends to be useless > until it has several thousand queries under its belt and the caches are > thoroughly warmed. The statistical functions I have mentioned would quickly > eliminate the influence of those initial slow queries. > The system will have to store individual data about each query. I don't know > if this is something Solr does already. It would be nice to have a > configurable count of how many of the most recent data points are kept, to > control the amount of memory the feature uses. The default value could be > something like 1024 or 4096. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3333) Create an option that allows a query to be cached, but not used for warming
[ https://issues.apache.org/jira/browse/SOLR-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13248602#comment-13248602 ] Shawn Heisey commented on SOLR-: I never actually answered your first question. Yes, I do want most entries in the filter cache to be usable for autowarming. Most users have relatively few boolean clauses in their filter queries. Employees are the common exception. We get a few hundred boolean clauses in ours. Plans are being discussed to greatly reduce that, but I'm not sure we'll ever get away from it entirely. > Create an option that allows a query to be cached, but not used for warming > --- > > Key: SOLR- > URL: https://issues.apache.org/jira/browse/SOLR- > Project: Solr > Issue Type: New Feature >Affects Versions: 3.5, 4.0 >Reporter: Shawn Heisey > > The application that uses my Solr install builds complex filter queries for > employees because they have access to everything, whereas most users have > access to a small subset. > Because of this, autowarming on the filterCache can take 30-60 seconds even > though autoWarm is set to just 4 queries. > If we had a way (probably a localparam) to tell Solr to not use those filters > when autowarming, but to go ahead and put them in the filterCache and use > them until there's a new commit, that would eliminate this problem. > Employees might have their queries take longer, but regular users would not > be affected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3333) Create an option that allows a query to be cached, but not used for warming
[ https://issues.apache.org/jira/browse/SOLR-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13248579#comment-13248579 ] Shawn Heisey commented on SOLR-: I would like to have our application code tag those nasty employee filters with something that makes them ineligible for autowarming, but still eligible for caching, which would keep them around until the next commit. I am pretty sure our code is capable of knowing that the user is a special user, typically admin or system. An update cycle runs once a minute for the index as a whole, but changes are tracked on a per-shard basis. Commits on each shard are only done if something on that particular shard actually changes. The large shards where this is a problem typically go several minutes between commits, and that might extend to an hour or more. I will talk to our developers about using the cache=false localparam for now, but I am hoping for the ability to use the cache for those nasty filters but not include them for warming. Having recently toyed with the cache code (SOLR-2906), I know this may not be trivial. > Create an option that allows a query to be cached, but not used for warming > --- > > Key: SOLR- > URL: https://issues.apache.org/jira/browse/SOLR- > Project: Solr > Issue Type: New Feature >Affects Versions: 3.5, 4.0 >Reporter: Shawn Heisey > > The application that uses my Solr install builds complex filter queries for > employees because they have access to everything, whereas most users have > access to a small subset. > Because of this, autowarming on the filterCache can take 30-60 seconds even > though autoWarm is set to just 4 queries. > If we had a way (probably a localparam) to tell Solr to not use those filters > when autowarming, but to go ahead and put them in the filterCache and use > them until there's a new commit, that would eliminate this problem. > Employees might have their queries take longer, but regular users would not > be affected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3319) Improve DataImportHandler status response
[ https://issues.apache.org/jira/browse/SOLR-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13248402#comment-13248402 ] Shawn Heisey commented on SOLR-3319: Here are some general ideas, preliminary because I have not taken a close look at the code yet. For reference, here is a completed status response on a full-import from 3.5.0: {code} 0 0 dih-config.xml idle 1 11287894 0 2012-04-03 17:38:01 Indexing completed. Added/Updated: 11287894 documents. Deleted 0 documents. 2012-04-03 20:16:32 11287894 2:38:31.314 This response format is experimental. It is likely to change in the future. {code} I was thinking it might be a good idea to have two response sections in addition to the echoParams section already mentioned - one for a human readable response and one for a relatively terse machine readable response. The human readable version would be fairly open to change, and could include extra verbiage so it's very understandable for a person. The machine readable version would have more elements, each of which is very simple, probably just a numeric value or a true/false indicator. A design decision needs to be made early - do we include all elements in every response (with the value set to zero, blank, or false), even if they don't apply to the current status? My first instinct is to include all elements, but maybe that's wrong. > Improve DataImportHandler status response > - > > Key: SOLR-3319 > URL: https://issues.apache.org/jira/browse/SOLR-3319 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Affects Versions: 3.5, 4.0 >Reporter: Shawn Heisey >Priority: Minor > Fix For: 4.0 > > > The DataImportHandler has some oddities and inconsistencies in its status > response that make it difficult to write code that parses DIH status, > especially if both full-import and delta-import are required. See SOLR-2729. > I would like to have a discussion where we come up with a well-defined and > consistent format that can be used programatically as well as be human > readable, and then I can implement it, or someone else can if they really > want to. I think it would be very useful if the status response included all > parameters that went into the import request, like echoParams in the query > interface. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3333) Create an option that allows a query to be cached, but not used for warming
[ https://issues.apache.org/jira/browse/SOLR-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13248292#comment-13248292 ] Shawn Heisey commented on SOLR-: I don't think I can implement this. My knowledge of Solr internals simply isn't strong enough. > Create an option that allows a query to be cached, but not used for warming > --- > > Key: SOLR- > URL: https://issues.apache.org/jira/browse/SOLR- > Project: Solr > Issue Type: New Feature >Affects Versions: 3.5, 4.0 >Reporter: Shawn Heisey > > The application that uses my Solr install builds complex filter queries for > employees because they have access to everything, whereas most users have > access to a small subset. > Because of this, autowarming on the filterCache can take 30-60 seconds even > though autoWarm is set to just 4 queries. > If we had a way (probably a localparam) to tell Solr to not use those filters > when autowarming, but to go ahead and put them in the filterCache and use > them until there's a new commit, that would eliminate this problem. > Employees might have their queries take longer, but regular users would not > be affected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3319) Improve DataImportHandler status response
[ https://issues.apache.org/jira/browse/SOLR-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13247684#comment-13247684 ] Shawn Heisey commented on SOLR-3319: Here's an idea, at least for 3x, assuming it's not unilaterally killed by the bug-fix-only mode: A configuration knob to use the old response or the new response. It would default to old. For 4.0, that configuration knob seems like a good idea, defaulting to the new response. In 4.1 or 5.0, the old response gets removed. > Improve DataImportHandler status response > - > > Key: SOLR-3319 > URL: https://issues.apache.org/jira/browse/SOLR-3319 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Affects Versions: 3.5, 4.0 >Reporter: Shawn Heisey >Priority: Minor > Fix For: 4.0 > > > The DataImportHandler has some oddities and inconsistencies in its status > response that make it difficult to write code that parses DIH status, > especially if both full-import and delta-import are required. See SOLR-2729. > I would like to have a discussion where we come up with a well-defined and > consistent format that can be used programatically as well as be human > readable, and then I can implement it, or someone else can if they really > want to. I think it would be very useful if the status response included all > parameters that went into the import request, like echoParams in the query > interface. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3946) improve docs & ivy verification output to explain classpath problems and mention "--noconfig"
[ https://issues.apache.org/jira/browse/LUCENE-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13246991#comment-13246991 ] Shawn Heisey commented on LUCENE-3946: -- Yes, putting rpm_mode=false in ~/.ant/ant.conf works too. I just got a bug filed with Redhat, hopefully they don't complain too much about it actually being CentOS. https://bugzilla.redhat.com/show_bug.cgi?id=810067 > improve docs & ivy verification output to explain classpath problems and > mention "--noconfig" > - > > Key: LUCENE-3946 > URL: https://issues.apache.org/jira/browse/LUCENE-3946 > Project: Lucene - Java > Issue Type: Task >Reporter: Hoss Man >Assignee: Hoss Man > Fix For: 3.6, 4.0 > > Attachments: LUCENE-3946.patch > > > offshoot of LUCENE-3930, where shawn reported... > {quote} > I can't get either branch_3x or trunk to build now, on a system that used to > build branch_3x without complaint. It > says that ivy is not available, even after doing "ant ivy-bootstrap" to > download ivy into the home directory. > Specifically I am trying to build solrj from trunk, but I can't even get > "ant" in the root directory of the checkout > to work. I'm on CentOS 6 with oracle jdk7 built using the city-fan.org > SRPMs. Ant (1.7.1) and junit are installed > from package repositories. Building a checkout of lucene_solr_3_5 on the > same machine works fine. > {quote} > The root cause is that ant's global configs can be setup to ignore the users > personal lib dir. suggested work arround is to run "ant --noconfig" but we > should also try to give the user feedback in our failure about exactly what > classpath ant is currently using (because apparently ${java.class.path} is > not actually it) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3319) Improve DataImportHandler status response
[ https://issues.apache.org/jira/browse/SOLR-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13246930#comment-13246930 ] Shawn Heisey commented on SOLR-3319: I personally would like to see this included in 3x, since that's what I use. How do the rest of you feel about that? > Improve DataImportHandler status response > - > > Key: SOLR-3319 > URL: https://issues.apache.org/jira/browse/SOLR-3319 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Affects Versions: 3.5, 4.0 >Reporter: Shawn Heisey >Priority: Minor > Fix For: 3.6, 4.0 > > > The DataImportHandler has some oddities and inconsistencies in its status > response that make it difficult to write code that parses DIH status, > especially if both full-import and delta-import are required. See SOLR-2729. > I would like to have a discussion where we come up with a well-defined and > consistent format that can be used programatically as well as be human > readable, and then I can implement it, or someone else can if they really > want to. I think it would be very useful if the status response included all > parameters that went into the import request, like echoParams in the query > interface. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2729) DIH status: successful zero-document delta-import missing "" field
[ https://issues.apache.org/jira/browse/SOLR-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13246569#comment-13246569 ] Shawn Heisey commented on SOLR-2729: Something that will require a separate issue, perhaps two: I really think that "" is not a good name for the place where this stuff goes, and that "Time Taken " should also be fixed so it has no trailing space. Perhaps the entire status response needs some TLC. Making these changes will break a lot of user code, but it specifically says in the status output that the format is experimental and may change. > DIH status: successful zero-document delta-import missing "" field > -- > > Key: SOLR-2729 > URL: https://issues.apache.org/jira/browse/SOLR-2729 > Project: Solr > Issue Type: Bug > Components: contrib - DataImportHandler >Affects Versions: 3.2 > Environment: Linux idxst0-a 2.6.18-238.12.1.el5.centos.plusxen #1 SMP > Wed Jun 1 11:57:54 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux > java version "1.6.0_26" > Java(TM) SE Runtime Environment (build 1.6.0_26-b03) > Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode) >Reporter: Shawn Heisey >Priority: Minor > Fix For: 4.0 > > > If you have a successful delta-import that happens to process zero documents, > the field is not present in the status. I've run into this > situation when the SQL query results in an empty set. A workaround for the > problem is to instead look for the "Time taken " field ... but if you don't > happen to notice that this field has an extraneous space in the name, that > won't work either. > A full-import that processes zero documents has the field present as expected: > Indexing completed. Added/Updated: 0 documents. Deleted 0 > documents. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2729) DIH status: successful zero-document delta-import missing "" field
[ https://issues.apache.org/jira/browse/SOLR-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13246562#comment-13246562 ] Shawn Heisey commented on SOLR-2729: Found it. In solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DocBuilder.java: {code} // Do not commit unnecessarily if this is a delta-import and no documents were created or deleted if (!requestParameters.clean) { if (importStatistics.docCount.get() > 0 || importStatistics.deletedDocCount.get() > 0) { finish(lastIndexTimeProps); } } else { // Finished operation normally, commit now finish(lastIndexTimeProps); } {code} The method named finish is where the status message gets updated with the status that says how many documents were added/updated. A fix that would take care of the immediate problem is to move the code that populates the "" part of statusMessages into its own method that is called by finish, then add an else clause to the inner if statement above which calls that method. Does that sound at all reasonable? > DIH status: successful zero-document delta-import missing "" field > -- > > Key: SOLR-2729 > URL: https://issues.apache.org/jira/browse/SOLR-2729 > Project: Solr > Issue Type: Bug > Components: contrib - DataImportHandler >Affects Versions: 3.2 > Environment: Linux idxst0-a 2.6.18-238.12.1.el5.centos.plusxen #1 SMP > Wed Jun 1 11:57:54 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux > java version "1.6.0_26" > Java(TM) SE Runtime Environment (build 1.6.0_26-b03) > Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode) >Reporter: Shawn Heisey >Priority: Minor > Fix For: 4.0 > > > If you have a successful delta-import that happens to process zero documents, > the field is not present in the status. I've run into this > situation when the SQL query results in an empty set. A workaround for the > problem is to instead look for the "Time taken " field ... but if you don't > happen to notice that this field has an extraneous space in the name, that > won't work either. > A full-import that processes zero documents has the field present as expected: > Indexing completed. Added/Updated: 0 documents. Deleted 0 > documents. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3946) improve docs & ivy verification output to explain classpath problems and mention "--noconfig"
[ https://issues.apache.org/jira/browse/LUCENE-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13246247#comment-13246247 ] Shawn Heisey commented on LUCENE-3946: -- Interesting late development. Commenting out "rpm_mode=true" in ant.conf made it work with just "ant test" as the command. If I can figure out how to file a bug with Redhat, I will do so. > improve docs & ivy verification output to explain classpath problems and > mention "--noconfig" > - > > Key: LUCENE-3946 > URL: https://issues.apache.org/jira/browse/LUCENE-3946 > Project: Lucene - Java > Issue Type: Task >Affects Versions: 3.6 >Reporter: Hoss Man >Assignee: Hoss Man > Fix For: 4.0 > > Attachments: LUCENE-3946.patch > > > offshoot of LUCENE-3930, where shawn reported... > {quote} > I can't get either branch_3x or trunk to build now, on a system that used to > build branch_3x without complaint. It > says that ivy is not available, even after doing "ant ivy-bootstrap" to > download ivy into the home directory. > Specifically I am trying to build solrj from trunk, but I can't even get > "ant" in the root directory of the checkout > to work. I'm on CentOS 6 with oracle jdk7 built using the city-fan.org > SRPMs. Ant (1.7.1) and junit are installed > from package repositories. Building a checkout of lucene_solr_3_5 on the > same machine works fine. > {quote} > The root cause is that ant's global configs can be setup to ignore the users > personal lib dir. suggested work arround is to run "ant --noconfig" but we > should also try to give the user feedback in our failure about exactly what > classpath ant is currently using (because apparently ${java.class.path} is > not actually it) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3946) improve docs & ivy verification output to explain classpath problems and mention "--noconfig"
[ https://issues.apache.org/jira/browse/LUCENE-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13246242#comment-13246242 ] Shawn Heisey commented on LUCENE-3946: -- Robert's comment shows that he saw this: /home/fedora/branch_3x/lucene/build.xml:48: No supported regular expression matcher found: java.lang.ClassNotFoundException: org.apache.tools.ant.util.regexp.Jdk14RegexpMatcher On CentOS 6, there is a ant-apache-regexp package. Poking around the ivy jar with the Classpath Helper in eclipse, I also saw that it will probably need ant-apache-oro and possibly other optional packages. I have installed every optional ant package I could find on mine, and it didn't help. ant-apache-bcel-1.7.1-13.el6.x86_64 ant-javamail-1.7.1-13.el6.x86_64 ant-nodeps-1.7.1-13.el6.x86_64 ant-apache-bsf-1.7.1-13.el6.x86_64 ant-apache-resolver-1.7.1-13.el6.x86_64 ant-commons-net-1.7.1-13.el6.x86_64 ant-contrib-1.0-0.10.b2.el6.noarch ant-commons-logging-1.7.1-13.el6.x86_64 ant-javadoc-1.7.1-13.el6.x86_64 ant-jdepend-1.7.1-13.el6.x86_64 ant-apache-regexp-1.7.1-13.el6.x86_64 ant-trax-1.7.1-13.el6.x86_64 ant-junit-1.7.1-13.el6.x86_64 ant-swing-1.7.1-13.el6.x86_64 ant-jmf-1.7.1-13.el6.x86_64 ant-scripts-1.7.1-13.el6.x86_64 ant-jsch-1.7.1-13.el6.x86_64 ant-apache-oro-1.7.1-13.el6.x86_64 ant-apache-log4j-1.7.1-13.el6.x86_64 ant-1.7.1-13.el6.x86_64 ant-antunit-1.1-4.el6.noarch Checking the java.class.path spit out by the downloaded ant, there appear to be things that it includes that are not available as optional packages. I know from previous experience that filing a bug with CentOS is useless, they'll just tell me to file a bug with Redhat. Since I've never given Redhat a single penny, I will have to research how to file a bug with them. > improve docs & ivy verification output to explain classpath problems and > mention "--noconfig" > - > > Key: LUCENE-3946 > URL: https://issues.apache.org/jira/browse/LUCENE-3946 > Project: Lucene - Java > Issue Type: Task >Affects Versions: 3.6 >Reporter: Hoss Man >Assignee: Hoss Man > Fix For: 4.0 > > Attachments: LUCENE-3946.patch > > > offshoot of LUCENE-3930, where shawn reported... > {quote} > I can't get either branch_3x or trunk to build now, on a system that used to > build branch_3x without complaint. It > says that ivy is not available, even after doing "ant ivy-bootstrap" to > download ivy into the home directory. > Specifically I am trying to build solrj from trunk, but I can't even get > "ant" in the root directory of the checkout > to work. I'm on CentOS 6 with oracle jdk7 built using the city-fan.org > SRPMs. Ant (1.7.1) and junit are installed > from package repositories. Building a checkout of lucene_solr_3_5 on the > same machine works fine. > {quote} > The root cause is that ant's global configs can be setup to ignore the users > personal lib dir. suggested work arround is to run "ant --noconfig" but we > should also try to give the user feedback in our failure about exactly what > classpath ant is currently using (because apparently ${java.class.path} is > not actually it) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3946) improve docs & ivy verification output to explain classpath problems and mention "--noconfig"
[ https://issues.apache.org/jira/browse/LUCENE-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13246223#comment-13246223 ] Shawn Heisey commented on LUCENE-3946: -- The broken ant in CentOS 6.2: Apache Ant version 1.7.1 compiled on August 24 2010 I don't have a real RHEL 6.x to check this on, it's probably a different date. Downloading, installing, and using ant 1.7.1 fixed it for me. I can actually still call the /usr/bin/ant script in the regular path, but explicitly setting ANT_HOME overrides what it actually uses. When I first found this problem, I did install the apache-ivy package, then when that didn't work, I noticed the ivy-bootstrap option. Trying some of the workarounds mentioned resulted in an error: ncindex@bigindy5 /index/src/trunk/solr $ ant --noconfig resolve Error: Could not find or load main class org.apache.tools.ant.launch.Launcher > improve docs & ivy verification output to explain classpath problems and > mention "--noconfig" > - > > Key: LUCENE-3946 > URL: https://issues.apache.org/jira/browse/LUCENE-3946 > Project: Lucene - Java > Issue Type: Task >Affects Versions: 3.6 >Reporter: Hoss Man >Assignee: Hoss Man > Fix For: 4.0 > > Attachments: LUCENE-3946.patch > > > offshoot of LUCENE-3930, where shawn reported... > {quote} > I can't get either branch_3x or trunk to build now, on a system that used to > build branch_3x without complaint. It > says that ivy is not available, even after doing "ant ivy-bootstrap" to > download ivy into the home directory. > Specifically I am trying to build solrj from trunk, but I can't even get > "ant" in the root directory of the checkout > to work. I'm on CentOS 6 with oracle jdk7 built using the city-fan.org > SRPMs. Ant (1.7.1) and junit are installed > from package repositories. Building a checkout of lucene_solr_3_5 on the > same machine works fine. > {quote} > The root cause is that ant's global configs can be setup to ignore the users > personal lib dir. suggested work arround is to run "ant --noconfig" but we > should also try to give the user feedback in our failure about exactly what > classpath ant is currently using (because apparently ${java.class.path} is > not actually it) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy
[ https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245781#comment-13245781 ] Shawn Heisey commented on LUCENE-3930: -- bq. Did you start a new terminal session after doing ant ivy-bootstrap? FWIW, I had to restart eclipse before I could continue using the eclipse/ant integration. I did not think of that, as I am not using Eclipse or any other IDE. This is purely commandline via ssh. I did open a new ssh session and try again, no change. > nuke jars from source tree and use ivy > -- > > Key: LUCENE-3930 > URL: https://issues.apache.org/jira/browse/LUCENE-3930 > Project: Lucene - Java > Issue Type: Task > Components: general/build >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Blocker > Fix For: 3.6, 4.0 > > Attachments: LUCENE-3930-skip-sources-javadoc.patch, > LUCENE-3930-solr-example.patch, LUCENE-3930-solr-example.patch, > LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930.patch, > LUCENE-3930__ivy_bootstrap_target.patch, > LUCENE-3930_includetestlibs_excludeexamplexml.patch, > ant_-verbose_clean_test.out.txt, langdetect-1.1.jar, > noggit-commons-csv.patch, patch-jetty-build.patch, pom.xml > > > As mentioned on the ML thread: "switch jars to ivy mechanism?". -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy
[ https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245698#comment-13245698 ] Shawn Heisey commented on LUCENE-3930: -- After printing out all the echo statements under ivy-availability, it spits out this: ivy-fail: BUILD FAILED /index/src/trunk/build.xml:42: The following error occurred while executing this line: /index/src/trunk/lucene/common-build.xml:584: The following error occurred while executing this line: /index/src/trunk/lucene/common-build.xml:298: Ivy is not available - By adding to the validate section of build.xml, I got it to print out the java classpath, which includes the jar downloaded by the ivy-bootstrap option: [echo] /usr/share/java/ant.jar:/usr/share/java/ant-launcher.jar:/usr/share/java/jaxp_parser_impl.jar:/usr/share/java/xml-commons-apis.jar:/usr/share/java/junit.jar:/usr/share/java/ant/ant-junit.jar:/usr/share/java/ant/ant-nodeps.jar:/usr/lib/jvm/java/lib/tools.jar:/home/ncindex/.ant/lib/ivy-2.2.0.jar:/usr/share/ant/lib/ant-bootstrap.jar:/usr/share/ant/lib/ant-junit.jar:/usr/share/ant/lib/ant-nodeps.jar:/usr/share/ant/lib/ant-launcher.jar:/usr/share/ant/lib/ant.jar > nuke jars from source tree and use ivy > -- > > Key: LUCENE-3930 > URL: https://issues.apache.org/jira/browse/LUCENE-3930 > Project: Lucene - Java > Issue Type: Task > Components: general/build >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Blocker > Fix For: 3.6, 4.0 > > Attachments: LUCENE-3930-skip-sources-javadoc.patch, > LUCENE-3930-solr-example.patch, LUCENE-3930-solr-example.patch, > LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930.patch, > LUCENE-3930__ivy_bootstrap_target.patch, > LUCENE-3930_includetestlibs_excludeexamplexml.patch, > ant_-verbose_clean_test.out.txt, langdetect-1.1.jar, > noggit-commons-csv.patch, patch-jetty-build.patch, pom.xml > > > As mentioned on the ML thread: "switch jars to ivy mechanism?". -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy
[ https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245658#comment-13245658 ] Shawn Heisey commented on LUCENE-3930: -- I can't get either branch_3x or trunk to build now, on a system that used to build branch_3x without complaint. It says that ivy is not available, even after doing "ant ivy-bootstrap" to download ivy into the home directory. Specifically I am trying to build solrj from trunk, but I can't even get "ant" in the root directory of the checkout to work. I'm on CentOS 6 with oracle jdk7 built using the city-fan.org SRPMs. Ant (1.7.1) and junit are installed from package repositories. Building a checkout of lucene_solr_3_5 on the same machine works fine. > nuke jars from source tree and use ivy > -- > > Key: LUCENE-3930 > URL: https://issues.apache.org/jira/browse/LUCENE-3930 > Project: Lucene - Java > Issue Type: Task > Components: general/build >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Blocker > Fix For: 3.6, 4.0 > > Attachments: LUCENE-3930-skip-sources-javadoc.patch, > LUCENE-3930-solr-example.patch, LUCENE-3930-solr-example.patch, > LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930.patch, > LUCENE-3930__ivy_bootstrap_target.patch, > LUCENE-3930_includetestlibs_excludeexamplexml.patch, > ant_-verbose_clean_test.out.txt, langdetect-1.1.jar, > noggit-commons-csv.patch, patch-jetty-build.patch, pom.xml > > > As mentioned on the ML thread: "switch jars to ivy mechanism?". -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2124) SEVERE exceptions are being logged for expected PingRequestHandler SERVICE_UNAVAILABLE exceptions
[ https://issues.apache.org/jira/browse/SOLR-2124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232766#comment-13232766 ] Shawn Heisey commented on SOLR-2124: Thank you, James! I am really looking forward to 3.6. > SEVERE exceptions are being logged for expected PingRequestHandler > SERVICE_UNAVAILABLE exceptions > - > > Key: SOLR-2124 > URL: https://issues.apache.org/jira/browse/SOLR-2124 > Project: Solr > Issue Type: Bug >Reporter: Hoss Man >Assignee: James Dyer >Priority: Minor > Fix For: 3.6, 4.0 > > Attachments: SOLR-2124.patch > > > As reported by a user, if you use the PingRequestHandler, and the > corrisponding helthcheck file doesn't exist (and expected situation when a > server is out of rotation) Solr is logging a SEVERE error... > {noformat} > SEVERE: org.apache.solr.common.SolrException: Service disabled > at > org.apache.solr.handler.PingRequestHandler.handleRequestBody(PingRequestHandler.java:48) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1324) > at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:337) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240) > at > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157) > at > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388) > at > org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) > at > org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) > at > org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765) > at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418) > at > org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) > at > org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) > at > org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) > at org.mortbay.jetty.Server.handle(Server.java:326) > at > org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) > at > org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:923) > at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547) > at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) > at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) > at > org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228) > at > org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) > {noformat} > This is in spite of hte fact that PingRequestHandler explicitly sets the > "alreadyLogged" boolean to true in the SolrException constructor. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3219) StreamingUpdateSolrServer is not quiet at INFO, but CommonsHttpSolrServer is
[ https://issues.apache.org/jira/browse/SOLR-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226584#comment-13226584 ] Shawn Heisey commented on SOLR-3219: It's probably not a bug, then.A side question, if you happen to know... how can I get milliseconds in my timestamps? Can you tell me how to turn off SUSS logging? I have a logging.properties file with the following in it: # Logging level .level=INFO # Write to a file handlers = java.util.logging.FileHandler # Write log messages in human readable format: java.util.logging.FileHandler.formatter = java.util.logging.SimpleFormatter java.util.logging.ConsoleHander.formatter = java.util.logging.SimpleFormatter # Log to the log subdirectory, with log files named idxbuild_log-n.log java.util.logging.FileHandler.pattern = ./log/idxbuild_log-%g.log java.util.logging.FileHandler.append = true java.util.logging.FileHandler.count = 10 java.util.logging.FileHandler.limit = 4194304 > StreamingUpdateSolrServer is not quiet at INFO, but CommonsHttpSolrServer is > > > Key: SOLR-3219 > URL: https://issues.apache.org/jira/browse/SOLR-3219 > Project: Solr > Issue Type: Bug > Components: clients - java >Affects Versions: 3.5, 4.0 >Reporter: Shawn Heisey >Priority: Minor > > When using CommonsHttpSolrServer, nothing gets logged by SolrJ at the INFO > level. When using StreamingUpdateSolrServer, I have seen two messages logged > each time it is used: > Mar 08, 2012 4:41:01 PM > org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner run > INFO: starting runner: > org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner@6bf28508 > Mar 08, 2012 4:41:01 PM > org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner run > INFO: finished: > org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner@6bf28508 > I think one of these behaviors should be considered a bug. My preference is > to move the logging in SUSS out of INFO so it is silent like CHSS. If the > decision is to leave it at INFO, I'll just live with it. A knob to make it > configurable would be cool, but that's probably a fair amount of work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3159) Upgrade to Jetty 8
[ https://issues.apache.org/jira/browse/SOLR-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225685#comment-13225685 ] Shawn Heisey commented on SOLR-3159: bq. The pile of config files in etc are for the various features you can enable – this just includes the ones I think we need I checked out trunk from SVN and took a look at the example etc directory. It only includes jetty.xml and webdefault.xml, so once things on this issue have settled down, I'll compare my config with yours and go ahead with an upgrade on my test server. > Upgrade to Jetty 8 > -- > > Key: SOLR-3159 > URL: https://issues.apache.org/jira/browse/SOLR-3159 > Project: Solr > Issue Type: Task >Reporter: Ryan McKinley >Priority: Minor > Fix For: 4.0 > > Attachments: SOLR-3159-maven.patch > > > Solr is currently tested (and bundled) with a patched jetty-6 version. > Ideally we can release and test with a standard version. > Jetty-6 (at codehaus) is just maintenance now. New development and > improvements are now hosted at eclipse. Assuming performance is equivalent, > I think we should switch to Jetty 8. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3159) Upgrade to Jetty 8
[ https://issues.apache.org/jira/browse/SOLR-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224505#comment-13224505 ] Shawn Heisey commented on SOLR-3159: I asked this on the mailing list, got zero response. I would like to know if there are any good reasons to upgrade to Jetty 8 with an older release, specifically 3.5.0. Also, the Jetty 8 distribution has a fair number of config files in etc, but the example Solr only has jetty.xml and webdefault.xml. What sort of recommendations do you have as far as config changes when upgrading? I am using the JDK, so I would I be OK with the presence of JSP in the pre-trunk version? Solr is not directly reachable from the outside world, it is used on the internal network by our web application. > Upgrade to Jetty 8 > -- > > Key: SOLR-3159 > URL: https://issues.apache.org/jira/browse/SOLR-3159 > Project: Solr > Issue Type: Task >Reporter: Ryan McKinley >Priority: Minor > Fix For: 4.0 > > > Solr is currently tested (and bundled) with a patched jetty-6 version. > Ideally we can release and test with a standard version. > Jetty-6 (at codehaus) is just maintenance now. New development and > improvements are now hosted at eclipse. Assuming performance is equivalent, > I think we should switch to Jetty 8. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2204) Cross-version replication broken by new javabin format
[ https://issues.apache.org/jira/browse/SOLR-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13196998#comment-13196998 ] Shawn Heisey commented on SOLR-2204: bq. I think there's an acceptable workaround for that too. You disable replication, then upgrade master, then upgrade slave(s) and then enable replication again. Disabling/enabling polling can be done over HTTP so it's easily scriptable. Yes, there are workarounds, but they force you into a rushed upgrade. While upgrading the master, the index is not being updated. While upgrading the slaves, it's running single-stranded. Ideally you'd want hours or days for semi-production testing, a luxury you don't get if you can't upgrade one slave first and replicate from the older version on the master. If the application has enough volume to require the services of multiple slaves, there will be application downtime. Of course any sane company has built-in maintenance outage times, but how many non-technical management people are actually sane when it comes to this? Mine are extremely intolerant of downtime, even if it's planned. When I upgraded one of my distributed indexes, it was about two weeks before I was ready to declare it a success and upgrade the other one. I had already been testing on a dev box with a limited shard collection for several weeks before that. I did find a few problems, and I was able to simply disable the index and fall back to the 1.4.1 version that I was keeping updated separately. > Cross-version replication broken by new javabin format > -- > > Key: SOLR-2204 > URL: https://issues.apache.org/jira/browse/SOLR-2204 > Project: Solr > Issue Type: Bug > Components: replication (java) >Affects Versions: 3.1 > Environment: Linux idxst0-a 2.6.18-194.3.1.el5.centos.plusxen #1 SMP > Wed May 19 09:59:34 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux > java version "1.6.0_20" > Java(TM) SE Runtime Environment (build 1.6.0_20-b02) > Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode) >Reporter: Shawn Heisey > Fix For: 3.6, 4.0 > > Attachments: SOLR-2204.patch, SOLR-2204.patch > > > Slave server is branch_3x, revision 1027974. Master server is 1.4.1. > Replication fails because of the new javabin format. > SEVERE: Master at: http://HOST:8983/solr/live/replication is not available. > Index fetch failed. Exception: Invalid version or the data in not in > 'javabin' format > Switching Solr's internally generated requests to XML, or adding support for > both javabin versions would get rid of this problem. I do not know how to do > either of these things. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2204) Cross-version replication broken by new javabin format
[ https://issues.apache.org/jira/browse/SOLR-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13196185#comment-13196185 ] Shawn Heisey commented on SOLR-2204: bq. Thinking more about this back compat thing, there is of course a known workaround, namely using XML. I just did that at a customer to be able to use v1.4.0 client towards v3.4 server. Of course that is less efficient and would not work for replication, but then I don't really see the usecase for cross version replication? Imagine this (extremely common) scenario: you've got one master and one or more slaves, and possibly a replication forwarder or two, all running unmodified 1.4.x. You want to upgrade. Your build software is only capable of updating one master server and the person who wrote it found themselves a new job. Without cross-version replication, how do you do this upgrade? I don't think it's possible. I've got two copies of my index, purely for redundancy purposes. As I already mentioned, I solved this problem when I went from 1.4.1 to 3.2.0 by making my software capable of updating two indexes in parallel. It was not a trivial undertaking. > Cross-version replication broken by new javabin format > -- > > Key: SOLR-2204 > URL: https://issues.apache.org/jira/browse/SOLR-2204 > Project: Solr > Issue Type: Bug > Components: replication (java) >Affects Versions: 3.1 > Environment: Linux idxst0-a 2.6.18-194.3.1.el5.centos.plusxen #1 SMP > Wed May 19 09:59:34 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux > java version "1.6.0_20" > Java(TM) SE Runtime Environment (build 1.6.0_20-b02) > Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode) >Reporter: Shawn Heisey > Fix For: 3.6, 4.0 > > Attachments: SOLR-2204.patch, SOLR-2204.patch > > > Slave server is branch_3x, revision 1027974. Master server is 1.4.1. > Replication fails because of the new javabin format. > SEVERE: Master at: http://HOST:8983/solr/live/replication is not available. > Index fetch failed. Exception: Invalid version or the data in not in > 'javabin' format > Switching Solr's internally generated requests to XML, or adding support for > both javabin versions would get rid of this problem. I do not know how to do > either of these things. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2906) Implement LFU Cache
[ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195148#comment-13195148 ] Shawn Heisey commented on SOLR-2906: bq. Did you ever update the Wiki with this new functionality? That'd be awesome Yes, I added LFUCache and the timeDecay option to the SolrCaching Wiki page. > Implement LFU Cache > --- > > Key: SOLR-2906 > URL: https://issues.apache.org/jira/browse/SOLR-2906 > Project: Solr > Issue Type: Sub-task > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Assignee: Erick Erickson >Priority: Minor > Fix For: 3.6, 4.0 > > Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, > SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, > SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, TestLFUCache.java > > > Implement an LFU (Least Frequently Used) cache as the first step towards a > full ARC cache -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1632) Distributed IDF
[ https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192357#comment-13192357 ] Shawn Heisey commented on SOLR-1632: Is this something that can be added to branch_3x? With high fuzz and ignore whitespace, the patch applies, but then fails to compile. It also fails to compile when I set fuzz to zero, pay attention to whitespace, and manually fix the patch rejects. I couldn't figure out how to fix the problems. > Distributed IDF > --- > > Key: SOLR-1632 > URL: https://issues.apache.org/jira/browse/SOLR-1632 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.5 >Reporter: Andrzej Bialecki > Attachments: SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, > distrib.patch > > > Distributed IDF is a valuable enhancement for distributed search across > non-uniform shards. This issue tracks the proposed implementation of an API > to support this functionality in Solr. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2906) Implement LFU Cache
[ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13174475#comment-13174475 ] Shawn Heisey commented on SOLR-2906: I must be dense. I can figure out how to add the timeDecay option, but I can't figure out what section of code to enable/disable based on the value of timeDecay. I've gone as far as doing a diff on my Nov 24th patch and the Dec 20th patch from Erick. (doing diffs on diffs ... the world is going to explode!) The only differences I can see between the two is in whitespace/formatting. > Implement LFU Cache > --- > > Key: SOLR-2906 > URL: https://issues.apache.org/jira/browse/SOLR-2906 > Project: Solr > Issue Type: Sub-task > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Assignee: Erick Erickson >Priority: Minor > Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, > SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, > TestLFUCache.java > > > Implement an LFU (Least Frequently Used) cache as the first step towards a > full ARC cache -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2906) Implement LFU Cache
[ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13173345#comment-13173345 ] Shawn Heisey commented on SOLR-2906: bq. Could you add in the optional time decay as Yonik suggests? I agree that it seems like the right thing is to have this on by default. At that point, I think it'll be ready to check in. We can add documentation as we can. I've looked at what Yonik has said and cannot figure out what I'd have to do. I'm not completely ignorant, but there is a lot that I don't know. I am amazed I was able to get this put together at all. > Implement LFU Cache > --- > > Key: SOLR-2906 > URL: https://issues.apache.org/jira/browse/SOLR-2906 > Project: Solr > Issue Type: Sub-task > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Assignee: Erick Erickson >Priority: Minor > Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, > SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, > TestLFUCache.java > > > Implement an LFU (Least Frequently Used) cache as the first step towards a > full ARC cache -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2906) Implement LFU Cache
[ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13166407#comment-13166407 ] Shawn Heisey commented on SOLR-2906: Some additional info: The index is composed of six large shard cores and a small one, running on two servers. The total index size on each server (CentOS 6, java 1.6.0_29) is about 60GB. Solr/Jetty has an 8GB heap. > Implement LFU Cache > --- > > Key: SOLR-2906 > URL: https://issues.apache.org/jira/browse/SOLR-2906 > Project: Solr > Issue Type: Sub-task > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Priority: Minor > Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, > SOLR-2906.patch, TestLFUCache.java > > > Implement an LFU (Least Frequently Used) cache as the first step towards a > full ARC cache -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2906) Implement LFU Cache
[ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13166399#comment-13166399 ] Shawn Heisey commented on SOLR-2906: I finally got a chance to do some testing in production. I have two distributed index chains, both running 3.5.0 with this patch and the one from SOLR-1972 applied. The chains are updated independently, there is no replication. It's not a truly definitive test, because my queries are load balanced between the two chains and I do have some hardware discrepancies. I cannot create a valid test environment to compare the two caches as they should be compared, with identical queries going to two completely identical servers. On average, commits made on chain A where filterCache is set to LFU and the servers have 48GB of RAM happen faster than those on chain B, with FastLRU and 64GB of RAM. One of the two servers on chain A has slightly faster processors than its counterpart on chain B -- 2.83 GHz vs. 2.66 GHz. The other two servers on both chains have 2.5 GHz processors. I suspect that most of the potential gains that the LFU algorithm might be able to provide are swallowed by the very inefficient implementation. If anyone has some thoughts for me to pursue, I will be happy to do so, but I am out of my own ideas. I hope the patch will be committed. It could use a lot of optimization and there's probably cosmetic cleanup to do. > Implement LFU Cache > --- > > Key: SOLR-2906 > URL: https://issues.apache.org/jira/browse/SOLR-2906 > Project: Solr > Issue Type: Sub-task > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Priority: Minor > Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, > SOLR-2906.patch, TestLFUCache.java > > > Implement an LFU (Least Frequently Used) cache as the first step towards a > full ARC cache -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2906) Implement LFU Cache
[ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13156273#comment-13156273 ] Shawn Heisey commented on SOLR-2906: Possibly false alarm. Although I still do not know what causes the discrepancy between inserts and size on the filter cache, I can confirm that exactly the same thing happens when I change it to FastLRUCache, restart Solr, and fire up the benchmarking script. > Implement LFU Cache > --- > > Key: SOLR-2906 > URL: https://issues.apache.org/jira/browse/SOLR-2906 > Project: Solr > Issue Type: Sub-task > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Priority: Minor > Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, > TestLFUCache.java > > > Implement an LFU (Least Frequently Used) cache as the first step towards a > full ARC cache -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2906) Implement LFU Cache
[ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155296#comment-13155296 ] Shawn Heisey commented on SOLR-2906: I can't reproduce it with LFUCache or FastLRUCache by manually sending invalid queries, so that's the wrong idea. > Implement LFU Cache > --- > > Key: SOLR-2906 > URL: https://issues.apache.org/jira/browse/SOLR-2906 > Project: Solr > Issue Type: Sub-task > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Priority: Minor > Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, > TestLFUCache.java > > > Implement an LFU (Least Frequently Used) cache as the first step towards a > full ARC cache -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2906) Implement LFU Cache
[ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155290#comment-13155290 ] Shawn Heisey commented on SOLR-2906: I might have figured out the problem, and if I have, the cache code is fine. I just checked the log from my most recent run and have found that there are two errors from invalid filter queries. I think this means that when a filter is invalid, inserts gets incremented but size doesn't. > Implement LFU Cache > --- > > Key: SOLR-2906 > URL: https://issues.apache.org/jira/browse/SOLR-2906 > Project: Solr > Issue Type: Sub-task > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Priority: Minor > Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, > TestLFUCache.java > > > Implement an LFU (Least Frequently Used) cache as the first step towards a > full ARC cache -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2906) Implement LFU Cache
[ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155280#comment-13155280 ] Shawn Heisey commented on SOLR-2906: The only static warming I have is a *:* search with a sort parameter (and no filter query), which precaches my sort. I do have autowarming configured, but this behavior happens from initial solr startup. Things seem to behave correctly with warming - size is autowarmCount, inserts are zero. FastLRUCache behavior: queryResultCache: size is one higher than inserts documentCache: size is one higher than inserts filterCache: size and inserts are identical LFUCache behavior: queryResultCache: size is one higher than inserts documentCache: size is one higher than inserts filterCache: inserts is higher than size by a variable amount I've seen 10 (on two different runs) and 2 (on the most recent run) as the difference between inserts and size. The FastLRUCache behavior is seen on my production servers with production queries, the LFUCache behavior is on my test server with a benchmark script providing the queries. I suppose there might be something weird about my canned queries that makes the filterCache behave differently, but the original source of the queries was a production Solr log at level INFO. > Implement LFU Cache > --- > > Key: SOLR-2906 > URL: https://issues.apache.org/jira/browse/SOLR-2906 > Project: Solr > Issue Type: Sub-task > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Priority: Minor > Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, > TestLFUCache.java > > > Implement an LFU (Least Frequently Used) cache as the first step towards a > full ARC cache -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2906) Implement LFU Cache
[ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155258#comment-13155258 ] Shawn Heisey commented on SOLR-2906: Something odd happens with the filterCache. When things are first starting off, the cache size and the number of inserts don't match up. It's usually off by 10, with more in the inserts. This doesn't seem to happen with the other cache types, also using LFU. > Implement LFU Cache > --- > > Key: SOLR-2906 > URL: https://issues.apache.org/jira/browse/SOLR-2906 > Project: Solr > Issue Type: Sub-task > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Priority: Minor > Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, > TestLFUCache.java > > > Implement an LFU (Least Frequently Used) cache as the first step towards a > full ARC cache -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1972) Need additional query stats in admin interface - median, 95th and 99th percentile
[ https://issues.apache.org/jira/browse/SOLR-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13154710#comment-13154710 ] Shawn Heisey commented on SOLR-1972: Based on filenames, I couldn't find an existing unit test that checks handler statistics, so I couldn't figure out how to make a test for this patch. I am very interested in getting this included in branch_3x. If you have some example code I can look at to create unit tests, I can look into making one. > Need additional query stats in admin interface - median, 95th and 99th > percentile > - > > Key: SOLR-1972 > URL: https://issues.apache.org/jira/browse/SOLR-1972 > Project: Solr > Issue Type: Improvement >Affects Versions: 1.4 >Reporter: Shawn Heisey >Priority: Minor > Attachments: SOLR-1972.patch, SOLR-1972.patch, SOLR-1972.patch, > SOLR-1972.patch, elyograg-1972-3.2.patch, elyograg-1972-3.2.patch, > elyograg-1972-trunk.patch, elyograg-1972-trunk.patch > > > I would like to see more detailed query statistics from the admin GUI. This > is what you can get now: > requests : 809 > errors : 0 > timeouts : 0 > totalTime : 70053 > avgTimePerRequest : 86.59209 > avgRequestsPerSecond : 0.8148785 > I'd like to see more data on the time per request - median, 95th percentile, > 99th percentile, and any other statistical function that makes sense to > include. In my environment, the first bunch of queries after startup tend to > take several seconds each. I find that the average value tends to be useless > until it has several thousand queries under its belt and the caches are > thoroughly warmed. The statistical functions I have mentioned would quickly > eliminate the influence of those initial slow queries. > The system will have to store individual data about each query. I don't know > if this is something Solr does already. It would be nice to have a > configurable count of how many of the most recent data points are kept, to > control the amount of memory the feature uses. The default value could be > something like 1024 or 4096. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2906) Implement LFU Cache
[ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13154514#comment-13154514 ] Shawn Heisey commented on SOLR-2906: bq. shawn, is it possible to upload a diff file (patch). These are all new files, no files changed. "svn diff" returns nothing. > Implement LFU Cache > --- > > Key: SOLR-2906 > URL: https://issues.apache.org/jira/browse/SOLR-2906 > Project: Solr > Issue Type: Sub-task > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Priority: Minor > Attachments: ConcurrentLFUCache.java, LFUCache.java, TestLFUCache.java > > > Implement an LFU (Least Frequently Used) cache as the first step towards a > full ARC cache -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2906) Implement LFU Cache
[ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13154465#comment-13154465 ] Shawn Heisey commented on SOLR-2906: All known bugs found and fixed, unit test looks correct and passes. This was created against branch_3x, but trunk probably won't be much different. IMHO, ready for review and possible inclusion. The javadoc and other comments were reviewed and modified, but not closely. > Implement LFU Cache > --- > > Key: SOLR-2906 > URL: https://issues.apache.org/jira/browse/SOLR-2906 > Project: Solr > Issue Type: Sub-task > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Priority: Minor > Attachments: ConcurrentLFUCache.java, LFUCache.java, TestLFUCache.java > > > Implement an LFU (Least Frequently Used) cache as the first step towards a > full ARC cache -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2906) Implement LFU Cache
[ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13154344#comment-13154344 ] Shawn Heisey commented on SOLR-2906: I've re-added lastAccessed to the class, as a tiebreaker when hitcount is equal. The test method prints out leastUsedItems and mostUsedItems. Somehow, item number 50 is included in both. > Implement LFU Cache > --- > > Key: SOLR-2906 > URL: https://issues.apache.org/jira/browse/SOLR-2906 > Project: Solr > Issue Type: Sub-task > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Priority: Minor > Attachments: ConcurrentLFUCache.java, ConcurrentLFUCache.java, > ConcurrentLFUCache.java, FastLFUCache.java, FastLFUCache.java, LFUCache.java, > TestLFUCache.java > > > Implement an LFU (Least Frequently Used) cache as the first step towards a > full ARC cache -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2906) Implement LFU Cache
[ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13153335#comment-13153335 ] Shawn Heisey commented on SOLR-2906: Would it be possible to adapt the lastAccessed shortcut to LFU, or would it be simply best to remove that section of code? > Implement LFU Cache > --- > > Key: SOLR-2906 > URL: https://issues.apache.org/jira/browse/SOLR-2906 > Project: Solr > Issue Type: Sub-task > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Priority: Minor > Attachments: ConcurrentLFUCache.java, FastLFUCache.java > > > Implement an LFU (Least Frequently Used) cache as the first step towards a > full ARC cache -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2906) Implement LFU Cache
[ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13153288#comment-13153288 ] Shawn Heisey commented on SOLR-2906: bq. But really, it seems like you should disregard all the algorithmic stuff in LRU when implementing LFU. If you think you see a bug in the existing LRU stuff, you're going to have to spell it out for me a bit more. I can't actually say that there is a bug, but I have to say that I'm really confused (in the LRU code) by what lastAccessed(Copy) actually is and how it works, and what the following code pieces from markAndSweep are doing with it, since wantToKeep and wantToRemove are entry counts: {code} long thisEntry = ce.lastAccessedCopy; if (thisEntry > newestEntry - wantToKeep) { } else if (thisEntry < oldestEntry + wantToRemove) { // entry in bottom group? {code} > Implement LFU Cache > --- > > Key: SOLR-2906 > URL: https://issues.apache.org/jira/browse/SOLR-2906 > Project: Solr > Issue Type: Sub-task > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Priority: Minor > Attachments: ConcurrentLFUCache.java, FastLFUCache.java > > > Implement an LFU (Least Frequently Used) cache as the first step towards a > full ARC cache -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2906) Implement LFU Cache
[ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13153249#comment-13153249 ] Shawn Heisey commented on SOLR-2906: I've been trying to find my bug. Looking back at the original LRU implementation, I have no idea how it's working. When a CacheEntry is created in the LRU code, one of the values sent in is an incremented stats.accessCounter, which gets called lastAccessed in the new object. When it is later used in markAndSweep, it is used in simple math along with the number of items that we want to keep/remove. This is very confusing, and I can't see how it could ever work. It might be that part of the code is simply skipped because of the way the math happens to work out. When I changed it around to use hitcounts, those counts are also used in the previously mentioned simple math, and I believe that results in some very weird behavior, such as removing most (or possibly all) of the cache entries. It appears that this idea is a lot more complicated than I originally thought, and that the current code needs to be at least partially rewritten. > Implement LFU Cache > --- > > Key: SOLR-2906 > URL: https://issues.apache.org/jira/browse/SOLR-2906 > Project: Solr > Issue Type: Sub-task > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Priority: Minor > Attachments: ConcurrentLFUCache.java, FastLFUCache.java > > > Implement an LFU (Least Frequently Used) cache as the first step towards a > full ARC cache -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2906) Implement LFU Cache
[ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13153082#comment-13153082 ] Shawn Heisey commented on SOLR-2906: I will fully admit that I built the new cache type from the old code without really understanding what the code was doing, and now I am out of my depth. > Implement LFU Cache > --- > > Key: SOLR-2906 > URL: https://issues.apache.org/jira/browse/SOLR-2906 > Project: Solr > Issue Type: Sub-task > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Priority: Minor > Attachments: ConcurrentLFUCache.java, FastLFUCache.java > > > Implement an LFU (Least Frequently Used) cache as the first step towards a > full ARC cache -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2906) Implement LFU Cache
[ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13153080#comment-13153080 ] Shawn Heisey commented on SOLR-2906: Evictions definitely don't seem to be working right. I finally got a benchmark script going. I watched as the size of the filterCache climbed to the maximum size of 64. On the next insert, it was suddenly only 7 entries, and the eviction counter had incremented from 290 to 348. That seems really aggressive. I seem to have done something wrong. Once I got through with the benchmark script, I did a commit, at that moment the size was 50 out of 64. After warming (autowarmCount 16), the cache size was 12, and both the hits and lookups were -12 (negative). > Implement LFU Cache > --- > > Key: SOLR-2906 > URL: https://issues.apache.org/jira/browse/SOLR-2906 > Project: Solr > Issue Type: Sub-task > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Priority: Minor > Attachments: ConcurrentLFUCache.java, FastLFUCache.java > > > Implement an LFU (Least Frequently Used) cache as the first step towards a > full ARC cache -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2906) Implement LFU Cache
[ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13153002#comment-13153002 ] Shawn Heisey commented on SOLR-2906: I used branch_3x to create the above files. I haven't even looked at trunk. > Implement LFU Cache > --- > > Key: SOLR-2906 > URL: https://issues.apache.org/jira/browse/SOLR-2906 > Project: Solr > Issue Type: Sub-task > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Priority: Minor > Attachments: ConcurrentLFUCache.java, FastLFUCache.java > > > Implement an LFU (Least Frequently Used) cache as the first step towards a > full ARC cache -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache
[ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152641#comment-13152641 ] Shawn Heisey commented on SOLR-2889: My original approach wasn't working well, which is why I said I wasn't going to be able to do it. Today I took a different approach, and the changes were pretty easy. I just made copies of ConcurrentLRUCache.java and FastLRUCache.java, then renamed and massaged them into LFU versions. The heart of what I did was remove lastAccessed and turned it into an AtomicLong named hits. It does work as a cache for some simple hand-entered queries, but I need to do some more extensive testing to see if evictions and warming are working as expected before I upload it. I think I'll temporarily stick in some println statements to watch what it's doing. Some other things that need to be done that I'm not sure I'm qualified for (but I will attempt): - Test code. - Abstracting out the common parts into parent classes. > Implement Adaptive Replacement Cache > > > Key: SOLR-2889 > URL: https://issues.apache.org/jira/browse/SOLR-2889 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Priority: Minor > > Currently Solr's caches are LRU, which doesn't look at hitcount to decide > which entries are most important. There is a method that takes both > frequency and time of cache hits into account: > http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache > If it's feasible, this could be a good addition to Solr/Lucene. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache
[ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13149949#comment-13149949 ] Shawn Heisey commented on SOLR-2889: After a close look, I find overall understanding elusive, and I have been told by my employer not to spend a lot of time on it. It must be relegated to my spare time, which is pretty scarce. > Implement Adaptive Replacement Cache > > > Key: SOLR-2889 > URL: https://issues.apache.org/jira/browse/SOLR-2889 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Priority: Minor > > Currently Solr's caches are LRU, which doesn't look at hitcount to decide > which entries are most important. There is a method that takes both > frequency and time of cache hits into account: > http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache > If it's feasible, this could be a good addition to Solr/Lucene. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache
[ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13149236#comment-13149236 ] Shawn Heisey commented on SOLR-2889: bq. ConcurrentLRUCache has been moved out of SolrJ and into Solr Core in trunk and 3x in SOLR-2758. Which source code checkout are you looking at? I'm looking at 3.4.0, the version I'm running. > Implement Adaptive Replacement Cache > > > Key: SOLR-2889 > URL: https://issues.apache.org/jira/browse/SOLR-2889 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Priority: Minor > > Currently Solr's caches are LRU, which doesn't look at hitcount to decide > which entries are most important. There is a method that takes both > frequency and time of cache hits into account: > http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache > If it's feasible, this could be a good addition to Solr/Lucene. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache
[ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13149179#comment-13149179 ] Shawn Heisey commented on SOLR-2889: FastLRUCache uses ConcurrentLRUCache, which includes a full class for a cache entry. A new member could be added to CacheEntry pretty easily to track usage, but the rest of the code would have to me modified to use it. Making sure it's all thread-safe would probably be the hard part. LRUCache relies on the alternate sort order on LinkedHashMap, so it would not be as simple to add usage tracking. Something I noticed along the way: The solrj tree seems like an odd place for ConcurrentLRUCache, because nothing else in that section uses it (directly at least). > Implement Adaptive Replacement Cache > > > Key: SOLR-2889 > URL: https://issues.apache.org/jira/browse/SOLR-2889 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Priority: Minor > > Currently Solr's caches are LRU, which doesn't look at hitcount to decide > which entries are most important. There is a method that takes both > frequency and time of cache hits into account: > http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache > If it's feasible, this could be a good addition to Solr/Lucene. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache
[ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148880#comment-13148880 ] Shawn Heisey commented on SOLR-2889: @lance: To say I'm working on it is very much an overstatement. I have taken a quick look, enough to know that I may be in over my head, requiring a lot of learning before diving in. I will give it a try, but Yoda probably would not be impressed by the results. > Implement Adaptive Replacement Cache > > > Key: SOLR-2889 > URL: https://issues.apache.org/jira/browse/SOLR-2889 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Priority: Minor > > Currently Solr's caches are LRU, which doesn't look at hitcount to decide > which entries are most important. There is a method that takes both > frequency and time of cache hits into account: > http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache > If it's feasible, this could be a good addition to Solr/Lucene. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache
[ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148529#comment-13148529 ] Shawn Heisey commented on SOLR-2889: Two things: 1) After some thought, I have concluded that a straight LFU cache might fit my needs perfectly, and it's a baby step towards ARC. 2) I took a quick look at some of the code. The code for cache trimming and warming is in ConcurrentLRUCache.java, but the hits seem to be tracked in {Fast}LRUCache.java. I think this means that the first step would be to refactor things so that we have one or more base classes with common functionality, which are then extended or imported by smaller classes that implement LRU, LFU, and ARC. Am I on the right track? Does this need another issue? > Implement Adaptive Replacement Cache > > > Key: SOLR-2889 > URL: https://issues.apache.org/jira/browse/SOLR-2889 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Priority: Minor > > Currently Solr's caches are LRU, which doesn't look at hitcount to decide > which entries are most important. There is a method that takes both > frequency and time of cache hits into account: > http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache > If it's feasible, this could be a good addition to Solr/Lucene. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache
[ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148124#comment-13148124 ] Shawn Heisey commented on SOLR-2889: @simon: I will certainly take a look, and I am encouraged by yonik's assessment that it's a simple change, but I have to say that I'm a Java newbie. I hope that I can do it, but I'm not super optimistic. > Implement Adaptive Replacement Cache > > > Key: SOLR-2889 > URL: https://issues.apache.org/jira/browse/SOLR-2889 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 3.4 >Reporter: Shawn Heisey >Priority: Minor > > Currently Solr's caches are LRU, which doesn't look at hitcount to decide > which entries are most important. There is a method that takes both > frequency and time of cache hits into account: > http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache > If it's feasible, this could be a good addition to Solr/Lucene. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org