Hi Tom, I believe Solr will automatically use DocValues for faceting if you've defined them in the schema.
Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Mon, Nov 11, 2013 at 11:33 AM, Tom Burton-West <tburt...@umich.edu> wrote: > Thanks Otis, > > I'm looking forward to the presentation videos. > > I'll look into using DocValues. Re-indexing 200 million docs will take a > while though :). > Will Solr automatically use DocValues for faceting if you have DocValues for > the field or is there some configuration or parameter that needs to be set? > > Tom > > > On Sat, Nov 9, 2013 at 9:57 AM, Otis Gospodnetic > <otis.gospodne...@gmail.com> wrote: >> >> Hi Tom, >> >> Check http://blog.sematext.com/2013/11/09/presentation-solr-for-analytics/ >> . It includes info about our experiment with DocValues, which clearly >> shows lower heap usage, which means you'll get further without getting >> this OOM. In our experiments we didn't sort, facet, or group, and I >> see you are faceting, which means that DocValues, which are more >> efficient than FieldCache, should help you even more than it helped >> us. >> >> The graphs are from SPM, which you could use to monitor your Solr >> cluster, at least while you are tuning it. >> >> Otis >> -- >> Performance Monitoring * Log Analytics * Search Analytics >> Solr & Elasticsearch Support * http://sematext.com/ >> >> >> On Fri, Nov 8, 2013 at 2:41 PM, Tom Burton-West <tburt...@umich.edu> >> wrote: >> > Hi Yonik, >> > >> > I don't know enough about JVM tuning and monitoring to do this in a >> > clean >> > way, so I just tried setting the max heap at 8GB and then 6GB to force >> > garbage collection. With it set to 6GB it goes into a long GC loop and >> > then runs out of heap (See below) . The stack trace says the issue is >> > with >> > DocTErmOrds.uninvert: >> > Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded >> > at org.apache.lucene.index.DocTermOrds.uninvert(DocTermOrds.java:405) >> > >> > I'm guessing the actual peak is somewhere between 6 and 8 GB. >> > >> > BTW: is there some documentation somewhere that explains what the stats >> > output to INFO mean? >> > >> > Tom >> > >> > >> > java.lang.OutOfMemoryError: GC overhead limit exceeded</str><str >> > name="trace">java.lang.RuntimeException: java.lang.OutOfMemoryError: GC >> > overhead limit exceeded >> > at >> > >> > org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:653) >> > at >> > >> > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:366) >> > at >> > >> > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141) >> > at >> > >> > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215) >> > at >> > >> > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) >> > at >> > >> > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213) >> > at >> > >> > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172) >> > at >> > org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:548) >> > at >> > >> > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) >> > at >> > >> > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117) >> > at >> > >> > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108) >> > at >> > >> > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174) >> > at >> > >> > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:875) >> > at >> > >> > org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665) >> > at >> > >> > org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528) >> > at >> > >> > org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81) >> > at >> > >> > org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689) >> > at java.lang.Thread.run(Thread.java:724) >> > Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded >> > at org.apache.lucene.index.DocTermOrds.uninvert(DocTermOrds.java:405) >> > at >> > org.apache.solr.request.UnInvertedField.<init>(UnInvertedField.java:179) >> > at >> > >> > org.apache.solr.request.UnInvertedField.getUnInvertedField(UnInvertedField.java:664) >> > at >> > org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:426) >> > at >> > >> > org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:517) >> > at >> > >> > org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:252) >> > at >> > >> > org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:78) >> > at >> > >> > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208) >> > at >> > >> > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) >> > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817) >> > at >> > >> > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639) >> > at >> > >> > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) >> > ... 16 more >> > </str> >> > >> > --- >> > Nov 08, 2013 1:39:26 PM org.apache.solr.request.UnInvertedField <init> >> > INFO: UnInverted multi-valued field {field=topicStr, >> > memSize=1,768,101,824, >> > tindexSize=86,028, >> > time=45,854, >> > phase1=41,039, >> > nTerms=271,987, >> > bigTerms=0, >> > termInstances=569,429,716, >> > uses=0} >> > Nov 08, 2013 1:39:28 PM org.apache.solr.core.SolrCore execute >> > >> > INFO: [core] webapp=/dev-3 path=/select >> > >> > params={facet=true&facet.mincount=100&indent=true&q=ocr:the&facet.limit=30&facet.field=topicStr&wt=xml} >> > hits=138,605,690 status=0 QTime=49,797 >> > >> > >> > >> > On Fri, Nov 8, 2013 at 2:01 PM, Yonik Seeley <yo...@heliosearch.com> >> > wrote: >> >> >> >> On Fri, Nov 8, 2013 at 1:56 PM, Tom Burton-West <tburt...@umich.edu> >> >> wrote: >> >> > When testing an index of about 200 million documents, when we do a >> >> > first >> >> > faceting on one field (query appended below), the memory use rises >> >> > from >> >> > about 2.5 GB to 13GB. If I run GC after the query the memory use >> >> > goes >> >> > down >> >> > to about 3GB and subsequent queries don't significantly increase the >> >> > memory >> >> > use. >> >> >> >> Is there a way to tell what the real max memory usage is? I assume >> >> 13GB is just the peak heap usage, but that could include a lot of >> >> garbage. >> >> >> >> -Yonik >> >> http://heliosearch.com -- making solr shine >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org