Re: Solr Suggest Component and OOM
Has anyone ever been successful in processing 150M records using the Suggester Component? The make of the component, please comment. On Tue, Jun 26, 2018 at 1:37 AM, Ratnadeep Rakshit wrote: > The site_address field has all the address of United states. Idea is to > build something similar to Google Places autosuggest. > > Here's an example query: curl "http://localhost/solr/ > addressbook/suggest?suggest.q=1054%20club=json" > > Response: > > { > "responseHeader": { > "status": 0, > "QTime": 3125, > "params": { > "suggest.q": "1054 club", > "wt": "json" > } > }, > "suggest": { > "mySuggester2": { > "1054 club": { > "numFound": 3, > "suggestions": [{ > "term": "1054 null N COUNTRY CLUB null BLVD null STOCKTON CA > 95204 5008", > "weight": 0, > "payload": "0023865882|06077|37.970769,-121.310433" > }, { > "term": "1054 null E HERITAGE CLUB null CIR null DELRAY > BEACH FL 33483 3482", > "weight": 0, > "payload": "0117190535|12099|26.445485,-80.069336" > }, { > "term": "1054 null null CORAL CLUB null DR 1054 CORAL > SPRINGS FL 33071 5657", > "weight": 0, > "payload": "0111342342|12011|26.243918,-80.267577" > }] > } > }, > "mySuggester1": { > "1054 club": { > "numFound": 0, > "suggestions": [] > } > } > } > } > > Now when I start building with 25M address records in the addressbook > core, the process runs smoothly. I can check the Heap utilization upto 56% > max out of the 20GB allotted to Solr. > I am not very experienced in metering solr performance. But it looks like > when I increase the record size beyond 25M in the core, the build process > fails. The query process of the suggester still works. > > Did that answer your questions correctly? > > On Tue, Jun 12, 2018 at 3:17 PM, Alessandro Benedetti < > a.benede...@sease.io> wrote: > >> Hi, >> first of all the two different suggesters you are using are based on >> different data structures ( with different memory utilisation) : >> >> - FuzzyLookupFactory -> FST ( in memory and stored binary on disk) >> - AnalyzingInfixLookupFactory -> Auxiliary Lucene Index >> >> Both the data structures should be very memory efficient ( both in >> building >> and storage). >> What is the cardinality of the fields you are building suggestions from ? >> ( >> site_address and site_address_other) >> What is the memory situation in Solr when you start the suggester >> building ? >> You are allocating much more memory to the JVM Solr process than the OS ( >> which in your situation doesn't fit the entire index ideal scenario). >> >> I would recommend to put some monitoring in place ( there are plenty of >> open >> source tools to do that) >> >> Regards >> >> >> >> - >> --- >> Alessandro Benedetti >> Search Consultant, R Software Engineer, Director >> Sease Ltd. - www.sease.io >> -- >> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html >> > >
Re: Solr Suggest Component and OOM
The site_address field has all the address of United states. Idea is to build something similar to Google Places autosuggest. Here's an example query: curl " http://localhost/solr/addressbook/suggest?suggest.q=1054%20club=json; Response: { "responseHeader": { "status": 0, "QTime": 3125, "params": { "suggest.q": "1054 club", "wt": "json" } }, "suggest": { "mySuggester2": { "1054 club": { "numFound": 3, "suggestions": [{ "term": "1054 null N COUNTRY CLUB null BLVD null STOCKTON CA 95204 5008", "weight": 0, "payload": "0023865882|06077|37.970769,-121.310433" }, { "term": "1054 null E HERITAGE CLUB null CIR null DELRAY BEACH FL 33483 3482", "weight": 0, "payload": "0117190535|12099|26.445485,-80.069336" }, { "term": "1054 null null CORAL CLUB null DR 1054 CORAL SPRINGS FL 33071 5657", "weight": 0, "payload": "0111342342|12011|26.243918,-80.267577" }] } }, "mySuggester1": { "1054 club": { "numFound": 0, "suggestions": [] } } } } Now when I start building with 25M address records in the addressbook core, the process runs smoothly. I can check the Heap utilization upto 56% max out of the 20GB allotted to Solr. I am not very experienced in metering solr performance. But it looks like when I increase the record size beyond 25M in the core, the build process fails. The query process of the suggester still works. Did that answer your questions correctly? On Tue, Jun 12, 2018 at 3:17 PM, Alessandro Benedetti wrote: > Hi, > first of all the two different suggesters you are using are based on > different data structures ( with different memory utilisation) : > > - FuzzyLookupFactory -> FST ( in memory and stored binary on disk) > - AnalyzingInfixLookupFactory -> Auxiliary Lucene Index > > Both the data structures should be very memory efficient ( both in building > and storage). > What is the cardinality of the fields you are building suggestions from ? ( > site_address and site_address_other) > What is the memory situation in Solr when you start the suggester building > ? > You are allocating much more memory to the JVM Solr process than the OS ( > which in your situation doesn't fit the entire index ideal scenario). > > I would recommend to put some monitoring in place ( there are plenty of > open > source tools to do that) > > Regards > > > > - > --- > Alessandro Benedetti > Search Consultant, R Software Engineer, Director > Sease Ltd. - www.sease.io > -- > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html >
Re: Solr Suggest Component and OOM
Anyone from the Solr team who can shed some more light? On Tue, Jun 12, 2018 at 8:13 PM, Ratnadeep Rakshit wrote: > I observed that the build works if the data size is below 25M. The moment > the records go beyond that, this OOM error shows up. Solar itself shows 56% > usage of 20GB space during the build. So, is there some settings I need to > change to handle larger data size? > > On Tue, Jun 12, 2018 at 3:17 PM, Alessandro Benedetti < > a.benede...@sease.io> wrote: > >> Hi, >> first of all the two different suggesters you are using are based on >> different data structures ( with different memory utilisation) : >> >> - FuzzyLookupFactory -> FST ( in memory and stored binary on disk) >> - AnalyzingInfixLookupFactory -> Auxiliary Lucene Index >> >> Both the data structures should be very memory efficient ( both in >> building >> and storage). >> What is the cardinality of the fields you are building suggestions from ? >> ( >> site_address and site_address_other) >> What is the memory situation in Solr when you start the suggester >> building ? >> You are allocating much more memory to the JVM Solr process than the OS ( >> which in your situation doesn't fit the entire index ideal scenario). >> >> I would recommend to put some monitoring in place ( there are plenty of >> open >> source tools to do that) >> >> Regards >> >> >> >> - >> --- >> Alessandro Benedetti >> Search Consultant, R Software Engineer, Director >> Sease Ltd. - www.sease.io >> -- >> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html >> > >
Re: Solr Suggest Component and OOM
I observed that the build works if the data size is below 25M. The moment the records go beyond that, this OOM error shows up. Solar itself shows 56% usage of 20GB space during the build. So, is there some settings I need to change to handle larger data size? On Tue, Jun 12, 2018 at 3:17 PM, Alessandro Benedetti wrote: > Hi, > first of all the two different suggesters you are using are based on > different data structures ( with different memory utilisation) : > > - FuzzyLookupFactory -> FST ( in memory and stored binary on disk) > - AnalyzingInfixLookupFactory -> Auxiliary Lucene Index > > Both the data structures should be very memory efficient ( both in building > and storage). > What is the cardinality of the fields you are building suggestions from ? ( > site_address and site_address_other) > What is the memory situation in Solr when you start the suggester building > ? > You are allocating much more memory to the JVM Solr process than the OS ( > which in your situation doesn't fit the entire index ideal scenario). > > I would recommend to put some monitoring in place ( there are plenty of > open > source tools to do that) > > Regards > > > > - > --- > Alessandro Benedetti > Search Consultant, R Software Engineer, Director > Sease Ltd. - www.sease.io > -- > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html >
Re: Solr Suggest Component and OOM
Can anyone put some light on this? On Tue, Jun 12, 2018 at 12:32 AM, Ratnadeep Rakshit wrote: > Here's the stack trace : > > 538 ERROR - 2018-06-07 09:07:36.030; [ x:addressbook] > org.apache.solr.common.SolrException; null:java.lang.RuntimeException: > java.lang.OutOfMemory > > Error: Java heap space > >539 at org.apache.solr.servlet.HttpSolrCall.sendError( > HttpSolrCall.java:607) > >540 at org.apache.solr.servlet.HttpSolrCall.call( > HttpSolrCall.java:475) > >541 at org.apache.solr.servlet.SolrDispatchFilter.doFilter( > SolrDispatchFilter.java:257) > >542 at org.apache.solr.servlet.SolrDispatchFilter.doFilter( > SolrDispatchFilter.java:208) > >543 at org.eclipse.jetty.servlet.ServletHandler$CachedChain. > doFilter(ServletHandler.java:1652) > >544 at org.eclipse.jetty.servlet.ServletHandler.doHandle( > ServletHandler.java:585) > >545 at org.eclipse.jetty.server.handler.ScopedHandler.handle( > ScopedHandler.java:143) > >546 at org.eclipse.jetty.security.SecurityHandler.handle( > SecurityHandler.java:577) > >547 at org.eclipse.jetty.server.session.SessionHandler. > doHandle(SessionHandler.java:223) > >548 at org.eclipse.jetty.server.handler.ContextHandler. > doHandle(ContextHandler.java:1127) > >549 at org.eclipse.jetty.servlet.ServletHandler.doScope( > ServletHandler.java:515) > >550 at org.eclipse.jetty.server.session.SessionHandler. > doScope(SessionHandler.java:185) > >551 at org.eclipse.jetty.server.handler.ContextHandler. > doScope(ContextHandler.java:1061) > >552 at org.eclipse.jetty.server.handler.ScopedHandler.handle( > ScopedHandler.java:141) > >553 at org.eclipse.jetty.server.handler. > ContextHandlerCollection.handle(ContextHandlerCollection.java:215) > >554 at org.eclipse.jetty.server.handler.HandlerCollection. > handle(HandlerCollection.java:110) > >555 at org.eclipse.jetty.server.handler.HandlerWrapper.handle( > HandlerWrapper.java:97) > >556 at org.eclipse.jetty.server.Server.handle(Server.java:499) > >557 at org.eclipse.jetty.server.HttpChannel.handle( > HttpChannel.java:310) > >558 at org.eclipse.jetty.server.HttpConnection.onFillable( > HttpConnection.java:257) > >559 at org.eclipse.jetty.io.AbstractConnection$2.run( > AbstractConnection.java:540) > >560 at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob( > QueuedThreadPool.java:635) > >561 at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run( > QueuedThreadPool.java:555) > >562 at java.lang.Thread.run(Thread.java:745) > >563 Caused by: java.lang.OutOfMemoryError: Java heap space > >564 at org.apache.lucene.util.packed. > Packed64.(Packed64.java:73) > >565 at org.apache.lucene.util.packed.PackedInts.getMutable( > PackedInts.java:1009) > >566 at org.apache.lucene.util.packed.PackedInts.getMutable( > PackedInts.java:976) > >567 at org.apache.lucene.util.packed. > GrowableWriter.ensureCapacity(GrowableWriter.java:80) > >568 at org.apache.lucene.util.packed.GrowableWriter.set( > GrowableWriter.java:88) > >569 at org.apache.lucene.util.packed.AbstractPagedMutable.set( > AbstractPagedMutable.java:101) > >570 at org.apache.lucene.util.fst. > NodeHash.addNew(NodeHash.java:152) > >571 at org.apache.lucene.util.fst. > NodeHash.rehash(NodeHash.java:169) > >572 at org.apache.lucene.util.fst.NodeHash.add(NodeHash.java: > 133) > >573 at org.apache.lucene.util.fst.Builder.compileNode(Builder. > java:215) > >574 at org.apache.lucene.util.fst.Builder.freezeTail(Builder. > java:310) > >575 at org.apache.lucene.util.fst. > Builder.add(Builder.java:417) > >576 at org.apache.lucene.search.suggest.analyzing. > AnalyzingSuggester.build(AnalyzingSuggester.java:565) > >577 at org.apache.lucene.search.suggest.Lookup.build(Lookup. > java:193) > >578 at org.apache.solr.spelling.suggest.SolrSuggester.build( > SolrSuggester.java:176) > > 576 at org.apache.lucene.search.suggest.analyzing. > AnalyzingSuggester.build(AnalyzingSuggester.java:565) > >577 at org.apache.lucene.search.suggest.Lookup.build(Lookup. > java:193) > >578 at org.apache.solr.spelling.suggest.SolrSuggester.build( > SolrSuggester.java:176) > >579
Re: Solr Suggest Component and OOM
) 587 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) 588 at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585) 589 at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) 590 at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577) 591 at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223) 592 at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127) 593 at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) 594 at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) 595 at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061) 596 597 WARN - 2018-06-07 09:07:36.053; [ x:addressbook] org.eclipse.jetty.servlet.ServletHandler; Error for /solr/addressbook/suggest 598 java.lang.OutOfMemoryError: Java heap space 599 at org.apache.lucene.util.packed.Packed64.(Packed64.java:73) 600 at org.apache.lucene.util.packed.PackedInts.getMutable(PackedInts.java:1009) 601 at org.apache.lucene.util.packed.PackedInts.getMutable(PackedInts.java:976) 602 at org.apache.lucene.util.packed.GrowableWriter.ensureCapacity(GrowableWriter.java:80) 603 at org.apache.lucene.util.packed.GrowableWriter.set(GrowableWriter.java:88) 604 at org.apache.lucene.util.packed.AbstractPagedMutable.set(AbstractPagedMutable.java:101) 605 at org.apache.lucene.util.fst.NodeHash.addNew(NodeHash.java:152) 606 at org.apache.lucene.util.fst.NodeHash.rehash(NodeHash.java:169) 607 at org.apache.lucene.util.fst.NodeHash.add(NodeHash.java:133) 608 at org.apache.lucene.util.fst.Builder.compileNode(Builder.java:215) 609 at org.apache.lucene.util.fst.Builder.freezeTail(Builder.java:310) 610 at org.apache.lucene.util.fst.Builder.add(Builder.java:417) 611 at org.apache.lucene.search.suggest.analyzing.AnalyzingSuggester.build(AnalyzingSuggester.java:565) 612 at org.apache.lucene.search.suggest.Lookup.build(Lookup.java:193) 613 at org.apache.solr.spelling.suggest.SolrSuggester.build(SolrSuggester.java:176) 614 at org.apache.solr.handler.component.SuggestComponent.prepare(SuggestComponent.java:179) 615 at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:246) 616 at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:155) 617 at org.apache.solr.core.SolrCore.execute(SolrCore.java:2102) 618 at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:654) 619 at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:460) 620 at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257) 621 at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208) 622 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) 623 at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585) 624 at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) 625 at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577) 626 at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223) 627 at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127) 628 at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) 629 at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) 630 at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061) On Mon, Jun 11, 2018 at 11:34 PM, Christopher Schultz < ch...@christopherschultz.net> wrote: > Ratnadeep, > > On 6/11/18 12:25 PM, Ratnadeep Rakshit wrote: > > I am using the Solr Suggester component in Solr 5.5 with a lot of address > > data. My Machine has allotted 20Gb RAM for solr and the machine has 32GB > > RAM in total. > > > > I have an address book core with the following vitals - > > > > "numDocs"=153242074 > > "segmentCount"=34 > > "size"=30.29 GB > > > > My solrconfig.xml looks something like this - > > > > > > > > mySuggester1 > > FuzzyLookupFactory > > suggester_fuzzy_dir > > > > > > > > DocumentDictionaryFactory > > site_address > > s
Solr Suggest Component and OOM
I am using the Solr Suggester component in Solr 5.5 with a lot of address data. My Machine has allotted 20Gb RAM for solr and the machine has 32GB RAM in total. I have an address book core with the following vitals - "numDocs"=153242074 "segmentCount"=34 "size"=30.29 GB My solrconfig.xml looks something like this - mySuggester1 FuzzyLookupFactory suggester_fuzzy_dir DocumentDictionaryFactory site_address suggestType property_metadata false false mySuggester2 AnalyzingInfixLookupFactory suggester_infix_dir DocumentDictionaryFactory site_address_other suggestType property_metadata false false The handler is defined like so - true 10 mySuggester1 mySuggester2 false explicit suggest *Problem Statement* Every time I try to build the suggest index using the suggest.build=true url parameter, I end up with an OutOfMemory error. I have no clue how I can make this work with the current setup. Can anyone explain why this is happening? And how can I fix this issue? *StackOverflow:* https://stackoverflow.com/questions/50802122/solr-suggest-component-and-outofmemory-error