Re: Indexing Multiple Languages with solr (Arabic English)
Hi, Thanks for ur post, I donot know how to use text_ar fieldtype for Arabic language. What are the configurations need to add in schema.xml file ? Please guide me. AnilJayanti -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-Multiple-Languages-with-solr-Arabic-English-tp4104580p4104613.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Indexing Multiple Languages with solr (Arabic English)
It's just a text type. So, just declare another field and instead of text_general or text_en, use text_ar. Then use copyField from source text field to it. Go through the tutorial, if you haven't yet. It explains some of the things. Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Tue, Dec 3, 2013 at 3:12 PM, aniljayanti aniljaya...@yahoo.co.in wrote: Hi, Thanks for ur post, I donot know how to use text_ar fieldtype for Arabic language. What are the configurations need to add in schema.xml file ? Please guide me. AnilJayanti -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-Multiple-Languages-with-solr-Arabic-English-tp4104580p4104613.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: post filtering for boolean filter queries
ok, we were able to confirm the behavior regarding not caching the filter query. It works as expected. It does not cache with {!cache=false}. We are still looking into clarifying the cost assignment: i.e. whether it works as expected for long boolean filter queries. On Tue, Dec 3, 2013 at 8:55 AM, Dmitry Kan solrexp...@gmail.com wrote: Hello! We have been experimenting with post filtering lately. Our setup is a filter having long boolean query; drawing the example from the Dublin's Stump the Chump: fq=UserId:(user1 OR user2 OR...OR user1000) The underlining issue impacting performance is that the combination of user ids in the query above is unique per each user in the system and on top the combination is changing every day. Our idea was to stop caching the filter query with {!cache=false}. Since there is no way to introspect the contents of the filter cache to our knowledge (jmx?), we can't be sure those are not cached. This is because the initial query per each combination takes substantially more time (as if it was *not* cached) than the second and subsequent queries with the same fq (as if it *was* cached). Question is: does post filtering support boolean queries in fq params? Another thing we have been trying is assigning a cost to the fq relatively higher than for other filter queries. Does this feature support the boolean queries in fq params as well? -- Dmitry Blog: http://dmitrykan.blogspot.com Twitter: twitter.com/dmitrykan -- Dmitry Blog: http://dmitrykan.blogspot.com Twitter: twitter.com/dmitrykan
Re: SolrCloud keeps repeating exception 'SolrCoreState already closed'
I just ran into this issue on solr 4.6 on an EC2 machine while indexing wikipedia dump with DIH. I'm trying to isolate exceptions before the SolrCoreState already closed exception. On Sun, Nov 10, 2013 at 11:58 PM, Mark Miller markrmil...@gmail.com wrote: Can you isolate any exceptions that happened just before that exception. started repeating? - Mark On Nov 7, 2013, at 9:09 AM, Eric Bus eric@websight.nl wrote: Hi, I'm having a problem with one of my shards. Since yesterday, SOLR keeps repeating the same exception over and over for this shard. The webinterface for this SOLR instance is also not working (it hangs on the Loading indicator). Nov 7, 2013 9:08:12 AM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: [website1_shard1_replica3] webapp=/solr path=/update params={update.distrib=TOLEADERwt=javabinversion=2} {} 0 0 Nov 7, 2013 9:08:12 AM org.apache.solr.common.SolrException log SEVERE: java.lang.RuntimeException: SolrCoreState already closed at org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:79) at org.apache.solr.update.DirectUpdateHandler2.delete(DirectUpdateHandler2.java:276) at org.apache.solr.update.processor.RunUpdateProcessor.processDelete(RunUpdateProcessorFactory.java:77) at org.apache.solr.update.processor.UpdateRequestProcessor.processDelete(UpdateRequestProcessor.java:55) at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalDelete(DistributedUpdateProcessor.java:460) at org.apache.solr.update.processor.DistributedUpdateProcessor.versionDelete(DistributedUpdateProcessor.java:1036) at org.apache.solr.update.processor.DistributedUpdateProcessor.processDelete(DistributedUpdateProcessor.java:721) at org.apache.solr.update.processor.LogUpdateProcessor.processDelete(LogUpdateProcessorFactory.java:121) at org.apache.solr.handler.loader.XMLLoader.processDelete(XMLLoader.java:346) at org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:277) at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:173) at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:448) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:269) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:662) I have about 3GB of logfiles for this single message. Reloading the collection does not work. Reloading the specific shard core returns the same exception. The only option seems to be to restart the server. But because it's the leader for a lot of collections, I want to know why this is happening. I've seen this problem before, and I haven't figured out what is causing it. I've reported a different problem a few days ago with 'hanging' deleted logfiles. Could this be related? Could the hanging logfiles prevent a new Searcher from opening? I've updated two of my three hosts to 4.5.1 but after only 2 days uptime, I'm still seeing about 11.000 deleted logfiles in the lsof output. Best regards, Eric Bus -- Regards, Shalin Shekhar Mangar.
RE: SolrCloud keeps repeating exception 'SolrCoreState already closed'
Are you currently running SOLR under Tomcat or standalone with Jetty? I switched from Tomcat to Jetty and the problems went away. - Eric -Oorspronkelijk bericht- Van: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] Verzonden: dinsdag 3 december 2013 12:44 Aan: solr-user@lucene.apache.org Onderwerp: Re: SolrCloud keeps repeating exception 'SolrCoreState already closed' I just ran into this issue on solr 4.6 on an EC2 machine while indexing wikipedia dump with DIH. I'm trying to isolate exceptions before the SolrCoreState already closed exception. On Sun, Nov 10, 2013 at 11:58 PM, Mark Miller markrmil...@gmail.com wrote: Can you isolate any exceptions that happened just before that exception. started repeating? - Mark On Nov 7, 2013, at 9:09 AM, Eric Bus eric@websight.nl wrote: Hi, I'm having a problem with one of my shards. Since yesterday, SOLR keeps repeating the same exception over and over for this shard. The webinterface for this SOLR instance is also not working (it hangs on the Loading indicator). Nov 7, 2013 9:08:12 AM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: [website1_shard1_replica3] webapp=/solr path=/update params={update.distrib=TOLEADERwt=javabinversion=2} {} 0 0 Nov 7, 2013 9:08:12 AM org.apache.solr.common.SolrException log SEVERE: java.lang.RuntimeException: SolrCoreState already closed at org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:79) at org.apache.solr.update.DirectUpdateHandler2.delete(DirectUpdateHandler2.java:276) at org.apache.solr.update.processor.RunUpdateProcessor.processDelete(RunUpdateProcessorFactory.java:77) at org.apache.solr.update.processor.UpdateRequestProcessor.processDelete(UpdateRequestProcessor.java:55) at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalDelete(DistributedUpdateProcessor.java:460) at org.apache.solr.update.processor.DistributedUpdateProcessor.versionDelete(DistributedUpdateProcessor.java:1036) at org.apache.solr.update.processor.DistributedUpdateProcessor.processDelete(DistributedUpdateProcessor.java:721) at org.apache.solr.update.processor.LogUpdateProcessor.processDelete(LogUpdateProcessorFactory.java:121) at org.apache.solr.handler.loader.XMLLoader.processDelete(XMLLoader.java:346) at org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:277) at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:173) at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:448) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:269) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:662) I have about 3GB of logfiles for this single message. Reloading the collection does not work. Reloading the specific shard core returns the same exception. The only option seems to be to restart the server. But because it's the leader for a lot of collections, I want to know why this is happening. I've seen this problem before, and I haven't figured out what is causing it. I've reported a different problem a few days ago with 'hanging' deleted logfiles. Could this be related? Could the hanging logfiles prevent a new Searcher from opening? I've updated two of my three hosts to 4.5.1 but after only 2 days uptime, I'm still seeing about 11.000 deleted logfiles in the lsof output. Best regards, Eric Bus -- Regards, Shalin
Re: Using the flexible query parser in Solr instead of classic
I don't recall hearing any discussion of such a switch. In fact, Solr now has its own copy of the classic Lucene query parser since Solr needed some features that the Lucene guys did not find acceptable. That said, if you have a proposal to dramatically upgrade the base Solr query parser, as well as edismax, I'm sure people would be interested. I think the intent was to evolve edismax to the point where it would become the default Solr query parser. So, maybe that would be the ideal starting point - a version of edismax based on the flexible query parser rather than the classic query parser. -- Jack Krupansky -Original Message- From: Karsten R. Sent: Tuesday, December 03, 2013 1:24 AM To: solr-user@lucene.apache.org Subject: Using the flexible query parser in Solr instead of classic Hi folks, last year we built a 3.X Solr-QueryParser based on org.apache.lucene.queryparser.flexible.standard.StandardQueryParser because we had some additions with SpanQueries and PhraseQueries. We think about to adapt this for 4.X At time the SolrQueryParser is based on org.apache.lucene.queryparser.classic.QueryParser.jj Is there a plan for 4.X to switch with LuceneQParser from classic to flexible ( org.apache.lucene.queryparser.flexible.standard.parser.StandardSyntaxParser.jj )? Is there a SOLR-Task to use the flexible QP ? Is this a need for someone else? Beste regards Karsten P.S. I did only found one (unanswered) Thread and no Task about Solr and flexible QP (Thread: http://lucene.472066.n3.nabble.com/Using-the-contrib-flexible-query-parser-in-Solr-td819.html ) -- View this message in context: http://lucene.472066.n3.nabble.com/Using-the-flexible-query-parser-in-Solr-instead-of-classic-tp4104584.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrCloud keeps repeating exception 'SolrCoreState already closed'
No, I am running on the example jetty. I am re-running the import and haven't hit the problem yet. Still running. On Tue, Dec 3, 2013 at 5:45 PM, Eric Bus eric@websight.nl wrote: Are you currently running SOLR under Tomcat or standalone with Jetty? I switched from Tomcat to Jetty and the problems went away. - Eric -Oorspronkelijk bericht- Van: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] Verzonden: dinsdag 3 december 2013 12:44 Aan: solr-user@lucene.apache.org Onderwerp: Re: SolrCloud keeps repeating exception 'SolrCoreState already closed' I just ran into this issue on solr 4.6 on an EC2 machine while indexing wikipedia dump with DIH. I'm trying to isolate exceptions before the SolrCoreState already closed exception. On Sun, Nov 10, 2013 at 11:58 PM, Mark Miller markrmil...@gmail.com wrote: Can you isolate any exceptions that happened just before that exception. started repeating? - Mark On Nov 7, 2013, at 9:09 AM, Eric Bus eric@websight.nl wrote: Hi, I'm having a problem with one of my shards. Since yesterday, SOLR keeps repeating the same exception over and over for this shard. The webinterface for this SOLR instance is also not working (it hangs on the Loading indicator). Nov 7, 2013 9:08:12 AM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: [website1_shard1_replica3] webapp=/solr path=/update params={update.distrib=TOLEADERwt=javabinversion=2} {} 0 0 Nov 7, 2013 9:08:12 AM org.apache.solr.common.SolrException log SEVERE: java.lang.RuntimeException: SolrCoreState already closed at org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:79) at org.apache.solr.update.DirectUpdateHandler2.delete(DirectUpdateHandler2.java:276) at org.apache.solr.update.processor.RunUpdateProcessor.processDelete(RunUpdateProcessorFactory.java:77) at org.apache.solr.update.processor.UpdateRequestProcessor.processDelete(UpdateRequestProcessor.java:55) at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalDelete(DistributedUpdateProcessor.java:460) at org.apache.solr.update.processor.DistributedUpdateProcessor.versionDelete(DistributedUpdateProcessor.java:1036) at org.apache.solr.update.processor.DistributedUpdateProcessor.processDelete(DistributedUpdateProcessor.java:721) at org.apache.solr.update.processor.LogUpdateProcessor.processDelete(LogUpdateProcessorFactory.java:121) at org.apache.solr.handler.loader.XMLLoader.processDelete(XMLLoader.java:346) at org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:277) at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:173) at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:448) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:269) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:662) I have about 3GB of logfiles for this single message. Reloading the collection does not work. Reloading the specific shard core returns the same exception. The only option seems to be to restart the server. But because it's the leader for a lot of collections, I want to know why this is happening. I've seen this problem before, and I haven't figured out what is causing it. I've reported a different problem a few days ago with 'hanging' deleted logfiles. Could this be related? Could the hanging logfiles prevent a new Searcher from opening?
Deleting and committing inside a SearchComponent
Hi Is it possible to delete and commit updates to an index inside a custom SearchComponent? I know I can do it with solrj but due to several business logic requirements I need to build the logic inside the search component. I am using SOLR 4.5.0. thank you
Re: Constantly increasing time of full data import
This occurs only on production environment so I can't profile it :-) Any clues? DirectUpdateHandler2 config: autoCommit maxTime15000/maxTime openSearcherfalse/openSearcher /autoCommit updateLog str name=dir${solr.ulog.dir:}/str /updateLog -- View this message in context: http://lucene.472066.n3.nabble.com/Constantly-increasing-time-of-full-data-import-tp4103873p4104722.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Automatically build spellcheck dictionary on replicas
Did you try to add str name=buildOnCommittrue/str parameter to your slave's spellcheck configuration? 03.12.2013, 12:04, Mirko idonthaveenoughinformat...@googlemail.com: Hi all, We use a Solr SpellcheckComponent with a file-based dictionary. We run a master and some replica slave servers. To update the dictionary, we copy the dictionary txt file to the master, from where it is automatically replicated to all slaves. However, it seems we need to run the spellcheck.build query on all servers individually. Is there a way to automatically build the spellcheck dictionary on all servers without calling spellcheck.build on all slaves individually? We use Solr 4.0.0 Thanks, Mirko
Re: Automatically build spellcheck dictionary on replicas
Yes, I have that, but it doesn't help. It seems Solr still needs the query with the spellcheck.build parameter to build the spellchecker index. 2013/12/3 Kydryavtsev Andrey werde...@yandex.ru Did you try to add str name=buildOnCommittrue/str parameter to your slave's spellcheck configuration? 03.12.2013, 12:04, Mirko idonthaveenoughinformat...@googlemail.com: Hi all, We use a Solr SpellcheckComponent with a file-based dictionary. We run a master and some replica slave servers. To update the dictionary, we copy the dictionary txt file to the master, from where it is automatically replicated to all slaves. However, it seems we need to run the spellcheck.build query on all servers individually. Is there a way to automatically build the spellcheck dictionary on all servers without calling spellcheck.build on all slaves individually? We use Solr 4.0.0 Thanks, Mirko
json update moves doc to end
When I search for agenda I get a lot of hits. Now if I update the 2. Result by json-update the doc is moved to the end of the index when I search for it again. The field I change is editorschoice and it never contains the search term agenda so I dont see why it changes the order. Why does it? Part of Solrconfig requesthandler I use: requestHandler name=/select2 class=solr.SearchHandler lst name=defaults str name=echoParamsexplicit/str int name=rows10/int str name=defTypesynonym_edismax/str str name=synonymstrue/str str name=qfplain_text^10 editorschoice^200 title^20 h_*^14 tags^10 thema^15 inhaltstyp^6 breadcrumb^6 doctype^10 contentmanager^5 links^5 last_modified^5 url^5 /str str name=bq(expiration:[NOW TO *] OR (*:* -expiration:*))^6/str !-- tested: now or newer or empty gets small boost -- str name=bflog(clicks)^8/str !-- tested -- !-- todo: anzahl-links(count urlparse in links query) / häufigkeit von suchbegriff (bf= count in title and text)-- str name=dftext/str str name=fl*,path,score/str str name=wtjson/str str name=q.opAND/str !-- Highlighting defaults -- str name=hlon/str str name=hl.flplain_text,title/str str name=hl.simple.prelt;bgt;/str str name=hl.simple.postlt;/bgt;/str !-- lst name=invariants -- str name=faceton/str str name=facet.mincount1/str str name=facet.field{!ex=inhaltstyp}inhaltstyp/str str name=f.inhaltstyp.facet.sortindex/str str name=facet.field{!ex=doctype}doctype/str str name=f.doctype.facet.sortindex/str str name=facet.field{!ex=thema_f}thema_f/str str name=f.thema_f.facet.sortindex/str str name=facet.field{!ex=author_s}author_s/str str name=f.author_s.facet.sortindex/str str name=facet.field{!ex=sachverstaendiger_s}sachverstaendiger_s/str str name=f.sachverstaendiger_s.facet.sortindex/str str name=facet.field{!ex=veranstaltung}veranstaltung/str str name=f.veranstaltung.facet.sortindex/str str name=facet.date{!ex=last_modified}last_modified/str str name=facet.date.gap+1MONTH/str str name=facet.date.endNOW/MONTH+1MONTH/str str name=facet.date.startNOW/MONTH-36MONTHS/str str name=facet.date.otherafter/str /lst /requestHandler
Re: json update moves doc to end
What order, the order if you supply no explicit sort at all? Solr does not make any guarantees about what order documents will come back in if you do not ask for a sort. In general in Solr/lucene, the only way to update a document is to re-add it as a new document, so that's probably what's going on behind the scenes, and it probably effects the 'default' sort order -- which Solr makes no agreement about anyway, you probably shouldn't even count on it being consistent at all. If you want a consistent sort order, maybe add a field with a timestamp, and ask for results sorted by the timestamp field? And then make sure not to change the timestamp when you do an update that you don't want to change the order? Apologies if I've misunderstood the situation. On 12/3/13 1:00 PM, Andreas Owen wrote: When I search for agenda I get a lot of hits. Now if I update the 2. Result by json-update the doc is moved to the end of the index when I search for it again. The field I change is editorschoice and it never contains the search term agenda so I don't see why it changes the order. Why does it? Part of Solrconfig requesthandler I use: requestHandler name=/select2 class=solr.SearchHandler lst name=defaults str name=echoParamsexplicit/str int name=rows10/int str name=defTypesynonym_edismax/str str name=synonymstrue/str str name=qfplain_text^10 editorschoice^200 title^20 h_*^14 tags^10 thema^15 inhaltstyp^6 breadcrumb^6 doctype^10 contentmanager^5 links^5 last_modified^5 url^5 /str str name=bq(expiration:[NOW TO *] OR (*:* -expiration:*))^6/str !-- tested: now or newer or empty gets small boost -- str name=bflog(clicks)^8/str !-- tested -- !-- todo: anzahl-links(count urlparse in links query) / häufigkeit von suchbegriff (bf= count in title and text)-- str name=dftext/str str name=fl*,path,score/str str name=wtjson/str str name=q.opAND/str !-- Highlighting defaults -- str name=hlon/str str name=hl.flplain_text,title/str str name=hl.simple.prelt;bgt;/str str name=hl.simple.postlt;/bgt;/str !-- lst name=invariants -- str name=faceton/str str name=facet.mincount1/str str name=facet.field{!ex=inhaltstyp}inhaltstyp/str str name=f.inhaltstyp.facet.sortindex/str str name=facet.field{!ex=doctype}doctype/str str name=f.doctype.facet.sortindex/str str name=facet.field{!ex=thema_f}thema_f/str str name=f.thema_f.facet.sortindex/str str name=facet.field{!ex=author_s}author_s/str str name=f.author_s.facet.sortindex/str str name=facet.field{!ex=sachverstaendiger_s}sachverstaendiger_s/str str name=f.sachverstaendiger_s.facet.sortindex/str str name=facet.field{!ex=veranstaltung}veranstaltung/str str name=f.veranstaltung.facet.sortindex/str str name=facet.date{!ex=last_modified}last_modified/str str name=facet.date.gap+1MONTH/str str name=facet.date.endNOW/MONTH+1MONTH/str str name=facet.date.startNOW/MONTH-36MONTHS/str str name=facet.date.otherafter/str /lst /requestHandler
Re: json update moves doc to end
AFAIK If you don't supply or configure a sort parameter, SOLR is sorting by score desc. In that case, you may want to understand (at least view) how each document score is calculated: you can run the query with queryDebug set and see the whole explain This great tool helped me a lot: _http://explain.solr.pl _ Best, Andrea On 12/03/2013 07:00 PM, Andreas Owen wrote: When I search for agenda I get a lot of hits. Now if I update the 2. Result by json-update the doc is moved to the end of the index when I search for it again. The field I change is editorschoice and it never contains the search term agenda so I don't see why it changes the order. Why does it? Part of Solrconfig requesthandler I use: requestHandler name=/select2 class=solr.SearchHandler lst name=defaults str name=echoParamsexplicit/str int name=rows10/int str name=defTypesynonym_edismax/str str name=synonymstrue/str str name=qfplain_text^10 editorschoice^200 title^20 h_*^14 tags^10 thema^15 inhaltstyp^6 breadcrumb^6 doctype^10 contentmanager^5 links^5 last_modified^5 url^5 /str str name=bq(expiration:[NOW TO *] OR (*:* -expiration:*))^6/str !-- tested: now or newer or empty gets small boost -- str name=bflog(clicks)^8/str !-- tested -- !-- todo: anzahl-links(count urlparse in links query) / häufigkeit von suchbegriff (bf= count in title and text)-- str name=dftext/str str name=fl*,path,score/str str name=wtjson/str str name=q.opAND/str !-- Highlighting defaults -- str name=hlon/str str name=hl.flplain_text,title/str str name=hl.simple.prelt;bgt;/str str name=hl.simple.postlt;/bgt;/str !-- lst name=invariants -- str name=faceton/str str name=facet.mincount1/str str name=facet.field{!ex=inhaltstyp}inhaltstyp/str str name=f.inhaltstyp.facet.sortindex/str str name=facet.field{!ex=doctype}doctype/str str name=f.doctype.facet.sortindex/str str name=facet.field{!ex=thema_f}thema_f/str str name=f.thema_f.facet.sortindex/str str name=facet.field{!ex=author_s}author_s/str str name=f.author_s.facet.sortindex/str str name=facet.field{!ex=sachverstaendiger_s}sachverstaendiger_s/str str name=f.sachverstaendiger_s.facet.sortindex/str str name=facet.field{!ex=veranstaltung}veranstaltung/str str name=f.veranstaltung.facet.sortindex/str str name=facet.date{!ex=last_modified}last_modified/str str name=facet.date.gap+1MONTH/str str name=facet.date.endNOW/MONTH+1MONTH/str str name=facet.date.startNOW/MONTH-36MONTHS/str str name=facet.date.otherafter/str /lst /requestHandler
RE: json update moves doc to end
So isn't it sorted automaticly by relevance (boost value)? If not do should i set it in solrconfig? -Original Message- From: Jonathan Rochkind [mailto:rochk...@jhu.edu] Sent: Dienstag, 3. Dezember 2013 19:07 To: solr-user@lucene.apache.org Subject: Re: json update moves doc to end What order, the order if you supply no explicit sort at all? Solr does not make any guarantees about what order documents will come back in if you do not ask for a sort. In general in Solr/lucene, the only way to update a document is to re-add it as a new document, so that's probably what's going on behind the scenes, and it probably effects the 'default' sort order -- which Solr makes no agreement about anyway, you probably shouldn't even count on it being consistent at all. If you want a consistent sort order, maybe add a field with a timestamp, and ask for results sorted by the timestamp field? And then make sure not to change the timestamp when you do an update that you don't want to change the order? Apologies if I've misunderstood the situation. On 12/3/13 1:00 PM, Andreas Owen wrote: When I search for agenda I get a lot of hits. Now if I update the 2. Result by json-update the doc is moved to the end of the index when I search for it again. The field I change is editorschoice and it never contains the search term agenda so I don't see why it changes the order. Why does it? Part of Solrconfig requesthandler I use: requestHandler name=/select2 class=solr.SearchHandler lst name=defaults str name=echoParamsexplicit/str int name=rows10/int str name=defTypesynonym_edismax/str str name=synonymstrue/str str name=qfplain_text^10 editorschoice^200 title^20 h_*^14 tags^10 thema^15 inhaltstyp^6 breadcrumb^6 doctype^10 contentmanager^5 links^5 last_modified^5 url^5 /str str name=bq(expiration:[NOW TO *] OR (*:* -expiration:*))^6/str !-- tested: now or newer or empty gets small boost -- str name=bflog(clicks)^8/str !-- tested -- !-- todo: anzahl-links(count urlparse in links query) / häufigkeit von suchbegriff (bf= count in title and text)-- str name=dftext/str str name=fl*,path,score/str str name=wtjson/str str name=q.opAND/str !-- Highlighting defaults -- str name=hlon/str str name=hl.flplain_text,title/str str name=hl.simple.prelt;bgt;/str str name=hl.simple.postlt;/bgt;/str !-- lst name=invariants -- str name=faceton/str str name=facet.mincount1/str str name=facet.field{!ex=inhaltstyp}inhaltstyp/str str name=f.inhaltstyp.facet.sortindex/str str name=facet.field{!ex=doctype}doctype/str str name=f.doctype.facet.sortindex/str str name=facet.field{!ex=thema_f}thema_f/str str name=f.thema_f.facet.sortindex/str str name=facet.field{!ex=author_s}author_s/str str name=f.author_s.facet.sortindex/str str name=facet.field{!ex=sachverstaendiger_s}sachverstaendiger_s/str str name=f.sachverstaendiger_s.facet.sortindex/str str name=facet.field{!ex=veranstaltung}veranstaltung/str str name=f.veranstaltung.facet.sortindex/str str name=facet.date{!ex=last_modified}last_modified/str str name=facet.date.gap+1MONTH/str str name=facet.date.endNOW/MONTH+1MONTH/str str name=facet.date.startNOW/MONTH-36MONTHS/str str name=facet.date.otherafter/str /lst /requestHandler
Re: Automatically build spellcheck dictionary on replicas
Yep, sorry, it doesn't work for file-based dictionaries: In particular, you still need to index the dictionary file once by issuing a search with spellcheck.build=true on the end of the URL; if you system doesn't update that dictionary file, then this only needs to be done once. This manual step may be required even if your configuration sets build=true and reload=true. http://wiki.apache.org/solr/FileBasedSpellChecker 03.12.2013, 21:27, Mirko idonthaveenoughinformat...@googlemail.com: Yes, I have that, but it doesn't help. It seems Solr still needs the query with the spellcheck.build parameter to build the spellchecker index. 2013/12/3 Kydryavtsev Andrey werde...@yandex.ru Did you try to add str name=buildOnCommittrue/str parameter to your slave's spellcheck configuration? 03.12.2013, 12:04, Mirko idonthaveenoughinformat...@googlemail.com: Hi all, We use a Solr SpellcheckComponent with a file-based dictionary. We run a master and some replica slave servers. To update the dictionary, we copy the dictionary txt file to the master, from where it is automatically replicated to all slaves. However, it seems we need to run the spellcheck.build query on all servers individually. Is there a way to automatically build the spellcheck dictionary on all servers without calling spellcheck.build on all slaves individually? We use Solr 4.0.0 Thanks, Mirko
Re: json update moves doc to end
Try adding debug=all and you'll see exactly how docs are scored. Also, it'll show you exactly how your query is parsed. Paste that if it's confused, it'll help figure out what's going wrong. On Tue, Dec 3, 2013 at 1:37 PM, Andreas Owen a...@conx.ch wrote: So isn't it sorted automaticly by relevance (boost value)? If not do should i set it in solrconfig? -Original Message- From: Jonathan Rochkind [mailto:rochk...@jhu.edu] Sent: Dienstag, 3. Dezember 2013 19:07 To: solr-user@lucene.apache.org Subject: Re: json update moves doc to end What order, the order if you supply no explicit sort at all? Solr does not make any guarantees about what order documents will come back in if you do not ask for a sort. In general in Solr/lucene, the only way to update a document is to re-add it as a new document, so that's probably what's going on behind the scenes, and it probably effects the 'default' sort order -- which Solr makes no agreement about anyway, you probably shouldn't even count on it being consistent at all. If you want a consistent sort order, maybe add a field with a timestamp, and ask for results sorted by the timestamp field? And then make sure not to change the timestamp when you do an update that you don't want to change the order? Apologies if I've misunderstood the situation. On 12/3/13 1:00 PM, Andreas Owen wrote: When I search for agenda I get a lot of hits. Now if I update the 2. Result by json-update the doc is moved to the end of the index when I search for it again. The field I change is editorschoice and it never contains the search term agenda so I don't see why it changes the order. Why does it? Part of Solrconfig requesthandler I use: requestHandler name=/select2 class=solr.SearchHandler lst name=defaults str name=echoParamsexplicit/str int name=rows10/int str name=defTypesynonym_edismax/str str name=synonymstrue/str str name=qfplain_text^10 editorschoice^200 title^20 h_*^14 tags^10 thema^15 inhaltstyp^6 breadcrumb^6 doctype^10 contentmanager^5 links^5 last_modified^5 url^5 /str str name=bq(expiration:[NOW TO *] OR (*:* -expiration:*))^6/str !-- tested: now or newer or empty gets small boost -- str name=bflog(clicks)^8/str !-- tested -- !-- todo: anzahl-links(count urlparse in links query) / häufigkeit von suchbegriff (bf= count in title and text)-- str name=dftext/str str name=fl*,path,score/str str name=wtjson/str str name=q.opAND/str !-- Highlighting defaults -- str name=hlon/str str name=hl.flplain_text,title/str str name=hl.simple.prelt;bgt;/str str name=hl.simple.postlt;/bgt;/str !-- lst name=invariants -- str name=faceton/str str name=facet.mincount1/str str name=facet.field{!ex=inhaltstyp}inhaltstyp/str str name=f.inhaltstyp.facet.sortindex/str str name=facet.field{!ex=doctype}doctype/str str name=f.doctype.facet.sortindex/str str name=facet.field{!ex=thema_f}thema_f/str str name=f.thema_f.facet.sortindex/str str name=facet.field{!ex=author_s}author_s/str str name=f.author_s.facet.sortindex/str str name=facet.field{!ex=sachverstaendiger_s}sachverstaendiger_s/str str name=f.sachverstaendiger_s.facet.sortindex/str str name=facet.field{!ex=veranstaltung}veranstaltung/str str name=f.veranstaltung.facet.sortindex/str str name=facet.date{!ex=last_modified}last_modified/str str name=facet.date.gap+1MONTH/str str name=facet.date.endNOW/MONTH+1MONTH/str str name=facet.date.startNOW/MONTH-36MONTHS/str str name=facet.date.otherafter/str /lst /requestHandler
a core for every user, lots of users... are there issues
We are building a system where there is a core for every user. There will be many tens or perhaps ultimately hundreds of thousands or millions of users. We do not need each of those users to have “warm” data in memory. In fact doing so would consume lots of memory unnecessarily, for users that might not have logged in in a long time. So my question is, is the default behavior of Solr to try to keep all of our cores warm, and if so, can we stop it? Also given the number of cores that we will likely have is there anything else we should be keeping in mind to maximize performance and minimize memory usage?
Re: post filtering for boolean filter queries
On Tue, Dec 3, 2013 at 4:45 AM, Dmitry Kan solrexp...@gmail.com wrote: ok, we were able to confirm the behavior regarding not caching the filter query. It works as expected. It does not cache with {!cache=false}. We are still looking into clarifying the cost assignment: i.e. whether it works as expected for long boolean filter queries. Yes, filters should be ordered by cost (cheapest first) whenever you use {!cache=false} -Yonik http://heliosearch.com -- making solr shine
Re: a core for every user, lots of users... are there issues
You probably want to look at transient cores, see: http://wiki.apache.org/solr/LotsOfCores But millions will be interesting for a single node, you must have some kind of partitioning in mind? Best, Erick On Tue, Dec 3, 2013 at 2:38 PM, hank williams hank...@gmail.com wrote: We are building a system where there is a core for every user. There will be many tens or perhaps ultimately hundreds of thousands or millions of users. We do not need each of those users to have “warm” data in memory. In fact doing so would consume lots of memory unnecessarily, for users that might not have logged in in a long time. So my question is, is the default behavior of Solr to try to keep all of our cores warm, and if so, can we stop it? Also given the number of cores that we will likely have is there anything else we should be keeping in mind to maximize performance and minimize memory usage?
Re: Using Payloads as a Coefficient For Score At a Custom QParser That extends ExtendedDismaxQParser
I've implemented what I want. I can add payload score into the document score. I've modified ExtendedDismaxQParser and I can use all the abilities of edismax at my case. I will explain what I did at my blog. Thanks; Furkan KAMACI 2013/12/1 Furkan KAMACI furkankam...@gmail.com Hi; I use Solr 4.5.1 I have a case: When a user searches for some specific keywords some documents should be listed at much more higher than its usual score. I mean I have probabilities of which documents user may want to see for given keywords. I have come up with that idea. I can put a new field to my schema. This field holds keyword and probability as payload. When a user searches for a keyword I will calculate usual document score for given fields and also I will make a search on payloaded field and I will multiply the total score with that payload. I followed that example: http://sujitpal.blogspot.com/2013/07/porting-payloads-to-solr4.html#! owever that example extends Qparser directly but I want to use capabilities of edismax. So I found that example: http://digitalpebble.blogspot.com/2010/08/using-payloads-with-dismaxqparser-in.html his one exteds dismax and but I could not used payloads at that example. I want to combine above to solutions. First solution has that case: @Override public Similarity get(String name) { if (payloads.equals(name) || cscores.equals(name)) { return new PayloadSimilarity(); } else { return new DefaultSimilarity(); } } However dismax behaves different. i.e. when you search for cscores:A it changes that into that: *+((text:cscores:y text:cscores text:y text:cscoresy)) ()* When I debug it name is text instead of cscores and does not work. My idea is combining two examples and extending edismax. Do you have any idea how to extend it for edismax or do you have any idea what to do for my case. *PS:* I've sent same question at Lucene user list too. I ask it here to get an idea from Solr perspective too. Thanks; Furkan KAMACI
Re: a core for every user, lots of users... are there issues
On Tue, Dec 3, 2013 at 3:20 PM, Erick Erickson erickerick...@gmail.comwrote: You probably want to look at transient cores, see: http://wiki.apache.org/solr/LotsOfCores But millions will be interesting for a single node, you must have some kind of partitioning in mind? Wow. Thanks for that great link. Yes we are sharding so its not like there would be millions of cores on one machine or even cluster. And since the cores are one per user, this is a totally clean approach. But still we want to make sure that we are not overloading the machine. Do you have any sense of what a good upper limit might be, or how we might figure that out? Best, Erick On Tue, Dec 3, 2013 at 2:38 PM, hank williams hank...@gmail.com wrote: We are building a system where there is a core for every user. There will be many tens or perhaps ultimately hundreds of thousands or millions of users. We do not need each of those users to have “warm” data in memory. In fact doing so would consume lots of memory unnecessarily, for users that might not have logged in in a long time. So my question is, is the default behavior of Solr to try to keep all of our cores warm, and if so, can we stop it? Also given the number of cores that we will likely have is there anything else we should be keeping in mind to maximize performance and minimize memory usage? -- blog: whydoeseverythingsuck.com
How to Empty Content of a Field via Solrj?
How can I empty content of a field at Solr (I use Solr 4.5.1 as SolrCloud) via Solrj? I mean if I have that document at my index: field1: abc field2: def field3: ghi and if I want to empty the content of field2. I want to have: field1: abc field2: field3: ghi
Re: How to Empty Content of a Field via Solrj?
I know that I can use Atomic Updates for such cases but I want to atomically update a field by a search result (I want to use that functionality as like nested queries). Any other ideas are welcome. 2013/12/3 Furkan KAMACI furkankam...@gmail.com How can I empty content of a field at Solr (I use Solr 4.5.1 as SolrCloud) via Solrj? I mean if I have that document at my index: field1: abc field2: def field3: ghi and if I want to empty the content of field2. I want to have: field1: abc field2: field3: ghi
Re: a core for every user, lots of users... are there issues
Also, I see that the lotsofcores stuff is for solr 4.4 and above. What is the state of the 4.4 codebase? Could we start using it now? Is it safe? On Tue, Dec 3, 2013 at 3:33 PM, hank williams hank...@gmail.com wrote: On Tue, Dec 3, 2013 at 3:20 PM, Erick Erickson erickerick...@gmail.comwrote: You probably want to look at transient cores, see: http://wiki.apache.org/solr/LotsOfCores But millions will be interesting for a single node, you must have some kind of partitioning in mind? Wow. Thanks for that great link. Yes we are sharding so its not like there would be millions of cores on one machine or even cluster. And since the cores are one per user, this is a totally clean approach. But still we want to make sure that we are not overloading the machine. Do you have any sense of what a good upper limit might be, or how we might figure that out? Best, Erick On Tue, Dec 3, 2013 at 2:38 PM, hank williams hank...@gmail.com wrote: We are building a system where there is a core for every user. There will be many tens or perhaps ultimately hundreds of thousands or millions of users. We do not need each of those users to have “warm” data in memory. In fact doing so would consume lots of memory unnecessarily, for users that might not have logged in in a long time. So my question is, is the default behavior of Solr to try to keep all of our cores warm, and if so, can we stop it? Also given the number of cores that we will likely have is there anything else we should be keeping in mind to maximize performance and minimize memory usage? -- blog: whydoeseverythingsuck.com -- blog: whydoeseverythingsuck.com
Re: a core for every user, lots of users... are there issues
Sorry, I see that we are up to solr 4.6. I missed that. On Tue, Dec 3, 2013 at 3:53 PM, hank williams hank...@gmail.com wrote: Also, I see that the lotsofcores stuff is for solr 4.4 and above. What is the state of the 4.4 codebase? Could we start using it now? Is it safe? On Tue, Dec 3, 2013 at 3:33 PM, hank williams hank...@gmail.com wrote: On Tue, Dec 3, 2013 at 3:20 PM, Erick Erickson erickerick...@gmail.comwrote: You probably want to look at transient cores, see: http://wiki.apache.org/solr/LotsOfCores But millions will be interesting for a single node, you must have some kind of partitioning in mind? Wow. Thanks for that great link. Yes we are sharding so its not like there would be millions of cores on one machine or even cluster. And since the cores are one per user, this is a totally clean approach. But still we want to make sure that we are not overloading the machine. Do you have any sense of what a good upper limit might be, or how we might figure that out? Best, Erick On Tue, Dec 3, 2013 at 2:38 PM, hank williams hank...@gmail.com wrote: We are building a system where there is a core for every user. There will be many tens or perhaps ultimately hundreds of thousands or millions of users. We do not need each of those users to have “warm” data in memory. In fact doing so would consume lots of memory unnecessarily, for users that might not have logged in in a long time. So my question is, is the default behavior of Solr to try to keep all of our cores warm, and if so, can we stop it? Also given the number of cores that we will likely have is there anything else we should be keeping in mind to maximize performance and minimize memory usage? -- blog: whydoeseverythingsuck.com -- blog: whydoeseverythingsuck.com -- blog: whydoeseverythingsuck.com
Re: post filtering for boolean filter queries
On 12/03/2013 01:55 AM, Dmitry Kan wrote: Hello! We have been experimenting with post filtering lately. Our setup is a filter having long boolean query; drawing the example from the Dublin's Stump the Chump: fq=UserId:(user1 OR user2 OR...OR user1000) The underlining issue impacting performance is that the combination of user ids in the query above is unique per each user in the system and on top the combination is changing every day. Our idea was to stop caching the filter query with {!cache=false}. Since there is no way to introspect the contents of the filter cache to our knowledge (jmx?), we can't be sure those are not cached. This is because the initial query per each combination takes substantially more time (as if it was *not* cached) than the second and subsequent queries with the same fq (as if it *was* cached). Question is: does post filtering support boolean queries in fq params? Another thing we have been trying is assigning a cost to the fq relatively higher than for other filter queries. Does this feature support the boolean queries in fq params as well? Dmitry - I went to a talk at LR where this problem was discussed, and the solution of implementing a custom filter cache only for logged-in users was discussed -- sounds interesting, but maybe some tricky parts to it -Mike
Re: SolrCloud FunctionQuery inconsistency
: Yes, I am populating ptime using a default of NOW. : : I only store the id, so I can't get ptime values. But from the perspective : of business logic, ptime should not change. if you are populating it using a *schema* default then the warning text i pasted into my last message would definitely apply to your situation and eeasily explain the behavior your are seeing -- because the schema defaults are applied on a per node basis, so the values wouldn't be garunteed to be consistent for hte entire shard. If you are populating it using and update processor that fills in a default (such as the TimestampUpdateProcessorFactory i linked to in my last message) prior to the distribute update logic, then everything should be working fine and if you are seeing the order change then the problem is likeley unrelated to my wild guess. As erick said: you have to give us a *lot* more details (exactly what your data looks like, what queries you are doing, what results you see, how those results differ from what you expect, etc...) in order to provide more useful/meaningful advice. https://wiki.apache.org/solr/UsingMailingLists -Hoss http://www.lucidworks.com/
Re: a core for every user, lots of users... are there issues
bq: Do you have any sense of what a good upper limit might be, or how we might figure that out? As always, it depends (tm). And the biggest thing it depends upon is the number of simultaneous users you have and the size of their indexes. And we've arrived at the black box of estimating size again. Siiihh... I'm afraid that the only way is to test and establish some rules of thumb. The transient core constraint will limit the number of cores loaded at once. If you allow too many cores at once, you'll get OOM errors when all the users pile on at the same time. Let's say you've determined that 100 is the limit for transient cores. What I suspect you'll see is degrading response times if this is too low. Say 110 users are signed on and say they submit queries perfectly in order, one after the other. Every request will require the core to be opened and it'll take a bit. So that'll be a flag. Or that's a fine limit but your users have added more and more documents and you're coming under memory pressure. As you can tell I don't have any good answers. I've seen between 10M and 300M documents on a single machine BTW, on a _very_ casual test I found about 1000 cores/second were found in discovery mode. While they aren't loaded if they're transient, it's still a consideration if you have 10s of thousands. Best, Erick On Tue, Dec 3, 2013 at 3:33 PM, hank williams hank...@gmail.com wrote: On Tue, Dec 3, 2013 at 3:20 PM, Erick Erickson erickerick...@gmail.com wrote: You probably want to look at transient cores, see: http://wiki.apache.org/solr/LotsOfCores But millions will be interesting for a single node, you must have some kind of partitioning in mind? Wow. Thanks for that great link. Yes we are sharding so its not like there would be millions of cores on one machine or even cluster. And since the cores are one per user, this is a totally clean approach. But still we want to make sure that we are not overloading the machine. Do you have any sense of what a good upper limit might be, or how we might figure that out? Best, Erick On Tue, Dec 3, 2013 at 2:38 PM, hank williams hank...@gmail.com wrote: We are building a system where there is a core for every user. There will be many tens or perhaps ultimately hundreds of thousands or millions of users. We do not need each of those users to have “warm” data in memory. In fact doing so would consume lots of memory unnecessarily, for users that might not have logged in in a long time. So my question is, is the default behavior of Solr to try to keep all of our cores warm, and if so, can we stop it? Also given the number of cores that we will likely have is there anything else we should be keeping in mind to maximize performance and minimize memory usage? -- blog: whydoeseverythingsuck.com
Re: Deleting and committing inside a SearchComponent
On Tue, Dec 3, 2013, at 03:22 PM, Peyman Faratin wrote: Hi Is it possible to delete and commit updates to an index inside a custom SearchComponent? I know I can do it with solrj but due to several business logic requirements I need to build the logic inside the search component. I am using SOLR 4.5.0. That just doesn't make sense. Search components are read only. What are you trying to do? What stuff do you need to change? Could you do it within an UpdateProcessor? Upayavira
Re: json update moves doc to end
By default it sorts by score. If the score is a consistent one, it will order docs as they appear in the index, which effectively means an undefined order. For example a *:* query doesn't have terms that can be used to score, so every doc will get a score if 1. Upayavira On Tue, Dec 3, 2013, at 06:37 PM, Andreas Owen wrote: So isn't it sorted automaticly by relevance (boost value)? If not do should i set it in solrconfig? -Original Message- From: Jonathan Rochkind [mailto:rochk...@jhu.edu] Sent: Dienstag, 3. Dezember 2013 19:07 To: solr-user@lucene.apache.org Subject: Re: json update moves doc to end What order, the order if you supply no explicit sort at all? Solr does not make any guarantees about what order documents will come back in if you do not ask for a sort. In general in Solr/lucene, the only way to update a document is to re-add it as a new document, so that's probably what's going on behind the scenes, and it probably effects the 'default' sort order -- which Solr makes no agreement about anyway, you probably shouldn't even count on it being consistent at all. If you want a consistent sort order, maybe add a field with a timestamp, and ask for results sorted by the timestamp field? And then make sure not to change the timestamp when you do an update that you don't want to change the order? Apologies if I've misunderstood the situation. On 12/3/13 1:00 PM, Andreas Owen wrote: When I search for agenda I get a lot of hits. Now if I update the 2. Result by json-update the doc is moved to the end of the index when I search for it again. The field I change is editorschoice and it never contains the search term agenda so I don't see why it changes the order. Why does it? Part of Solrconfig requesthandler I use: requestHandler name=/select2 class=solr.SearchHandler lst name=defaults str name=echoParamsexplicit/str int name=rows10/int str name=defTypesynonym_edismax/str str name=synonymstrue/str str name=qfplain_text^10 editorschoice^200 title^20 h_*^14 tags^10 thema^15 inhaltstyp^6 breadcrumb^6 doctype^10 contentmanager^5 links^5 last_modified^5 url^5 /str str name=bq(expiration:[NOW TO *] OR (*:* -expiration:*))^6/str !-- tested: now or newer or empty gets small boost -- str name=bflog(clicks)^8/str !-- tested -- !-- todo: anzahl-links(count urlparse in links query) / häufigkeit von suchbegriff (bf= count in title and text)-- str name=dftext/str str name=fl*,path,score/str str name=wtjson/str str name=q.opAND/str !-- Highlighting defaults -- str name=hlon/str str name=hl.flplain_text,title/str str name=hl.simple.prelt;bgt;/str str name=hl.simple.postlt;/bgt;/str !-- lst name=invariants -- str name=faceton/str str name=facet.mincount1/str str name=facet.field{!ex=inhaltstyp}inhaltstyp/str str name=f.inhaltstyp.facet.sortindex/str str name=facet.field{!ex=doctype}doctype/str str name=f.doctype.facet.sortindex/str str name=facet.field{!ex=thema_f}thema_f/str str name=f.thema_f.facet.sortindex/str str name=facet.field{!ex=author_s}author_s/str str name=f.author_s.facet.sortindex/str str name=facet.field{!ex=sachverstaendiger_s}sachverstaendiger_s/str str name=f.sachverstaendiger_s.facet.sortindex/str str name=facet.field{!ex=veranstaltung}veranstaltung/str str name=f.veranstaltung.facet.sortindex/str str name=facet.date{!ex=last_modified}last_modified/str str name=facet.date.gap+1MONTH/str str name=facet.date.endNOW/MONTH+1MONTH/str str name=facet.date.startNOW/MONTH-36MONTHS/str str name=facet.date.otherafter/str /lst
Re: Deleting and committing inside a SearchComponent
On Dec 3, 2013, at 8:41 PM, Upayavira u...@odoko.co.uk wrote: On Tue, Dec 3, 2013, at 03:22 PM, Peyman Faratin wrote: Hi Is it possible to delete and commit updates to an index inside a custom SearchComponent? I know I can do it with solrj but due to several business logic requirements I need to build the logic inside the search component. I am using SOLR 4.5.0. That just doesn't make sense. Search components are read only. i can think of many situations that it makes sense. for instance, you search for a document and your index contains many duplicates that only differ by one field, such as the time they were indexed (think news feeds from multiple sources). So after the search we want to delete the duplicate documents that satisfy some policy (here date, but it could be some other policy). What are you trying to do? What stuff do you need to change? Could you do it within an UpdateProcessor? Solution i am working with UpdateRequestProcessorChain processorChain = rb.req.getCore().getUpdateProcessingChain(rb.req.getParams().get(UpdateParams.UPDATE_CHAIN)); UpdateRequestProcessor processor = processorChain.createProcessor(rb.req, rb.rsp); ... docId = f(); ... DeleteUpdateCommand cmd = new DeleteUpdateCommand(req); cmd.setId(docId.toString()); processor.processDelete(cmd); Upayavira
Re: SolrCloud FunctionQuery inconsistency
Thanks, Chirs: The schema is: field name=title type=textComplex indexed=true stored=false multiValued=false omitNorms=true / field name=dkeys type=textComplex indexed=true stored=false multiValued=false omitNorms=true / field name=ptime type=date indexed=true stored=false multiValued=false omitNorms=true / There is no default value for ptime. It is generated by users. There are 4 shards in this solrcloud, and 2 nodes in each shard. I was trying query with a function query({!boost b=dateDeboost(ptime)} channelid:0082 title:abc), which leads differents results from the same shard(using the param: shards=shard3). The diffenence is maxScore, which is not consistent. And the maxScore is either score A or score B. And at the same time, new docs are indexed. In my opinion, the maxScore should be the same between querys in a very short time. or at least, it shoud not always change between score A and score B. And quite by accident, the sort result is even inconsistent(say there is a doc in this query, and not in another query, over and over ). It does appear once, but not reappear again. Does this mean , when query happens, the index in replica has not synced from its leader? so if query from different nodes from the shard at the same time, it shows different results. -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-FunctionQuery-inconsistency-tp4104346p4104851.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: SolrCloud FunctionQuery inconsistency
Hi All, Sorry to ask, is it possible to create multiple collections in solr standalone mode.I mean only one solr instance.I am able to create multiple collections in solr cloud environment. But when creating in solr standalone, it is saying, solr is not in cloud mode.Any suggestions great help.. Regards, Raju Shikha -Original Message- From: sling [mailto:sling...@gmail.com] Sent: 04 December 2013 08:33 To: solr-user@lucene.apache.org Subject: Re: SolrCloud FunctionQuery inconsistency Thanks, Chirs: The schema is: field name=title type=textComplex indexed=true stored=false multiValued=false omitNorms=true / field name=dkeys type=textComplex indexed=true stored=false multiValued=false omitNorms=true / field name=ptime type=date indexed=true stored=false multiValued=false omitNorms=true / There is no default value for ptime. It is generated by users. There are 4 shards in this solrcloud, and 2 nodes in each shard. I was trying query with a function query({!boost b=dateDeboost(ptime)} channelid:0082 title:abc), which leads differents results from the same shard(using the param: shards=shard3). The diffenence is maxScore, which is not consistent. And the maxScore is either score A or score B. And at the same time, new docs are indexed. In my opinion, the maxScore should be the same between querys in a very short time. or at least, it shoud not always change between score A and score B. And quite by accident, the sort result is even inconsistent(say there is a doc in this query, and not in another query, over and over ). It does appear once, but not reappear again. Does this mean , when query happens, the index in replica has not synced from its leader? so if query from different nodes from the shard at the same time, it shows different results. -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-FunctionQuery-inconsistency-tp4104346p4104851.html Sent from the Solr - User mailing list archive at Nabble.com.