Re: Replication and soft commits for NRT searches
Hello, thank you for the detailed answer. If a timeout between shard leader and replica can lead to a smaller rf value (because replication has timed out), is it possible to increase this timeout in the configuration? Best Regards, Martin Mois Comments inline: On Mon, Oct 12, 2015 at 1:31 PM, MOIS Martin (MORPHO) wrote: > Hello, > > I am running Solr 5.2.1 in a cluster with 6 nodes. My collections have been > created with replicationFactor=2, i.e. I have one replica for each shard. Beyond that I am using autoCommit/maxDocs=1 and autoSoftCommits/maxDocs=1 in order to achieve near realtime search behavior. > > As far as I understand from section "Write Side Fault Tolerance" in the > documentation (https://cwiki.apache.org/confluence/display/solr/Read+and+Write+Side+Fault+Tolerance), I cannot enforce that an update gets replicated to all replicas, but I can only get the achieved replication factor by requesting the return value rf. > > My question is now, what exactly does rf=2 mean? Does it only mean that the > replica has written the update to its transaction log? Or has the replica also performed the soft commit as configured with autoSoftCommits/maxDocs=1? The answer is important for me, as if the update would only get written to the transaction log, I could not search for it reliable, as the replica may not have added it to the searchable index. rf=2 means that the update was successfully replicated to and acknowledged by two replicas (including the leader). The rf only deals with the durability of the update and has no relation to visibility of the update to searchers. The auto(soft)commit settings are applied asynchronously and do not block an update request. > > My second question is, does rf=1 mean that the update was definitely not > successful on the replica or could it also represent a timeout of the replication request from the shard leader? If it could also represent a timeout, then there would be a small chance that the replication was successfully despite of the timeout. Well, rf=1 implies that the update was only applied on the leader's index + tlog and either replicas weren't available or returned an error or the request timed out. So yes, you are right that it can represent a timeout and as such there is a chance that the replication was indeed successful despite of the timeout. > > Is there a way to retrieve the replication factor for a specific document > after the update in order to check if replication was successful in the meantime? > No, there is no way to do that. > Thanks in advance. > > Best Regards, > Martin Mois > # > " This e-mail and any attached documents may contain confidential or > proprietary information. If you are not the intended recipient, you are notified that any dissemination, copying of this e-mail and any attachments thereto or use of their contents by any means whatsoever is strictly prohibited. If you have received this e-mail in error, please advise the sender immediately and delete this e-mail and all attached documents from your computer system." > # -- Regards, Shalin Shekhar Mangar. # " This e-mail and any attached documents may contain confidential or proprietary information. If you are not the intended recipient, you are notified that any dissemination, copying of this e-mail and any attachments thereto or use of their contents by any means whatsoever is strictly prohibited. If you have received this e-mail in error, please advise the sender immediately and delete this e-mail and all attached documents from your computer system." #
Re: Request for Wiki edit right
Thank you very much Erick. Arcadius. On 13 October 2015 at 22:04, Erick Erickson wrote: > Just added you to the Solr Wiki contributors group, if you need to > access the Lucene Wiki let us know. > > Best, > Erick > > On Tue, Oct 13, 2015 at 1:57 PM, Arcadius Ahouansou > wrote: > > Hello Erick. > > Thank you for the detailed info. > > My username is arcadius. > > > > Thanks. > > > > > > On 13 October 2015 at 16:58, Erick Erickson > wrote: > > > >> Create a user on the Wiki (anyone can), then tell us the user name > >> you've created and we'll add you to the auth lists. There are separate > >> lists for Solr and Lucene. We had to lock these down because we were > >> getting a lot of spam pages created. > >> > >> The reference guide (CWiki) is restricted to committers though. > >> > >> Best, > >> Erick > >> > >> On Tue, Oct 13, 2015 at 6:30 AM, Arcadius Ahouansou > >> wrote: > >> > Hello. > >> > > >> > Please, can I have the right to edit the Wiki? > >> > > >> > Thanks. > >> > > >> > Arcadius. > >> > > > > > > > > -- > > Arcadius Ahouansou > > Menelic Ltd | Information is Power > > M: 07908761999 > > W: www.menelic.com > > --- > -- Arcadius Ahouansou Menelic Ltd | Information is Power M: 07908761999 W: www.menelic.com ---
Re: AutoComplete Feature in Solr
We want to use suggester but also want to show those results closest to my lat,long... Kinda combine suggester and bq=geodist() On Mon, Oct 12, 2015 at 2:24 PM, Salman Ansari wrote: > Hi, > > I have been trying to get the autocomplete feature in Solr working with no > luck up to now. First I read that "suggest component" is the recommended > way as in the below article (and this is the exact functionality I am > looking for, which is to autocomplete multiple words) > > http://blog.trifork.com/2012/02/15/different-ways-to-make-auto-suggestions-with-solr/ > > Then I tried implementing suggest as described in the following articles in > this order > 1) https://wiki.apache.org/solr/Suggester#SearchHandler_configuration > 2) http://solr.pl/en/2010/11/15/solr-and-autocomplete-part-2/ (I > implemented suggesting phrases) > 3) > > http://stackoverflow.com/questions/18132819/how-to-have-solr-autocomplete-on-whole-phrase-when-query-contains-multiple-terms > > With no luck, after implementing each article when I run my query as > http://[MySolr]:8983/solr/entityStore114/suggest?spellcheck.q=Barack > > > > I get > > > 0 > 0 > > > > Although I have an entry for Barack Obama in my index. I am posting my > Solr configuration as well > > > > suggest > org.apache.solr.spelling.suggest.Suggester >name="lookupImpl">org.apache.solr.spelling.suggest.fst.FSTLookup > entity_autocomplete > true > > > > class="org.apache.solr.handler.component.SearchHandler"> > > true > suggest > 10 > true > false > > > suggest > > > > It looks like a very simple job, but even after following so many articles, > I could not get it right. Any comment will be appreciated! > > Regards, > Salman > -- Bill Bell billnb...@gmail.com cell 720-256-8076
Re: Indexing Solr in production
Thank you Alessandro and Erick. Will try out the SolrJ methond. Regards, Edwin On 14 October 2015 at 00:00, Erick Erickson wrote: > Here's a sample: > https://lucidworks.com/blog/2012/02/14/indexing-with-solrj/ > > On Tue, Oct 13, 2015 at 4:18 AM, Alessandro Benedetti > wrote: > > The most robust and simple way to go is building your own Indexer. > > You can decide the platform you want, Solr has plenty of client API > > libraries. > > > > For example if you want to write your Indexer app in Java, you can use > > SolrJ.. > > Each client library will give you all the flexibility you need to index > > solr in a robust way. > > > > [1] https://cwiki.apache.org/confluence/display/solr/Client+APIs > > Cheers > > > > On 13 October 2015 at 09:35, Zheng Lin Edwin Yeo > > wrote: > > > >> Hi, > >> > >> What is the best practice to do indexing in Solr for production > system.I'm > >> using Solr 5.3.0. > >> > >> I understand that post.jar does not have things like robustness checks > and > >> retires, which is important in production, as sometimes certain records > >> might failed during the indexing, and we need to re-try the indexing for > >> those records that fails. > >> > >> Normally, do we need to write a new custom handler in order to achieve > all > >> these? > >> Want to find out what most people did before I decide on a method and > >> proceed on to the next step. > >> > >> Thank you. > >> > >> Regards, > >> Edwin > >> > > > > > > > > -- > > -- > > > > Benedetti Alessandro > > Visiting card - http://about.me/alessandro_benedetti > > Blog - http://alexbenedetti.blogspot.co.uk > > > > "Tyger, tyger burning bright > > In the forests of the night, > > What immortal hand or eye > > Could frame thy fearful symmetry?" > > > > William Blake - Songs of Experience -1794 England >
Re: solr cloud recovery and search
Great. Thanks Erick. On 10/13/15 5:39 PM, Erick Erickson wrote: More than expected, guaranteed. As long as at least one replica in a shard is active, all queries should succeed. Maybe more slowly, but they should succeed. Best, Erick On Tue, Oct 13, 2015 at 4:25 PM, Rallavagu wrote: It appears that when a node that is in "recovery" mode queried it would defer the query to leader instead of serving from locally. Is this the expected behavior? Thanks.
Re: solr cloud recovery and search
More than expected, guaranteed. As long as at least one replica in a shard is active, all queries should succeed. Maybe more slowly, but they should succeed. Best, Erick On Tue, Oct 13, 2015 at 4:25 PM, Rallavagu wrote: > It appears that when a node that is in "recovery" mode queried it would > defer the query to leader instead of serving from locally. Is this the > expected behavior? Thanks.
solr cloud recovery and search
It appears that when a node that is in "recovery" mode queried it would defer the query to leader instead of serving from locally. Is this the expected behavior? Thanks.
Re: Solr cross core join special condition
On Wed, Oct 7, 2015 at 9:42 AM, Ryan Josal wrote: > I developed a join transformer plugin that did that (although it didn't > flatten the results like that). The one thing that was painful about it is > that the TextResponseWriter has references to both the IndexSchema and > SolrReturnFields objects for the primary core. So when you add a > SolrDocument from another core it returned the wrong fields. We've made some progress on this front in trunk: * SOLR-7957: internal/expert - ResultContext was significantly changed and expanded to allow for multiple full query results (DocLists) per Solr request. TransformContext was rendered redundant and was removed. (yonik) So ResultContext now has it's own searcher, ReturnFields, etc. -Yonik
Re: Request for Wiki edit right
Just added you to the Solr Wiki contributors group, if you need to access the Lucene Wiki let us know. Best, Erick On Tue, Oct 13, 2015 at 1:57 PM, Arcadius Ahouansou wrote: > Hello Erick. > Thank you for the detailed info. > My username is arcadius. > > Thanks. > > > On 13 October 2015 at 16:58, Erick Erickson wrote: > >> Create a user on the Wiki (anyone can), then tell us the user name >> you've created and we'll add you to the auth lists. There are separate >> lists for Solr and Lucene. We had to lock these down because we were >> getting a lot of spam pages created. >> >> The reference guide (CWiki) is restricted to committers though. >> >> Best, >> Erick >> >> On Tue, Oct 13, 2015 at 6:30 AM, Arcadius Ahouansou >> wrote: >> > Hello. >> > >> > Please, can I have the right to edit the Wiki? >> > >> > Thanks. >> > >> > Arcadius. >> > > > > -- > Arcadius Ahouansou > Menelic Ltd | Information is Power > M: 07908761999 > W: www.menelic.com > ---
Re: Request for Wiki edit right
Hello Erick. Thank you for the detailed info. My username is arcadius. Thanks. On 13 October 2015 at 16:58, Erick Erickson wrote: > Create a user on the Wiki (anyone can), then tell us the user name > you've created and we'll add you to the auth lists. There are separate > lists for Solr and Lucene. We had to lock these down because we were > getting a lot of spam pages created. > > The reference guide (CWiki) is restricted to committers though. > > Best, > Erick > > On Tue, Oct 13, 2015 at 6:30 AM, Arcadius Ahouansou > wrote: > > Hello. > > > > Please, can I have the right to edit the Wiki? > > > > Thanks. > > > > Arcadius. > -- Arcadius Ahouansou Menelic Ltd | Information is Power M: 07908761999 W: www.menelic.com ---
Re: Grouping facets: Possible to get facet results for each Group?
Hi, Thanks for your response. I did have a look at pivots, and they could work in a way. We're still on Solr 4.3, so I'll have to wait for sub-facets - but they sure look pretty cool! Peter On Tue, Oct 13, 2015 at 12:30 PM, Alessandro Benedetti < benedetti.ale...@gmail.com> wrote: > Can you model your business domain with Solr nested Docs ? In the case you > can use Yonik article about nested facets. > > Cheers > > On 13 October 2015 at 05:05, Alexandre Rafalovitch > wrote: > > > Could you use the new nested facets syntax? > > http://yonik.com/solr-subfacets/ > > > > Regards, > >Alex. > > > > Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: > > http://www.solr-start.com/ > > > > On 11 October 2015 at 09:51, Peter Sturge > wrote: > > > Been trying to coerce Group faceting to give some faceting back for > each > > > group, but maybe this use case isn't catered for in Grouping? : > > > > > > So the Use Case is this: > > > Let's say I do a grouped search that returns say, 9 distinct groups, > and > > in > > > these groups are various numbers of unique field values that need > > faceting > > > - but the faceting needs to be within each group: > > > > > > -- > -- > > Benedetti Alessandro > Visiting card - http://about.me/alessandro_benedetti > Blog - http://alexbenedetti.blogspot.co.uk > > "Tyger, tyger burning bright > In the forests of the night, > What immortal hand or eye > Could frame thy fearful symmetry?" > > William Blake - Songs of Experience -1794 England >
Re: Help me read Thread
The main reason is that the updates are coming from some client applications and it is not a controlled indexing process. The controlled indexing process works fine (after spending some time to tune it). Will definitely look into throttling incoming updates requests and reduce the number of connections per host. Thanks for the insight. On 10/13/15 9:17 AM, Erick Erickson wrote: How heavy is heavy? The proverbial smoking gun here will be messages in any logs referring to "leader initiated recovery". (note, that's the message I remember seeing, it may not be exact). There's no particular work-around here except to back off the indexing load. Certainly increasing the thread pool size allowed this to surface. Also 5.2 has some significant improvements in this area, see: https://lucidworks.com/blog/2015/06/10/indexing-performance-solr-5-2-now-twice-fast/ And a lot depends on how you're indexing, batching up updates is a good thing. If you go to a multi-shard setup, using SolrJ and CloudSolrServer (CloudSolrClient in 5.x) would help. More shards would help as well, but I'd first take a look at the indexing process and be sure you're batching up updates. It's also possible if indexing is a once-a-day process and it fits with your SLAs to shut off the replicas, index to the leader, then turn the replicas back on. That's not all that satisfactory, but I've seen it used. But with a single shard setup, I really have to ask why indexing at such a furious rate is required that you're hitting this. Are you unable to reduce the indexing rate? Best, Erick On Tue, Oct 13, 2015 at 9:08 AM, Rallavagu wrote: Also, we have increased number of connections per host from default (20) to 100 for http thread pool to communicate with other nodes. Could this have caused the issues as it can now spin many threads to send updates? On 10/13/15 8:56 AM, Erick Erickson wrote: Is this under a very heavy indexing load? There were some inefficiencies that caused followers to work a lot harder than the leader, but the leader had to spin off a bunch of threads to send update to followers. That's fixed int he 5.2 release. Best, Erick On Tue, Oct 13, 2015 at 8:40 AM, Rallavagu wrote: Please help me understand what is going on with this thread. Solr 4.6.1, single shard, 4 node cluster, 3 node zk. Running on tomcat with 500 threads. There are 47 threads overall and designated leader becomes unresponsive though shows "green" from cloud perspective. This is causing issues. particularly, " at org/apache/solr/update/processor/DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1278)[optimized] ^-- Holding lock: java/util/LinkedList@0x2ee24e958[thin lock] ^-- Holding lock: org/apache/solr/update/StreamingSolrServers$1@0x2ee24e9c0[biased lock] ^-- Holding lock: org/apache/solr/update/StreamingSolrServers@0x2ee24ea90[biased lock]" "http-bio-8080-exec-2878" id=5899 idx=0x30c tid=17132 prio=5 alive, native_blocked, daemon at __lll_lock_wait+34(:0)@0x382ba0e262 at safepointSyncOnPollAccess+167(safepoint.c:83)@0x7f83ae266138 at trapiNormalHandler+484(traps_posix.c:220)@0x7f83ae29a745 at _L_unlock_16+44(:0)@0x382ba0f710 at java/util/LinkedList.peek(LinkedList.java:447)[optimized] at org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrServer.blockUntilFinished(ConcurrentUpdateSolrServer.java:384)[inlined] at org/apache/solr/update/StreamingSolrServers.blockUntilFinished(StreamingSolrServers.java:98)[inlined] at org/apache/solr/update/SolrCmdDistributor.finish(SolrCmdDistributor.java:61)[inlined] at org/apache/solr/update/processor/DistributedUpdateProcessor.doFinish(DistributedUpdateProcessor.java:501)[inlined] at org/apache/solr/update/processor/DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1278)[optimized] ^-- Holding lock: java/util/LinkedList@0x2ee24e958[thin lock] ^-- Holding lock: org/apache/solr/update/StreamingSolrServers$1@0x2ee24e9c0[biased lock] ^-- Holding lock: org/apache/solr/update/StreamingSolrServers@0x2ee24ea90[biased lock] at org/apache/solr/handler/ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:83)[optimized] at org/apache/solr/handler/RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)[optimized] at org/apache/solr/core/SolrCore.execute(SolrCore.java:1859)[optimized] at org/apache/solr/servlet/SolrDispatchFilter.execute(SolrDispatchFilter.java:721)[inlined] at org/apache/solr/servlet/SolrDispatchFilter.doFilter(SolrDispatchFilter.java:417)[inlined] at org/apache/solr/servlet/SolrDispatchFilter.doFilter(SolrDispatchFilter.java:201)[optimized] at org/apache/catalina/core/ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)[inlined] at org/apache/catalina/core/ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)[optimized] at org/apache/catalina
Re: Help me read Thread
How heavy is heavy? The proverbial smoking gun here will be messages in any logs referring to "leader initiated recovery". (note, that's the message I remember seeing, it may not be exact). There's no particular work-around here except to back off the indexing load. Certainly increasing the thread pool size allowed this to surface. Also 5.2 has some significant improvements in this area, see: https://lucidworks.com/blog/2015/06/10/indexing-performance-solr-5-2-now-twice-fast/ And a lot depends on how you're indexing, batching up updates is a good thing. If you go to a multi-shard setup, using SolrJ and CloudSolrServer (CloudSolrClient in 5.x) would help. More shards would help as well, but I'd first take a look at the indexing process and be sure you're batching up updates. It's also possible if indexing is a once-a-day process and it fits with your SLAs to shut off the replicas, index to the leader, then turn the replicas back on. That's not all that satisfactory, but I've seen it used. But with a single shard setup, I really have to ask why indexing at such a furious rate is required that you're hitting this. Are you unable to reduce the indexing rate? Best, Erick On Tue, Oct 13, 2015 at 9:08 AM, Rallavagu wrote: > Also, we have increased number of connections per host from default (20) to > 100 for http thread pool to communicate with other nodes. Could this have > caused the issues as it can now spin many threads to send updates? > > > On 10/13/15 8:56 AM, Erick Erickson wrote: >> >> Is this under a very heavy indexing load? There were some >> inefficiencies that caused followers to work a lot harder than the >> leader, but the leader had to spin off a bunch of threads to send >> update to followers. That's fixed int he 5.2 release. >> >> Best, >> Erick >> >> On Tue, Oct 13, 2015 at 8:40 AM, Rallavagu wrote: >>> >>> Please help me understand what is going on with this thread. >>> >>> Solr 4.6.1, single shard, 4 node cluster, 3 node zk. Running on tomcat >>> with >>> 500 threads. >>> >>> >>> There are 47 threads overall and designated leader becomes unresponsive >>> though shows "green" from cloud perspective. This is causing issues. >>> >>> particularly, >>> >>> " at >>> >>> org/apache/solr/update/processor/DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1278)[optimized] >>> ^-- Holding lock: java/util/LinkedList@0x2ee24e958[thin lock] >>> ^-- Holding lock: >>> org/apache/solr/update/StreamingSolrServers$1@0x2ee24e9c0[biased lock] >>> ^-- Holding lock: >>> org/apache/solr/update/StreamingSolrServers@0x2ee24ea90[biased lock]" >>> >>> >>> >>> "http-bio-8080-exec-2878" id=5899 idx=0x30c tid=17132 prio=5 alive, >>> native_blocked, daemon >>> at __lll_lock_wait+34(:0)@0x382ba0e262 >>> at safepointSyncOnPollAccess+167(safepoint.c:83)@0x7f83ae266138 >>> at trapiNormalHandler+484(traps_posix.c:220)@0x7f83ae29a745 >>> at _L_unlock_16+44(:0)@0x382ba0f710 >>> at java/util/LinkedList.peek(LinkedList.java:447)[optimized] >>> at >>> >>> org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrServer.blockUntilFinished(ConcurrentUpdateSolrServer.java:384)[inlined] >>> at >>> >>> org/apache/solr/update/StreamingSolrServers.blockUntilFinished(StreamingSolrServers.java:98)[inlined] >>> at >>> >>> org/apache/solr/update/SolrCmdDistributor.finish(SolrCmdDistributor.java:61)[inlined] >>> at >>> >>> org/apache/solr/update/processor/DistributedUpdateProcessor.doFinish(DistributedUpdateProcessor.java:501)[inlined] >>> at >>> >>> org/apache/solr/update/processor/DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1278)[optimized] >>> ^-- Holding lock: java/util/LinkedList@0x2ee24e958[thin lock] >>> ^-- Holding lock: >>> org/apache/solr/update/StreamingSolrServers$1@0x2ee24e9c0[biased lock] >>> ^-- Holding lock: >>> org/apache/solr/update/StreamingSolrServers@0x2ee24ea90[biased lock] >>> at >>> >>> org/apache/solr/handler/ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:83)[optimized] >>> at >>> >>> org/apache/solr/handler/RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)[optimized] >>> at >>> org/apache/solr/core/SolrCore.execute(SolrCore.java:1859)[optimized] >>> at >>> >>> org/apache/solr/servlet/SolrDispatchFilter.execute(SolrDispatchFilter.java:721)[inlined] >>> at >>> >>> org/apache/solr/servlet/SolrDispatchFilter.doFilter(SolrDispatchFilter.java:417)[inlined] >>> at >>> >>> org/apache/solr/servlet/SolrDispatchFilter.doFilter(SolrDispatchFilter.java:201)[optimized] >>> at >>> >>> org/apache/catalina/core/ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)[inlined] >>> at >>> >>> org/apache/catalina/core/ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)[optimized] >>> at >>> >>> org/apache/catalina/core/StandardWrapperValve.invoke(StandardWrapperValve.java:222)[optimized] >>> at >>> >>
Re: Help me read Thread
Also, we have increased number of connections per host from default (20) to 100 for http thread pool to communicate with other nodes. Could this have caused the issues as it can now spin many threads to send updates? On 10/13/15 8:56 AM, Erick Erickson wrote: Is this under a very heavy indexing load? There were some inefficiencies that caused followers to work a lot harder than the leader, but the leader had to spin off a bunch of threads to send update to followers. That's fixed int he 5.2 release. Best, Erick On Tue, Oct 13, 2015 at 8:40 AM, Rallavagu wrote: Please help me understand what is going on with this thread. Solr 4.6.1, single shard, 4 node cluster, 3 node zk. Running on tomcat with 500 threads. There are 47 threads overall and designated leader becomes unresponsive though shows "green" from cloud perspective. This is causing issues. particularly, " at org/apache/solr/update/processor/DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1278)[optimized] ^-- Holding lock: java/util/LinkedList@0x2ee24e958[thin lock] ^-- Holding lock: org/apache/solr/update/StreamingSolrServers$1@0x2ee24e9c0[biased lock] ^-- Holding lock: org/apache/solr/update/StreamingSolrServers@0x2ee24ea90[biased lock]" "http-bio-8080-exec-2878" id=5899 idx=0x30c tid=17132 prio=5 alive, native_blocked, daemon at __lll_lock_wait+34(:0)@0x382ba0e262 at safepointSyncOnPollAccess+167(safepoint.c:83)@0x7f83ae266138 at trapiNormalHandler+484(traps_posix.c:220)@0x7f83ae29a745 at _L_unlock_16+44(:0)@0x382ba0f710 at java/util/LinkedList.peek(LinkedList.java:447)[optimized] at org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrServer.blockUntilFinished(ConcurrentUpdateSolrServer.java:384)[inlined] at org/apache/solr/update/StreamingSolrServers.blockUntilFinished(StreamingSolrServers.java:98)[inlined] at org/apache/solr/update/SolrCmdDistributor.finish(SolrCmdDistributor.java:61)[inlined] at org/apache/solr/update/processor/DistributedUpdateProcessor.doFinish(DistributedUpdateProcessor.java:501)[inlined] at org/apache/solr/update/processor/DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1278)[optimized] ^-- Holding lock: java/util/LinkedList@0x2ee24e958[thin lock] ^-- Holding lock: org/apache/solr/update/StreamingSolrServers$1@0x2ee24e9c0[biased lock] ^-- Holding lock: org/apache/solr/update/StreamingSolrServers@0x2ee24ea90[biased lock] at org/apache/solr/handler/ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:83)[optimized] at org/apache/solr/handler/RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)[optimized] at org/apache/solr/core/SolrCore.execute(SolrCore.java:1859)[optimized] at org/apache/solr/servlet/SolrDispatchFilter.execute(SolrDispatchFilter.java:721)[inlined] at org/apache/solr/servlet/SolrDispatchFilter.doFilter(SolrDispatchFilter.java:417)[inlined] at org/apache/solr/servlet/SolrDispatchFilter.doFilter(SolrDispatchFilter.java:201)[optimized] at org/apache/catalina/core/ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)[inlined] at org/apache/catalina/core/ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)[optimized] at org/apache/catalina/core/StandardWrapperValve.invoke(StandardWrapperValve.java:222)[optimized] at org/apache/catalina/core/StandardContextValve.invoke(StandardContextValve.java:123)[optimized] at org/apache/catalina/core/StandardHostValve.invoke(StandardHostValve.java:171)[optimized] at org/apache/catalina/valves/ErrorReportValve.invoke(ErrorReportValve.java:99)[optimized] at org/apache/catalina/valves/AccessLogValve.invoke(AccessLogValve.java:953)[optimized] at org/apache/catalina/core/StandardEngineValve.invoke(StandardEngineValve.java:118)[optimized] at org/apache/catalina/connector/CoyoteAdapter.service(CoyoteAdapter.java:408)[optimized] at org/apache/coyote/http11/AbstractHttp11Processor.process(AbstractHttp11Processor.java:1023)[optimized] at org/apache/coyote/AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)[optimized] at org/apache/tomcat/util/net/JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)[optimized] ^-- Holding lock: org/apache/tomcat/util/net/SocketWrapper@0x2ee6e4aa8[thin lock] at java/util/concurrent/ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)[inlined] at java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)[optimized] at java/lang/Thread.run(Thread.java:682)[optimized] at jrockit/vm/RNI.c2java(J)V(Native Method)
Re: Help me read Thread
The heavy load of indexing is true. During this time, all other nodes are under "recovery" mode and search queries are referred to leader and it times out. Is there a temporary work around for this? Thanks. On 10/13/15 8:56 AM, Erick Erickson wrote: Is this under a very heavy indexing load? There were some inefficiencies that caused followers to work a lot harder than the leader, but the leader had to spin off a bunch of threads to send update to followers. That's fixed int he 5.2 release. Best, Erick On Tue, Oct 13, 2015 at 8:40 AM, Rallavagu wrote: Please help me understand what is going on with this thread. Solr 4.6.1, single shard, 4 node cluster, 3 node zk. Running on tomcat with 500 threads. There are 47 threads overall and designated leader becomes unresponsive though shows "green" from cloud perspective. This is causing issues. particularly, " at org/apache/solr/update/processor/DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1278)[optimized] ^-- Holding lock: java/util/LinkedList@0x2ee24e958[thin lock] ^-- Holding lock: org/apache/solr/update/StreamingSolrServers$1@0x2ee24e9c0[biased lock] ^-- Holding lock: org/apache/solr/update/StreamingSolrServers@0x2ee24ea90[biased lock]" "http-bio-8080-exec-2878" id=5899 idx=0x30c tid=17132 prio=5 alive, native_blocked, daemon at __lll_lock_wait+34(:0)@0x382ba0e262 at safepointSyncOnPollAccess+167(safepoint.c:83)@0x7f83ae266138 at trapiNormalHandler+484(traps_posix.c:220)@0x7f83ae29a745 at _L_unlock_16+44(:0)@0x382ba0f710 at java/util/LinkedList.peek(LinkedList.java:447)[optimized] at org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrServer.blockUntilFinished(ConcurrentUpdateSolrServer.java:384)[inlined] at org/apache/solr/update/StreamingSolrServers.blockUntilFinished(StreamingSolrServers.java:98)[inlined] at org/apache/solr/update/SolrCmdDistributor.finish(SolrCmdDistributor.java:61)[inlined] at org/apache/solr/update/processor/DistributedUpdateProcessor.doFinish(DistributedUpdateProcessor.java:501)[inlined] at org/apache/solr/update/processor/DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1278)[optimized] ^-- Holding lock: java/util/LinkedList@0x2ee24e958[thin lock] ^-- Holding lock: org/apache/solr/update/StreamingSolrServers$1@0x2ee24e9c0[biased lock] ^-- Holding lock: org/apache/solr/update/StreamingSolrServers@0x2ee24ea90[biased lock] at org/apache/solr/handler/ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:83)[optimized] at org/apache/solr/handler/RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)[optimized] at org/apache/solr/core/SolrCore.execute(SolrCore.java:1859)[optimized] at org/apache/solr/servlet/SolrDispatchFilter.execute(SolrDispatchFilter.java:721)[inlined] at org/apache/solr/servlet/SolrDispatchFilter.doFilter(SolrDispatchFilter.java:417)[inlined] at org/apache/solr/servlet/SolrDispatchFilter.doFilter(SolrDispatchFilter.java:201)[optimized] at org/apache/catalina/core/ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)[inlined] at org/apache/catalina/core/ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)[optimized] at org/apache/catalina/core/StandardWrapperValve.invoke(StandardWrapperValve.java:222)[optimized] at org/apache/catalina/core/StandardContextValve.invoke(StandardContextValve.java:123)[optimized] at org/apache/catalina/core/StandardHostValve.invoke(StandardHostValve.java:171)[optimized] at org/apache/catalina/valves/ErrorReportValve.invoke(ErrorReportValve.java:99)[optimized] at org/apache/catalina/valves/AccessLogValve.invoke(AccessLogValve.java:953)[optimized] at org/apache/catalina/core/StandardEngineValve.invoke(StandardEngineValve.java:118)[optimized] at org/apache/catalina/connector/CoyoteAdapter.service(CoyoteAdapter.java:408)[optimized] at org/apache/coyote/http11/AbstractHttp11Processor.process(AbstractHttp11Processor.java:1023)[optimized] at org/apache/coyote/AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)[optimized] at org/apache/tomcat/util/net/JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)[optimized] ^-- Holding lock: org/apache/tomcat/util/net/SocketWrapper@0x2ee6e4aa8[thin lock] at java/util/concurrent/ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)[inlined] at java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)[optimized] at java/lang/Thread.run(Thread.java:682)[optimized] at jrockit/vm/RNI.c2java(J)V(Native Method)
Re: Help me read Thread
Is this under a very heavy indexing load? There were some inefficiencies that caused followers to work a lot harder than the leader, but the leader had to spin off a bunch of threads to send update to followers. That's fixed int he 5.2 release. Best, Erick On Tue, Oct 13, 2015 at 8:40 AM, Rallavagu wrote: > Please help me understand what is going on with this thread. > > Solr 4.6.1, single shard, 4 node cluster, 3 node zk. Running on tomcat with > 500 threads. > > > There are 47 threads overall and designated leader becomes unresponsive > though shows "green" from cloud perspective. This is causing issues. > > particularly, > > " at > org/apache/solr/update/processor/DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1278)[optimized] > ^-- Holding lock: java/util/LinkedList@0x2ee24e958[thin lock] > ^-- Holding lock: > org/apache/solr/update/StreamingSolrServers$1@0x2ee24e9c0[biased lock] > ^-- Holding lock: > org/apache/solr/update/StreamingSolrServers@0x2ee24ea90[biased lock]" > > > > "http-bio-8080-exec-2878" id=5899 idx=0x30c tid=17132 prio=5 alive, > native_blocked, daemon > at __lll_lock_wait+34(:0)@0x382ba0e262 > at safepointSyncOnPollAccess+167(safepoint.c:83)@0x7f83ae266138 > at trapiNormalHandler+484(traps_posix.c:220)@0x7f83ae29a745 > at _L_unlock_16+44(:0)@0x382ba0f710 > at java/util/LinkedList.peek(LinkedList.java:447)[optimized] > at > org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrServer.blockUntilFinished(ConcurrentUpdateSolrServer.java:384)[inlined] > at > org/apache/solr/update/StreamingSolrServers.blockUntilFinished(StreamingSolrServers.java:98)[inlined] > at > org/apache/solr/update/SolrCmdDistributor.finish(SolrCmdDistributor.java:61)[inlined] > at > org/apache/solr/update/processor/DistributedUpdateProcessor.doFinish(DistributedUpdateProcessor.java:501)[inlined] > at > org/apache/solr/update/processor/DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1278)[optimized] > ^-- Holding lock: java/util/LinkedList@0x2ee24e958[thin lock] > ^-- Holding lock: > org/apache/solr/update/StreamingSolrServers$1@0x2ee24e9c0[biased lock] > ^-- Holding lock: > org/apache/solr/update/StreamingSolrServers@0x2ee24ea90[biased lock] > at > org/apache/solr/handler/ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:83)[optimized] > at > org/apache/solr/handler/RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)[optimized] > at org/apache/solr/core/SolrCore.execute(SolrCore.java:1859)[optimized] > at > org/apache/solr/servlet/SolrDispatchFilter.execute(SolrDispatchFilter.java:721)[inlined] > at > org/apache/solr/servlet/SolrDispatchFilter.doFilter(SolrDispatchFilter.java:417)[inlined] > at > org/apache/solr/servlet/SolrDispatchFilter.doFilter(SolrDispatchFilter.java:201)[optimized] > at > org/apache/catalina/core/ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)[inlined] > at > org/apache/catalina/core/ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)[optimized] > at > org/apache/catalina/core/StandardWrapperValve.invoke(StandardWrapperValve.java:222)[optimized] > at > org/apache/catalina/core/StandardContextValve.invoke(StandardContextValve.java:123)[optimized] > at > org/apache/catalina/core/StandardHostValve.invoke(StandardHostValve.java:171)[optimized] > at > org/apache/catalina/valves/ErrorReportValve.invoke(ErrorReportValve.java:99)[optimized] > at > org/apache/catalina/valves/AccessLogValve.invoke(AccessLogValve.java:953)[optimized] > at > org/apache/catalina/core/StandardEngineValve.invoke(StandardEngineValve.java:118)[optimized] > at > org/apache/catalina/connector/CoyoteAdapter.service(CoyoteAdapter.java:408)[optimized] > at > org/apache/coyote/http11/AbstractHttp11Processor.process(AbstractHttp11Processor.java:1023)[optimized] > at > org/apache/coyote/AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)[optimized] > at > org/apache/tomcat/util/net/JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)[optimized] > ^-- Holding lock: > org/apache/tomcat/util/net/SocketWrapper@0x2ee6e4aa8[thin lock] > at > java/util/concurrent/ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)[inlined] > at > java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)[optimized] > at java/lang/Thread.run(Thread.java:682)[optimized] > at jrockit/vm/RNI.c2java(J)V(Native Method)
Re: Indexing Solr in production
Here's a sample: https://lucidworks.com/blog/2012/02/14/indexing-with-solrj/ On Tue, Oct 13, 2015 at 4:18 AM, Alessandro Benedetti wrote: > The most robust and simple way to go is building your own Indexer. > You can decide the platform you want, Solr has plenty of client API > libraries. > > For example if you want to write your Indexer app in Java, you can use > SolrJ.. > Each client library will give you all the flexibility you need to index > solr in a robust way. > > [1] https://cwiki.apache.org/confluence/display/solr/Client+APIs > Cheers > > On 13 October 2015 at 09:35, Zheng Lin Edwin Yeo > wrote: > >> Hi, >> >> What is the best practice to do indexing in Solr for production system.I'm >> using Solr 5.3.0. >> >> I understand that post.jar does not have things like robustness checks and >> retires, which is important in production, as sometimes certain records >> might failed during the indexing, and we need to re-try the indexing for >> those records that fails. >> >> Normally, do we need to write a new custom handler in order to achieve all >> these? >> Want to find out what most people did before I decide on a method and >> proceed on to the next step. >> >> Thank you. >> >> Regards, >> Edwin >> > > > > -- > -- > > Benedetti Alessandro > Visiting card - http://about.me/alessandro_benedetti > Blog - http://alexbenedetti.blogspot.co.uk > > "Tyger, tyger burning bright > In the forests of the night, > What immortal hand or eye > Could frame thy fearful symmetry?" > > William Blake - Songs of Experience -1794 England
Re: Request for Wiki edit right
Create a user on the Wiki (anyone can), then tell us the user name you've created and we'll add you to the auth lists. There are separate lists for Solr and Lucene. We had to lock these down because we were getting a lot of spam pages created. The reference guide (CWiki) is restricted to committers though. Best, Erick On Tue, Oct 13, 2015 at 6:30 AM, Arcadius Ahouansou wrote: > Hello. > > Please, can I have the right to edit the Wiki? > > Thanks. > > Arcadius.
Help me read Thread
Please help me understand what is going on with this thread. Solr 4.6.1, single shard, 4 node cluster, 3 node zk. Running on tomcat with 500 threads. There are 47 threads overall and designated leader becomes unresponsive though shows "green" from cloud perspective. This is causing issues. particularly, " at org/apache/solr/update/processor/DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1278)[optimized] ^-- Holding lock: java/util/LinkedList@0x2ee24e958[thin lock] ^-- Holding lock: org/apache/solr/update/StreamingSolrServers$1@0x2ee24e9c0[biased lock] ^-- Holding lock: org/apache/solr/update/StreamingSolrServers@0x2ee24ea90[biased lock]" "http-bio-8080-exec-2878" id=5899 idx=0x30c tid=17132 prio=5 alive, native_blocked, daemon at __lll_lock_wait+34(:0)@0x382ba0e262 at safepointSyncOnPollAccess+167(safepoint.c:83)@0x7f83ae266138 at trapiNormalHandler+484(traps_posix.c:220)@0x7f83ae29a745 at _L_unlock_16+44(:0)@0x382ba0f710 at java/util/LinkedList.peek(LinkedList.java:447)[optimized] at org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrServer.blockUntilFinished(ConcurrentUpdateSolrServer.java:384)[inlined] at org/apache/solr/update/StreamingSolrServers.blockUntilFinished(StreamingSolrServers.java:98)[inlined] at org/apache/solr/update/SolrCmdDistributor.finish(SolrCmdDistributor.java:61)[inlined] at org/apache/solr/update/processor/DistributedUpdateProcessor.doFinish(DistributedUpdateProcessor.java:501)[inlined] at org/apache/solr/update/processor/DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1278)[optimized] ^-- Holding lock: java/util/LinkedList@0x2ee24e958[thin lock] ^-- Holding lock: org/apache/solr/update/StreamingSolrServers$1@0x2ee24e9c0[biased lock] ^-- Holding lock: org/apache/solr/update/StreamingSolrServers@0x2ee24ea90[biased lock] at org/apache/solr/handler/ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:83)[optimized] at org/apache/solr/handler/RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)[optimized] at org/apache/solr/core/SolrCore.execute(SolrCore.java:1859)[optimized] at org/apache/solr/servlet/SolrDispatchFilter.execute(SolrDispatchFilter.java:721)[inlined] at org/apache/solr/servlet/SolrDispatchFilter.doFilter(SolrDispatchFilter.java:417)[inlined] at org/apache/solr/servlet/SolrDispatchFilter.doFilter(SolrDispatchFilter.java:201)[optimized] at org/apache/catalina/core/ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)[inlined] at org/apache/catalina/core/ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)[optimized] at org/apache/catalina/core/StandardWrapperValve.invoke(StandardWrapperValve.java:222)[optimized] at org/apache/catalina/core/StandardContextValve.invoke(StandardContextValve.java:123)[optimized] at org/apache/catalina/core/StandardHostValve.invoke(StandardHostValve.java:171)[optimized] at org/apache/catalina/valves/ErrorReportValve.invoke(ErrorReportValve.java:99)[optimized] at org/apache/catalina/valves/AccessLogValve.invoke(AccessLogValve.java:953)[optimized] at org/apache/catalina/core/StandardEngineValve.invoke(StandardEngineValve.java:118)[optimized] at org/apache/catalina/connector/CoyoteAdapter.service(CoyoteAdapter.java:408)[optimized] at org/apache/coyote/http11/AbstractHttp11Processor.process(AbstractHttp11Processor.java:1023)[optimized] at org/apache/coyote/AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)[optimized] at org/apache/tomcat/util/net/JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)[optimized] ^-- Holding lock: org/apache/tomcat/util/net/SocketWrapper@0x2ee6e4aa8[thin lock] at java/util/concurrent/ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)[inlined] at java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)[optimized] at java/lang/Thread.run(Thread.java:682)[optimized] at jrockit/vm/RNI.c2java(J)V(Native Method)
Re: catchall fields or multiple fields
Performing a sequence of queries can help too. For example, if users commonly search for a product name, you could do an initial query on just the product name field which should be much faster than searching the text of all product descriptions, and highlighting would be less problematic. If that initial query comes up empty, then you could move on to the next highest most likely field, maybe product title (short one line description), and query voluminous fields like detailed product descriptions, specifications, and user comments/reviews only as a last resort. -- Jack Krupansky On Tue, Oct 13, 2015 at 6:17 AM, elisabeth benoit wrote: > Thanks to you all for those informed advices. > > Thanks Trey for your very detailed point of view. This is now very clear to > me how a search on multiple fields can grow slower than a search on a > catchall field. > > Our actual search model is problematic: we search on a catchall field, but > need to know which fields match, so we do highlighting on multi fields (not > indexed, but stored). To improve performance, we want to get rid of > highlighting and use the solr explain output. To get the explain output on > those fields, we need to do a search on those fields. > > So I guess we have to test if removing highlighting and adding multi fields > search will improve performances or not. > > Best regards, > Elisabeth > > > > 2015-10-12 17:55 GMT+02:00 Jack Krupansky : > > > I think it may all depend on the nature of your application and how much > > commonality there is between fields. > > > > One interesting area is auto-suggest, where you can certainly suggest > from > > the union of all fields, you may want to give priority to suggestions > from > > preferred fields. For example, for actual product names or important > > keywords rather than random words from the English language that happen > to > > occur in descriptions, all of which would occur in a catchall. > > > > -- Jack Krupansky > > > > On Mon, Oct 12, 2015 at 8:39 AM, elisabeth benoit < > > elisaelisael...@gmail.com > > > wrote: > > > > > Hello, > > > > > > We're using solr 4.10 and storing all data in a catchall field. It > seems > > to > > > me that one good reason for using a catchall field is when using > scoring > > > with idf (with idf, a word might not have same score in all fields). We > > got > > > rid of idf and are now considering using multiple fields. I remember > > > reading somewhere that using a catchall field might speed up searching > > > time. I was wondering if some of you have any opinion (or experience) > > > related to this subject. > > > > > > Best regards, > > > Elisabeth > > > > > >
Re: Selective field query
Thanks Alessandro, Certainly the use of the Analysis tool, along with debug query supplies a lot of useful information. I've found that a combination of using the ngram field, (as detailed previously), along with the qf param of the edismax parser seems to be working well. >From there I can dynamically create the relevant search queries as required. Certailnly there's a lot more to get on board here to properly maximise the value of using solr. So further suggestions, advice, etc. remain very welcome. Appreciated Colin On Tue, Oct 13, 2015 at 12:00 PM, Alessandro Benedetti < benedetti.ale...@gmail.com> wrote: > The first thing I would suggest you is the use of the Analysis tool, to > explore your analysis at query and index time. > This will be the first step to understand if you are actually tokenising > and token filtering as expected. > > Then you should play with different fields ( in the case the original field > is single value, you are not going to lose the relation) . > Then you can provide the search you expect , for example : > > Service Name : Ngram token filtered ( or whatever you need) > Service id: keywordTokenizer ( to keep only one token) . > > Can you give additional details ? > > Cheers > > On 13 October 2015 at 10:36, Colin Hunter wrote: > > > Thanks Scot. > > That is definitely moving things in the right direction > > > > I have another question that relates to this. It is also requested to > > implement a partial word search on the service name field. > > However, each service also has a unique identifier (string). This field > > requires exact string matching. > > I have attempted making a copy field for Service Name using the > > NGramTokenizerFactory, as below. > > > > > > > positionIncrementGap="100"> > > > > > minGramSize="3" maxGramSize="7"/> > > > > > > > > > > > > > > > > > > While the debugQuery info showed the _ngram results, I was having issue > > building the query that would return these results along with regular > > search. (Your previous response may well clarify this). > > When I set this to return on all fields, then the full string match > > required for the service UI no longer works. > > > > I certainly have to explore further re the eDisMax parser. > > However, any advice that can be offered, regarding meeting these > different > > requirements in a single query would be very helpful. > > > > Many Thanks > > Colin > > > > On Tue, Oct 13, 2015 at 5:49 AM, Scott Stults < > > sstu...@opensourceconnections.com> wrote: > > > > > Colin, > > > > > > The other thing you'll want to keep in mind (and you'll find this out > > with > > > debugQuery) is that the query parser is going to take your > > > ServiceName:(Search Service) and turn it into two queries -- > > > ServiceName:(Search) ServiceName:(Service). That's because the query > > parser > > > breaks on whitespace. My bet is you have a lot of entries with a name > of > > "X > > > Service" and the second part of your query is hitting them. Phrase > Field > > > might be your friend here: > > > > > > https://wiki.apache.org/solr/ExtendedDisMax#pf_.28Phrase_Fields.29 > > > > > > > > > -Scott > > > > > > On Mon, Oct 12, 2015 at 4:15 AM, Colin Hunter > > > wrote: > > > > > > > Thanks Erick, I'm sure this will be valuable in implementing ngram > > filter > > > > factory > > > > > > > > On Fri, Oct 9, 2015 at 4:38 PM, Erick Erickson < > > erickerick...@gmail.com> > > > > wrote: > > > > > > > > > Colin: > > > > > > > > > > Adding &debug=all to your query is your friend here, the > > > > > parsed_query.toString will show you exactly what > > > > > is searched against. > > > > > > > > > > Best, > > > > > Erick > > > > > > > > > > On Fri, Oct 9, 2015 at 2:09 AM, Colin Hunter > > > > > wrote: > > > > > > Ah ha... the copy field... makes sense. > > > > > > Thank You. > > > > > > > > > > > > On Fri, Oct 9, 2015 at 10:04 AM, Upayavira > wrote: > > > > > > > > > > > >> > > > > > >> > > > > > >> On Fri, Oct 9, 2015, at 09:54 AM, Colin Hunter wrote: > > > > > >> > Hi > > > > > >> > > > > > > >> > I am working on a complex search utility with an index created > > via > > > > > data > > > > > >> > import from an extensive MySQL database. > > > > > >> > There are many ways in which the index is searched. One of the > > > > utility > > > > > >> > input fields searches only on a Service Name. However, if I > > target > > > > the > > > > > >> > query as q=ServiceName:"Searched service", this only returns > an > > > > exact > > > > > >> > string match. If q=Searched Service, the query still returns > > > results > > > > > from > > > > > >> > all indexed data. > > > > > >> > > > > > > >> > Is there a way to construct a query to only return results > from > > > one > > > > > field > > > > > >> > of a doc ? > > > > > >> > I have tried setting index=false, stored=true on unwanted > > fields, > > > > but > > > > > >> > these > > > > > >> > appear to have still
Request for Wiki edit right
Hello. Please, can I have the right to edit the Wiki? Thanks. Arcadius.
RE: File-based Spelling
Mark, The older spellcheck implementations create an n-gram sidecar index, which is why you're seeing your name split into 2-grams like this. See the IR Book by Manning et al, section 3.3.4 for more information. Based on the results you're getting, I think it is loading your file correctly. You should now try a query against this spelling index, using words *not* in the file you loaded that are within 1 or 2 edits from something that is in the dictionary. If it doesn't yield suggestions, then post the relevant sections of the solrconfig.xml, schema.xml and also the query string you are trying. James Dyer Ingram Content Group -Original Message- From: Mark Fenbers [mailto:mark.fenb...@noaa.gov] Sent: Monday, October 12, 2015 2:38 PM To: Solr User Group Subject: File-based Spelling Greetings! I'm attempting to use a file-based spell checker. My sourceLocation is /usr/share/dict/linux.words, and my spellcheckIndexDir is set to ./data/spFile. BuildOnStartup is set to true, and I see nothing to suggest any sort of problem/error in solr.log. However, in my ./data/spFile/ directory, there are only two files: segments_2 with only 71 bytes in it, and a zero-byte write.lock file. For a source dictionary having 480,000 words in it, I was expecting a bit more substance in the ./data/spFile directory. Something doesn't seem right with this. Moreover, I ran a query on the word Fenbers, which isn't listed in the linux.words file, but there are several similar words. The results I got back were odd, and suggestions included the following: fenber f en be r f e nb er f en b er f e n be r f en b e r f e nb e r f e n b er f e n b e r But I expected suggestions like fenders, embers, and fenberry, etc. I also ran a query on Mark (which IS listed in linux.words) and got back two suggestions in a similar format. I played with configurables like changing the fieldType from text_en to string and the characterEncoding from UTF-8 to ASCII, etc., but nothing seemed to yield any different results. Can anyone offer suggestions as to what I'm doing wrong? I've been struggling with this for more than 40 hours now! I'm surprised my persistence has lasted this long! Thanks, Mark
Re: are there any SolrCloud supervisors?
I would be interested in seeing it in action. Do you have any documentation available on what it does and how? Thanks From: r b Sent: Friday, October 2, 2015 3:09 PM To: solr-user@lucene.apache.org Subject: are there any SolrCloud supervisors? I've been working on something that just monitors ZooKeeper to add and remove nodes from collections. the use case being I put SolrCloud in an autoscaling group on EC2 and as instances go up and down, I need them added to the collection. It's something I've built for work and could clean up to share on GitHub if there is much interest. I asked in the IRC about a SolrCloud supervisor utility but wanted to extend that question to this list. are there any more "full featured" supervisors out there? -renning
Re: are there any SolrCloud supervisors?
Sounds interesting... On Tue, Oct 13, 2015 at 12:58 AM, Trey Grainger wrote: > I'd be very interested in taking a look if you post the code. > > Trey Grainger > Co-Author, Solr in Action > Director of Engineering, Search & Recommendations @ CareerBuilder > > On Fri, Oct 2, 2015 at 3:09 PM, r b wrote: > > > I've been working on something that just monitors ZooKeeper to add and > > remove nodes from collections. the use case being I put SolrCloud in > > an autoscaling group on EC2 and as instances go up and down, I need > > them added to the collection. It's something I've built for work and > > could clean up to share on GitHub if there is much interest. > > > > I asked in the IRC about a SolrCloud supervisor utility but wanted to > > extend that question to this list. are there any more "full featured" > > supervisors out there? > > > > > > -renning > > >
Re: Replication and soft commits for NRT searches
Comments inline: On Mon, Oct 12, 2015 at 1:31 PM, MOIS Martin (MORPHO) wrote: > Hello, > > I am running Solr 5.2.1 in a cluster with 6 nodes. My collections have been > created with replicationFactor=2, i.e. I have one replica for each shard. > Beyond that I am using autoCommit/maxDocs=1 and autoSoftCommits/maxDocs=1 > in order to achieve near realtime search behavior. > > As far as I understand from section "Write Side Fault Tolerance" in the > documentation > (https://cwiki.apache.org/confluence/display/solr/Read+and+Write+Side+Fault+Tolerance), > I cannot enforce that an update gets replicated to all replicas, but I can > only get the achieved replication factor by requesting the return value rf. > > My question is now, what exactly does rf=2 mean? Does it only mean that the > replica has written the update to its transaction log? Or has the replica > also performed the soft commit as configured with autoSoftCommits/maxDocs=1? > The answer is important for me, as if the update would only get written to > the transaction log, I could not search for it reliable, as the replica may > not have added it to the searchable index. rf=2 means that the update was successfully replicated to and acknowledged by two replicas (including the leader). The rf only deals with the durability of the update and has no relation to visibility of the update to searchers. The auto(soft)commit settings are applied asynchronously and do not block an update request. > > My second question is, does rf=1 mean that the update was definitely not > successful on the replica or could it also represent a timeout of the > replication request from the shard leader? If it could also represent a > timeout, then there would be a small chance that the replication was > successfully despite of the timeout. Well, rf=1 implies that the update was only applied on the leader's index + tlog and either replicas weren't available or returned an error or the request timed out. So yes, you are right that it can represent a timeout and as such there is a chance that the replication was indeed successful despite of the timeout. > > Is there a way to retrieve the replication factor for a specific document > after the update in order to check if replication was successful in the > meantime? > No, there is no way to do that. > Thanks in advance. > > Best Regards, > Martin Mois > # > " This e-mail and any attached documents may contain confidential or > proprietary information. If you are not the intended recipient, you are > notified that any dissemination, copying of this e-mail and any attachments > thereto or use of their contents by any means whatsoever is strictly > prohibited. If you have received this e-mail in error, please advise the > sender immediately and delete this e-mail and all attached documents from > your computer system." > # -- Regards, Shalin Shekhar Mangar.
Re: AutoComplete Feature in Solr
Thanks guys, I was able to make it work using your articles. The key point was mentioned in one of the articles which was that suggestion component is preconfigured in techproducts sample. I started my work from there and tweaked it to suit my needs. Thanks a lot! One thing still remaining, I don't find the support for "suggest" is Solr.NET. What I found is that we should use Spell check but that is not the recommended option as per the articles. Spell Check component in Solr.NET will use /spell component while I have configured suggestions using /suggest component. It is easy to handle it myself as well but I was just wondering if Solr.NET supports suggest component somehow. Regards, Salman On Tue, Oct 13, 2015 at 2:39 PM, Alessandro Benedetti < benedetti.ale...@gmail.com> wrote: > As Erick suggested you are reading a really old way to provide the > autocomplete feature ! > Please take a read to the docs Erick linked and to my blog as well. > It will definitely give you more insight about the Autocomplete world ! > > Cheers > > [1] http://alexbenedetti.blogspot.co.uk/2015/07/solr-you-complete-me.html > > On 12 October 2015 at 21:24, Salman Ansari > wrote: > > > Hi, > > > > I have been trying to get the autocomplete feature in Solr working with > no > > luck up to now. First I read that "suggest component" is the recommended > > way as in the below article (and this is the exact functionality I am > > looking for, which is to autocomplete multiple words) > > > > > http://blog.trifork.com/2012/02/15/different-ways-to-make-auto-suggestions-with-solr/ > > > > Then I tried implementing suggest as described in the following articles > in > > this order > > 1) https://wiki.apache.org/solr/Suggester#SearchHandler_configuration > > 2) http://solr.pl/en/2010/11/15/solr-and-autocomplete-part-2/ (I > > implemented suggesting phrases) > > 3) > > > > > http://stackoverflow.com/questions/18132819/how-to-have-solr-autocomplete-on-whole-phrase-when-query-contains-multiple-terms > > > > With no luck, after implementing each article when I run my query as > > http://[MySolr]:8983/solr/entityStore114/suggest?spellcheck.q=Barack > > > > > > > > I get > > > > > > 0 > > 0 > > > > > > > > Although I have an entry for Barack Obama in my index. I am posting my > > Solr configuration as well > > > > > > > > suggest > > org.apache.solr.spelling.suggest.Suggester > >> name="lookupImpl">org.apache.solr.spelling.suggest.fst.FSTLookup > > entity_autocomplete > > true > > > > > > > > > class="org.apache.solr.handler.component.SearchHandler"> > > > > true > > suggest > > 10 > > true > > false > > > > > > suggest > > > > > > > > It looks like a very simple job, but even after following so many > articles, > > I could not get it right. Any comment will be appreciated! > > > > Regards, > > Salman > > > > > > -- > -- > > Benedetti Alessandro > Visiting card - http://about.me/alessandro_benedetti > Blog - http://alexbenedetti.blogspot.co.uk > > "Tyger, tyger burning bright > In the forests of the night, > What immortal hand or eye > Could frame thy fearful symmetry?" > > William Blake - Songs of Experience -1794 England >
Re: Spell Check and Privacy
We had the very exact issue and we solved as James suggested :) To answer Susheel, the requirement is to provide users with the only suggestions he should see. It can seem a paranoid request but can happen that we don't want to show any of the indexed data for different users. In enterprise search you are able to see only the documents you expect to see, and the same is valid for autocompletion and spellchecking. Time ago I was thinking to provide a filter query approach for spellchecking and autocomplete, maybe I will return to think about it later. Cheers On 12 October 2015 at 15:36, Susheel Kumar wrote: > Hi Arnon, > > I couldn't fully understood your use case regarding Privacy. Are you > concerned that SpellCheck may reveal user names part of suggestions which > could have belonged to different organizations / ACLS OR after providing > suggestions you are concerned that user may be able to click and view other > organization users? > > Please provide some details on your concern for Privacy with Spell Checker. > > Thanks, > Susheel > > On Mon, Oct 12, 2015 at 9:45 AM, Dyer, James > > wrote: > > > Arnon, > > > > Use "spellcheck.collate=true" with "spellcheck.maxCollationTries" set to > a > > non-zero value. This will give you re-written queries that are > guaranteed > > to return hits, given the original query and filters. If you are using > an > > "mm" value other than 100%, you also will want specify " > > spellcheck.collateParam.mm=100%". (or if using "q.op=OR", then use > > "spellcheck.collateParam.q.op=AND") > > > > Of course, the first section of the spellcheck result will still show > > every possible suggestion, so your client needs to discard these and not > > divulge them to the user. If you need to know word-by-word how the > > collations were constructed, then specify > > "spellcheck.collateExtendedResults=true". Use the extended collation > > results for this information and not the first section of the spellcheck > > results. > > > > This is all fairly well-documented on the old solr wiki: > > https://wiki.apache.org/solr/SpellCheckComponent#spellcheck.collate > > > > James Dyer > > Ingram Content Group > > > > -Original Message- > > From: Arnon Yogev [mailto:arn...@il.ibm.com] > > Sent: Monday, October 12, 2015 2:33 AM > > To: solr-user@lucene.apache.org > > Subject: Spell Check and Privacy > > > > Hi, > > > > Our system supports many users from different organizations and with > > different ACLs. > > We consider adding a spell check ("did you mean") functionality using > > DirectSolrSpellChecker. However, a privacy concern was raised, as this > > might lead to private information being revealed between users via the > > suggested terms. Using the FileBasedSpellChecker is another option, but > > naturally a static list of terms is not optimal. > > > > Is there a best practice or a suggested method for these kind of cases? > > > > Thanks, > > Arnon > > > > > -- -- Benedetti Alessandro Visiting card - http://about.me/alessandro_benedetti Blog - http://alexbenedetti.blogspot.co.uk "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England
Re: AutoComplete Feature in Solr
As Erick suggested you are reading a really old way to provide the autocomplete feature ! Please take a read to the docs Erick linked and to my blog as well. It will definitely give you more insight about the Autocomplete world ! Cheers [1] http://alexbenedetti.blogspot.co.uk/2015/07/solr-you-complete-me.html On 12 October 2015 at 21:24, Salman Ansari wrote: > Hi, > > I have been trying to get the autocomplete feature in Solr working with no > luck up to now. First I read that "suggest component" is the recommended > way as in the below article (and this is the exact functionality I am > looking for, which is to autocomplete multiple words) > > http://blog.trifork.com/2012/02/15/different-ways-to-make-auto-suggestions-with-solr/ > > Then I tried implementing suggest as described in the following articles in > this order > 1) https://wiki.apache.org/solr/Suggester#SearchHandler_configuration > 2) http://solr.pl/en/2010/11/15/solr-and-autocomplete-part-2/ (I > implemented suggesting phrases) > 3) > > http://stackoverflow.com/questions/18132819/how-to-have-solr-autocomplete-on-whole-phrase-when-query-contains-multiple-terms > > With no luck, after implementing each article when I run my query as > http://[MySolr]:8983/solr/entityStore114/suggest?spellcheck.q=Barack > > > > I get > > > 0 > 0 > > > > Although I have an entry for Barack Obama in my index. I am posting my > Solr configuration as well > > > > suggest > org.apache.solr.spelling.suggest.Suggester >name="lookupImpl">org.apache.solr.spelling.suggest.fst.FSTLookup > entity_autocomplete > true > > > > class="org.apache.solr.handler.component.SearchHandler"> > > true > suggest > 10 > true > false > > > suggest > > > > It looks like a very simple job, but even after following so many articles, > I could not get it right. Any comment will be appreciated! > > Regards, > Salman > -- -- Benedetti Alessandro Visiting card - http://about.me/alessandro_benedetti Blog - http://alexbenedetti.blogspot.co.uk "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England
Re: Grouping facets: Possible to get facet results for each Group?
Can you model your business domain with Solr nested Docs ? In the case you can use Yonik article about nested facets. Cheers On 13 October 2015 at 05:05, Alexandre Rafalovitch wrote: > Could you use the new nested facets syntax? > http://yonik.com/solr-subfacets/ > > Regards, >Alex. > > Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: > http://www.solr-start.com/ > > On 11 October 2015 at 09:51, Peter Sturge wrote: > > Been trying to coerce Group faceting to give some faceting back for each > > group, but maybe this use case isn't catered for in Grouping? : > > > > So the Use Case is this: > > Let's say I do a grouped search that returns say, 9 distinct groups, and > in > > these groups are various numbers of unique field values that need > faceting > > - but the faceting needs to be within each group: > -- -- Benedetti Alessandro Visiting card - http://about.me/alessandro_benedetti Blog - http://alexbenedetti.blogspot.co.uk "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England
Re: Indexing Solr in production
The most robust and simple way to go is building your own Indexer. You can decide the platform you want, Solr has plenty of client API libraries. For example if you want to write your Indexer app in Java, you can use SolrJ.. Each client library will give you all the flexibility you need to index solr in a robust way. [1] https://cwiki.apache.org/confluence/display/solr/Client+APIs Cheers On 13 October 2015 at 09:35, Zheng Lin Edwin Yeo wrote: > Hi, > > What is the best practice to do indexing in Solr for production system.I'm > using Solr 5.3.0. > > I understand that post.jar does not have things like robustness checks and > retires, which is important in production, as sometimes certain records > might failed during the indexing, and we need to re-try the indexing for > those records that fails. > > Normally, do we need to write a new custom handler in order to achieve all > these? > Want to find out what most people did before I decide on a method and > proceed on to the next step. > > Thank you. > > Regards, > Edwin > -- -- Benedetti Alessandro Visiting card - http://about.me/alessandro_benedetti Blog - http://alexbenedetti.blogspot.co.uk "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England
Re: NullPointerException
Generally it is highly discouraged to build the spellcheck on startup. In the case of big suggestion file, you are going to build the suggester data structures ( basically FST in memory and then in disk) for a long time, on startup. You should build your spellchecker only when you change the file source of the suggestions, Checking the snippet, first I see you add a field to the FileBased Spellchecker config, which is useless. Anyway should not be the problem. Can you give us the source of suggestions ? A snippet of the file ? Cheers On 13 October 2015 at 10:02, Duck Geraint (ext) GBJH < geraint.d...@syngenta.com> wrote: > How odd, though I'm afraid this is reaching the limit of my knowledge at > this point (and I still can't find where that box is within the Admin UI!). > > The only thing I'd say is to check that "logtext" is a defined named field > within your schema, and to double check how it's field type is defined. > > Also, try without the "text_en" > definition (I believe this should be implicit as the filed type of > "logtext" above). > > Geraint > > Geraint Duck > Data Scientist > Toxicology and Health Sciences > Syngenta UK > Email: geraint.d...@syngenta.com > > > -Original Message- > From: Mark Fenbers [mailto:mark.fenb...@noaa.gov] > Sent: 12 October 2015 12:14 > To: solr-user@lucene.apache.org > Subject: Re: NullPointerException > > On 10/12/2015 5:38 AM, Duck Geraint (ext) GBJH wrote: > > "When I use the Admin UI (v5.3.0), and check the spellcheck.build box" > > Out of interest, where is this option within the Admin UI? I can't find > anything like it in mine... > This is in the expanded options that open up once I put a checkmark in the > "spellcheck" box. > > Do you get the same issue by submitting the build command directly with > something like this instead: > > http://localhost:8983/solr//ELspell?spellcheck.build=true > > ? > Yes, I do. > > It'll be reasonably obvious if the dictionary has actually built or not > by the file size of your speller store: > > /localapps/dev/EventLog/solr/EventLog2/data/spFile > > > > > > Otherwise, (temporarily) try adding... > > true > > ...to your spellchecker search component config, you might find it'll > log a more useful error message that way. > Interesting! The index builds successfully using this method and I get no > stacktrace error. Hurray! But why?? > > So now, I tried running a query, so I typed Fenbers into the spellcheck.q > box, and I get the following 9 suggestions: > fenber > f en be r > f e nb er > f en b er > f e n be r > f en b e r > f e nb e r > f e n b er > f e n b e r > > I find this very odd because I commented out all references to the > wordbreak checker in solrconfig.xml. What do I configure so that Solr will > give me sensible suggestions like: >fenders >embers >fenberry > and so on? > > Mark > > > > > Syngenta Limited, Registered in England No 2710846;Registered Office : > Syngenta Limited, European Regional Centre, Priestley Road, Surrey Research > Park, Guildford, Surrey, GU2 7YH, United Kingdom > > This message may contain confidential information. If you are not the > designated recipient, please notify the sender immediately, and delete the > original and any copies. Any use of the message by you is prohibited. > -- -- Benedetti Alessandro Visiting card - http://about.me/alessandro_benedetti Blog - http://alexbenedetti.blogspot.co.uk "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England
Re: Selective field query
The first thing I would suggest you is the use of the Analysis tool, to explore your analysis at query and index time. This will be the first step to understand if you are actually tokenising and token filtering as expected. Then you should play with different fields ( in the case the original field is single value, you are not going to lose the relation) . Then you can provide the search you expect , for example : Service Name : Ngram token filtered ( or whatever you need) Service id: keywordTokenizer ( to keep only one token) . Can you give additional details ? Cheers On 13 October 2015 at 10:36, Colin Hunter wrote: > Thanks Scot. > That is definitely moving things in the right direction > > I have another question that relates to this. It is also requested to > implement a partial word search on the service name field. > However, each service also has a unique identifier (string). This field > requires exact string matching. > I have attempted making a copy field for Service Name using the > NGramTokenizerFactory, as below. > > > positionIncrementGap="100"> > > minGramSize="3" maxGramSize="7"/> > > > > > > > > > While the debugQuery info showed the _ngram results, I was having issue > building the query that would return these results along with regular > search. (Your previous response may well clarify this). > When I set this to return on all fields, then the full string match > required for the service UI no longer works. > > I certainly have to explore further re the eDisMax parser. > However, any advice that can be offered, regarding meeting these different > requirements in a single query would be very helpful. > > Many Thanks > Colin > > On Tue, Oct 13, 2015 at 5:49 AM, Scott Stults < > sstu...@opensourceconnections.com> wrote: > > > Colin, > > > > The other thing you'll want to keep in mind (and you'll find this out > with > > debugQuery) is that the query parser is going to take your > > ServiceName:(Search Service) and turn it into two queries -- > > ServiceName:(Search) ServiceName:(Service). That's because the query > parser > > breaks on whitespace. My bet is you have a lot of entries with a name of > "X > > Service" and the second part of your query is hitting them. Phrase Field > > might be your friend here: > > > > https://wiki.apache.org/solr/ExtendedDisMax#pf_.28Phrase_Fields.29 > > > > > > -Scott > > > > On Mon, Oct 12, 2015 at 4:15 AM, Colin Hunter > > wrote: > > > > > Thanks Erick, I'm sure this will be valuable in implementing ngram > filter > > > factory > > > > > > On Fri, Oct 9, 2015 at 4:38 PM, Erick Erickson < > erickerick...@gmail.com> > > > wrote: > > > > > > > Colin: > > > > > > > > Adding &debug=all to your query is your friend here, the > > > > parsed_query.toString will show you exactly what > > > > is searched against. > > > > > > > > Best, > > > > Erick > > > > > > > > On Fri, Oct 9, 2015 at 2:09 AM, Colin Hunter > > > wrote: > > > > > Ah ha... the copy field... makes sense. > > > > > Thank You. > > > > > > > > > > On Fri, Oct 9, 2015 at 10:04 AM, Upayavira wrote: > > > > > > > > > >> > > > > >> > > > > >> On Fri, Oct 9, 2015, at 09:54 AM, Colin Hunter wrote: > > > > >> > Hi > > > > >> > > > > > >> > I am working on a complex search utility with an index created > via > > > > data > > > > >> > import from an extensive MySQL database. > > > > >> > There are many ways in which the index is searched. One of the > > > utility > > > > >> > input fields searches only on a Service Name. However, if I > target > > > the > > > > >> > query as q=ServiceName:"Searched service", this only returns an > > > exact > > > > >> > string match. If q=Searched Service, the query still returns > > results > > > > from > > > > >> > all indexed data. > > > > >> > > > > > >> > Is there a way to construct a query to only return results from > > one > > > > field > > > > >> > of a doc ? > > > > >> > I have tried setting index=false, stored=true on unwanted > fields, > > > but > > > > >> > these > > > > >> > appear to have still been returned in results. > > > > >> > > > > >> q=ServiceName:(Searched Service) > > > > >> > > > > >> That'll look in just one field. > > > > >> > > > > >> Remember changing indexed to false doesn't impact the stuff > already > > in > > > > >> your index. And the reason you are likely getting all that stuff > is > > > > >> because you have a copyField that copies it over into the 'text' > > > field. > > > > >> If you'll never want to search on some fields, switch them to > > > > >> index=false, make sure you aren't doing a copyField on them, and > > then > > > > >> reindex. > > > > >> > > > > >> Upayavira > > > > >> > > > > > > > > > > > > > > > > > > > > -- > > > > > www.gfc.uk.net > > > > > > > > > > > > > > > > -- > > > www.gfc.uk.net > > > > > > > > > > > -- > > Scott Stults | Founder & Solutions Architect | OpenSource Connections, > LLC > > | 434.409.2780 > > http://www.opens
Re: catchall fields or multiple fields
Thanks to you all for those informed advices. Thanks Trey for your very detailed point of view. This is now very clear to me how a search on multiple fields can grow slower than a search on a catchall field. Our actual search model is problematic: we search on a catchall field, but need to know which fields match, so we do highlighting on multi fields (not indexed, but stored). To improve performance, we want to get rid of highlighting and use the solr explain output. To get the explain output on those fields, we need to do a search on those fields. So I guess we have to test if removing highlighting and adding multi fields search will improve performances or not. Best regards, Elisabeth 2015-10-12 17:55 GMT+02:00 Jack Krupansky : > I think it may all depend on the nature of your application and how much > commonality there is between fields. > > One interesting area is auto-suggest, where you can certainly suggest from > the union of all fields, you may want to give priority to suggestions from > preferred fields. For example, for actual product names or important > keywords rather than random words from the English language that happen to > occur in descriptions, all of which would occur in a catchall. > > -- Jack Krupansky > > On Mon, Oct 12, 2015 at 8:39 AM, elisabeth benoit < > elisaelisael...@gmail.com > > wrote: > > > Hello, > > > > We're using solr 4.10 and storing all data in a catchall field. It seems > to > > me that one good reason for using a catchall field is when using scoring > > with idf (with idf, a word might not have same score in all fields). We > got > > rid of idf and are now considering using multiple fields. I remember > > reading somewhere that using a catchall field might speed up searching > > time. I was wondering if some of you have any opinion (or experience) > > related to this subject. > > > > Best regards, > > Elisabeth > > >
Re: Selective field query
Thanks Scot. That is definitely moving things in the right direction I have another question that relates to this. It is also requested to implement a partial word search on the service name field. However, each service also has a unique identifier (string). This field requires exact string matching. I have attempted making a copy field for Service Name using the NGramTokenizerFactory, as below. While the debugQuery info showed the _ngram results, I was having issue building the query that would return these results along with regular search. (Your previous response may well clarify this). When I set this to return on all fields, then the full string match required for the service UI no longer works. I certainly have to explore further re the eDisMax parser. However, any advice that can be offered, regarding meeting these different requirements in a single query would be very helpful. Many Thanks Colin On Tue, Oct 13, 2015 at 5:49 AM, Scott Stults < sstu...@opensourceconnections.com> wrote: > Colin, > > The other thing you'll want to keep in mind (and you'll find this out with > debugQuery) is that the query parser is going to take your > ServiceName:(Search Service) and turn it into two queries -- > ServiceName:(Search) ServiceName:(Service). That's because the query parser > breaks on whitespace. My bet is you have a lot of entries with a name of "X > Service" and the second part of your query is hitting them. Phrase Field > might be your friend here: > > https://wiki.apache.org/solr/ExtendedDisMax#pf_.28Phrase_Fields.29 > > > -Scott > > On Mon, Oct 12, 2015 at 4:15 AM, Colin Hunter > wrote: > > > Thanks Erick, I'm sure this will be valuable in implementing ngram filter > > factory > > > > On Fri, Oct 9, 2015 at 4:38 PM, Erick Erickson > > wrote: > > > > > Colin: > > > > > > Adding &debug=all to your query is your friend here, the > > > parsed_query.toString will show you exactly what > > > is searched against. > > > > > > Best, > > > Erick > > > > > > On Fri, Oct 9, 2015 at 2:09 AM, Colin Hunter > > wrote: > > > > Ah ha... the copy field... makes sense. > > > > Thank You. > > > > > > > > On Fri, Oct 9, 2015 at 10:04 AM, Upayavira wrote: > > > > > > > >> > > > >> > > > >> On Fri, Oct 9, 2015, at 09:54 AM, Colin Hunter wrote: > > > >> > Hi > > > >> > > > > >> > I am working on a complex search utility with an index created via > > > data > > > >> > import from an extensive MySQL database. > > > >> > There are many ways in which the index is searched. One of the > > utility > > > >> > input fields searches only on a Service Name. However, if I target > > the > > > >> > query as q=ServiceName:"Searched service", this only returns an > > exact > > > >> > string match. If q=Searched Service, the query still returns > results > > > from > > > >> > all indexed data. > > > >> > > > > >> > Is there a way to construct a query to only return results from > one > > > field > > > >> > of a doc ? > > > >> > I have tried setting index=false, stored=true on unwanted fields, > > but > > > >> > these > > > >> > appear to have still been returned in results. > > > >> > > > >> q=ServiceName:(Searched Service) > > > >> > > > >> That'll look in just one field. > > > >> > > > >> Remember changing indexed to false doesn't impact the stuff already > in > > > >> your index. And the reason you are likely getting all that stuff is > > > >> because you have a copyField that copies it over into the 'text' > > field. > > > >> If you'll never want to search on some fields, switch them to > > > >> index=false, make sure you aren't doing a copyField on them, and > then > > > >> reindex. > > > >> > > > >> Upayavira > > > >> > > > > > > > > > > > > > > > > -- > > > > www.gfc.uk.net > > > > > > > > > > > -- > > www.gfc.uk.net > > > > > > -- > Scott Stults | Founder & Solutions Architect | OpenSource Connections, LLC > | 434.409.2780 > http://www.opensourceconnections.com > -- www.gfc.uk.net
Highlighting content field problem when using JiebaTokenizerFactory
Hi, I'm trying to use the JiebaTokenizerFactory to index Chinese characters in Solr. It works fine with the segmentation when I'm using the Analysis function on the Solr Admin UI. However, when I tried to do the highlighting in Solr, it is not highlighting in the correct place. For example, when I search of 自然环境与企业本身, it highlight 认为自然环境与企业本身的 Even when I search for English character like responsibility, it highlight *responsibilit*y. Basically, the highlighting goes off by 1 character/space consistently. This problem only happens in content field, and not in any other fields. Does anyone knows what could be causing the issue? I'm using jieba-analysis-1.0.0, Solr 5.3.0 and Lucene 5.3.0. Regards, Edwin
RE: NullPointerException
How odd, though I'm afraid this is reaching the limit of my knowledge at this point (and I still can't find where that box is within the Admin UI!). The only thing I'd say is to check that "logtext" is a defined named field within your schema, and to double check how it's field type is defined. Also, try without the "text_en" definition (I believe this should be implicit as the filed type of "logtext" above). Geraint Geraint Duck Data Scientist Toxicology and Health Sciences Syngenta UK Email: geraint.d...@syngenta.com -Original Message- From: Mark Fenbers [mailto:mark.fenb...@noaa.gov] Sent: 12 October 2015 12:14 To: solr-user@lucene.apache.org Subject: Re: NullPointerException On 10/12/2015 5:38 AM, Duck Geraint (ext) GBJH wrote: > "When I use the Admin UI (v5.3.0), and check the spellcheck.build box" > Out of interest, where is this option within the Admin UI? I can't find > anything like it in mine... This is in the expanded options that open up once I put a checkmark in the "spellcheck" box. > Do you get the same issue by submitting the build command directly with > something like this instead: > http://localhost:8983/solr//ELspell?spellcheck.build=true > ? Yes, I do. > It'll be reasonably obvious if the dictionary has actually built or not by > the file size of your speller store: > /localapps/dev/EventLog/solr/EventLog2/data/spFile > > > Otherwise, (temporarily) try adding... > true > ...to your spellchecker search component config, you might find it'll log a > more useful error message that way. Interesting! The index builds successfully using this method and I get no stacktrace error. Hurray! But why?? So now, I tried running a query, so I typed Fenbers into the spellcheck.q box, and I get the following 9 suggestions: fenber f en be r f e nb er f en b er f e n be r f en b e r f e nb e r f e n b er f e n b e r I find this very odd because I commented out all references to the wordbreak checker in solrconfig.xml. What do I configure so that Solr will give me sensible suggestions like: fenders embers fenberry and so on? Mark Syngenta Limited, Registered in England No 2710846;Registered Office : Syngenta Limited, European Regional Centre, Priestley Road, Surrey Research Park, Guildford, Surrey, GU2 7YH, United Kingdom This message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited.
Indexing Solr in production
Hi, What is the best practice to do indexing in Solr for production system.I'm using Solr 5.3.0. I understand that post.jar does not have things like robustness checks and retires, which is important in production, as sometimes certain records might failed during the indexing, and we need to re-try the indexing for those records that fails. Normally, do we need to write a new custom handler in order to achieve all these? Want to find out what most people did before I decide on a method and proceed on to the next step. Thank you. Regards, Edwin