Re: Difference in queryString and Parsed query
Thank you Walter Under Underwood for a complete honest review. I will start simple by using the sample. Regards,LavanyaOn Tuesday, 22 January 2019, 12:31:55 pm AEDT, Walter Underwood wrote: There are many, many problems with this analyzer chain definition. This is a summary of the indexing chain: * WhitespaceTokenizerFilter * LowerCaseFilter * SynonymFilter (with ignoreCase=true after lower-casing everything) * StopFilter (we should have stopped using stopwords 20 years ago) * WordDelimiterFilter (with all the transformation options set to 0, does nothing) * RemoveDuplicates (this must always be last) * KStemFilter (good choice) * EdgeNGramFilter (!!! are you doing prefix matching? doing that with stemming makes bizarre matches) * ReverseStringFilter (Yowza! Only do this on unmodified tokens, what does this mean on word stems? Even more bizarre) Reversed stemmed edge ngrams should cause some really exciting matches. Summary of the query chain: * WhitespaceTokenizerFilter * LowerCaseFilter * PorterStemFilter (different stemmer from indexing, guarantees missed matches) * SynonymFilter (after stemmer? never do this, all tokens need stemmed) * StopFilter (bad, but extra bad after a Porter stemmer that doesn’t generate dictionary words) * WordDelimiterFilter (again, doing nothing, also the results should have been stemmed) * KStemFilter (two stemmers in a chain! never do that! plus the Porter stemmer doesn’t produce dictionary words, so KStem won’t do much) Short version, I’m astonished that this configuration works at all. Delete the whole thing, use one from the sample file (without stop words), and reindex. There is no way to fix this. Not to be mean, but this is the worst field type definition I have ever seen. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Jan 21, 2019, at 4:24 AM, Lavanya Thirumalaisami > wrote: > > > Thank you Aman Deep > I tried removing the kstem filter factory and still get the same issue, but > when i comment the Porterstemfilterfactory the character y does not get > replaced. > > On Monday, 21 January 2019, 11:16:23 pm AEDT, Aman deep singh > wrote: > > Hi Lavanya, > This is probably due to the kstem Filter factory it is removing the y > charactor ,since the stemmer has rule of words ending with y . > > > Regards, > Aman Deep Singh > >> On 21-Jan-2019, at 5:43 PM, Mikhail Khludnev wrote: >> >> querystring is what goes into QPaser, parsedquery is >> LuceneQuery.toString() >> >> On Mon, Jan 21, 2019 at 3:04 PM Lavanya Thirumalaisami >> wrote: >> >>> Hi, >>> Our solr search is not returning expected results for keywords ending with >>> the character 'y'. >>> For example keywords like battery, way, accessory etc. >>> I tried debugging the solr query in solr admin console and i find there is >>> a difference between query string and parsed query. >>> "querystring":"battery","parsedquery":"batteri", >>> Also I find that if i search omitting the character y i am getting all the >>> results. >>> This happens only for keywords ending with Y and most others we donot have >>> this issue. >>> Could any one please help me understand why is the keywords gets changed, >>> specially the last character. Is there any issues in my field type >>> definition. >>> While indexing the data we use the text data type and we have defined as >>> follows >>> >> positionIncrementGap="100"> >> class="solr.WhitespaceTokenizerFactory" /> >> class="solr.LowerCaseFilterFactory" /> >> class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" >>> expand="true"/> >> words="stopwords.txt" /> >> catenateWords="1" class="solr.WordDelimiterFilterFactory" >>> generateNumberParts="0" generateWordParts="0" preserveOriginal="1" >>> splitOnCaseChange="0" splitOnNumerics="0" /> >> class="solr.RemoveDuplicatesTokenFilterFactory" /> >> class="solr.KStemFilterFactory" /> >> class="solr.EdgeNGramFilterFactory" maxGramSize="255" minGramSize="1" /> >>> >> type="query"> >> class="solr.LowerCaseFilterFactory" /> >> class="solr.PorterStemFilterFactory" /> >> class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" >>> expand="true"/> >> words="stopwords.txt" /> >> catenateWords="0" class="solr.WordDelimiterFilterFactory" >>> generateNumberParts="0" generateWordParts="0" preserveOriginal="1" >>> splitOnCaseChange="0" splitOnNumerics="0" /> >> class="solr.KStemFilterFactory" /> >>> >>> Regards,Lavanya >> >> >> >> -- >> Sincerely yours >> Mikhail Khludnev
Re: Large Number of Collections takes down Solr 7.3
Do you mind if I ask why so many collections rather than a field in one collection that you can apply a filter query to each customer to restrict the result set, assuming you’re the one controlling the middle ware? > On Jan 22, 2019, at 4:43 PM, Monica Skidmore > wrote: > > We have been running Solr 5.4 in master-slave mode with ~4500 cores for a > couple of years very successfully. The cores represent individual customer > data, so they can vary greatly in size, and some of them have gotten too > large to be manageable. > > We are trying to upgrade to Solr 7.3 in cloud mode, with ~4500 collections, 2 > NRTreplicas total per collection. We have experimented with additional > servers and ZK nodes as a part of this move. We can create up to ~4000 > collections, with a slow-down to ~20s per collection to create, but if we go > much beyond that, the time to create collections shoots up, some collections > fail to be created, and we see some of the nodes crash. Autoscaling brings > nodes back into the cluster, but they don’t have all the replicas created on > them that they should – we’re pretty sure this is related to the challenge of > adding the large number of collections on those node as they come up. > > There are some approaches we could take that don’t separate our customers > into collections, but we get some benefits from this approach that we’d like > to keep. We’d also like to add the benefits of cloud, like balancing where > collections are placed and the ability to split large collections. > > Is anyone successfully running Solr 7x in cloud mode with thousands or more > of collections? Are there some configurations we should be taking a closer > look at to make this feasible? Should we try a different replica type? (We > do want NRT-like query latency, but we also index heavily – this cluster will > have 10’s of millions of documents.) > > I should note that the problems are not due to the number of documents – the > problems occur on a new cluster while we’re creating the collections we know > we’ll need. > > Monica Skidmore > > >
Re: Large Number of Collections takes down Solr 7.3
On 1/22/2019 2:43 PM, Monica Skidmore wrote: Is anyone successfully running Solr 7x in cloud mode with thousands or more of collections? Are there some configurations we should be taking a closer look at to make this feasible? Should we try a different replica type? (We do want NRT-like query latency, but we also index heavily – this cluster will have 10’s of millions of documents.) That many collections will overwhelm SolrCloud. This issue is marked as fixed, but it's actually not fixed: https://issues.apache.org/jira/browse/SOLR-7191 SolrCloud simply will not scale to that many collections. I wish I had better news for you. I would like to be able to solve the problem, but I am not familiar with that particular code. Getting familiar with the code is a major undertaking. Thanks, Shawn
Large Number of Collections takes down Solr 7.3
We have been running Solr 5.4 in master-slave mode with ~4500 cores for a couple of years very successfully. The cores represent individual customer data, so they can vary greatly in size, and some of them have gotten too large to be manageable. We are trying to upgrade to Solr 7.3 in cloud mode, with ~4500 collections, 2 NRTreplicas total per collection. We have experimented with additional servers and ZK nodes as a part of this move. We can create up to ~4000 collections, with a slow-down to ~20s per collection to create, but if we go much beyond that, the time to create collections shoots up, some collections fail to be created, and we see some of the nodes crash. Autoscaling brings nodes back into the cluster, but they don’t have all the replicas created on them that they should – we’re pretty sure this is related to the challenge of adding the large number of collections on those node as they come up. There are some approaches we could take that don’t separate our customers into collections, but we get some benefits from this approach that we’d like to keep. We’d also like to add the benefits of cloud, like balancing where collections are placed and the ability to split large collections. Is anyone successfully running Solr 7x in cloud mode with thousands or more of collections? Are there some configurations we should be taking a closer look at to make this feasible? Should we try a different replica type? (We do want NRT-like query latency, but we also index heavily – this cluster will have 10’s of millions of documents.) I should note that the problems are not due to the number of documents – the problems occur on a new cluster while we’re creating the collections we know we’ll need. Monica Skidmore
Re: _version_ field missing in schema?
What do you mean schema.xml from managed-schema? schema.xml is old non-managed approach. If you have both, schema.xml will be ignored. I suspect you are not running with the schema you think you do. You can check that with API or in Admin UI if you get that far. Regards, Alex On Tue, Jan 22, 2019, 11:39 AM Aleksandar Dimitrov < a.dimit...@seidemann-web.com wrote: > Hi, > > I'm using solr 7.5, in my schema.xml I have this, which I took > from the > managed-schema: > > > > stored="false" /> > docValues="true" /> > > However, on startup, solr complains: > > Caused by: org.apache.solr.common.SolrException: _version_ field > must exist in schema and be searchable (indexed or docValues) and > retrievable(stored or docValues) and not multiValued (_version_ > does not exist) > at > > org.apache.solr.update.VersionInfo.getAndCheckVersionField(VersionInfo.java:69) > > ~[solr-core-7.5.0.jar:7.5.0 > b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - > 2018-09-18 13:07:55] > at > org.apache.solr.update.VersionInfo.(VersionInfo.java:95) > ~[solr-core-7.5.0.jar:7.5.0 > b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - > 2018-09-18 13:07:55] > at org.apache.solr.update.UpdateLog.init(UpdateLog.java:404) > ~[solr-core-7.5.0.jar:7.5.0 > b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - > 2018-09-18 13:07:55] > at > org.apache.solr.update.UpdateHandler.(UpdateHandler.java:161) > ~[solr-core-7.5.0.jar:7.5.0 > b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - > 2018-09-18 13:07:55] > at > org.apache.solr.update.UpdateHandler.(UpdateHandler.java:116) > ~[solr-core-7.5.0.jar:7.5.0 > b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - > 2018-09-18 13:07:55] > at > > org.apache.solr.update.DirectUpdateHandler2.(DirectUpdateHandler2.java:119) > > ~[solr-core-7.5.0.jar:7.5.0 > b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - > 2018-09-18 13:07:55] > at > > jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) ~[?:?] > at > > jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > > ~[?:?] > at > > jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > > ~[?:?] > at > java.lang.reflect.Constructor.newInstance(Constructor.java:488) > ~[?:?] > at > org.apache.solr.core.SolrCore.createInstance(SolrCore.java:799) > ~[solr-core-7.5.0.jar:7.5.0 > b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - > 2018-09-18 13:07:55] > at > org.apache.solr.core.SolrCore.createUpdateHandler(SolrCore.java:861) > ~[solr-core-7.5.0.jar:7.5.0 > b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - > 2018-09-18 13:07:55] > at > org.apache.solr.core.SolrCore.initUpdateHandler(SolrCore.java:1114) > ~[solr-core-7.5.0.jar:7.5.0 > b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - > 2018-09-18 13:07:55] > at org.apache.solr.core.SolrCore.(SolrCore.java:984) > ~[solr-core-7.5.0.jar:7.5.0 > b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - > 2018-09-18 13:07:55] > at org.apache.solr.core.SolrCore.(SolrCore.java:869) > ~[solr-core-7.5.0.jar:7.5.0 > b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - > 2018-09-18 13:07:55] > at > > org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1138) > > ~[solr-core-7.5.0.jar:7.5.0 > b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - > 2018-09-18 13:07:55] > ... 7 more > > Anyone know what I'm doing wrong? > I've tried having the _version_ field be string, and indexed and > stored, > but that didn't help. > > Thanks! > > Aleks > >
_version_ field missing in schema?
Hi, I'm using solr 7.5, in my schema.xml I have this, which I took from the managed-schema: stored="false" /> docValues="true" /> However, on startup, solr complains: Caused by: org.apache.solr.common.SolrException: _version_ field must exist in schema and be searchable (indexed or docValues) and retrievable(stored or docValues) and not multiValued (_version_ does not exist) at org.apache.solr.update.VersionInfo.getAndCheckVersionField(VersionInfo.java:69) ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:07:55] at org.apache.solr.update.VersionInfo.(VersionInfo.java:95) ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:07:55] at org.apache.solr.update.UpdateLog.init(UpdateLog.java:404) ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:07:55] at org.apache.solr.update.UpdateHandler.(UpdateHandler.java:161) ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:07:55] at org.apache.solr.update.UpdateHandler.(UpdateHandler.java:116) ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:07:55] at org.apache.solr.update.DirectUpdateHandler2.(DirectUpdateHandler2.java:119) ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:07:55] at jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:?] at jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:?] at jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:?] at java.lang.reflect.Constructor.newInstance(Constructor.java:488) ~[?:?] at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:799) ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:07:55] at org.apache.solr.core.SolrCore.createUpdateHandler(SolrCore.java:861) ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:07:55] at org.apache.solr.core.SolrCore.initUpdateHandler(SolrCore.java:1114) ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:07:55] at org.apache.solr.core.SolrCore.(SolrCore.java:984) ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:07:55] at org.apache.solr.core.SolrCore.(SolrCore.java:869) ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:07:55] at org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1138) ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:07:55] ... 7 more Anyone know what I'm doing wrong? I've tried having the _version_ field be string, and indexed and stored, but that didn't help. Thanks! Aleks
Best practice to deploy Solr to production
Hi all, I have a Solr index which has been evolving since Solr1.4 and now is in SolrCloud6.6. This cluster is composed of 4 servers, few collections and shards. Since first time I deployed to production in 2009 I am using the same approach to deploy. I think it's probably the time to review and improve. I am looking for best practices and different approaches. Here is what I have. Everyhting is done using ant. 1. I do have solr package together with my own code. I do have some plugins written. 2. 8 collections with several files per collection. 3. Ant steps 3.1 Update config files with deployment properties per environment. 3.2 Package and copy to servers 3.3 Run remote sh commands for installation of Solr 3.4 Run remote sh to push config files to zookeeper 3.4 Run web requests to create collections. I have the feeling this is not the proper way to go but not sure what's the best practice either. Can anyone point me to a nicer way to do this? Thanks a lot. Sergio -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
difference in behavior of term boosting between Solr 6 and Solr 7
We're preparing to upgrade from Solr 6.4.2 to Solr 7.6.0, and found an inconsistency in scoring. It appears that term boosts in the query are not applied in Solr 7. The query itself against both versions is identical (removed un-important params): ("one"^1) OR ("two"^2) OR ("three"^3) edismax max_term AND dictionary_id:"WKUS-TAL-DEPLURALIZATION-THESAURUS" 100 xml on 3 documents are returned, but in Solr 6 results the docs are returned in order of the boosts (three, two, one), as the boosts accounts for the entirety of the score, while in Solr 7 they are returned randomly, as all the scores are 1.0. Looking at the debug and explains, in Solr 6 the boost is multiplied to the rest of the score: ("one"^1) OR ("two"^2) OR ("three"^3) ("one"^1) OR ("two"^2) OR ("three"^3) (+(DisjunctionMaxQuery((max_term:" one "))^1.0 DisjunctionMaxQuery((max_term:" two "))^2.0 DisjunctionMaxQuery((max_term:" three "))^3.0))/no_coord +(((max_term:" one "))^1.0 ((max_term:" two "))^2.0 ((max_term:" three "))^3.0) 3.0 = sum of: 3.0 = weight(max_term:" three " in 658) [WKSimilarity], result of: 3.0 = score(doc=658,freq=1.0 = phraseFreq=1.0 ), product of: 3.0 = boost 1.0 = idf(), for phrases, always set to 1 1.0 = tfNorm, computed as (freq * (k1a + 1)) / (freq + k1b) [WKSimilarity] from: 1.0 = phraseFreq=1.0 1.2 = k1a 1.2 = k1b 0.0 = b (norms omitted for field) But in Solr 7, the boost is not there at all: ("one"^1) OR ("two"^2) OR ("three"^3) ("one"^1) OR ("two"^2) OR ("three"^3) +((+DisjunctionMaxQuery((max_term:" one "))^1.0) (+DisjunctionMaxQuery((max_term:" two "))^2.0) (+DisjunctionMaxQuery((max_term:" three "))^3.0)) +((+((max_term:" one "))^1.0) (+((max_term:" two "))^2.0) (+((max_term:" three "))^3.0)) 1.0 = sum of: 1.0 = weight(max_term:" two " in 436) [WKSimilarity], result of: 1.0 = score(doc=436,freq=1.0 = phraseFreq=1.0 ), product of: 1.0 = idf(), for phrases, always set to 1 1.0 = tfNorm, computed as (freq * (k1a + 1)) / (freq + k1b) [WKSimilarity] from: 1.0 = phraseFreq=1.0 1.2 = k1a 1.2 = k1b 0.0 = b (norms omitted for field) I noted a subtle difference in the parsedquery between the 2 versions as well, not sure if that is causing the boost to drop out in Solr 7: SOLR 6: +(((max_term:" one "))^1.0 ((max_term:" two "))^2.0 ((max_term:" three "))^3.0) SOLR 7: +((+((max_term:" one "))^1.0) (+((max_term:" two "))^2.0) (+((max_term:" three "))^3.0)) For our use case , I think we can work around it using a constant score query, but it would be good to know if this is a bug or expected behavior, or we're missing something in the query to get boost to work again. Thanks!
Re: Get MLT Interesting Terms for a set of documents corresponding to the query specified
I will certainly try it out. Thanks! On Mon, Jan 21, 2019 at 8:48 PM Joel Bernstein wrote: > You find the significantTerms streaming expressions useful: > > > https://lucene.apache.org/solr/guide/7_6/stream-source-reference.html#significantterms > > > Joel Bernstein > http://joelsolr.blogspot.com/ > > > On Mon, Jan 21, 2019 at 3:02 PM Pratik Patel wrote: > > > Aman, > > > > Thanks for the reply! > > > > I have tried with corrected query but it doesn't solve the problem. also, > > my tags filter matches multiple documents, however the interestingTerms > > seems to correspond to just the first document. > > Here is an example of a query which matches 1900 documents. > > > > > > > http://localhost:8081/solr/collection1/mlt?debugQuery=on=tags:voltage=true=my_field=details=1=2=3=*:*=100=0 > > > > > > Thanks, > > Pratik > > > > > > On Mon, Jan 21, 2019 at 2:52 PM Aman Tandon > > wrote: > > > > > I see two rows params, looks like which will be overwritten by rows=2, > > and > > > then your tags filter is resulting only one document. Please remove > extra > > > rows and try. > > > > > > On Mon, Jan 21, 2019, 08:44 Pratik Patel > > > > > > Hi Everyone! > > > > > > > > I am trying to use MLT request handler. My query matches more than > one > > > > documents but the response always seems to pick up the first document > > and > > > > interestingTerms also seems to be corresponding to that single > document > > > > only. > > > > > > > > What I am expecting is that if my query matches multiple documents > then > > > the > > > > InterestingTerms handler result also corresponds to that set of > > documents > > > > and not the first document. > > > > > > > > Following is my query, > > > > > > > > > > > > > > > > > > http://localhost:8081/solr/collection1/mlt?debugQuery=on=tags:test=true=mlt.fl=textpropertymlt=details=1=2=3=*:*=100=2=0 > > > > > > > > Ultimately, my goal is to get interesting terms corresponding to this > > > whole > > > > set of documents. I don't need similar documents as such. If not with > > > mlt, > > > > is there any other way I can achieve this? that is, given a query > > > matching > > > > set of documents, find interestingTerms for that set of documents > based > > > on > > > > tf-idf? > > > > > > > > Thanks! > > > > Pratik > > > > > > > > > >
Re: Iterative graph/nodes query
Perhaps you're looking for the traversalFilter parameter of the graph query? https://lucene.apache.org/solr/guide/7_6/other-parsers.html#graph-query-parser Dan Meehl Meehl Technology Solutions Inc On Tue, Jan 22, 2019 at 7:13 AM Magnus Karlsson wrote: > Hi, > > > anyone using any of the functionality of graphs either in a single > collection (shortest path) or streaming expressions (nodes)? > > > Experiences? > > > / Magnus > > > Från: Magnus Karlsson > Skickat: den 17 december 2018 14:51:17 > Till: solr-user@lucene.apache.org > Ämne: Iterative graph/nodes query > > Hi, > > > looking at the graph traversal capabilities of solr. Is there a > function/feature that traverses until certain pre-requisites are met? > > > For instance, in an hierarchical use case, "traverse all children until a > child has a certain name or type"? > > > Using the current nodes streaming expression I need to know beforehand the > number of levels I need to traverse. > > > Is there a feature supporting this use-case or is it planned to be > implemented? > > > Thanks in advance. > > > / Magnus > >
Re: The parent shard will never be delete/clean?
Hi, You might want to check out the documentation, which goes over split-shard in a bit more detail: https://lucene.apache.org/solr/guide/7_6/collections-api.html#CollectionsAPI-splitshard To answer your question directly though, no. Split-shard creates two new subshards, but it doesn't do anything to remove or cleanup the original shard. The original shard remains with its data and will delegate future requests to the result shards. Hope that helps, Jason On Tue, Jan 22, 2019 at 4:17 AM zhenyuan wei wrote: > > Hi, >If I split shard1 to shard1_0,shard1_1, Is the parent shard1 will > never be clean up? > > > Best, > Tinswzy
Re: modifying the export request handler
i use the following defintion : < request handler name="my_export" class="solr.exportHandler" useParams="_EXPORT"> json false {!xport} myComponent query and recieve a nullPointerException when the im loading the core. the exception is at org.apache.solr.common.params.solrParams.toMultiMap(solrParams:414) -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
SV: Iterative graph/nodes query
Hi, anyone using any of the functionality of graphs either in a single collection (shortest path) or streaming expressions (nodes)? Experiences? / Magnus Från: Magnus Karlsson Skickat: den 17 december 2018 14:51:17 Till: solr-user@lucene.apache.org Ämne: Iterative graph/nodes query Hi, looking at the graph traversal capabilities of solr. Is there a function/feature that traverses until certain pre-requisites are met? For instance, in an hierarchical use case, "traverse all children until a child has a certain name or type"? Using the current nodes streaming expression I need to know beforehand the number of levels I need to traverse. Is there a feature supporting this use-case or is it planned to be implemented? Thanks in advance. / Magnus
The parent shard will never be delete/clean?
Hi, If I split shard1 to shard1_0,shard1_1, Is the parent shard1 will never be clean up? Best, Tinswzy
Multiplicative Boosts broken since 7.3 (LUCENE-8099)
Hello, As described in https://issues.apache.org/jira/browse/SOLR-13126 multiplicative boots (in certain conditions) seem to be broken since 7.3. The error seems to be introduced in https://issues.apache.org/jira/browse/LUCENE-8099. Reverting the SOLR parts to the now deprecated BoostingQuery again fixes the issue. The filed issue contains a test case and a patch with the revert (for testing purposes, not really a clean fix). We sadly couldn't find the actual issue, which seems to lie with the use of "FunctionScoreQuery" for boosting. We were able to patch our 7.5 installation with the patch. As others might be affected as well, we hope this can be helpful in resolving this bug. To all SOLR/Lucene developers, thank you for your work. Looking trough the code base gave me a new appreciation of your work. Best Regards, Tobias PS: This issue was already posted by a colleague, "Inconsistent debugQuery score with multiplicative boost", but I wanted to create a new post with a clearer title.
Re: Single query to get the count for all individual collections
+1 for the most elegant solution so far :) -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 22. jan. 2019 kl. 03:15 skrev Joel Bernstein : > > Streaming Expressions can do this: > > plist(stats(collection1, q="*:*", count(*)), >stats(collection2, q="*:*", count(*)), >stats(collection2, q="*:*", count(*))) > > The plist function is a parallel list of expressions. It will spin each > expression off in it's own thread and concatenate the results of each > expression into a single result set. > Here are the docs: > https://lucene.apache.org/solr/guide/7_6/stream-source-reference.html#stats > https://lucene.apache.org/solr/guide/7_6/stream-decorator-reference.html#plist > > plist is quite new, but "list" has been around for a while if you have an > older version of Solr > > https://lucene.apache.org/solr/guide/7_6/stream-decorator-reference.html#list_expression > > > > > > > > > Joel Bernstein > http://joelsolr.blogspot.com/ > > > On Mon, Jan 21, 2019 at 12:53 PM Jens Brandt wrote: > >> Hi, >> >> maybe adding =true might help. In case of SolrCloud this >> gives you numFound for each shard. >> >> Regards, >> Jens >> >>> Am 10.01.2019 um 04:40 schrieb Zheng Lin Edwin Yeo >> : >>> >>> Hi, >>> >>> I would like to find out, is there any way that I can send a single query >>> to retrieve the numFound for all the individual collections? >>> >>> I have tried with this query >>> >> http://localhost:8983/solr/collection1/select?q=*:*=collection1,collection2 >>> However, this query is doing the sum of all the collections, instead of >>> showing the count for each of the collection. >>> >>> I am using Solr 7.5.0. >>> >>> Regards, >>> Edwin >> >>