Re: timeAllowed flag in the response
Am 08.06.2012 11:55, schrieb Laurent Vaills: Hi Michael, Thanks for the details that helped me to take a deeper look in the source code. I noticed that each time a TimeExceededException is caught the method setPartialResults(true) is called...which seems to be what I'm looking for. I have to investigate, since this partialResults does not seem to be set for the sharded queries. Ah, I simply was too blind! ;) The partial results flag indeed is set in the response header. Then I think this is a bug that it's not filled in a sharded response, or it simply is not there when sharding. Greeting, Kuli
Re: what's better for in memory searching?
Set the swapiness to 0 to avoid memory pages being swapped to disk too early. http://en.wikipedia.org/wiki/Swappiness -Kuli Am 11.06.2012 10:38, schrieb Li Li: I have roughly read the codes of RAMDirectory. it use a list of 1024 byte arrays and many overheads. But as far as I know, using MMapDirectory, I can't prevent the page faults. OS will swap less frequent pages out. Even if I allocate enough memory for JVM, I can guarantee all the files in the directory are in memory. am I understanding right? if it is, then some less frequent queries will be slow. How can I let them always in memory? On Fri, Jun 8, 2012 at 5:53 PM, Lance Norskoggoks...@gmail.com wrote: Yes, use MMapDirectory. It is faster and uses memory more efficiently than RAMDirectory. This sounds wrong, but it is true. With RAMDirectory, Java has to work harder doing garbage collection. On Fri, Jun 8, 2012 at 1:30 AM, Li Lifancye...@gmail.com wrote: hi all I want to use lucene 3.6 providing searching service. my data is not very large, raw data is less that 1GB and I want to use load all indexes into memory. also I need save all indexes into disk persistently. I originally want to use RAMDirectory. But when I read its javadoc. Warning: This class is not intended to work with huge indexes. Everything beyond several hundred megabytes will waste resources (GC cycles), because it uses an internal buffer size of 1024 bytes, producing millions of byte [1024] arrays. This class is optimized for small memory-resident indexes. It also has bad concurrency on multithreaded environments. It is recommended to materialize large indexes on disk and use MMapDirectory, which is a high-performance directory implementation working directly on the file system cache of the operating system, so copying data to Java heap space is not useful. should I use MMapDirectory? it seems another contrib instantiated. anyone test it with RAMDirectory? -- Lance Norskog goks...@gmail.com
Re: what's better for in memory searching?
You cannot guarantee this when you're running out of RAM. You'd have a problem then anyway. Why are you caring that much? Did you yet have performance issues? 1GB should load really fast, and both auto warming and OS cache should help a lot as well. With such an index, you usually don't need to fine tune performance that much. Did you think about using a SSD? Since you want to persist your index, you'll need to live with disk IO anyway. Greetings, Kuli Am 11.06.2012 11:20, schrieb Li Li: I am sorry. I make a mistake. even use RAMDirectory, I can not guarantee they are not swapped out. On Mon, Jun 11, 2012 at 4:45 PM, Michael Kuhlmannk...@solarier.de wrote: Set the swapiness to 0 to avoid memory pages being swapped to disk too early. http://en.wikipedia.org/wiki/Swappiness -Kuli Am 11.06.2012 10:38, schrieb Li Li: I have roughly read the codes of RAMDirectory. it use a list of 1024 byte arrays and many overheads. But as far as I know, using MMapDirectory, I can't prevent the page faults. OS will swap less frequent pages out. Even if I allocate enough memory for JVM, I can guarantee all the files in the directory are in memory. am I understanding right? if it is, then some less frequent queries will be slow. How can I let them always in memory? On Fri, Jun 8, 2012 at 5:53 PM, Lance Norskoggoks...@gmail.comwrote: Yes, use MMapDirectory. It is faster and uses memory more efficiently than RAMDirectory. This sounds wrong, but it is true. With RAMDirectory, Java has to work harder doing garbage collection. On Fri, Jun 8, 2012 at 1:30 AM, Li Lifancye...@gmail.comwrote: hi all I want to use lucene 3.6 providing searching service. my data is not very large, raw data is less that 1GB and I want to use load all indexes into memory. also I need save all indexes into disk persistently. I originally want to use RAMDirectory. But when I read its javadoc. Warning: This class is not intended to work with huge indexes. Everything beyond several hundred megabytes will waste resources (GC cycles), because it uses an internal buffer size of 1024 bytes, producing millions of byte [1024] arrays. This class is optimized for small memory-resident indexes. It also has bad concurrency on multithreaded environments. It is recommended to materialize large indexes on disk and use MMapDirectory, which is a high-performance directory implementation working directly on the file system cache of the operating system, so copying data to Java heap space is not useful. should I use MMapDirectory? it seems another contrib instantiated. anyone test it with RAMDirectory? -- Lance Norskog goks...@gmail.com
Re: Starts with Query
It's not necessary to do this. You can simply be happy about the fact that all digits are ordered strictly in unicode, so you can use a range query: (f)q={!frange l=0 u=\: incl=true incu=false}title This finds all documents where any token from the title field starts with a digit, so if you want to only find documents where the whole title starts with a digit, you need a second field with a string or untokenized text type. Use the copyField directive then, as Jack Krupansky already suggested in a previous reply. Greetings, Kuli Am 15.06.2012 08:38, schrieb Afroz Ahmad: If you are not searching for the specific digit and want to match all documents that start with any digit, you could as part of the indexing process, have another field say startsWithDigit and set it to true if it the title begins with a digit. All you need to do at query time then is query for startsWithDigit =true. Thanks Afroz From: nutchsolruser Sent: 6/14/2012 11:03 PM To: solr-user@lucene.apache.org Subject: Re: Starts with Query Thanks Jack for valuable response,Actually i am trying to match *any* numeric pattern at the start of each document. I dont know documents in index i just want documents title starting with any digit. -- View this message in context: http://lucene.472066.n3.nabble.com/Starts-with-Query-tp3989627p3989761.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Urgent: Facetable but not Searchable Field
On 01.08.2012 13:58, jayakeerthi s wrote: We have a requirement, where we need to implement 2 fields as Facetable, but the values of the fields should not be Searchable. Simply don't search for it, then it's not searchable. Or do I simply don't understand your question? As long as Dismax doesn't have the attribute in its qf parameter, it's not getting searched. Or, if the user has direct access to Solr, then she can search for the attribute. And can delete the index, or crash the server, if she likes. So the short anser is: No. Facettable fields must be searchable. But usually, this is no problem. -Kuli
Re: Urgent: Facetable but not Searchable Field
On 01.08.2012 15:40, Jack Krupansky wrote: The indexed and stored field attributes are independent, so you can define a facet field as stored but not indexed (stored=true indexed=false), so that the field can be faceted but not indexed. ? A field must be indexed to be used for faceting. -Kuli
Re: SOLR 3.4 GeoSpatial Query Returning distance
On 02.08.2012 01:52, Anand Henry wrote: Hi, In SOLR 3.4, while doing a geo-spatial search, is there a way to retrieve the distance of each document from the specified location? Not that I know of. What we did was to read and parse the location field on client side and calculate the distance on our own using this library: http://code.google.com/p/simplelatlng/ However, it's not as nice as getting the distance from Solr, and sometimes the distances seem to slightly differ - e.g. when you filter up to a distance of 100 km, there are cases where the client library still computes 100.8 km or so. But at least, it's working. -Kuli
Re: Connect to SOLR over socket file
On 07.08.2012 21:43, Jason Axelson wrote: Hi, Is it possible to connect to SOLR over a socket file as is possible with mysql? I've looked around and I get the feeling that I may be mi-understanding part of SOLR's architecture. Any pointers are welcome. Thanks, Jason Hi Jason, not that I know of. This has nothing to do with Solr, it depends on the web server you are using. Tomcat, Jetty and the others are using TCP/IP directly through java.io or java.nio classes, and Solr is just one web app that is handled by them. Java web servers typically run on a separate host, and in contrast to MySQL, the local deployment is rather the exception than the standard. If you don't want the network overhead, than use an embedded Solr server: http://wiki.apache.org/solr/EmbeddedSolr Greetings, Kuli
Re: Does Solr support 'Value Search'?
On 08.08.2012 20:56, Bing Hua wrote: Not quite understand but I'd explain the problem I had. The response would contain only fields and a list of field values that match the query. Essentially it's querying for field values rather than documents. The underlying use case would be, when typing in a quick search box, the drill down menu may contain matches on authors, on doctitles, and potentially on other fields. Still thanks for your response and hopefully I'm making it clearer. Bing Hi Bing, hmh, I implemented myself an autosuggest component that does exactly this. You could specify which field you wanted to query, give an optional weight to them, and the component returned a list of all fields and values beginning with the queried string. Either combined or per field, depending on your configuration. However, that was with Solr 1.4.0, when there was no genuine suggest component available. Since then, the Suggester component has been implemented: http://wiki.apache.org/solr/Suggester/ This relies on the spell check dictionary and works better than a simple term dictionary approach. And that's the reason why I didn't bother my old code any more. So maybe you're simply looking for the suggester component? If not, I can try to make my old-style component work with a current Solr version and spread it around. Just tell me. Greetings, Kuli
Re: how to sort search results by count matches
Hi syegorius, are you sure that there's no synonym "planet,world" defined? -Michael Am 02.08.2016 um 15:57 schrieb syegorius: > I have 4 records index by Solr: > > 1 hello planet dear friends > 2 hello world dear friends > 3 nothing > 4 just friends > > I'm searching with this query: > > select?q=world+dear+friends=json=true > > The result is: > > 1 hello planet dear friends > 2 hello world dear friends > 4 just friends > > But as you can see first record has 2 matches, second - 3 and fourth - 1 and > i need the sequence of the result was: > > 2 hello world dear friends //3 matches > 1 hello planet dear friends //2 matches > 4 just friends//1 match > > How can i do that? > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/how-to-sort-search-results-by-count-matches-tp4290022.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: Sorl 6 with jetty issues
This may be related to SOLR-10130. Am 20.02.2017 um 14:06 schrieb ~$alpha`: > Issues with solr settings while migrating from solr 4.0 to solr6.0. > > Issue Faced: My CPU consumption goes to unacceptable levels. ie. load on > solr4.0 is between 6 to 10 while the load on solr 6 reaches 100 and since > its the production i rolled back quickly. > > My Solr4 setting > > - Running on tomcat > - JVM Memory : 16GB > - 24 core cpu > - JVM settings : >- JVM Runtime Java HotSpot(TM) 64-Bit Server VM (24.45-b08) >- Processors 24 >- Args : Paths mentioned here > > > **My Solr6 setting** > > - Running on jetty > - JVM Memory : 20GB > - 32 core cpu > - JVM settings : >- Oracle Corporation Java HotSpot(TM) 64-Bit Server VM 1.8.0_45 25.45-b02 >- Processors 32 >- Args > - DSTOP.KEY=solrrocks > - DSTOP.PORT=7983 > - Djetty.home=/usr/local/solr-6.4.1/server-Djetty.port=8983 > - > Dlog4j.configuration=file:/usr/local/solr-6.4.1/example/resources/log4j.properties > - > Dsolr.install.dir=/usr/local/solr-6.4.1-Dsolr.log.dir=/usr/local/solr-6.4.1/example/techproducts/solr/../logs > - Dsolr.log.muteconsole > - > Dsolr.solr.home=/usr/local/solr-6.4.1/example/techproducts/solr-Duser.timezone=US/Eastern > - XX:+AggressiveOpts > - XX:+CMSParallelRemarkEnabled > - XX:+CMSScavengeBeforeRemark > - XX:+ParallelRefProcEnabled > - XX:+PrintGCApplicationStoppedTime > - XX:+PrintGCDateStamps > - XX:+PrintGCDetails > - XX:+PrintGCTimeStamps > - XX:+PrintHeapAtGC > - XX:+PrintTenuringDistribution > - XX:+UseCMSInitiatingOccupancyOnly > - XX:+UseConcMarkSweepGC > - XX:+UseGCLogFileRotation > - XX:-UseSuperWord > - XX:CMSFullGCsBeforeCompaction=1 > - XX:CMSInitiatingOccupancyFraction=70 > - XX:CMSMaxAbortablePrecleanTime=6000 > - XX:CMSTriggerPermRatio=80 > - XX:GCLogFileSize=20M > - XX:MaxTenuringThreshold=8 > - XX:NewRatio=2 > - XX:NumberOfGCLogFiles=9 > - XX:OnOutOfMemoryError=/usr/local/solr-6.4.1/bin/oom_solr.sh 8983 > /usr/local/solr-6.4.1/example/techproducts/solr/../logs > - XX:PretenureSizeThreshold=64m > - XX:SurvivorRatio=15 > - > XX:TargetSurvivorRatio=90-Xloggc:/usr/local/solr-6.4.1/example/techproducts/solr/../logs/solr_gc.log-Xms21g-Xmx21g-Xss256k-verbose:gc > What i looking for > > My guess its related to gc setting of jetty as i am not expert in > jetty(java8).please help how to tune these settings. Also how should i > chosoe these values or how to to debug these issue ? > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Sorl-6-with-jetty-issues-tp4321291.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: Select TOP 10 items from Solr Query
So basically you want faceting only on the returned result set? I doubt that this is possible without additional queries. The issue is that faceting and result collecting is done within one iteration, so when some document (actually the document's internal id) is fetched as a possible result item, you can't determine whether this will make it into the top x elements or not since there will come more. -Michael Am 17.02.2017 um 05:00 schrieb Zheng Lin Edwin Yeo: > Hi, > > Would like to check, is it possible to do a select of say TOP 10 items from > Solr query, and use the list of the items to do another query (Eg: JSON > Facet)? > > Currently, I'm using a normal facet to retrieve the list of the TOP 10 item > from the normal faceting. > After which, I have to list out all the 10 items as a filter when I do the > JSON Facet like this > q=itemNo:(001 002 003 004 005 006 007 008 009 010) > > It will help if I can combine both of this into a single query. > > I'm using Solr 6.4.1 > > Regards, > Edwin >
Re: Continual garbage collection loop
The number of cores is not *that much* important compared to the index size, but each core has its memory overhead. For instance, caches are based on cores, so you're having 36 individual caches per type. Best, Michael Am 14.02.2017 um 16:39 schrieb Leon STRINGER: >> On 14 February 2017 at 14:44 Michael Kuhlmann <k...@solr.info> wrote: >> >> >> Wow, running 36 cores with only half a gigabyte of heap memory is >> *really* optimistic! >> >> I'd raise the heap size to some gigabytes at least and see how it's >> working then. >> > I'll try increasing the heap size and see if I get the problem again. > > Is core quantity a big issue? As opposed to the size of the cores? Yes, > there's > 36 but some relate to largely inactive web sites so the average size (assuming > my "Master (Searching)" way of calculating this is correct) is less than 4 > MB. I > naively assumed a heap size-related issue would result from larger data sets. > > Thanks for your recommendation, > > Leon Stringer >
Re: Continual garbage collection loop
Wow, running 36 cores with only half a gigabyte of heap memory is *really* optimistic! I'd raise the heap size to some gigabytes at least and see how it's working then. -Michael Am 14.02.2017 um 15:23 schrieb Leon STRINGER: > Further background on the environment: > > There are 36 cores, with a total size of 131 MB (based on the size reported by > "Master (Searching)" in the web console). > > The Java memory parameters in use are: -Xms512m -Xmx512m. > >> On 14 February 2017 at 05:45 Erick Erickson>> wrote: >> >> GCViewer is a nifty tool for visualizing the GC activity BTW. >> > I don't know what I'm looking for but for a log covering a 3-hour period today > the "Summary" tab says (typed manually, apologies for any mistakes): > > Total heap (usage / alloc. max): 490.7M (100.0%) / 490.7M > > Max heap after conc GC: 488.7M (99.6%) > > Max tenured after conc GC: 382M (99.5% / 77.9%) > > Max heap after full GC: 490M (99.9%) > > Freed Memory: 141,811.4M > > Freed Mem/Min: 748.554M/min > > Total Time: 3h9m26s > > Accumulated pauses: 883.6s > > Throughput: 92.23% > > Number of full gc pauses: 476 > > Full GC Performance: 101.4M/s > > Number of gc pauses: 15153 > > GC Performance: 245.5M/s > > > "Memory" tab: > > Total heap (usage / alloc. max): 490.7M (100.0%) / 490.7M > > Tenured heap (usage / alloc. max): 384M (100.0%) / 384M > > Young heap (usage / alloc. max): 106.7M (100.0%) / 106.7M > > Perm heap (usage / alloc. max): 205.6M (17.0%) / 1,212M > > Max tenured after conc GC: 382M (99.5% / 77.9%) > > Avg tenured after conc GC: 247.5M (delta=17.612M) > > Max heap after conc GC: 488.7M (99.6%) > > Avg heap after conc GC: 252.6M (delta=35.751M) > > Max heap after full GC: 490M (99.9%) > > Avg heap after full GC: 379M (delta=72.917M) > > Avg after GC: 359.9M (delta=40.965M) > > Freed by full GC: 47,692.8M (33.6%) > > Freed by GC: 94,118.7M (66.4%) > > Avg freed full GC: 100.2M/coll (delta=68.015M) [greyed] > > Avg freed GC: 6,360.3K/coll (delta=19.963M) [greyed] > > Avg rel inc after FGC: -199,298B/coll > > Avg rel inc after GC: 6,360.3K/coll (delta=19.963M) > > Slope full GC: -126,380B/s > > Slope GC: 14.317M/s > > InitiatingOccFraction (avg / max): 65.9% / 100.0% > > Avg promotion: 2,215.324K/coll (delta=6,904.174K) [greyed] > > Total promotion: 12,504.467M > > > Can anyone can shed any light on this? Is it a problem or is this all normal? > > Thanks, > > Leon Stringer
Re: Select TOP 10 items from Solr Query
Since you already have the top x items then, wouldn't it be much easier to collect the "facet" data from the result list on your own? Am 17.02.2017 um 10:18 schrieb Zheng Lin Edwin Yeo: > Hi Michael, > > Yes, I only want the JSON Facet to query based on the returned result set > of the itemNo from the 1st query. > > There's definitely more than the 10, but we just need the top 10 in this > case. As the top 10 itemNo may change, so we have to get the returned > result set of the itemNo each time we want to do the JSON Facet. > > Regards, > Edwin > > > On 17 February 2017 at 15:42, Michael Kuhlmann <k...@solr.info> wrote: > >> So basically you want faceting only on the returned result set? >> >> I doubt that this is possible without additional queries. The issue is >> that faceting and result collecting is done within one iteration, so >> when some document (actually the document's internal id) is fetched as a >> possible result item, you can't determine whether this will make it into >> the top x elements or not since there will come more. >> >> -Michael >> >> Am 17.02.2017 um 05:00 schrieb Zheng Lin Edwin Yeo: >>> Hi, >>> >>> Would like to check, is it possible to do a select of say TOP 10 items >> from >>> Solr query, and use the list of the items to do another query (Eg: JSON >>> Facet)? >>> >>> Currently, I'm using a normal facet to retrieve the list of the TOP 10 >> item >>> from the normal faceting. >>> After which, I have to list out all the 10 items as a filter when I do >> the >>> JSON Facet like this >>> q=itemNo:(001 002 003 004 005 006 007 008 009 010) >>> >>> It will help if I can combine both of this into a single query. >>> >>> I'm using Solr 6.4.1 >>> >>> Regards, >>> Edwin >>> >>
Re: Select TOP 10 items from Solr Query
It's not possible to do such thing in one request with faceting only. The problem is that you need a fixed filter on every item when the facet algorithm is iterating over it; you can't look into future elements to find out which ones the top 10 will be. So either you stick with two queries (which may be fast enough anyway when you only have ca. 100 items in your collection), or you fetch the data for the top 10 items and do the calculation on your own. -Michael Am 17.02.2017 um 11:35 schrieb Zheng Lin Edwin Yeo: > I'm looking at JSON facet for both of type:terms and type:range. > > For example, I may have 100 Items in my collections, and each item can have > many transactions. But I'm only interested to look at the top 10 items > which has the highest transaction rate (ie the highest count) > > I'm doing a calculation of the total amount and average amount. However, I > will only want the total amount and average amount to be calculated based > on the top 10 items which has the highest transaction rate, and not all the > 100 items. > > For now, I need the additional query to get the top 10 items first, before > I run the JSON Facet to get the total amount and average amount for that 10 > items. > > Regards, > Edwin > > > On 17 February 2017 at 18:02, alessandro.benedetti> wrote: > >> I think we are missing something here ... >> You want to fetch the top 10 results for your query, and allow the user to >> navigate only those 10 results through facets ? >> >> Which facets are you interested in ? >> Field facets ? >> Whatever facet you want, calculating it in your client, on 10 results >> shouldn't be that problematic. >> Are we missing something ? Why you would need an additional query ? >> >> Cheers >> >> >> >> - >> --- >> Alessandro Benedetti >> Search Consultant, R Software Engineer, Director >> Sease Ltd. - www.sease.io >> -- >> View this message in context: http://lucene.472066.n3. >> nabble.com/Select-TOP-10-items-from-Solr-Query-tp4320863p4320910.html >> Sent from the Solr - User mailing list archive at Nabble.com. >>
Re: Multi word synonyms
It's not working out of the box, sorry. We're using this plugin: https://github.com/healthonnet/hon-lucene-synonyms#getting-started It's working nicely, but can lead to OOME when you add many synonyms with multiple terms. And I'm not sure whether it#s still working with Solr 6.0. -Michael Am 15.11.2016 um 10:29 schrieb Midas A: > - i have to use multi word synonyms at query time . > > Please suggest how can i do it . > and let me know it whether it would be visible in debug query or not . >
Re: Again : Query formulation help
Hi Prasanna, there's no such filter out-of-the-box. It's similar to the mm parameter in (e)dismax parser, but this only works for full text searches on the same fields. So you have to build the query on your own using all possible permutations: fq=(code1: AND code2:) OR (code1: AND code3:) OR . Of course, such a query can become huge when there are more than four constraints. Best, Michael Am 24.11.2016 um 11:40 schrieb Prasanna S. Dhakephalkar: > Hi, > > > > Need to formulate a distinctive field values query on 4 fields with minimum > match on 2 fields > > > > I have 4 fields in my core > > Code 1 : Values between 1001 to > > Code 2 : Values between 1001 to > > Code 3 : Values between 1001 to > > Code 4 : Values between 1001 to > > > > I want to formulate a query in following manner > > > > Code 1 : > > Code 2 : > > Code 3 : > > Code 4 : > > > > I want to formulate a query, given above parameters, the result should > contain documents where at least 2 of the above match. > > > > Thanks and Regards, > > > > Prasanna > > > >
Re: Multi word synonyms
This is a nice reading though, but that solution depends on the precondition that you'll already know your synonyms at index time. While having synonyms in the index is mostly the better solution anyway, it's sometimes not feasible. -Michael Am 15.11.2016 um 12:14 schrieb Vincenzo D'Amore: > Hi Midas, > > I suggest this interesting reading: > > https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/ > > > > On Tue, Nov 15, 2016 at 11:00 AM, Michael Kuhlmann <k...@solr.info> wrote: > >> It's not working out of the box, sorry. >> >> We're using this plugin: >> https://github.com/healthonnet/hon-lucene-synonyms#getting-started >> >> It's working nicely, but can lead to OOME when you add many synonyms >> with multiple terms. And I'm not sure whether it#s still working with >> Solr 6.0. >> >> -Michael >> >> Am 15.11.2016 um 10:29 schrieb Midas A: >>> - i have to use multi word synonyms at query time . >>> >>> Please suggest how can i do it . >>> and let me know it whether it would be visible in debug query or not . >>> >> >
Re: Multi word synonyms
Wow, that's great news! I didn't notice that. Am 15.11.2016 um 13:05 schrieb Vincenzo D'Amore: > Hi Michael, > > an update, reading the article I double checked if at least one of the > issues were fixed. > The good news is that https://issues.apache.org/jira/browse/LUCENE-2605 has > been closed and is available in 6.2. > > On Tue, Nov 15, 2016 at 12:32 PM, Michael Kuhlmann <k...@solr.info> wrote: > >> This is a nice reading though, but that solution depends on the >> precondition that you'll already know your synonyms at index time. >> >> While having synonyms in the index is mostly the better solution anyway, >> it's sometimes not feasible. >> >> -Michael >> >> Am 15.11.2016 um 12:14 schrieb Vincenzo D'Amore: >>> Hi Midas, >>> >>> I suggest this interesting reading: >>> >>> https://lucidworks.com/blog/2014/07/12/solution-for-multi- >> term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/ >>> >>> >>> On Tue, Nov 15, 2016 at 11:00 AM, Michael Kuhlmann <k...@solr.info> >> wrote: >>>> It's not working out of the box, sorry. >>>> >>>> We're using this plugin: >>>> https://github.com/healthonnet/hon-lucene-synonyms#getting-started >>>> >>>> It's working nicely, but can lead to OOME when you add many synonyms >>>> with multiple terms. And I'm not sure whether it#s still working with >>>> Solr 6.0. >>>> >>>> -Michael >>>> >>>> Am 15.11.2016 um 10:29 schrieb Midas A: >>>>> - i have to use multi word synonyms at query time . >>>>> >>>>> Please suggest how can i do it . >>>>> and let me know it whether it would be visible in debug query or not . >>>>> >> >
Re: File system choices?
Yes, and we're doing such things at my company. However we most often do things you shouldn't do; this is one of these. Solr needs to load data quite fast, otherwise you'll be having a performance killer. It's often recommended to use an SSD instead of a normal hard disk; a network share would be quite contrary to it. It might make sense when you update very seldom, and all your index fits into memory. -Michael Am 15.12.2016 um 16:37 schrieb Michael Joyner (NewsRx): > Hello all, > > Can the Solr indexes be safely stored and used via mounted NFS shares? > > -Mike >
Re: FacetField-Result on String-Field contains value with count 0?
Then I don't understand your problem. Solr already does exactly what you want. Maybe the problem is different: I assume that there never was a value of "1" in the index, leading to your confusion. Solr returns all fields as facet result where there was some value at some time as long as the the documents are somewhere in the index, even when they're marked as indexed. So there must have been a document with m_mediaType_s=1. Even if all these documents are deleted already, its values still appear in the facet result. This holds true until segments get merged so that all deleted documents are pruned. So if you send a forceMerge request, chances are good that "1" won't come up any more. -Michael Am 13.01.2017 um 15:36 schrieb Sebastian Riemer: > Hi Bill, > > Thanks, that's actually where I come from. But I don't want to exclude values > leading to a count of zero. > > Background to this: A user searched for mediaType "book" which gave him 10 > results. Now some other task/routine whatever changes all those 10 books to > be say 10 ebooks, because the type has been incorrect. The user makes a > refresh, still looking for "book" gets 0 results (which is expected) and > because we rule out facet.fields having count 0, I don't get back the > selected mediaType "book" and thus I cannot select this value in the > select-dropdown-filter for the mediaType. This leads to confusion for the > user, since he has no results, but doesn't see that it's because of he still > has that mediaType-filter set to a value "books" which now actually leads to > 0 results. > > -Ursprüngliche Nachricht- > Von: billnb...@gmail.com [mailto:billnb...@gmail.com] > Gesendet: Freitag, 13. Januar 2017 15:23 > An: solr-user@lucene.apache.org > Betreff: Re: AW: FacetField-Result on String-Field contains value with count > 0? > > Set mincount to 1 > > Bill Bell > Sent from mobile > > >> On Jan 13, 2017, at 7:19 AM, Sebastian Riemerwrote: >> >> Pardon me, >> the second search should have been this: >> http://localhost:8983/solr/wemi/select?fq=m_mediaType_s:%221%22 >> =on=*:*=0=0=json (or in other words, give me all >> documents having value "1" for field "m_mediaType_s") >> >> Since this search gives zero results, why is it included in the facet.fields >> result-count list? >> >> >> >> Hi, >> >> Please help me understand: >> http://localhost:8983/solr/wemi/select?facet.field=m_mediaType_s=on=on=*:*=json >> returns: >> >> "facet_counts":{ >>"facet_queries":{}, >>"facet_fields":{ >> "m_mediaType_s":[ >>"2",25561, >>"3",19027, >>"10",1966, >>"11",1705, >>"12",1067, >>"4",1056, >>"5",291, >>"8",68, >>"13",2, >>"6",2, >>"7",1, >>"9",1, >>"1",0]}, >>"facet_ranges":{}, >>"facet_intervals":{}, >>"facet_heatmaps":{}}} >> >> http://localhost:8983/solr/wemi/select?fq=m_mediaType_s:%222%22 >> =on=*:*=0=0=json >> >> >> ? "response":{"numFound":25561,"start":0,"docs":[] >> >> http://localhost:8983/solr/wemi/select?fq=m_mediaType_s:%220%22 >> =on=*:*=0=0=json >> >> >> ? "response":{"numFound":0,"start":0,"docs":[] >> >> So why does the search for facet.field even contain the value "1", if it >> does not exist? >> >> And why does it e.g. not contain >> "SomeReallyCrazyOtherValueWhichLikeValue"1"DoesNotExistButLetsIncludeI >> tInTheFacetFieldsResultListAnywaysWithCountZero" : 0 >> >> Best regards, >> Sebastian >> >> Additional info, field m_mediaType_s is a string; >> > stored="true" /> >> > /> >>
Re: Solr Suggester
For the suggester, the field must be indexed. It's not necessary to have it stored. Best, Michael Am 22.12.2016 um 11:24 schrieb Furkan KAMACI: > Hi Emir, > > As far as I know, it should be enough to be stored=true for a suggestion > field? Should it be both indexed and stored? > > Kind Regards, > Furkan KAMACI > > On Thu, Dec 22, 2016 at 11:31 AM, Emir Arnautovic < > emir.arnauto...@sematext.com> wrote: > >> That is because my_field_2 is not indexed. >> >> Regards, >> Emir >> >> >> On 21.12.2016 18:04, Furkan KAMACI wrote: >> >>> Hi All, >>> >>> I've a field like that: >>> >>> >> multiValued="false" /> >>> >>> >> stored="true" multiValued="false"/> >>> >>> When I run a suggester on my_field_1 it returns response. However >>> my_field_2 doesn't. I've defined suggester as: >>> >>>suggester >>>FuzzyLookupFactory >>>DocumentDictionaryFactory >>> >>> What can be the reason? >>> >>> Kind Regards, >>> Furkan KAMACI >>> >>> >> -- >> Monitoring * Alerting * Anomaly Detection * Centralized Log Management >> Solr & Elasticsearch Support * http://sematext.com/ >> >>
Re: fq performance
First of all, from what I can see, this won't do what you're expecting. Multiple fq conditions are always combined using AND, so if a user is member of 100 groups, but the document is accessible to only 99 of them, then the user won't find it. Or in other words, if you add a user to some group, then she would get *less* results than before. But coming back to your performance question: Just try it. Having 100 fq conditions will of course slow down your query a bit, but not that much. I rather see the problem with the filter cache: It will only be fast enough if all of your fq filters fit into the cache. Each possible fq filter will take 1 million/8 == 125k bytes, so having hundreds of possible access groups conditions might blow up your query cache (which must fit into RAM). -Michael Am 16.03.2017 um 13:02 schrieb Ganesh M: Hi, We have 1 million of documents and would like to query with multiple fq values. We have kept the access_control ( multi value field ) which holds information about for which group that document is accessible. Now to get the list of all the documents of an user, we would like to pass multiple fq values ( one for each group user belongs to ) q:somefiled:value& fq:access_control:g1:access_control:g2:access_control:g3:access_control:g4:access_control:g5... Like this, there could be 100 groups for an user. If we fire query with 100 values in the fq, whats the penalty on the performance ? Can we get the result in less than one second for 1 million of documents. Let us know your valuable inputs on this. Regards,
Re: fq performance
Hi Ganesh, you might want to use something like this: fq=access_control:(g1 g2 g5 g99 ...) Then it's only one fq filter per request. Internally it's like an OR condition, but in a more condensed form. I already have used this with up to 500 values without larger performance degradation (but in that case it was the unique id field). You should think a minute about your filter cache here. Since you only have one fq filter per request, you won't blow your cache that fast. But it depends on your use case whether you should cache these filters at all. When it's common that a single user will send several requests within one commit interval, or when it's likely that several users will be in the same groups, that just use it like that. But when it's more likely that each request belongs to a different user with different security settings, then you should consider disabling the cache for this fq filter so that your filter cache (for other filters you probably have) won't be polluted: fq=*{!cache=false}*access_control:(g1 g2 g5 g99 ...). See http://yonik.com/advanced-filter-caching-in-solr/ for information on that. -Michael Am 17.03.2017 um 07:46 schrieb Ganesh M: Hi Shawn / Michael, Thanks for your replies and I guess you have got my scenarios exactly right. Initially my document contains information about who have access to the documents, like field as (U1_s:true). if 100 users can access a document, we will have 100 such fields for each user. So when U1 wants to see all this documents..i will query like get all documents where U1_s:true. If user U5 added to group G1, then I have to take all the documents of group G1 and have to set the information of user U5 in the document like U5_s:true in the document. For this, I have re-index all the documents in that group. To avoid this, I was trying to keep group information instead of user information like G1_s:true, G2_s:true in the document. And for querying user documents, I will first get all the groups of User U1, and then query get all documents where G1_s:true OR G2_s:true or G3_s:true By this we don't need to re-index all the documents. But while querying I need to query with OR of all the groups user belongs to. For how many ORs solr can give the results in less than one second.Can I pass 100's of OR condtion in the solr query? will that affects the performance ? Pls share your valuable inputs. On Thu, Mar 16, 2017 at 6:04 PM Shawn Heiseywrote: On 3/16/2017 6:02 AM, Ganesh M wrote: We have 1 million of documents and would like to query with multiple fq values. We have kept the access_control ( multi value field ) which holds information about for which group that document is accessible. Now to get the list of all the documents of an user, we would like to pass multiple fq values ( one for each group user belongs to ) q:somefiled:value:access_control:g1:access_control:g2:access_control:g3:access_control:g4:access_control:g5... Like this, there could be 100 groups for an user. The correct syntax is fq=field:value -- what you have there is not going to work. This might not do what you expect. Filter queries are ANDed together -- *every* filter must match, which means that if a document that you want has only one of those values in access_control, or has 98 of them but not all 100, then the query isn't going to match that document. The solution is one filter query that can match ANY of them, which also might run faster. I can't say whether this is a problem for you or not. Your data might be completely correct for matching 100 filters. Also keep in mind that there is a limit to the size of a URL that you can send into any webserver, including the container that runs Solr. That default limit is 8192 bytes, and includes the "GET " or "POST " at the beginning and the " HTTP/1.1" at the end (note the spaces). The filter query information for 100 of the filters you mentioned is going to be over 2K, which will fit in the default, but if your query has more complexity than you have mentioned here, the total URL might not fit. There's a workaround to this -- use a POST request and put the parameters in the request body. If we fire query with 100 values in the fq, whats the penalty on the performance ? Can we get the result in less than one second for 1 million of documents. With one million documents, each internal filter query result is 25 bytes -- the number of documents divided by eight. That's 2.5 megabytes for 100 of them. In addition, every time a filter is run, it must examine every document in the index to create that 25 byte structure, which means that filters which *aren't* found in the filterCache are relatively slow. If they are found in the cache, they're lightning fast, because the cache will contain the entire 25 byte bitset. If you make your filterCache large enough, it's going to consume a LOT of java heap memory, particularly if the index gets bigger. The
Re: Mixing AND OR conditions with query parameters
Make sure to have a whitespace are the OR operator. The parenthesises should be around the OR query, not including the "fq:" -- this should be outside the parenthesises (which are not necessary at all). What exactly are you expecting? -Michael Am 24.04.2017 um 12:59 schrieb VJ: > Hi All, > > I am facing issues with OR/AND conditions with query parameters: > > fq=cioname:"XYZ" & (fq=attr1:trueORattr2:true) > > The queries are not returning expected results. > > I have tried various permutation and combinations but couldn't get it > working. Any pointers on this? > > > > Regards, > VJ >
Re: Error after moving index
Hi Moritz, did you stop your local Solr sever before? Copying data from a running instance may cause headaches. If yes, what happens if you copy everything again? It seems that your copy operations wasn't successful. Best, Michael Am 22.06.2017 um 14:37 schrieb Moritz Munte: > Hello, > > > > I created an index on my local machine (Windows 10) and it works fine there. > > After uploading the index to the production server (Linux), the server shows > an error: .
Re: Solr NLS custom query parser
Hi Arun, your question is too generic. What do you mean with nlp search? What do you expect to happen? The short answer is: No, there is no such parser because the individual requirements will vary a lot. -Michael Am 14.06.2017 um 16:32 schrieb aruninfo100: > Hi, > > I am trying to configure NLP search with Solr. I am using OpenNLP for the > same.I am able to index the documents and extract named entities and POS > using OpenNLP-UIMA support and also by using a UIMA Update request processor > chain.But I am not able to write a query parser for the same.Is there a > query parser already written to satisfy the above features(nlp search). > > Thanks and Regards, > Arun > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Solr-NLS-custom-query-parser-tp4340511.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr Facet.contains
What is the field type? Which Analyzers are configured? How do you split at "~"? (You have to do it by yourself, or configure some tokenizer for that.) What do you get when you don't filter your facets? What do you mean with "it is not working"? What is your result now? -Michael Am 15.09.2017 um 13:43 schrieb vobium: > Hello, > > I want to limit my facet data by using substring (only that contain > specified substring). My solr version is 4.8.0 > > e.g if doc with such type of string (field with such type of data is > multivalued and splited with "~") > > India/maha/mumbai~India/gujarat/badoda > India/goa/xyz > India/raj/jaypur > 1236/maha/890~India/maha/kolhapur > India/maha/mumbai > India/maha/nashik > Uk/Abc/Cde > > > Expected facet Data that contain maha as substring > o/p > India/maha/mumbai (2) > India/maha/kolhapur(1) > India/maha/nashik(1) > 1236/maha/890(1) > > I tried it by using facet.contains but it is not working > so plz give solution for this issue > > > > -- > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Solr nodes crashing (OOM) after 6.6 upgrade
Hi Shamik, funny enough, we had a similar issue with our old legacy application that still used plain Lucene code in a JBoss container. Same, there were no specific queries or updates causing this, the performance just broke completely without unusual usage. GC was raising up to 99% or so. Sometimes it came back after some while but most often we had to completely restart JBoss for that. I never figured out what the root cause was, but my suspicion still is that Lucene was innocent. I rather suspect Rackspace's hypervisor to be the blamable. So maybe you can give it a try and have a look at the Amazon cloud settings? Best, Michael Am 22.09.2017 um 12:00 schrieb shamik: > All the tuning and scaling down of memory seemed to be stable for a couple of > days but then came down due to a huge spike in CPU usage, contributed by G1 > Old Generation GC. I'm really puzzled why the instances are suddenly > behaving like this. It's not that a sudden surge of load contributed to > this, query and indexing load seemed to be comparable with the previous time > frame. Just wondering if the hardware itself is not adequate enough for 6.6. > The instances are all running on 8 CPU / 30gb m3.2xlarge EC2 instances. > > Does anyone ever face issues similar to this? > > > > -- > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html >
Re: Newbie question about why represent timestamps as "float" values
While you're generally right, in this case it might make sense to stick to a primitive type. I see "unixtime" as a technical information, probably from System.currentTimeMillis(). As long as it's not used as a "real world" date but only for sorting based on latest updates, or chosing which document is more recent, it's totally okay to index it as a long value. But definitely not as a float. -Michael Am 10.10.2017 um 10:55 schrieb alessandro.benedetti: > There was time ago a Solr installation which had the same problem, and the > author explained me that the choice was made for performance reasons. > Apparently he was sure that handling everything as primitive types would > give a boost to the Solr searching/faceting performance. > I never agreed ( and one of the reasons is that you need to transform back > from float to dates to actually render them in a readable format). > > Furthermore I tend to rely on standing on the shoulders of giants, so if a > community ( not just a single developer) spent time implementing a date type > ( with the different available implementations) to manage specifically date > information, I tend to thrust them and believe that the best approach to > manage dates is to use that ad hoc date type ( in its variants, depending on > the use cases). > > As a plus, using the right data type gives you immense power in debugging > and understanding better your data. > For proper maintenance , it is another good reason to stick with standards. > > > > - > --- > Alessandro Benedetti > Search Consultant, R Software Engineer, Director > Sease Ltd. - www.sease.io > -- > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html >
Re: Solr Issue
Hi Patrick, can you attach the query you're sending to Solr and one example result? Or more specific, what are your hl.* parameters? -Michael Am 07.09.2017 um 09:36 schrieb Patrick Fallert: > > Hey Guys, > i´ve got a problem with my Solr Highlighter.. > When I search for a word, i get some results. For every result i want > to display the highlighted text and here is my problem. Some of the > returned documents have a highlighted text the other ones doesnt. I > don´t know why it is but i need to fix this problem. Below is the > configuration of my managed-schema. The configuration of the > highlighter in solrconfig.xml is default. > I hope someone can help me. If you need more details you can ask me > for sure. > > managed-schema: > > > > > id > > > > > > > > > > > sortMissingLast="true" multiValued="true"/> > currencyConfig="currency.xml" defaultCurrency="USD" precisionStep="8"/> > positionIncrementGap="0" docValues="true" precisionStep="0"/> > positionIncrementGap="0" docValues="true" multiValued="true" > precisionStep="0"/> > indexed="true" stored="false"> > > > > > > indexed="true" stored="false"> > > > encoder="integer"/> > > > indexed="true" stored="false"> > > > encoder="identity"/> > > > > > > > > > > > positionIncrementGap="0" docValues="true" precisionStep="0"/> > positionIncrementGap="0" docValues="true" multiValued="true" > precisionStep="0"/> > positionIncrementGap="0" docValues="true" precisionStep="0"/> > positionIncrementGap="0" docValues="true" multiValued="true" > precisionStep="0"/> > stored="false" docValues="false" multiValued="true"/> > positionIncrementGap="0" docValues="true" precisionStep="0"/> > positionIncrementGap="0" docValues="true" multiValued="true" > precisionStep="0"/> > docValues="true"/> > class="solr.SpatialRecursivePrefixTreeFieldType" geo="true" > maxDistErr="0.001" distErrPct="0.025" distanceUnits="kilometers"/> > positionIncrementGap="0" docValues="true" precisionStep="0"/> > positionIncrementGap="0" docValues="true" multiValued="true" > precisionStep="0"/> > positionIncrementGap="100"> > > > > > > > multiValued="true"/> > > docValues="true" multiValued="true"/> > > docValues="true" multiValued="true"/> > stored="false"> > > > > > > > multiValued="true"/> > > multiValued="true"/> > dimension="2"/> > > docValues="true"/> > docValues="true" multiValued="true"/> > positionIncrementGap="0" docValues="true" precisionStep="6"/> > positionIncrementGap="0" docValues="true" multiValued="true" > precisionStep="6"/> > positionIncrementGap="0" docValues="true" precisionStep="8"/> > positionIncrementGap="0" docValues="true" multiValued="true" > precisionStep="8"/> > positionIncrementGap="100"> > > > > ignoreCase="true"/> > > > > > positionIncrementGap="100"> > > > > ignoreCase="true"/> > > > > positionIncrementGap="100"> > > > articles="lang/contractions_ca.txt" ignoreCase="true"/> > > ignoreCase="true"/> > > > > positionIncrementGap="100"> > > > > > > > > positionIncrementGap="100"> > > > > ignoreCase="true"/> > > > > positionIncrementGap="100"> > > > > words="lang/stopwords_da.txt" ignoreCase="true"/> > > > > positionIncrementGap="100"> > > > > words="lang/stopwords_de.txt" ignoreCase="true"/> > > > > > positionIncrementGap="100"> > > > > ignoreCase="false"/> > > > > positionIncrementGap="100"> > > > ignoreCase="true"/> > > > protected="protwords.txt"/> > > > > > ignoreCase="true" synonyms="synonyms.txt"/> > ignoreCase="true"/> > > > protected="protwords.txt"/> > > > > autoGeneratePhraseQueries="true" positionIncrementGap="100"> > > > ignoreCase="true"/> > catenateNumbers="1" generateNumberParts="1" splitOnCaseChange="1" > generateWordParts="1" catenateAll="0" catenateWords="1"/> > > protected="protwords.txt"/> > > > > > > ignoreCase="true" synonyms="synonyms.txt"/> > ignoreCase="true"/> > catenateNumbers="0" generateNumberParts="1" splitOnCaseChange="1" > generateWordParts="1" catenateAll="0" catenateWords="0"/> > > protected="protwords.txt"/> > > > > autoGeneratePhraseQueries="true" positionIncrementGap="100"> > > > ignoreCase="true" synonyms="synonyms.txt"/> > ignoreCase="true"/> > catenateNumbers="1" generateNumberParts="0" generateWordParts="0" > catenateAll="0" catenateWords="1"/> > > protected="protwords.txt"/> > > > > > > > ignoreCase="true" synonyms="synonyms.txt"/> > ignoreCase="true"/> > catenateNumbers="1" generateNumberParts="0" generateWordParts="0" > catenateAll="0" catenateWords="1"/> > > protected="protwords.txt"/> > > > > > positionIncrementGap="100"> > > > > words="lang/stopwords_es.txt" ignoreCase="true"/> > > > > positionIncrementGap="100"> > > > > ignoreCase="true"/> > > > > positionIncrementGap="100"> > > > > > > > ignoreCase="true"/> > > > positionIncrementGap="100"> > > > > words="lang/stopwords_fi.txt"
Re: ways to check if document is in a huge search result set
Maybe I don't understand your problem, but why don't you just filter by "supplier information"? -Michael Am 11.09.2017 um 04:12 schrieb Derek Poh: > Hi > > I have a collection of productdocument. > Each productdocument has supplier information in it. > > I need to check if a supplier's products is return in a search > resultcontaining over 100,000 products and in which page (assuming > pagination is 20 products per page). > Itis time-consuming and "labour-intensive" to go through each page to > look for the product of the supplier. > > Would like to know if you guys have any better and easier waysto do this? > > Derek > > -- > CONFIDENTIALITY NOTICE > This e-mail (including any attachments) may contain confidential > and/or privileged information. If you are not the intended recipient > or have received this e-mail in error, please inform the sender > immediately and delete this e-mail (including any attachments) from > your computer, and you must not use, disclose to anyone else or copy > this e-mail (including any attachments), whether in whole or in part. > This e-mail and any reply to it may be monitored for security, legal, > regulatory compliance and/or other appropriate reasons.
Re: Solr6.6 Issue/Bug
Why would you need to start Solr as root? You should definitely not do this, there's no reason for that. And even if you *really* want this: What's so bad about the -force option? -Michael Am 06.09.2017 um 07:26 schrieb Kasim Jinwala: > Dear team, > I am using solr 5.0 last 1 year, now we are planning to upgrade > solr 6.6. > While trying to start solr using root user, we need to pass -force > parameter to start solr forcefully, > please help to start solr using root user without -force command. > > Regards > Kasim J. >
Re: ways to check if document is in a huge search result set
So you're looking for a solution to validate the result output. You have two ways: 1. Assuming you're sorting by the default "score" sort option: Find the result you're looking for by setting the fq filter clause accordingly, and add "score" the the fl field list. Then do the normal unfiltered search, still including "score", and start with page, let's say, 50,000. Then continue using binary search depending on the returned score values. 2. Set fl to return only the supplier id, then you'll probably be able to return several ten-thousand results at once. But be warned, the result position of these elements can vary with every single commit, esp. when there're lots of documents with the same score value. -Michael Am 12.09.2017 um 03:21 schrieb Derek Poh: > Some additional information. > > I have a query from user that a supplier's product(s) is not in the > search result. > I debugged by adding a fq on the supplier id to the query to verify > the supplier's product is in thesearch result. The products do existin > the search result. > I want to tell user in which page of the search result the supplier's > product appear in. To do this I go through each page of the search > result to find the supplier's product. > It is still fine if the search result has a few hundreds products but > it will be a chore if the result have thousands. In this case there > are more than 100,000 products in the result. > > Any advice on easier ways to check which page the supplier's product > or document appear in a search result? > > On 9/11/2017 2:44 PM, Mikhail Khludnev wrote: >> You can request facet field, query facet, filter or even explainOther. >> >> On Mon, Sep 11, 2017 at 5:12 AM, Derek Poh>> wrote: >> >>> Hi >>> >>> I have a collection of productdocument. >>> Each productdocument has supplier information in it. >>> >>> I need to check if a supplier's products is return in a search >>> resultcontaining over 100,000 products and in which page (assuming >>> pagination is 20 products per page). >>> Itis time-consuming and "labour-intensive" to go through each page >>> to look >>> for the product of the supplier. >>> >>> Would like to know if you guys have any better and easier waysto do >>> this? >>> >>> Derek >>> >>> -- >>> CONFIDENTIALITY NOTICE >>> This e-mail (including any attachments) may contain confidential and/or >>> privileged information. If you are not the intended recipient or have >>> received this e-mail in error, please inform the sender immediately and >>> delete this e-mail (including any attachments) from your computer, >>> and you >>> must not use, disclose to anyone else or copy this e-mail (including >>> any >>> attachments), whether in whole or in part. >>> This e-mail and any reply to it may be monitored for security, legal, >>> regulatory compliance and/or other appropriate reasons. >> >> >> > > > -- > CONFIDENTIALITY NOTICE > This e-mail (including any attachments) may contain confidential > and/or privileged information. If you are not the intended recipient > or have received this e-mail in error, please inform the sender > immediately and delete this e-mail (including any attachments) from > your computer, and you must not use, disclose to anyone else or copy > this e-mail (including any attachments), whether in whole or in part. > This e-mail and any reply to it may be monitored for security, legal, > regulatory compliance and/or other appropriate reasons.
Re: ways to check if document is in a huge search result set
Am 13.09.2017 um 04:04 schrieb Derek Poh: > Hi Michael > > "Then continue using binary search depending on the returned score > values." > > May I know what do you mean by using binary search? An example algorithm is in Java method java.util.Arrays::binarySearch. Or more detailed: https://en.wikipedia.org/wiki/Binary_search_algorithm Best, Michael
Re: Modifing create_core's instanceDir attribute
I'd rather say you didn't quote the URL when sending it using curl. Bash accepts the ampersand as a request to execute curl including the URL up to CREATE in background - that's why the error is included within the next output, followed by "Exit" - and then tries to execute the following part of the URL as additional commands, which of course fails. Just put the URL in quotes, and it will work much better. -Michael Am 27.09.2017 um 23:14 schrieb Miller, William K - Norman, OK - Contractor: > I understand that this has to be done on the command line, but I don't know > where to put this structure or what it should look like. Can you please be > more specific in this answer? I have only been working with Solr for about > six months. > > > > > ~~~ > William Kevin Miller > > ECS Federal, Inc. > USPS/MTSC > (405) 573-2158 > > > -Original Message- > From: Erick Erickson [mailto:erickerick...@gmail.com] > Sent: Wednesday, September 27, 2017 3:57 PM > To: solr-user > Subject: Re: Modifing create_core's instanceDir attribute > > Standard command-line. You're doing this on the box itself, not through a > REST API. > > Erick > > On Wed, Sep 27, 2017 at 10:26 AM, Miller, William K - Norman, OK - Contractor >wrote: >> This is my first time to try using the core admin API. How do I go about >> creating the directory structure? >> >> >> >> >> ~~~ >> William Kevin Miller >> >> ECS Federal, Inc. >> USPS/MTSC >> (405) 573-2158 >> >> >> -Original Message- >> From: Erick Erickson [mailto:erickerick...@gmail.com] >> Sent: Wednesday, September 27, 2017 11:45 AM >> To: solr-user >> Subject: Re: Modifing create_core's instanceDir attribute >> >> Right, the core admin API is pretty low-level, it expects the base directory >> exists, you have to create the directory structure by hand. >> >> Best, >> Erick >> >> On Wed, Sep 27, 2017 at 9:24 AM, Miller, William K - Norman, OK - Contractor >> wrote: >>> Thanks Erick for pointing me in this direction. Unfortunately when I try >>> to us this I get an error. Here is the command that I am using and the >>> response I get: >>> >>> https://solrserver:8983/solr/admin/cores?action=CREATE=mycore >>> s >>> tanceDir=/var/solr/data/mycore=data=custom_configs >>> >>> >>> [1] 32023 >>> [2] 32024 >>> [3] 32025 >>> -bash: https://solrserver:8983/solr/admin/cores?action=CREATE: No >>> such file or directory [4] 32026 >>> [1] Exit 127 >>> https://solrserver:8983/solr/adkmin/cores?action=CREATE >>> [2] Donename=mycore >>> [3]-DoneinstanceDir=/var/solr/data/mycore >>> [4]+DonedataDir=data >>> >>> >>> I even tried to use the UNLOAD action to remove a core and got the same >>> type of error as the -bash line above. >>> >>> I have tried searching online for an answer and have found nothing so far. >>> Any ideas why this error is occuring. >>> >>> >>> >>> ~~~ >>> William Kevin Miller >>> >>> ECS Federal, Inc. >>> USPS/MTSC >>> (405) 573-2158 >>> >>> -Original Message- >>> From: Erick Erickson [mailto:erickerick...@gmail.com] >>> Sent: Tuesday, September 26, 2017 3:33 PM >>> To: solr-user >>> Subject: Re: Modifing create_core's instanceDir attribute >>> >>> I don't think you can. You can, however, use the core admin API to do >>> that, >>> see: >>> https://lucene.apache.org/solr/guide/6_6/coreadmin-api.html#coreadmin >>> - >>> api >>> >>> Best, >>> Erick >>> >>> On Tue, Sep 26, 2017 at 1:14 PM, Miller, William K - Norman, OK - >>> Contractor wrote: >>> I know that when the create_core command is used that it sets the core to the name of the parameter supplied with the “-c” option and the instanceDir attribute in the http is also set to the name of the core. What I want is to tell the create_core to use a different instanceDir parameter. How can I go about doing this? I am using Solr 6.5.1 and it is running on a linux server using the apache tomcat webserver. ~~~ William Kevin Miller [image: ecsLogo] ECS Federal, Inc. USPS/MTSC (405) 573-2158
Re: Where the uploaded configset from SOLR into zookeeper ensemble resides?
Do you find your configs in the Solr admin panel, in the Cloud --> Tree folder? -Michael Am 28.09.2017 um 04:50 schrieb Gunalan V: > Hello, > > Could you please let me know where can I find the uploaded configset from > SOLR into zookeeper ensemble ? > > In docs it says they will "/configs/" but I'm not able to see > the configs directory in zookeeper. Please let me know if I need to check > somewhere else. > > > Thanks! >
Re: Moving to Point, trouble with IntPoint.newRangeQuery()
Hi Markus, I don't know why there aren't any results. But just out of curiosity, why don't you use the better choice IntPoint.newExectQuery(String,int)? What happens if you use that? -Michael Am 26.09.2017 um 13:22 schrieb Markus Jelsma: > Hello, > > I have a QParser impl. that transforms text input to one or more integers, it > makes a BooleanQuery one a field with all integers in OR-more. It used to > work by transforming the integer using LegacyNumericUtils.intToPrefixCoded, > getting a BytesRef. > > I have now moved it to use IntPoint.newRangeQuery(field, integer, integer), i > read (think javadocs) this is the way to go, but i get no matches! > > Iterator i = digests.iterator(); > while (i.hasNext()) { > Integer digest = i.next(); > queryBuilder.add(IntPoint.newRangeQuery(field, digest, digest), > Occur.SHOULD); > } > return queryBuilder.build(); > > To be sure i didn't mess up elsewhere i also tried building a string for > LuceneQParser and cheat: > > Iterator i = digests.iterator(); > while (i.hasNext()) { > Integer digest = i.next(); > str.append(ClientUtils.escapeQueryChars(digest.toString())); > if (i.hasNext()) { > str.append(" OR "); > } > } > QParser luceneQParser = new LuceneQParser(str.append(")").toString(), > localParams, params, req); > return luceneQParser.parse(); > > Well, this works! This is their respective debug output: > > Using the IntPoint range query: > > > > > {!q f=d1}value > {!q f=d1}value > (d1:[-1820898630 TO -1820898630]) > d1:[-1820898630 TO -1820898630] > > LuceneQParser cheat, it does find! > > > > 1 > -1820898630 > > > {!qd f=d1}value > {!qd f=d1}value > d1:-1820898630 > > There is not much difference in output, it looks fine, using LuceneQParser > you can also match using a range query, so what am i doing wrong? > > Many thanks! > Markus >
Re: Moving to Point, trouble with IntPoint.newRangeQuery()
Arrgh, forget my question. I just see that newExactQuery() simply triggers newRangeQuery() like you already do. -Michael Am 26.09.2017 um 13:29 schrieb Michael Kuhlmann: > Hi Markus, > > I don't know why there aren't any results. But just out of curiosity, > why don't you use the better choice IntPoint.newExectQuery(String,int)? > > What happens if you use that? > > -Michael > > Am 26.09.2017 um 13:22 schrieb Markus Jelsma: >> Hello, >> >> I have a QParser impl. that transforms text input to one or more integers, >> it makes a BooleanQuery one a field with all integers in OR-more. It used to >> work by transforming the integer using LegacyNumericUtils.intToPrefixCoded, >> getting a BytesRef. >> >> I have now moved it to use IntPoint.newRangeQuery(field, integer, integer), >> i read (think javadocs) this is the way to go, but i get no matches! >> >> Iterator i = digests.iterator(); >> while (i.hasNext()) { >> Integer digest = i.next(); >> queryBuilder.add(IntPoint.newRangeQuery(field, digest, digest), >> Occur.SHOULD); >> } >> return queryBuilder.build(); >> >> To be sure i didn't mess up elsewhere i also tried building a string for >> LuceneQParser and cheat: >> >> Iterator i = digests.iterator(); >> while (i.hasNext()) { >> Integer digest = i.next(); >> str.append(ClientUtils.escapeQueryChars(digest.toString())); >> if (i.hasNext()) { >> str.append(" OR "); >> } >> } >> QParser luceneQParser = new LuceneQParser(str.append(")").toString(), >> localParams, params, req); >> return luceneQParser.parse(); >> >> Well, this works! This is their respective debug output: >> >> Using the IntPoint range query: >> >> >> >> >> {!q f=d1}value >> {!q f=d1}value >> (d1:[-1820898630 TO -1820898630]) >> d1:[-1820898630 TO -1820898630] >> >> LuceneQParser cheat, it does find! >> >> >> >> 1 >> -1820898630 >> >> >> {!qd f=d1}value >> {!qd f=d1}value >> d1:-1820898630 >> >> There is not much difference in output, it looks fine, using LuceneQParser >> you can also match using a range query, so what am i doing wrong? >> >> Many thanks! >> Markus >> >
Re: How to sort on dates?
Am 16.12.2017 um 19:39 schrieb Georgios Petasis: > Even if the DateRangeField field can store a range of dates, doesn't > Solr understand that I have used single timestamps? No. It could theoretically, but sorting just isn't implemented in DateRangeField. > I have even stored the dates. > My problem is that I need to use the query formating stated in the > documentation: > https://lucene.apache.org/solr/guide/7_1/working-with-dates.html#date-range-formatting > > For example, if "financialYear" is a date range, I can do > q=financialYear:2014 and it will return everything that has a date > within 2014. If the field is date point, will it work? Yes, just query with the plain old range syntax: q=financialYear:[2014-01-01T00:00:00.000Z TO 2015-01-01T00:00:00.000Z} DateRangeField might be slightly faster for such queries, but that doesn't really matter much. I only used normal date fields yet, usually they're fast enough. As a rule of thunb, only use DateRangeField if you really need to index date ranges. -Michael
Re: Wildcard searches with special character gives zero result
Solr does not analyze queries with wildcards in it. So, with ch*p-seq, it will search for terms that start with ch and end with p-seq. Since your indexer has analyzed all tokens before, only chip and seq are in the index. See https://solr.pl/en/2010/12/20/wildcard-queries-and-how-solr-handles-them/ for example. If you really need results for such queries, I suggest to have a copyField which is unstemmed and only tokenized on whitespaces. If you then detect a wildcard character in your query string, search on that field instead of the others. -Michael Am 15.12.2017 um 11:59 schrieb Selvam Raman: > I am using edismax query parser. > > On Fri, Dec 15, 2017 at 10:37 AM, Selvam Ramanwrote: > >> Solr version - 6.4.0 >> >> "title_en":["Chip-seq"] >> >> When i fired query like below >> >> 1) chip-seq >> 2) chi* >> >> it is giving expected result, for this case one result. >> >> But when i am searching with wildcard it produce zero result. >> 1) ch*p-seq >> >> >> if i use escape character in '-' it creates two terms rather than single >> term. >> >> -- >> Selvam Raman >> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து" >> > > >
Re: How to sort on dates?
Hi Georgios, DateRangeField is a kind of SpatialField which is not sortable at all. For sorting, use a DatePointField instead. It's not deprecated; the deprecated class is TrieDateField. Best, Michael Am 15.12.2017 um 10:53 schrieb Georgios Petasis: > Hi all, > > I have a field of type "date_range" defined as: > > multiValued="false" indexed="true" stored="true"/> > > The problem is that sorting on this field does not work (despite the > fact that I put dates in there). Instead I get an error prompting to > perform sorting through a query. > > How can I do that? There is no documentation that I could find, that > shows an alternative. > > Also, I think that I saw a warning somewhere, that DateRangeField is > deprecated. But no alternative is suggested: > > https://lucene.apache.org/solr/guide/7_1/working-with-dates.html > > I am using solr 7.1. > > George >
Re: SolrException undefined field *
To correct myself, querying "*" is allowed in the sense that asking for all fields is done by assigning "*" to the fl parameter. So the problem is possibly not that "*" is requested, but that the star is used somewhere else, probably in the q parameter. We can help you better when you pass the full query string (if you're able to fetch it). -Michael Am 09.01.2018 um 16:38 schrieb Michael Kuhlmann: > First, you might want to index, but what Solr is executing here is a > search request. > > Second, you're querying for a dynamic field "*" which is not defined in > your schema. This is quite obvious, the exception says right this. > > So whatever is sending the query (some client, it seems) is doing the > wrong thing. Or your schema definition is not matching what the client > expects. > > Since we don't know what client code you're using, we can't tell more. > > -Michael > > > Am 09.01.2018 um 16:31 schrieb padmanabhan: >> I get the below error whenever an indexing is executed.. I didn't find enough >> clue on to where this field is coming from and how could i debug on to it.. >> any help would be appreciated >> >> 2018-01-09 16:03:11.705 INFO >> (searcherExecutor-51-thread-1-processing-x:master_backoffice_backoffice_product_default) >> [ x:master_backoffice_backoffice_product_default] >> o.a.s.c.QuerySenderListener QuerySenderListener sending requests to >> Searcher@232ae42b[master_backoffice_backoffice_product_default] >> main{ExitableDirectoryReader(UninvertingDirectoryReader(Uninverting(_1p(6.4.1):C56)))} >> 2018-01-09 16:03:11.705 ERROR >> (searcherExecutor-51-thread-1-processing-x:master_backoffice_backoffice_product_default) >> [ x:master_backoffice_backoffice_product_default] >> o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: undefined >> field * >> at >> org.apache.solr.schema.IndexSchema.getDynamicFieldType(IndexSchema.java:1308) >> at >> org.apache.solr.schema.IndexSchema.getFieldType(IndexSchema.java:1260) >> at >> org.apache.solr.parser.SolrQueryParserBase.getWildcardQuery(SolrQueryParserBase.java:932) >> at >> org.apache.solr.parser.SolrQueryParserBase.handleBareTokenQuery(SolrQueryParserBase.java:616) >> at org.apache.solr.parser.QueryParser.Term(QueryParser.java:312) >> at org.apache.solr.parser.QueryParser.Clause(QueryParser.java:182) >> at org.apache.solr.parser.QueryParser.Query(QueryParser.java:102) >> at org.apache.solr.parser.QueryParser.TopLevelQuery(QueryParser.java:91) >> at >> org.apache.solr.parser.SolrQueryParserBase.parse(SolrQueryParserBase.java:194) >> at org.apache.solr.search.LuceneQParser.parse(LuceneQParser.java:50) >> at org.apache.solr.search.QParser.getQuery(QParser.java:168) >> at >> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:160) >> at >> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:269) >> at >> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:166) >> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2306) >> at >> org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:72) >> at >> org.apache.solr.core.SolrCore.lambda$getSearcher$4(SolrCore.java:2094) >> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >> at >> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >> at java.lang.Thread.run(Thread.java:748) >> >> 2018-01-09 16:03:11.705 INFO >> (searcherExecutor-51-thread-1-processing-x:master_backoffice_backoffice_product_default) >> [ x:master_backoffice_backoffice_product_default] o.a.s.c.S.Request >> [master_backoffice_backoffice_product_default] webapp=null path=null >> params={q=*:*%26facet%3Dtrue%26facet.field%3DcatalogVersion%26facet.field%3DcatalogId%26facet.field%3DapprovalStatus_string%26facet.field%3Dcategory_string_mv=false=newSearcher} >> status=400 QTime=0 >> >> >> >> >> -- >> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html >> >
Re: SolrException undefined field *
First, you might want to index, but what Solr is executing here is a search request. Second, you're querying for a dynamic field "*" which is not defined in your schema. This is quite obvious, the exception says right this. So whatever is sending the query (some client, it seems) is doing the wrong thing. Or your schema definition is not matching what the client expects. Since we don't know what client code you're using, we can't tell more. -Michael Am 09.01.2018 um 16:31 schrieb padmanabhan: > I get the below error whenever an indexing is executed.. I didn't find enough > clue on to where this field is coming from and how could i debug on to it.. > any help would be appreciated > > 2018-01-09 16:03:11.705 INFO > (searcherExecutor-51-thread-1-processing-x:master_backoffice_backoffice_product_default) > [ x:master_backoffice_backoffice_product_default] > o.a.s.c.QuerySenderListener QuerySenderListener sending requests to > Searcher@232ae42b[master_backoffice_backoffice_product_default] > main{ExitableDirectoryReader(UninvertingDirectoryReader(Uninverting(_1p(6.4.1):C56)))} > 2018-01-09 16:03:11.705 ERROR > (searcherExecutor-51-thread-1-processing-x:master_backoffice_backoffice_product_default) > [ x:master_backoffice_backoffice_product_default] > o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: undefined > field * > at > org.apache.solr.schema.IndexSchema.getDynamicFieldType(IndexSchema.java:1308) > at > org.apache.solr.schema.IndexSchema.getFieldType(IndexSchema.java:1260) > at > org.apache.solr.parser.SolrQueryParserBase.getWildcardQuery(SolrQueryParserBase.java:932) > at > org.apache.solr.parser.SolrQueryParserBase.handleBareTokenQuery(SolrQueryParserBase.java:616) > at org.apache.solr.parser.QueryParser.Term(QueryParser.java:312) > at org.apache.solr.parser.QueryParser.Clause(QueryParser.java:182) > at org.apache.solr.parser.QueryParser.Query(QueryParser.java:102) > at org.apache.solr.parser.QueryParser.TopLevelQuery(QueryParser.java:91) > at > org.apache.solr.parser.SolrQueryParserBase.parse(SolrQueryParserBase.java:194) > at org.apache.solr.search.LuceneQParser.parse(LuceneQParser.java:50) > at org.apache.solr.search.QParser.getQuery(QParser.java:168) > at > org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:160) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:269) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:166) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2306) > at > org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:72) > at > org.apache.solr.core.SolrCore.lambda$getSearcher$4(SolrCore.java:2094) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > > 2018-01-09 16:03:11.705 INFO > (searcherExecutor-51-thread-1-processing-x:master_backoffice_backoffice_product_default) > [ x:master_backoffice_backoffice_product_default] o.a.s.c.S.Request > [master_backoffice_backoffice_product_default] webapp=null path=null > params={q=*:*%26facet%3Dtrue%26facet.field%3DcatalogVersion%26facet.field%3DcatalogId%26facet.field%3DapprovalStatus_string%26facet.field%3Dcategory_string_mv=false=newSearcher} > status=400 QTime=0 > > > > > -- > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html >
Re: Edismax leading wildcard search
Am 22.12.2017 um 11:57 schrieb Selvam Raman: > 1) how can i disable leading wildcard search Do it on the client side. Just don't allow leading asterisks or question marks in your query term. > 2) why leading wildcard search takes so much of time to give the response. > Because Lucene can't just look in the index for all terms beginning with something; it needs to look in all terms instead. Basically, indexed terms are in alphabetical order, but that doesn't help with leading wildcards. There's a ReversedWildcardFilterFactory in Solr to address this issue. -Michael
Re: How to split index more than 2GB in size
Hi Sushant, while this is true in general, it won't hold here. If you split your index, searching on each splitted shard might be a bit faster, but you'll increase search time much more because Solr needs to send your search queries to all shards and then combine the results. So instead of having one medium fast search request, you'll have several fast requests plus the aggregation step. Erick is totally right, splitting an index of that size has no performance benefit. Sharding is not a technique to improve performance, it's a technique to be able to handle indexes of hundreds of megabytes in size, which won't fit into an individual machine. Best, Michael Am 20.06.2018 um 19:58 schrieb Sushant Vengurlekar: > Thank you for the detailed response Eric. Very much appreciated. The reason > I am looking into splitting the index into two is because it’s much faster > to search across a smaller index than a larger one. > > On Wed, Jun 20, 2018 at 10:46 AM Erick Erickson > wrote: > >> You still haven't answered _why_ you think splitting even a 20G index >> is desirable. We regularly see 200G+ indexes per replica in the field, >> so what's the point? Have you measured different setups to see if it's >> a good idea? A 200G index needs some beefy hardware admittedly >> >> If you have adequate response times with a 20G index and need to >> increase the QPS rate, just add more replicas. Having more than one >> shard inevitably adds overhead which may (or may not) be made up for >> by parallelizing some of the work. It's nearly always better to use >> only one shard if it meets your response time requirements. >> >> Best, >> Erick >> >> On Wed, Jun 20, 2018 at 10:39 AM, Sushant Vengurlekar >> wrote: >>> The index size is small because this is my local development copy. The >>> production index is more than 20GB. So I am working on getting the index >>> split and replicated on different nodes. Our current instance on prod is >>> single instance solr 6 which we are working on moving towards solrcloud 7 >>> >>> On Wed, Jun 20, 2018 at 10:30 AM Erick Erickson >> >>> wrote: >>> Use the indexupgrader tool or optimize your index before using >> splitshard. Since this is a small index (< 5G), optimizing will not create an overly-large segment, so that pitfall is avoided. You haven't yet explained why you think splitting the index would be beneficial. Splitting an index this small is unlikely to improve query performance appreciably. This feels a lot like an "XY" problem, you're asking how to do X thinking it will solve Y but not telling us what Y is. Best, Erick On Wed, Jun 20, 2018 at 9:40 AM, Sushant Vengurlekar wrote: > How can I resolve this error? > > On Wed, Jun 20, 2018 at 9:11 AM, Alexandre Rafalovitch < arafa...@gmail.com> > wrote: > >> This seems more related to an old index upgraded to latest Solr >> rather than >> the split itself. >> >> Regards, >> Alex >> >> On Wed, Jun 20, 2018, 12:07 PM Sushant Vengurlekar, < >> svengurle...@curvolabs.com> wrote: >> >>> Thanks for the reply Alessandro! Appreciate it. >>> >>> Below is the full request and the error received >>> >>> curl ' >>> >>> http://localhost:8081/solr/admin/collections?action= >> SPLITSHARD=dev-transactions=shard1 >>> ' >>> >>> { >>> >>> "responseHeader":{ >>> >>> "status":500, >>> >>> "QTime":7920}, >>> >>> "success":{ >>> >>> "solr-1:8081_solr":{ >>> >>> "responseHeader":{ >>> >>> "status":0, >>> >>> "QTime":1190}, >>> >>> "core":"dev-transactions_shard1_0_replica_n3"}, >>> >>> "solr-1:8081_solr":{ >>> >>> "responseHeader":{ >>> >>> "status":0, >>> >>> "QTime":1047}, >>> >>> "core":"dev-transactions_shard1_1_replica_n4"}, >>> >>> "solr-1:8081_solr":{ >>> >>> "responseHeader":{ >>> >>> "status":0, >>> >>> "QTime":6}}, >>> >>> "solr-1:8081_solr":{ >>> >>> "responseHeader":{ >>> >>> "status":0, >>> >>> "QTime":1009}}}, >>> >>> "failure":{ >>> >>> >>> >> "solr-1:8081_solr":"org.apache.solr.client.solrj.impl.HttpSolrClient$ >> RemoteSolrException:Error >>> from server at http://solr-1:8081/solr: >>> java.lang.IllegalArgumentException: >>> Cannot merge a segment that has been created with major version 6 >> into >> this >>> index which has been created by major version 7"}, >>> >>> "Operation splitshard caused >>> >>> exception:":"org.apache.solr.common.SolrException:org. >> apache.solr.common.SolrException: >>> SPLITSHARD failed to invoke SPLIT core admin command", >>> >>>
Re: Query with exact number of tokens
Hi Sergio, alas that's not possible that way. If you search for CENTURY BANCORP, INC., then Solr will be totally happy to find all these terms in "NEW CENTURY BANCORP, INC." and return it with a high score. But you can prepare your data at index time. Make it a multivalued field of type string or text without any tokenization and then permute company names in all reasonable combinations. Since company names should seldom have more than half a dozen words, that might be practicable. You then search with an exact match on that field. Make sure to quote your query parameter correctly, otherwise NEW CENTURY BANCORP, INC. would match CENTURY BANCORP, INC.. -Michael Am 21.09.2018 um 15:00 schrieb marotosg: > Hi, > > I have to search for company names where my first requirement is to find > only exact matches on the company name. > > For instance if I search for "CENTURY BANCORP, INC." I shouldn't find "NEW > CENTURY BANCORP, INC." > because the result company has the extra keyword "NEW". > > I can't use exact match because the sequence of tokens may differ. Basically > I need to find results where the tokens are the same in any order and the > number of tokens match. > > I have no idea if it's possible as include in the query the number of tokens > and solr field has that info within to match it. > > Thanks for your help > Sergio > > > > -- > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html >