Re: ExecutorService support in SolrIndexSearcher

2019-08-30 Thread David Smiley
It'd take some work to do that. Years ago I recall Etsy did a POC and shared their experience at Lucene/Solr Revolution in Washington DC; I attended the presentation with great interest. One of the major obstacles, if I recall, was the Collector needs to support this mode of operation, and in

Re: Solutio for long time highlighting

2019-08-30 Thread David Smiley
Ah, multi-threaded highlighting. I implemented that once as a precursor to ultimately other better things -- the UnifiedHighlighter. Your ExecutorService ought to be a field on the handler. In inform() you can call SolrCore.addCloseHook to ensure this executor is shut down. I suggest looking

Re: Multi-lingual Search & Accent Marks

2019-08-30 Thread Walter Underwood
The right transliteration for accents is language-dependent. In English, a diaeresis can be stripped because it is only used to mark neighboring vowels as independently pronounced. In German, the “typewriter umlaut” adds an “e”. English: coöperate -> cooperate German: Glück -> Glueck Some

Re: Re: Multi-lingual Search & Accent Marks

2019-08-30 Thread Erick Erickson
It Depends (tm). In this case on how sophisticated/precise your users are. If your users are exclusively extremely conversant in the language and are expected to have keyboards that allow easy access to all the accents… then I might leave them in. In some cases removing them can change the

Re: Re: Multi-lingual Search & Accent Marks

2019-08-30 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
Aita, Thanks for that insight! As the conversation has progressed, we are now leaning towards not having the ASCII-folding filter in our pipelines in order to keep marks like umlauts and tildas. Instead, we might add acute and grave accents to a file pointed at by the

Re: Clustering error - Solr 8.2

2019-08-30 Thread Joe Obernberger
Mystery solved.  I added 'features' to the schema, next error was name, then manu, sku, and cat.  These are defined in solrconfig.xml under browse: text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4 title^10.0 description^5.0 keywords^5.0 author^2.0 resourcename^1.0 text^0.5

Re: Multi-lingual Search & Accent Marks

2019-08-30 Thread Atita Arora
We work on german index, we neutralize accents before index i.e. umlauts to 'ae', 'ue'.. Etc and similar what we do at the query time too for an appropriate match. On Fri, Aug 30, 2019, 4:22 PM Audrey Lorberfeld - audrey.lorberf...@ibm.com wrote: > Hi All, > > Just wanting to test the waters

Multi-lingual Search & Accent Marks

2019-08-30 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
Hi All, Just wanting to test the waters here – for those of you with search engines that index multiple languages, do you use ASCII-folding in your schema? We are onboarding Spanish documents into our index right now and keep going back and forth on whether we should preserve accent marks.

Re: solr-8.1.1 -> solr-8.2.0, "lucene... cannot be cast"

2019-08-30 Thread Mikhail Khludnev
Hello, Unified highlighter seems feed GraphQuery with Lucene's index searcher, which query can't deal with. It's a bug. On Fri, Aug 30, 2019 at 4:08 PM Jochen Barth wrote: > Here the complete error message for the Query below: > > perhaps I should do a bug report. Just tested wirth 8.1.1:

solr-8.1.1 -> solr-8.2.0, "lucene... cannot be cast"

2019-08-30 Thread Jochen Barth
Here the complete error message for the Query below: perhaps I should do a bug report. Just tested wirth 8.1.1: works. 2019-08-30 12:40:40.476 ERROR (qtp2116511124-65) [   x:Suchindex] o.a.s.h.RequestHandlerBase java.lang.ClassCastException: class org.apache.lucene.search.IndexSearcher cannot

Re: Question: Solr perform well with thousands of replicas?

2019-08-30 Thread Erick Erickson
“no registered leader” is the effect of some problem usually, not the root cause. In this case, for instance, you could be running out of file handles and see other errors like “too many open files”. That’s just one example. One common problem is that Solr needs a lot of file handles and the

Re: Question: Solr perform well with thousands of replicas?

2019-08-30 Thread Jörn Franke
What is the reason for this number of replicas? Solr should work fine, but maybe it is worth to consolidate some collections to avoid also administrative overhead. > Am 29.08.2019 um 05:27 schrieb Hongxu Ma : > > Hi > I have a solr-cloud cluster, but it's unstable when collection number is

Re: Question: Solr perform well with thousands of replicas?

2019-08-30 Thread Hongxu Ma
Hi guys Thanks for your helpful help! More details about my env. Cluster: A 4 GCP(google cloud) hosts cluster, each host: 16Core cpu, 60G mem, 2TB HDD. I set up 2 solr nodes on each host and there are 1000+ replicas on each solr node. (Sorry for forgetting this before: 2 solr node on each host,