Re: Solr Streaming Expression failures

2020-03-26 Thread Aroop Ganguly
I have personally not used streaming expressions to commit data to a collection 
(have used them a lot of querying), and would not recommend it for bulk 
indexing unless Joel recommends it :) 

On the other hand we have had decent success in indexing at scale and 12 
million is not a big number.
You would need to have a decently sized cluster and have a commensurate number 
of shards. Indexing speed has correlation to number of shards, inverse 
correlation to number of replicas and maxShardsPerNode.
You can use traditional solrj apis to commit in parallel usage multiple threads 
concurrently.


> On Mar 26, 2020, at 2:59 PM, Mohamed Sirajudeen Mayitti Ahamed Pillai 
>  wrote:
> 
> Hi Everyone,
> 
> We are using Solr 7.4 with 3 external ZKs and 7 Solr node in a cloud setup. 
> We are using Streaming expression to pull 12million records from a different 
> Solr Cloud using below expression.
> 
> http://solrTarget:8983/solr/collection1/stream?expr=commit(collection1,batchSize=1,update(collection1,batchSize=1,search(collection1,zkHost="zkhost_source:9983",sort="timestamp_tdt
>  asc, id asc", rows=12114606, q=" aggr_type_s:click@doc_id,filters* AND 
> timestamp_tdt:[2020-03-25T18:58:33.337Z TO 2020-03-26T18:58:33.336Z]", 
> fl="id,timestamp_tdt,*",TZ="CST"))).
> 
> Collection 1 in SolrTarget has 2 shards and 2 replicas. Collection 1 in 
> solrSource has 1 shard and 2 replicas
> 
> The streaming expression reads documents from collection1 in 
> zkhost_source:9983 and indexes into collection1 in solrTarget environment.
> Similar streaming expression with less number of documents (less than 
> 5million) working without any failures.
> This streaming expression is not been successful as it grow bigger and 
> bigger, as we have been noticing that streaming expression is getting failed 
> response with different kind of errors.
> 
> Few error messages are below,
> 
> 
>  1.  Error trying to proxy request for url: http:// 
> solrTarget:8983/solr/collection1/stream, metadata=[error-class, 
> org.apache.solr.common.SolrException, root-error-class, 
> java.net.SocketTimeoutException], trace=org.apache.solr.common.SolrException: 
> Error trying to proxy request for url: http:// 
> solrTarget:8983/solr/collection1/stream
>  2.  {result-set={docs=[{EXCEPTION=java.util.concurrent.ExecutionException: 
> java.io.IOException: params 
> sort=timestamp_tdt+asc,+id+asc=12114606=aggr_type_s:click@doc_id,filters*+AND+timestamp_tdt:[2020-03-25T18:58:33.337Z+TO+2020-03-26T18:58:33.336Z]=id,timestamp_tdt,*=CST=false,
>  RESPONSE_TIME=121125, EOF=true}]}}
>  3.  {result-set={docs=[{EXCEPTION=org.apache.solr.common.SolrException: 
> Could not load collection from ZK: collection10, RESPONSE_TIME=139300, 
> EOF=true}]}}
> 
> 
> Is it a known issue with Streaming expression when it comes to bulk indexing 
> using update and commit expression? Is there any work-around to this issue?
> 
> Is there a better option available in Solr to index 12million records (with 
> only 12 fields per document) at a faster speed?
> 
> Thanks,
> Mohamed



Solr Streaming Expression failures

2020-03-26 Thread Mohamed Sirajudeen Mayitti Ahamed Pillai
Hi Everyone,

We are using Solr 7.4 with 3 external ZKs and 7 Solr node in a cloud setup. We 
are using Streaming expression to pull 12million records from a different Solr 
Cloud using below expression.

http://solrTarget:8983/solr/collection1/stream?expr=commit(collection1,batchSize=1,update(collection1,batchSize=1,search(collection1,zkHost="zkhost_source:9983",sort="timestamp_tdt
 asc, id asc", rows=12114606, q=" aggr_type_s:click@doc_id,filters* AND 
timestamp_tdt:[2020-03-25T18:58:33.337Z TO 2020-03-26T18:58:33.336Z]", 
fl="id,timestamp_tdt,*",TZ="CST"))).

Collection 1 in SolrTarget has 2 shards and 2 replicas. Collection 1 in 
solrSource has 1 shard and 2 replicas

The streaming expression reads documents from collection1 in zkhost_source:9983 
and indexes into collection1 in solrTarget environment.
Similar streaming expression with less number of documents (less than 5million) 
working without any failures.
This streaming expression is not been successful as it grow bigger and bigger, 
as we have been noticing that streaming expression is getting failed response 
with different kind of errors.

Few error messages are below,


  1.  Error trying to proxy request for url: http:// 
solrTarget:8983/solr/collection1/stream, metadata=[error-class, 
org.apache.solr.common.SolrException, root-error-class, 
java.net.SocketTimeoutException], trace=org.apache.solr.common.SolrException: 
Error trying to proxy request for url: http:// 
solrTarget:8983/solr/collection1/stream
  2.  {result-set={docs=[{EXCEPTION=java.util.concurrent.ExecutionException: 
java.io.IOException: params 
sort=timestamp_tdt+asc,+id+asc=12114606=aggr_type_s:click@doc_id,filters*+AND+timestamp_tdt:[2020-03-25T18:58:33.337Z+TO+2020-03-26T18:58:33.336Z]=id,timestamp_tdt,*=CST=false,
 RESPONSE_TIME=121125, EOF=true}]}}
  3.  {result-set={docs=[{EXCEPTION=org.apache.solr.common.SolrException: Could 
not load collection from ZK: collection10, RESPONSE_TIME=139300, EOF=true}]}}


Is it a known issue with Streaming expression when it comes to bulk indexing 
using update and commit expression? Is there any work-around to this issue?

Is there a better option available in Solr to index 12million records (with 
only 12 fields per document) at a faster speed?

Thanks,
Mohamed


Ingest data from multiple databases into a single solr collection

2020-03-26 Thread Guillermo Lopez Mackinnon
Hi,
I've posted a question within stackoverflow regarding the subject of this
email.

https://stackoverflow.com/questions/60876128/ingest-data-from-multiple-databases-into-a-single-solr-collection

If someone from this list could provide some help it'll be highly
appreciated!

Thanks in advance!

Guillermo


Re: Apache Solr 8.4.1 Basic Authentication

2020-03-26 Thread lstusr 5u93n4
Hey Emmanuel,

If you're using Java, I'd highly suggest using solrj, it'll do the work
that you need it to do:

SolrRequest req ;//create a new request object
req.setBasicAuthCredentials(userName, password);
solrClient.request(req);


If that doesn't work for you for some reason, you need to base64 encode the
username:password combo for basic http auth:

String auth =
Base64.getEncoder().encodeToString("solr:SolrRocks".getBytes());

headers.add("Authorization", "Basic " +  auth );

Also, I'm not sure if java.net.HttpClient has basic auth built in, but
apache HttpClient sure does...

Kyle

On Thu, 26 Mar 2020 at 15:27, Altamirano, Emmanuel <
emmanuel.altamir...@transunion.com> wrote:

> Hello everyone,
>
>
>
> We recently enable Solr Basic Authentication in our Dev environment and we
> are testing Solr security. We followed the instructions provided in the
> Apache Solr website and it is working using curl command.
>
>
>
> If you could provide us any advice of how do we need to send the
> credentials in the HTTP headers in a Java program? It is very appreciate it.
>
>
>
> HttpHeaders headers = *new* HttpHeaders();
>
> headers.setAccept(Arrays.*asList*(MediaType.*APPLICATION_JSON*));
>
> headers.setContentType(MediaType.*APPLICATION_JSON*);
>
> headers.add("Authorization", "Basic " + "solr:SolrRocks");
>
>
>
> Thanks,
>
>
>
> *Emmanuel Altamirano,*
>
> Consultant - Global Technology
>
> International Operations
>
>
>
> *Telephone:* 312-985-3149
>
> *Mobile:* 312-860-3774
>
>
>
> *[image: cid:image001.png@01D02A68.19FA64F0]*
>
>
>
> 555 W. Adams 5th Floor
>
> Chicago, IL 60661
>
> *transunion.com *
>
>
>
> This email including, without limitation, the attachments, if any,
> accompanying this email, may contain information which is confidential or
> privileged and exempt from disclosure under applicable law. The information
> is for the use of the intended recipient. If you are not the intended
> recipient, be aware that any disclosure, copying, distribution, review or
> use of the contents of this email, and/or its attachments, is without
> authorization and is prohibited. If you have received this email in error,
> please notify us by reply email immediately and destroy all copies of this
> email and its attachments.
>
>
>


Apache Solr 8.4.1 Basic Authentication

2020-03-26 Thread Altamirano, Emmanuel
Hello everyone,

We recently enable Solr Basic Authentication in our Dev environment and we are 
testing Solr security. We followed the instructions provided in the Apache Solr 
website and it is working using curl command.

If you could provide us any advice of how do we need to send the credentials in 
the HTTP headers in a Java program? It is very appreciate it.

HttpHeaders headers = new HttpHeaders();
headers.setAccept(Arrays.asList(MediaType.APPLICATION_JSON));
headers.setContentType(MediaType.APPLICATION_JSON);
headers.add("Authorization", "Basic " + "solr:SolrRocks");

Thanks,

Emmanuel Altamirano,
Consultant - Global Technology
International Operations

Telephone: 312-985-3149
Mobile: 312-860-3774

[cid:image001.png@01D02A68.19FA64F0]

555 W. Adams 5th Floor
Chicago, IL 60661
transunion.com

This email including, without limitation, the attachments, if any, accompanying 
this email, may contain information which is confidential or privileged and 
exempt from disclosure under applicable law. The information is for the use of 
the intended recipient. If you are not the intended recipient, be aware that 
any disclosure, copying, distribution, review or use of the contents of this 
email, and/or its attachments, is without authorization and is prohibited. If 
you have received this email in error, please notify us by reply email 
immediately and destroy all copies of this email and its attachments.



Re: deduplication of suggester results are not enough

2020-03-26 Thread Michal Hlavac
Hi Roland,

I wrote AnalyzingInfixSuggester that deduplicates data on several levels at 
index time.
I will publish it in few days on github. I'll wrote to this thread when done.

m.

On štvrtok 26. marca 2020 16:01:57 CET Szűcs Roland wrote:
> Hi All,
> 
> I follow the discussion of the suggester related discussions quite a while
> ago. Everybody agrees that it is not the expected behaviour from a
> Suggester where the terms are the entities and not the documents to return
> the same string representation several times.
> 
> One suggestion was to make deduplication on client side of Solr. It is very
> easy in most of the client solution as any set based data structure solve
> this.
> 
> *But one important problem is not solved the deduplication: suggest.count*.
> 
> If I have15 matches by the suggester and the suggest.count=10 and the first
> 9 matches are the same, I will get back only 2 after the deduplication and
> the remaining 5 unique terms will be never shown.
> 
> What is the solution for this?
> 
> Cheers,
> Roland
> 


deduplication of suggester results are not enough

2020-03-26 Thread Szűcs Roland
Hi All,

I follow the discussion of the suggester related discussions quite a while
ago. Everybody agrees that it is not the expected behaviour from a
Suggester where the terms are the entities and not the documents to return
the same string representation several times.

One suggestion was to make deduplication on client side of Solr. It is very
easy in most of the client solution as any set based data structure solve
this.

*But one important problem is not solved the deduplication: suggest.count*.

If I have15 matches by the suggester and the suggest.count=10 and the first
9 matches are the same, I will get back only 2 after the deduplication and
the remaining 5 unique terms will be never shown.

What is the solution for this?

Cheers,
Roland


Re: edge ngram/find as you type sorting

2020-03-26 Thread Erick Erickson
From other mails, it looks like you’re inheriting something you had
no input in building. My sympathies ;)

Unless you’ve explicitly changed the memory by specifying -Xmx and -Xms
at startup, you’re operating with 512M of memory, which is far too small
for most Solr installations. the -m parameter at startup will modify this.

The admin UI will also show you how much memory Solr is running with.

Best,
Erick

> On Mar 26, 2020, at 8:52 AM, matthew sporleder  wrote:
> 
> That explains the OOM's I've been getting in the initial test cycle.
> I'm working with about 50M (small) documents.
> 
> On Thu, Mar 26, 2020 at 7:58 AM Erick Erickson  
> wrote:
>> 
>> the ngramming is a time/space tradeoff. Typically,
>> if you restrict the wildcards to have three or more
>> “real” characters performance is fine. One real
>> character (i.e. a*) will be your worst-case. I’ve
>> seen requiring two characters in the prefix work well
>> too. It Depends (tm).
>> 
>> Conceptually what happens here is that Lucene has
>> to enumerate all of the terms that start with the prefix
>> and create a ginormous OR clause. The term
>> enumeration will take longer the more terms there are.
>> Things are more efficient than that, but still...
>> 
>> So make sure you’re testing with a real corpus. Having
>> a test index with just a few terms will be misleading.
>> 
>> Best,
>> Erick
>> 
>>> On Mar 25, 2020, at 9:37 PM, matthew sporleder  wrote:
>>> 
>>> Okay confirmed-
>>> I am getting a more predictable results set after adding an additional 
>>> field:
>>> >> sortMissingLast="true" omitNorms="true">
>>>
>>> 
>>> 
>>> >> pattern="\p{Punct}" replacement=""/>
>>>
>>> 
>>> 
>>> q=slug:what_is_lo*=slug=1000=csv=slug_alpha%20asc
>>> 
>>> So it appears I can skip edge ngram entirely using this method as
>>> slug:foo* appears to be the exact same results as fayt:foo, but I have
>>> the cost of the alphaOnly field :)
>>> 
>>> I will try to figure out some benchmarks or something to decide how to go.
>>> 
>>> Thanks again for the help so far.
>>> 
>>> 
>>> On Wed, Mar 25, 2020 at 2:39 PM Erick Erickson  
>>> wrote:
 
 You’re getting the correct sorted order… The underscore character is 
 confusing you.
 
 It’s ascii code for underscore is %2d which sorts before any letter, 
 uppercase or lowercase.
 
 See the alphaOnlySort type for a way to remove this, although the output 
 there can also
 be confusing.
 
 Best,
 Erick
 
> On Mar 25, 2020, at 1:30 PM, matthew sporleder  
> wrote:
> 
> What_is_Lov_Holtz_known_for
> What_is_lova_after_it_harddens
> What_is_Lova_Moor's_birthday
> What_is_lovable_in_Spanish
> What_is_lovage
> What_is_Lovagny's_population
> What_is_lovan_for
> What_is_lovanox
> What_is_lovarstan_for
> What_is_Lovasatin
 
>> 



Re: Solr Instance Migration - Server Access

2020-03-26 Thread matthew sporleder
If it's solrcloud + zookeeper you can get most of the configs from the
"tree" browser on the console: /solr/#/~cloud?view=tree

You can otherwise derive a lot of the configs/schema/data-import
properties from the web console and api, neither of which require
server access.

It is also possible to get into servers where you do not have the
passwords assuming you have physical access/cloud console access/etc
but that is not a solr question.

On Thu, Mar 26, 2020 at 3:24 AM Landon Cowan  wrote:
>
> Hello!  I’m working on a website for a client that was migrated from another 
> website development company.  The previous company used Solr to build out the 
> site search – but they did not send us the server credentials.  The 
> developers who built the tool are no longer with the company – is there a 
> process we should follow to secure the credentials?  I worry we may need to 
> rebuild the feature from the ground up.
>
>


Re: edge ngram/find as you type sorting

2020-03-26 Thread matthew sporleder
That explains the OOM's I've been getting in the initial test cycle.
I'm working with about 50M (small) documents.

On Thu, Mar 26, 2020 at 7:58 AM Erick Erickson  wrote:
>
> the ngramming is a time/space tradeoff. Typically,
> if you restrict the wildcards to have three or more
> “real” characters performance is fine. One real
> character (i.e. a*) will be your worst-case. I’ve
> seen requiring two characters in the prefix work well
> too. It Depends (tm).
>
> Conceptually what happens here is that Lucene has
> to enumerate all of the terms that start with the prefix
> and create a ginormous OR clause. The term
> enumeration will take longer the more terms there are.
> Things are more efficient than that, but still...
>
> So make sure you’re testing with a real corpus. Having
> a test index with just a few terms will be misleading.
>
> Best,
> Erick
>
> > On Mar 25, 2020, at 9:37 PM, matthew sporleder  wrote:
> >
> > Okay confirmed-
> > I am getting a more predictable results set after adding an additional 
> > field:
> >   > sortMissingLast="true" omitNorms="true">
> > 
> >  
> >  
> >   > pattern="\p{Punct}" replacement=""/>
> > 
> >  
> >
> > q=slug:what_is_lo*=slug=1000=csv=slug_alpha%20asc
> >
> > So it appears I can skip edge ngram entirely using this method as
> > slug:foo* appears to be the exact same results as fayt:foo, but I have
> > the cost of the alphaOnly field :)
> >
> > I will try to figure out some benchmarks or something to decide how to go.
> >
> > Thanks again for the help so far.
> >
> >
> > On Wed, Mar 25, 2020 at 2:39 PM Erick Erickson  
> > wrote:
> >>
> >> You’re getting the correct sorted order… The underscore character is 
> >> confusing you.
> >>
> >> It’s ascii code for underscore is %2d which sorts before any letter, 
> >> uppercase or lowercase.
> >>
> >> See the alphaOnlySort type for a way to remove this, although the output 
> >> there can also
> >> be confusing.
> >>
> >> Best,
> >> Erick
> >>
> >>> On Mar 25, 2020, at 1:30 PM, matthew sporleder  
> >>> wrote:
> >>>
> >>> What_is_Lov_Holtz_known_for
> >>> What_is_lova_after_it_harddens
> >>> What_is_Lova_Moor's_birthday
> >>> What_is_lovable_in_Spanish
> >>> What_is_lovage
> >>> What_is_Lovagny's_population
> >>> What_is_lovan_for
> >>> What_is_lovanox
> >>> What_is_lovarstan_for
> >>> What_is_Lovasatin
> >>
>


Autoscaling question

2020-03-26 Thread Kudrettin Güleryüz
Hi,

I'd like to balance freedisk and cores across eight nodes. Here is my
cluster-preferences and cluster-policy:

{
  "responseHeader":{
"status":0,
"QTime":0},
  "cluster-preferences":[{
  "precision":10,
  "maximize":"freedisk"}
,{
  "minimize":"cores",
  "precision":10}
,{
  "minimize":"sysLoadAvg",
  "precision":3}],
  "cluster-policy":[{
  "freedisk":"<10",
  "replica":"0",
  "strict":"true"}],
  "triggers":{".auto_add_replicas":{
  "name":".auto_add_replicas",
  "event":"nodeLost",
  "waitFor":120,
  "actions":[{
  "name":"auto_add_replicas_plan",
  "class":"solr.AutoAddReplicasPlanAction"},
{
  "name":"execute_plan",
  "class":"solr.ExecutePlanAction"}],
  "enabled":true}},
  "listeners":{".auto_add_replicas.system":{
  "trigger":".auto_add_replicas",
  "afterAction":[],
  "stage":["STARTED",
"ABORTED",
"SUCCEEDED",
"FAILED",
"BEFORE_ACTION",
"AFTER_ACTION",
"IGNORED"],
  "class":"org.apache.solr.cloud.autoscaling.SystemLogListener",
  "beforeAction":[]}},
  "properties":{},
  "WARNING":"This response format is experimental.  It is likely to
change in the future."}

Can you help me understand why least loaded node is test-54 in this case?
{
  "responseHeader":{
"status":0,
"QTime":1294},
  "diagnostics":{
"sortedNodes":[{
"node":"test-52:8983_solr",
"cores":99,
"freedisk":1136.8754272460938,
"sysLoadAvg":0.0},
  {
"node":"test-56:8983_solr",
"cores":99,
"freedisk":1045.345874786377,
"sysLoadAvg":6.0},
  {
"node":"test-51:8983_solr",
"cores":94,
"freedisk":1029.996826171875,
"sysLoadAvg":17.0},
  {
"node":"test-55:8983_solr",
"cores":98,
"freedisk":876.639045715332,
"sysLoadAvg":2.0},
  {
"node":"test-53:8983_solr",
"cores":91,
"freedisk":715.8955001831055,
"sysLoadAvg":17.0},
  {
"node":"test-58:8983_solr",
"cores":104,
"freedisk":927.1832389831543,
"sysLoadAvg":0.0},
  {
"node":"test-57:8983_solr",
"cores":120,
"freedisk":934.3348655700684,
"sysLoadAvg":0.0},
  {
"node":"test-54:8983_solr",
"cores":165,
"freedisk":580.5822525024414,
"sysLoadAvg":0.0}],
"violations":[]},
  "WARNING":"This response format is experimental.  It is likely to
change in the future."}

Solr 7.3.1 is running.

Thank you


Re: edge ngram/find as you type sorting

2020-03-26 Thread Erick Erickson
the ngramming is a time/space tradeoff. Typically,
if you restrict the wildcards to have three or more
“real” characters performance is fine. One real
character (i.e. a*) will be your worst-case. I’ve
seen requiring two characters in the prefix work well
too. It Depends (tm).

Conceptually what happens here is that Lucene has
to enumerate all of the terms that start with the prefix
and create a ginormous OR clause. The term
enumeration will take longer the more terms there are.
Things are more efficient than that, but still...

So make sure you’re testing with a real corpus. Having
a test index with just a few terms will be misleading.

Best,
Erick

> On Mar 25, 2020, at 9:37 PM, matthew sporleder  wrote:
> 
> Okay confirmed-
> I am getting a more predictable results set after adding an additional field:
>   sortMissingLast="true" omitNorms="true">
> 
>  
>  
>   pattern="\p{Punct}" replacement=""/>
> 
>  
> 
> q=slug:what_is_lo*=slug=1000=csv=slug_alpha%20asc
> 
> So it appears I can skip edge ngram entirely using this method as
> slug:foo* appears to be the exact same results as fayt:foo, but I have
> the cost of the alphaOnly field :)
> 
> I will try to figure out some benchmarks or something to decide how to go.
> 
> Thanks again for the help so far.
> 
> 
> On Wed, Mar 25, 2020 at 2:39 PM Erick Erickson  
> wrote:
>> 
>> You’re getting the correct sorted order… The underscore character is 
>> confusing you.
>> 
>> It’s ascii code for underscore is %2d which sorts before any letter, 
>> uppercase or lowercase.
>> 
>> See the alphaOnlySort type for a way to remove this, although the output 
>> there can also
>> be confusing.
>> 
>> Best,
>> Erick
>> 
>>> On Mar 25, 2020, at 1:30 PM, matthew sporleder  wrote:
>>> 
>>> What_is_Lov_Holtz_known_for
>>> What_is_lova_after_it_harddens
>>> What_is_Lova_Moor's_birthday
>>> What_is_lovable_in_Spanish
>>> What_is_lovage
>>> What_is_Lovagny's_population
>>> What_is_lovan_for
>>> What_is_lovanox
>>> What_is_lovarstan_for
>>> What_is_Lovasatin
>> 



Re: Cross DC CloudSolr Client

2020-03-26 Thread Erick Erickson
I’ve never even heard of someone trying to put
different ensembles in the same connection
string for a single client.

Create N CloudSolrClients, one for each DC.

And why do you want to try to contact individual nodes?
CloudSolrClient will do that for you.

Best,
Erick

> On Mar 26, 2020, at 2:38 AM, Lucky Sharma  wrote:
> 
> Hi all,
> Just wish to confirm on the cross DC connection situation from the
> CloudSolrClient.
> Scenario:
> We have multiple DC with the same collection data. Can we add the zookeeper
> connect string of the DC's to the cloud SolrClient.
> 
> Will it work like this:
> The client will utilise this connection string to fetch the Solr config,
> from ZK.
> reading of the connection string will be in a sequence i.e. if the first
> node itself is available, then that will be used to fetch the ClusterState.
> if not available, the next node will be used.
> 
> If we put two ZK clusters in one connection string, what will behave with
> two/multiple leaders since the Zk-clients embedded inside SolrClient?
> -- 
> Warm Regards,
> 
> Lucky Sharma
> Contact No :+91 9821559918



Re: Solr Instance Migration - Server Access

2020-03-26 Thread Charlie Hull
If you can get the server login details you should be able to copy the 
Solr installation and its configuration. If not, then Solr itself 
doesn't provide any way to get them - it's just a search engine, it's 
not responsible for securing a server in any way.


Charlie

On 26/03/2020 02:13, Landon Cowan wrote:

Hello!  I’m working on a website for a client that was migrated from another 
website development company.  The previous company used Solr to build out the 
site search – but they did not send us the server credentials.  The developers 
who built the tool are no longer with the company – is there a process we 
should follow to secure the credentials?  I worry we may need to rebuild the 
feature from the ground up.





--
Charlie Hull
OpenSource Connections, previously Flax

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.o19s.com



suggestion with multiple context field

2020-03-26 Thread Szűcs Roland
Hi All,

Is there any way to define multiple context fields with the suggester?

It is typical use case in an ecommerce environment that the facets are
listed in the sidebar, and they are acting as filter queries, when the user
select them. I am looking for similar functionality for the suggester.Do
you know how to solve this?

A potential workaround could be using normal queries with fq parameter and
N-gram based index analysis chain. Can it be fast enough to follow the
speed of typing?

Thanks,
Roland


Solr Instance Migration - Server Access

2020-03-26 Thread Landon Cowan
Hello!  I’m working on a website for a client that was migrated from another 
website development company.  The previous company used Solr to build out the 
site search – but they did not send us the server credentials.  The developers 
who built the tool are no longer with the company – is there a process we 
should follow to secure the credentials?  I worry we may need to rebuild the 
feature from the ground up.




Cross DC CloudSolr Client

2020-03-26 Thread Lucky Sharma
Hi all,
Just wish to confirm on the cross DC connection situation from the
CloudSolrClient.
Scenario:
We have multiple DC with the same collection data. Can we add the zookeeper
connect string of the DC's to the cloud SolrClient.

Will it work like this:
The client will utilise this connection string to fetch the Solr config,
from ZK.
reading of the connection string will be in a sequence i.e. if the first
node itself is available, then that will be used to fetch the ClusterState.
if not available, the next node will be used.

If we put two ZK clusters in one connection string, what will behave with
two/multiple leaders since the Zk-clients embedded inside SolrClient?
-- 
Warm Regards,

Lucky Sharma
Contact No :+91 9821559918