Re: Solr cloud issuse: Async exception during distributed update

2020-12-09 Thread Ritvik Sharma
Hi Houston,
Thanks for reply

We dont have this kind of field. It's a field value and it is coming
randomly, not all the time.
We are indexing using cloudsolrclient + spring data .  It is coming on any
value,

I am trying to do indexing of  ~30 million records. And it is coming on
Solr cloud mode not on standalone VM.

Here x.x.x.x and x.x.x.y are tlog.

Remote error message: ERROR: [doc=33140886###Track] unknown field '
https://a10.ga'
at
org.apache.solr.client.solrj.impl.CloudSolrClient.getRouteException(CloudSolrClient.java:125)
at
org.apache.solr.client.solrj.impl.CloudSolrClient.getRouteException(CloudSolrClient.java:46)
at
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.directUpdate(BaseCloudSolrClient.java:549)
at
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.sendRequest(BaseCloudSolrClient.java:1037)
at
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.requestWithRetryOnStaleState(BaseCloudSolrClient.java:897)
at
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.request(BaseCloudSolrClient.java:829)
at
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:211)
at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:106)
at
org.springframework.data.solr.core.SolrTemplate.lambda$saveBeans$3(SolrTemplate.java:227)
at
org.springframework.data.solr.core.SolrTemplate.execute(SolrTemplate.java:167)
... 29 common frames omitted
Caused by:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
from server at http://x.x.x.x:8983/solr
/searchcollection_shard1_replica_t101: Async exception during distributed
update: Error from server at http://x.x.x.y:8983/sol
r/searchcollection_shard2_replica_t103/: null



request: http://x.x.x.x:8983/solr/searchcollection_shard2_replica_t103/
Remote error message: ERROR: [doc=33140886###Track] unknown field '
https://a10.ga'
at
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:665)
at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:265)
at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248)
at
org.apache.solr.client.solrj.impl.LBSolrClient.doRequest(LBSolrClient.java:368)
at
org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:296)
at
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.lambda$directUpdate$0(BaseCloudSolrClient.java:525)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
... 3 common frames omitted


On Thu, 10 Dec 2020 at 11:38, Houston Putman 
wrote:

> Do you have a field named "314257s_seourls" in your schema?
>
> Is there a dynamic field you are trying to match with that name?
>
> - Houston
>
> On Thu, Dec 10, 2020 at 2:53 PM ritvik  wrote:
>
> > Hi ,
> >  Please suggest, why it is happening.
> >
> >
> >
> > --
> > Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
> >
>


Re: Solr cloud issuse: Async exception during distributed update

2020-12-09 Thread Houston Putman
Do you have a field named "314257s_seourls" in your schema?

Is there a dynamic field you are trying to match with that name?

- Houston

On Thu, Dec 10, 2020 at 2:53 PM ritvik  wrote:

> Hi ,
>  Please suggest, why it is happening.
>
>
>
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


Re: Solr cloud issuse: Async exception during distributed update

2020-12-09 Thread ritvik
Hi ,
 Please suggest, why it is happening.



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: [CAUTION] SSL + Solr 8.5.1 in cloud mode + Java 8

2020-12-09 Thread Ritvik Sharma
This code is there but it does not show on solr running cammnd

On Wed, 9 Dec 2020 at 23:28, rkrish84  wrote:

> Commented out the solr_ssl_client_key_store related code section in solr.sh
> file to resolve the issue and enable ssl.
>
>
>
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


Re: solrcloud with EKS kubernetes

2020-12-09 Thread Houston Putman
Hello Abhishek,

It's really hard to provide any advice without knowing any information
about your setup/usage.

Are you giving your Solr pods enough resources on EKS?
Have you run Solr in the same configuration outside of kubernetes in the
past without timeouts?
What type of storage volumes are you using to store your data?
Are you using headless services to connect your Solr Nodes, or ingresses?

If this is the first time that you are using this data + Solr
configuration, maybe it's just that your data within Solr isn't optimized
for the type of queries that you are doing.
If you have run it successfully in the past outside of Kubernetes, then I
would look at the resources that you are giving your pods and the storage
volumes that you are using.
If you are using Ingresses, that might be causing slow connections between
nodes, or between your client and Solr.

- Houston

On Wed, Dec 9, 2020 at 3:24 PM Abhishek Mishra  wrote:

> Hello guys,
> We are kind of facing some of the issues(Like timeout etc.) which are very
> inconsistent. By any chance can it be related to EKS? We are using solr 7.7
> and zookeeper 3.4.13. Should we move to ECS?
>
> Regards,
> Abhishek
>


Collection reload task is taking long time

2020-12-09 Thread Moulay Hicham
Hi,

We have a solr cluster of 30 nodes with a Replication Factor =3.
Each index size is about 80GB.
Solr version is 8.1
The cluster has high TPS both in read and write.

We have recently made a schema change and uploaded it using ZKCLI script.
Then we issue a collection reload async request:
admin/collections?action=RELOAD==1000'

When we check on the status of this request, it shows that it's still
running:

admin/collections?action=REQUESTSTATUS=1000'

{

  "responseHeader":{

"status":0,

"QTime":1},

  "status":{

"state":"running",

"msg":"found [1000] in running tasks"}}

This task has been in a running state for *about 5 hours* so far. I am not
sure if this is expected or the status of this task failed or completed but
never reported back to zookeeper.

Also if running for that long - is it because the index is being actively
(with high TPS) updated? We have a softcommit of 10s and hadcommit of 60s.

Please help me understand what's going on.

Thanks,
Moulay


Problem with HttpSolrClient.Builder on Windows with tomcat

2020-12-09 Thread Raivo Rebane

Hello

I have problem with HttpSolrClient.Builder on Windows with tomcat

I have follwing code lines in my programm:
System.out.println("Trying HttpSolrClient = " + url );
LibraryServlet.solrClientClient = new 
HttpSolrClient.Builder(url).withConnectionTimeout(5000).withSocketTimeout(3000).build();

System.out.println("Trying HttpSolrClient sucessfull ");

and project WEB-INF/lib contains following jars:
commons-codec-1.15.jar*
commons-httpclient-3.1.jar*
commons-io-1.4.jar*
httpclient5-5.0.3.jar
slf4j-api-1.8.0-beta4.jar
slf4j-log4j12.jar*
solr-solrj-8.6.3.jar*

On Ubuntu I have following tomcat output
Trying HttpSolrClient = http://localhost:8983/solr/clients
SLF4J: No SLF4J providers were found.
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#noProviders for further details.
SLF4J: Class path contains SLF4J bindings targeting slf4j-api versions 
prior to 1.8.
SLF4J: Ignoring binding found at 
[jar:file:/home/hydra/Librarian/target/Librarian/WEB-INF/lib/slf4j-log4j12.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Ignoring binding found at 
[jar:file:/opt/tomcat/apache-tomcat-9.0.38/lib/nlog4j-1.2.19.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Ignoring binding found at 
[jar:file:/opt/tomcat/apache-tomcat-9.0.38/lib/logback-classic-1.1.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#ignoredBindings for an 
explanation.

Trying HttpSolrClient sucessfull

But on Windows tomcat I have only one line:
Trying HttpSolrClient = http://localhost:8983/solr/clients

and it's hangs !!!

What is the reason ?

Regards
Raivo



Re: [CAUTION] SSL + Solr 8.5.1 in cloud mode + Java 8

2020-12-09 Thread rkrish84
Commented out the solr_ssl_client_key_store related code section in solr.sh
file to resolve the issue and enable ssl.



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


How can i poll Solrcloud via API to get the sum of index size of all shards and replicas?

2020-12-09 Thread Roman Ivanov
Hello! We have a Solrcloud(7.4) consisting of 90+ hosts(each of them
running multiple nodes of solr, e.g. ports 8983, 8984, 8985), numerous
shards(each having several replicas) and numerous collections.

I was given a task to summarize the total index size(on disks) of a certain
collection. First I calculated it from web interface(via copy-paste)
manually and there were thousands of lines (The http interface(8983) Cloud
- Nodes tab). It took about several hours. Now i consider this task needs
some automatization. I read the API documentation and googled but still no
luck... And any possible solution could help somebody else in the future.

What i tried:
   1) If I poll one of the solr cores via

"
http://solrhost1.somecorporatesite.org:8983/solr/admin/metrics?wt=JSON=INDEX
"

I get output like (**cores.json**):

"responseHeader":{
   "status":0,
"Qtime":2004},
 "metrics":{
   "solr.core.collectionname1-2020-12-05.shard12.replica_n240:{
   "INDEX.size":"456 bytes",
   "INDEX.sizeInBytes":456},
   "solr.core.collectionname2-2020-12-04.shard74.replica_n650:{
   "INDEX.size":"2.88 GB",
   "INDEX.sizeInBytes":3088933801},

... and so on which is what i need BUT only according to one core(local).
But there are more than 200 of them.

   2) I can get a list of all collections, shards and replicas via:


http://localhost:8983/solr/admin/collections?action=clusterstatus=json

and it looks like (**collections.json**)

"responseHeader":{
  "status":0,
  "QTime":184},
"cluster":{
  "collections":{
  "collectionname1":{
  "pullReplicas":"0",
  "replicationFactor":"1",
  "shards":{
 "shard1":{
  "range":"8-80e0",
  "state":active",
  "replicas":{
 "core_node67":{
   "core":"collectionname123-2020-11-30_shard1_replica_n54",
   "node_name":"solrhost99.somecorporatesite.org:8985/solr",
   "state":"active",
   "type":"NRT",
   "force_ste_state":"false",
   "leader":"true"},
  "core_node548":{
 "core":"collectionname223-2020-11-29_shard1_replica_n448",
  "node_name":"solrhost77.somecorporatesite.org:8984/solr",
  "state":"active",
  "type":"NRT",
  "force_ste_state":"false"}}},
   "shard2":{
 "range":

... and so on, 117 156 lines

The question is, how can i insert the fields of INDEX.size into the second
output(clusterstatus) for calculation of sum disk space used by indices?

In other words, i need the correspondings fields of INDEX.size in replicas
sections of **collections.json**

Currently the whole solr system consumes 100TB+ and is still growing, we
need to know the tempo of it's growth. Many thanks in advance!


Re: Commits (with openSearcher = true) are too slow in solr 8

2020-12-09 Thread raj.yadav
Hi All,

I tried debugging but unable to find any solution. Do let me know in case
details/logs shared by me are not suffiecient/clear. 

Regards,
Raj



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Solr cloud issuse: Async exception during distributed update

2020-12-09 Thread Ritvik Sharma
Hi Furkan

I have added the mail. Please check.




On Wed, 9 Dec 2020 at 12:52, Furkan KAMACI  wrote:

> Hi Ritwik,
>
> Could you send your e-mail to solr user list?
>
> Kind Regards,
> Furkan KAMACI
>
> On 9 Dec 2020 Wed at 10:18 Ritvik Sharma  wrote:
>
>>
>> Hi Solr Owner ,
>>
>> Please check the below mail, I am facing the issue. Do you have any
>> solution.
>> -- Forwarded message -
>> From: Ritvik Sharma 
>> Date: Thu, 3 Dec 2020 at 16:21
>> Subject: Solr cloud issuse: Async exception during distributed update
>> To: 
>>
>>
>> Hello Guys,
>>
>> I am facing an issue , while indexing on solr cloud, on Zookeeper,
>>
>> Solr version: 8.3,
>> Spring solr connector
>>
>>
>> request: http://X:8983/solr/searchcollection_shard2_replica_t103/
>> Remote error message: ERROR: [doc=] *unknown field* '314257s_seourls'
>> at
>> org.apache.solr.client.solrj.impl.CloudSolrClient.getRouteException(CloudSolrClient.java:125)
>> at
>> org.apache.solr.client.solrj.impl.CloudSolrClient.getRouteException(CloudSolrClient.java:46)
>> at
>> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.directUpdate(BaseCloudSolrClient.java:549)
>> at
>> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.sendRequest(BaseCloudSolrClient.java:1037)
>> at
>> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.requestWithRetryOnStaleState(BaseCloudSolrClient.java:897)
>> at
>> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.request(BaseCloudSolrClient.java:829)
>> at
>> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:211)
>> at
>> org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:106)
>> at org.*springframework*
>> .data.solr.core.SolrTemplate.lambda$saveBeans$3(SolrTemplate.java:227)
>> at
>> org.springframework.data.solr.core.SolrTemplate.execute(SolrTemplate.java:167)
>> ... 29 common frames omitted
>> Caused by:
>> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
>> from server at
>> http://x:8983/solr/searchcollection_shard1_replica_t105: *Async
>> exception during distributed update: Error from server at*
>> http://yy:8983/solr/searchcollection_shard2_replica_t103/: *null*
>>
>>
>> It seems it reads value as a field.
>>
>> Please check this.
>>
>>


London Information Retrieval Meetup (10th December)

2020-12-09 Thread Lisa Biella
The London Information Retrieval Meetup has moved online!
It is a free evening meetup aimed at Information Retrieval passionates and
professionals who are curious to explore and discuss the latest trends in
the field.
It is technology agnostic, but you'll find many talks on Apache Solr and
related technologies.

Next London Information Retrieval Meetup:
10th DECEMBER starting from 06:00 PM (GMT)

This time we will have one talk and a Q session:

Talk 1
"Enriching postal addresses with Elastic stack"
from David Pilato, Developer and Evangelist at Elastic

Q session
We are building up an interesting Q session with questions from the
audience chosen! If you have any question about Information Retrieval,
Apache Solr, Elasticsearch or about our new Apache Solr Learning To Rank
Interleaving (see our latest blogpost), just leave a comment in the event
page. 

Feel free to register here:
https://www.meetup.com/London-Information-Retrieval-Meetup-Group/events/274765296/

Cheers!



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Collection deleted still in zookeeper

2020-12-09 Thread Marisol Redondo
Yes, maybe it's more complicated I was thinking. But it's good to know that
newer version of Solr still work in the same way.

Thanks again


On Mon, 7 Dec 2020 at 13:08, Erick Erickson  wrote:

> What should happen when you delete a collection and _only_ that
> collection references the configset has been discussed several
> times, and… whatever is chosen is wrong ;)
>
> 1> if we delete the configset, then if you want to delete a collection
> to insure that you’re starting all over for whatever reason, your
> configset is gone and you need to find it again.
>
> 2> If we _don’t_ delete the configset, then you can wind up with
> obsolete configsets polluting Zookeeper…
>
> 3> If we make a copy of the configset every time we make a collection,
> then there can be a bazillion of them in a large installation.
>
> Best,
> Erick
>
> > On Dec 7, 2020, at 6:52 AM, Marisol Redondo <
> marisol.redondo.gar...@gmail.com> wrote:
> >
> > Thanks Erick for the answer, you gave me the clue to find the issue.
> >
> > The real problem is that when I removed the collection using the solr API
> > (http://solrintance:port
> /solr/admin/collections?action=DELETE=collectionname)
> > the config files are not deleted. I don't know if this is the normal
> > behavior in every version of solr (I'm using version 6), but I think when
> > deleting the collection, the config files for this collection should be
> > removed.
> >
> > Anyway, I found that the config where still in the UI/cloud/tree/configs
> > and they can be removed using the solr zk -r configs/myconfig and this
> > solve the issue.
> >
> > Thanks
> >
> >
> >
> >
> >
> >
> > On Fri, 4 Dec 2020 at 15:46, Erick Erickson 
> wrote:
> >
> >> This almost always a result of one of two things:
> >>
> >> 1> you didn’t upload the config to the correct place or the ZK that Solr
> >> uses.
> >> or
> >> 2> you still have a syntax problem in the config.
> >>
> >> The solr.log file on the node that’s failing may have a more useful
> >> error message about what’s wrong. Also, you can try validating the XML
> >> with one of the online tools.
> >>
> >> Are you totally and absolutely sure that, for instance, you’re uploading
> >> to the correct Zookeeper? You should be able to look at the admin UI
> >> screen and see the ZK address. I’ve seen this happen when people
> >> inadvertently use the embedded ZK for one operation but not for the
> >> other. Of have the ZK_HOST environment variable pointing to some
> >> ZK ensemble that’s used when you start Solr but not when you upload
> >> files. Or…
> >>
> >> Use the admin UI>>cloud>>tree>>configs>>your_config_name
> >> to see if the solrconfig has the correct changes. I’ll often add some
> >> bogus comment in the early part of the file that I can use to make
> >> sure I’ve uploaded the correct file to the correct place.
> >>
> >> I use the "bin/solr zk upconfig” command to move files back and forth
> >> FWIW, that
> >> avoids, say putting the individual file a in the wrong directory...
> >>
> >> Best,
> >> Erick
> >>
> >>> On Dec 4, 2020, at 9:18 AM, Marisol Redondo <
> >> marisol.redondo.gar...@gmail.com> wrote:
> >>>
> >>> Hi,
> >>>
> >>> When trying to modify the config.xml file for a collection I made a
> >> mistake
> >>> and the config was wrong. So I removed the collection to create it
> again
> >>> from a backend.
> >>> But, although I'm sure I'm using a correct config.xml, solr is still
> >>> complaining about the error in the older solrconfig.xml
> >>>
> >>> I have tried to removed the collection more than once, I have stopped
> >> solr
> >>> and zookeeper and still having the same error. It's like zookeeper is
> >> still
> >>> storing the older solrconfig.xml and don't upload the configuration
> file
> >>> from the new collection.
> >>>
> >>> I have tried to
> >>> - upload the files
> >>> - remove the collection and create it again, but empty
> >>> - restore the collection from the backup
> >>> And I get always the same error:
> >>>  collection_name_shard1_replica1:
> >>>
> >>
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> >>> Could not load conf for core collection_name_shard1_replica1: Error
> >> loading
> >>> solr config from solrconfig.xml
> >>>
> >>> Thanks for your help
> >>
> >>
>
>