I understand your concern. I'll add more information to help clarify how the transactions are done:
- The whole indexing process (client) runs in a Spark cluster. Given one document, the process of removing/inserting is done by the same Spark executor. - The pipes that perform removal/insertion have different Solr client instances (instances of CloudSolrClient). However, the client instances share the same HttpClient. Not sure how (and if) this affects Solr's autocommit feature. ________________________________ From: Deepak Goel <[email protected]> Sent: 13 October 2021 12:01 To: [email protected] <[email protected]> Subject: Re: Concurrent transactions and autocommit in Solr 8 *** External email: use caution *** Hello If the insert/delete are done with two different threads (and with no synchronization), it could be possible that the new records are getting deleted. We might have to dig a bit into Solr code I guess. Deepak "The greatness of a nation can be judged by the way its animals are treated - Mahatma Gandhi" +91 73500 12833 [email protected] Facebook: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.facebook.com%2Fdeicool&data=04%7C01%7Ca.saur%40elsevier.com%7C79a9e8c5cc754867a9a408d98e308dbb%7C9274ee3f94254109a27f9fb15c10675d%7C0%7C0%7C637697162020572357%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=jC7jbL3OgJidgtpzyudATJv45vGSPhj836aRH2nkBkA%3D&reserved=0 LinkedIn: https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.linkedin.com%2Fin%2Fdeicool&data=04%7C01%7Ca.saur%40elsevier.com%7C79a9e8c5cc754867a9a408d98e308dbb%7C9274ee3f94254109a27f9fb15c10675d%7C0%7C0%7C637697162020572357%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=cJzUqhRS4OmrQCU%2FiRRKo2RtxEWuxx6hr2lyl2CYSN0%3D&reserved=0 "Plant a Tree, Go Green" Make In India : https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.makeinindia.com%2Fhome&data=04%7C01%7Ca.saur%40elsevier.com%7C79a9e8c5cc754867a9a408d98e308dbb%7C9274ee3f94254109a27f9fb15c10675d%7C0%7C0%7C637697162020582351%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=x%2B5LaN1wYq4%2FJlOm%2FC6sEaKafwsNt2Zdy1P4z7V5qJ8%3D&reserved=0 On Wed, Oct 13, 2021 at 3:16 PM Saur, Alexandre (ELS-AMS) < [email protected]> wrote: > Hi, > > I have a (noobie) question about Solr 8 autocommit behaviour. This is my > scenario: > > - Autocommit configured in solrconfig > - ETL job that indexes thousands of documents whenever it runs > > The ETL job updates the collection in the following manner: first it > deletes a series of documents based on a key and then it adds new ones > using the same key, with updated values (the transactions are always in > this order). The job (client) does not commit its transactions (neither > delete nor insert). > > Given this scenario, is it possible that the delete/insert order is played > differently when Solr autocommits? In other words, is it possible that the > insert gets deleted? > > > > > ________________________________ > > Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The > Netherlands, Registration No. 33158992, Registered in The Netherlands. > ________________________________ Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The Netherlands, Registration No. 33158992, Registered in The Netherlands.
