ChildDocTransformer and export handler
Dear Community, we are using the /export handler with Solr 7.7 to fetch a large number of documents from Solr. Recently we have extended our schema with Child Documents and now we are wondering if/how it is possible to export parent documents together with their corresponding Child Documents. When using the /select handler this can be done with the ChildDocTransformer ( https://lucene.apache.org/solr/guide/7_7/transforming-result-documents.html#child-childdoctransformerfactory ). However, when using the export handle we get an error from Solr. Our request: { "query" : "*:*", "sort" : "id asc", "fields" : "id,[child parentFilter='-child_type:* *:*']" } The response from Solr: { "responseHeader":{"status":400}, "response":{ "numFound":0, "docs":[{"EXCEPTION":"org.apache.solr.common.SolrException: undefined field: \"[child parentFilter=\"-child_type:* *:*\"]\""}]} } Is it possible to get parent documents together with their corresponding child documents? If it is possible: What is the correct query? If it is not possible: Can Streaming Expressions be used together with child documents? As far as I understand they internally use the export handler. Thanks in advance for your help Ludger -- *„Beste Arbeitgeber ITK 2020“ - 1. Platz für QAware* ausgezeichnet von Great Place to Work <https://www.qaware.de/news/great-place-to-work-deutschlands-beste-arbeitgeber-2020/> -- Ludger Steens Softwarearchitekt QAware GmbH Aschauer Straße 32 81549 München, Germany Mobil +49 175 7973969 ludger.ste...@qaware.de www.qaware.de -- Geschäftsführer: Christian Kamm, Johannes Weigend, Dr. Josef Adersberger Registergericht: München Handelsregisternummer: HRB 163761
AW: Atomic updates with nested documents
Hi Adi, thank you for your reply! Although I have to admit that this is not the response that I was hoping for . Upgrading to Solr 8 is currently not possible for us because we found multiple issues when doing so (see http://mail-archives.apache.org/mod_mbox/lucene-solr-user/202005.mbox/%3Ce7dc73d4be2ac35404db0f6cfb75f905%40mail.gmail.com%3E). We have now implemented a workaround and send the whole document with ChildDocs to Solr instead of doing an atomic update. This works as expected but is significantly slower. Regards Ludger --- Beste Arbeitgeber ITK 2020 - 1. Platz für QAware ausgezeichnet von Great Place to Work --- Ludger Steens Softwarearchitekt QAware GmbH Aschauer Straße 32 81549 München, Germany Mobil +49 175 7973969 mailto:ludger.ste...@qaware.de https://www.qaware.de Geschäftsführer: Christian Kamm, Johannes Weigend, Dr. Josef Adersberger Registergericht: München Handelsregisternummer: HRB 163761 --- -Ursprüngliche Nachricht- Von: Kaminski, Adi Gesendet: Sonntag, 7. Juni 2020 08:45 An: solr-user@lucene.apache.org Betreff: RE: Atomic updates with nested documents Hi Ludger, We had the same issue with Solr 7.6, and after discussing with the community we've found out that this partial update of parent document without "harm" parent-child association can work only on Solr 8.1 or higher, and It also requires some prerequisites. See the below item and it's last comments with details: https://issues.apache.org/jira/browse/SOLR-12638?focusedCommentId=16894628=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16894628 Eventually we have move to Solr 8.3 and it's working there as expected with the above mentioned changes. Regards, Adi -Original Message----- From: Ludger Steens Sent: Friday, June 5, 2020 3:24 PM To: solr-user@lucene.apache.org Subject: Atomic updates with nested documents Dear Community, I am using Solr 7.7 and I am wondering how it is possible to do a partial update on nested documents / child documents. Suppose I have committed the following documents to the index: [ { "id": "1", "testString": "1", "testInt": "1", "_childDocuments_": [ { "id": "1.1", "child_type": "child_a", "testString": "1.1", "testInt": "1" }, { "id": "1.2", "child_type": "child_a", "testString": "1.1", "testInt": "1" } ] } ] is id, all fields are indexed. Now I want to update testInt to 2 on the parent document without losing the parent child relation (ChildDocTransformerFactory should still produce correct results). I tried the following variants, both not successful: *Variant 1:* Sending the following update document to the update-Endpoint [ { "id": "1", "testInt": { "set": "2" } } ] The parent document is updated, but the ChildDocTransformerFactory does not return any child documents *Variant 2:* Sending the following update document to the update-Endpoint [ { "id": "1", "testInt": { "set": "2" }, "_childDocuments_": [ { "id": { "set": "1.1" } }, { "id": { "set": "1.2" } } ] } ] Same result: Parent is updated, but ChildDocTransformerFactory does not return any child documents Is there any other way of doing a partial update without losing the parent child relation? Resending the complete document with all attributes and children would work but is inefficient for us (we had to load all documents from Solr before resending them). Thanks in advance for your help Ludger -- *„Beste Arbeitgeber ITK 2020“ - 1. Platz für QAware* ausgezeichnet von Great Place to Work <https://www.qaware.de/news/great-place-to-work-deutschlands-beste-arbeitgeber-2020/> -- Ludger Steens Softwarearchitekt QAware GmbH Aschauer Straße 32 81549 München, Germany Mobil +49 175 7973969 ludger.ste...@qaware.de www.qaware.de -- Geschäftsführer: Christian Kamm, Johannes Weigend, Dr. Josef Adersberger Registergericht: München Handelsregisternummer: HRB 163761 This electronic message may contain proprietary and confidential information of Verint Systems Inc., its affiliates and/or subsidiaries. The information is intended to be for the use of the individual(s) or entity(ies) named above. If you are not the intended recipient (or authorized to receive this e-mail for the intended recipient), you may not use, copy, disclose or distribute to anyone this message or any information contained in this message. If you have received this electronic message in error, please notify us by replying to this e-mail.
Atomic updates with nested documents
Dear Community, I am using Solr 7.7 and I am wondering how it is possible to do a partial update on nested documents / child documents. Suppose I have committed the following documents to the index: [ { "id": "1", "testString": "1", "testInt": "1", "_childDocuments_": [ { "id": "1.1", "child_type": "child_a", "testString": "1.1", "testInt": "1" }, { "id": "1.2", "child_type": "child_a", "testString": "1.1", "testInt": "1" } ] } ] is id, all fields are indexed. Now I want to update testInt to 2 on the parent document without losing the parent child relation (ChildDocTransformerFactory should still produce correct results). I tried the following variants, both not successful: *Variant 1:* Sending the following update document to the update-Endpoint [ { "id": "1", "testInt": { "set": "2" } } ] The parent document is updated, but the ChildDocTransformerFactory does not return any child documents *Variant 2:* Sending the following update document to the update-Endpoint [ { "id": "1", "testInt": { "set": "2" }, "_childDocuments_": [ { "id": { "set": "1.1" } }, { "id": { "set": "1.2" } } ] } ] Same result: Parent is updated, but ChildDocTransformerFactory does not return any child documents Is there any other way of doing a partial update without losing the parent child relation? Resending the complete document with all attributes and children would work but is inefficient for us (we had to load all documents from Solr before resending them). Thanks in advance for your help Ludger -- *„Beste Arbeitgeber ITK 2020“ - 1. Platz für QAware* ausgezeichnet von Great Place to Work <https://www.qaware.de/news/great-place-to-work-deutschlands-beste-arbeitgeber-2020/> -- Ludger Steens Softwarearchitekt QAware GmbH Aschauer Straße 32 81549 München, Germany Mobil +49 175 7973969 ludger.ste...@qaware.de www.qaware.de -- Geschäftsführer: Christian Kamm, Johannes Weigend, Dr. Josef Adersberger Registergericht: München Handelsregisternummer: HRB 163761
Problems when Upgrading from Solr 7.7.1 to 8.5.0
Hi all, we recently upgraded our SolrCloud cluster from version 7.7.1 to version 8.5.0 and ran into multiple problems. In the end we had to revert the upgrade and went back to Solr 7.7.1. In our company we are using Solr since Version 4 and so far, upgrading Solr to a newer version was possible without any problems. We are curious if others are experiencing the same kind of problems and if these are some known issues. Or maybe we did something wrong and missed something when upgrading? 1. Network issues when indexing documents === Our collection contains roughly 150 million documents. When we re-created the collection and re-indexed all documents, we regularly experienced network problems that causes our loader application to fail. The Solr log always contains an IOException Exception: ERROR (updateExecutor-5-thread-1338-processing-x:PSMG_CI_2020_04_15_10_07_04_sha rd6_replica_n22 r:core_node25 null n:solr2:8983_solr c:PSMG_CI_2020_04_15_10_07_04 s:shard6) [c:PSMG_CI_2020_04_15_10_07_04 s:shard6 r:core_node25 x:PSMG_CI_2020_04_15_10_07_04_shard6_replica_n22] o.a.s.u.ErrorReportingConcurrentUpdateSolrClient Error when calling SolrCmdDistributor$Req: cmd=add{,id=(null)}; node=StdNode: http://solr1:8983/solr/PSMG_CI_2020_04_15_10_07_04_shard6_replica_n20/ to http://solr1:8983/solr/PSMG_CI_2020_04_15_10_07_04_shard6_replica_n20/ => java.io.IOException: java.io.IOException: cancel_stream_error at org.eclipse.jetty.client.util.DeferredContentProvider.flush(DeferredConten tProvider.java:197) java.io.IOException: java.io.IOException: cancel_stream_error at org.eclipse.jetty.client.util.DeferredContentProvider.flush(DeferredConten tProvider.java:197) ~[jetty-client-9.4.24.v20191120.jar:9.4.24.v20191120] at org.eclipse.jetty.client.util.OutputStreamContentProvider$DeferredOutputSt ream.flush(OutputStreamContentProvider.java:151) ~[jetty-client-9.4.24.v20191120.jar:9.4.24.v20191120] at org.eclipse.jetty.client.util.OutputStreamContentProvider$DeferredOutputSt ream.write(OutputStreamContentProvider.java:145) ~[jetty-client-9.4.24.v20191120.jar:9.4.24.v20191120] at org.apache.solr.common.util.FastOutputStream.flush(FastOutputStream.java:2 16) ~[solr-solrj-8.5.0.jar:8.5.0 7ac489bf7b97b61749b19fa2ee0dc46e74b8dc42 - romseygeek - 2020-03-1309:38:26] at org.apache.solr.common.util.FastOutputStream.flushBuffer(FastOutputStream. java:209) ~[solr-solrj-8.5.0.jar:8.5.0 7ac489bf7b97b61749b19fa2ee0dc46e74b8dc42 - romseygeek - 202003-13 09:38:26] at org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:172) ~[solr-solrj-8.5.0.jar:8.5.0 7ac489bf7b97b61749b19fa2ee0dc46e74b8dc42 - romseygeek - 2020-03-13 09:3826] After the Exception the collection usually was in a degraded state for some time and shards try to recover and sync with the leader. In the Solr changelog we saw that one major change from 7.x to 8.x was that Solr now uses HTTP/2 instead of HTTP/1.1. So we tried to disable HTTP/2 by setting the system property solr.http1=true. That did make the indexing process a LOT more stable but we still saw a IOExceptions from time to time. Disabling HTTP/2 did not completely fix the problem. ERROR (updateExecutor-5-thread-9310-processing-x:PSMG_BOM_2020_04_28_05_00_11_sh ard7_replica_n24 r:core_node27 null n:solr3:8983_solr c:PSMG_BOM_2020_04_28_05_00_11 s:shard7) [c:PSMG_BOM_2020_04_28_05_00_11 s:shard7 r:core_node27 x:PSMG_BOM_2020_04_28_05_00_11_shard7_replica_n24] o.a.s.u.ErrorReportingConcurrentUpdateSolrClient Error when calling SolrCmdDistributor$Req: cmd=add{,id=5141653a-e33a-4b60-856d-7aa2ce73dee7}; node=ForwardNode: http://solr2:8983/solr/PSMG_BOM_2020_04_28_05_00_11_shard6_replica_n22/ to http://solr2:8983/solr/PSMG_BOM_2020_04_28_05_00_11_shard6_replica_n22/ => java.io.IOException: java.io.EOFException: HttpConnectionOverHTTP@9dc7ad1::SocketChannelEndPoint@2d20213b{solr2/10.0. 0.216:8983<->/10.0.0.193:38728,ISHUT,fill=-,flush=-,to=5/60}{io=0/0,ki o=0,kro=1}->HttpConnectionOverHTTP@9dc7ad1(l:/10.0.0.193:38728 <-> r:solr2/10.0.0.216:8983,closed=false)=>HttpChannelOverHTTP@47a242c3(exchan ge=HttpExchange@6ffd260f req=PENDING/null@null res=PENDING/null@null)[send=HttpSenderOverHTTP@17e056f9(req=CONTENT,snd=ID LE,failure=null)[HttpGenerator@3b6594c7{s=COMMITTED}],recv=HttpReceiverOve rHTTP@6e847d32(rsp=IDLE,failure=null)[HttpParser{s=CLOSED,0 of -1}]] at org.eclipse.jetty.client.util.DeferredContentProvider.flush(DeferredConten tProvider.java:197) java.io.IOException: java.io.EOFException: HttpConnectionOverHTTP@9dc7ad1::SocketChannelEndPoint@2d20213b{solr2/10.0. 0.216:8983<->/10.0.0.193:38728,ISHUT,fill=-,flush=-,to=5/60}{io=0/0,ki o=0,kro=1}->HttpConnectionOverHTTP@9dc7ad1(l:/10.0.0.193:38728 <-> r:solr2/10.0.0.216:8983,closed=false)=>HttpChannelOverHTTP@47a242c3(exchan ge=HttpExchange@6ffd260f req=PENDING/null@null res=PENDING/null@null)[send=HttpSenderOverHTTP@17e056f9(req=CONTENT,snd=ID