[ https://issues.apache.org/jira/browse/SOLR-9493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15536574#comment-15536574 ]
Alexandre Rafalovitch commented on SOLR-9493: --------------------------------------------- Can this case be closed now? It is not a bug and there is no next action on it. > uniqueKey generation fails if content POSTed as "application/javabin" and > uniqueKey field comes as NULL (as opposed to not coming at all). > ------------------------------------------------------------------------------------------------------------------------------------------ > > Key: SOLR-9493 > URL: https://issues.apache.org/jira/browse/SOLR-9493 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Yury Kartsev > Attachments: 200.png, 400.png, Screen Shot 2016-09-11 at 16.29.50 > .png, SolrInputDoc_contents.png, SolrInputDoc_headers.png > > > I have faced a weird issue when the same application code (using SolrJ) fails > indexing a document without a unique key (should be auto-generated by SOLR) > in SolrCloud and succeeds indexing it in standalone SOLR instance (or even in > cloud mode, but from web interface of one of the replicas). Difference is > obviously only between clients (CloudSolrClient vs HttpSolrClient) and SOLR > URLs (Zokeeper hostname+port vs standalone SOLR instance hostname and port). > Failure is seen as "org.apache.solr.client.solrj.SolrServerException: > org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: > Document is missing mandatory uniqueKey field: id". > I am using SOLR 5.1. In cloud mode I have 1 shard and 3 replicas. > After lot of debugging and investigation (see below as well as my > [StackOverflow > post|http://stackoverflow.com/questions/39401792/uniquekey-generation-does-not-work-in-solrcloud-but-works-if-standalone]) > I came to a conclusion that the difference in failing and succeeding calls > is simply content type of the POSTing requests. Local proxy clearly shows > that the request fails if content is sent as "application/javabin" (see > attached screenshot with sensitive data removed) and succeeds if content sent > as "application/xml; charset=UTF-8" (see attached screenshot with sensitive > data removed). > Would you be able to please assist? > Thank you very much in advance! > ------------------------ > Copying whole description and investigation here as well: > ------------------------ > [Documentation|https://cwiki.apache.org/confluence/display/solr/Other+Schema+Elements] > states:{quote}Schema defaults and copyFields cannot be used to populate the > uniqueKey field. You can use UUIDUpdateProcessorFactory to have uniqueKey > values generated automatically.{quote} > Therefore I have added my uniqueKey field to the schema:{code}<fieldType > name="uuid" class="solr.UUIDField" indexed="true" /> > ... > <field name="id" type="uuid" indexed="true" stored="true" required="true" /> > ... > <uniqueKey>id</uniqueKey>{code}Then I have added updateRequestProcessorChain > to my solrconfig:{code}<updateRequestProcessorChain name="uuid"> > <processor class="solr.UUIDUpdateProcessorFactory"> > <str name="fieldName">id</str> > </processor> > <processor class="solr.RunUpdateProcessorFactory" /> > </updateRequestProcessorChain>{code}And made it the default for the > UpdateRequestHandler:{code}<initParams path="/update/**"> > <lst name="defaults"> > <str name="update.chain">uuid</str> > </lst> > </initParams>{code} > Adding new documents with null/absent id works fine as from web-interface of > one of the replicas, as when using SOLR in standalone mode (non-cloud) from > my application. Although when only I'm using SolrCloud and add document from > my application (using CloudSolrClient from SolrJ) it fails with > "org.apache.solr.client.solrj.SolrServerException: > org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: > Document is missing mandatory uniqueKey field: id" > All other operations like ping or search for documents work fine in either > mode (standalone or cloud). > INVESTIGATION (i.e. more details): > In standalone mode obviously update request is:{code}POST > standalone_host:port/solr/collection_name/update?wt=json{code} > In SOLR cloud mode, when adding document from one replica's web interface, > update request is (found through inspecting the call made by web interface): > {code}POST > replica_host:port/solr/collection_name_shard1_replica_1/update?wt=json{code} > In both these cases payload is something like:{code}{ > "add": { > "doc": { > ..... > }, > "boost": 1.0, > "overwrite": true, > "commitWithin": 1000 > } > }{code} > In case when CloudSolrClient is used, the following happens (found through > debugging): > Using ZK and some logic, URL list of replicas is constructed that looks like > this:{code}[http://replica_1_host:port/solr/collection_name/, > http://replica_2_host:port/solr/collection_name/, > http://replica_3_host:port/solr/collection_name/]{code} > This code is called:{code}LBHttpSolrClient.Req req = new > LBHttpSolrClient.Req(request, theUrlList); > LBHttpSolrClient.Rsp rsp = lbClient.request(req); > return rsp.getResponse();{code} > Where the second line fails with the exception. > If to debug the second line further, it ends up calling HttpClient.execute > (from HttpSolrClient.executeMethod) for:{code}POST > http://replica_1_host:port/solr/collection_name/update?wt=javabin&version=2 > HTTP/1.1 > POST > http://replica_2_host:port/solr/collection_name/update?wt=javabin&version=2 > HTTP/1.1 > POST > http://replica_3_host:port/solr/collection_name/update?wt=javabin&version=2 > HTTP/1.1{code} > And the very first request returns 400 Bad Request with replica 1 logging > "Document is missing mandatory uniqueKey field: id" in the logs. > The funny thing is that when I execute the same request using POSTMAN (but > with JSON instead of binary payload), it works! Am I doing something wrong > here? I assume it's definitely something in the way of how the request is > made... > UPDATE: > I have used local proxy in order to see the difference in these 2 requests > sent by my application in order to understand what is different there. Looks > like the only difference is content type. In case of cloud mode the payload > for POSTing document is sent as "application/javabin" while in standalone > mode it's sent as "application/xml; charset=UTF-8". Everything else is the > same. First request results in 400 while second is 200. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org