[
https://issues.apache.org/jira/browse/SOLR-9493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15484595#comment-15484595
]
Alexandre Rafalovitch commented on SOLR-9493:
---------------------------------------------
I am not able to reproduce this using the following basic code against a single
node single shard cloud example :
{noformat}
String zkHostString = "localhost:9983";
CloudSolrClient solr = new
CloudSolrClient.Builder().withZkHost(zkHostString).build();
solr.setDefaultCollection("gettingstarted");
SolrInputDocument doc = new SolrInputDocument();
doc.addField("fielda", "valuec");
doc.addField("fieldb", "valued");
solr.add(doc);
solr.commit();
solr.close();
{noformat}
If I enable full TRACEing (literally setting root to TRACE in the Admin UI
under Logging/Level, I see my javabin request coming in in the *solr.log*
(file, not Admin UI which has INFO level limit).
However, my requests seems to have different headers from yours. I get the
following:
{noformat}
DEBUG - 2016-09-12 16:31:37.644; [ ] org.eclipse.jetty.server.Server; REQUEST
on
HttpChannelOverHttp@43c0621b{r=1,c=false,a=DISPATCHED,uri=//192.168.50.128:8983/solr/gettingstarted/update?wt=javabin&version=2}
POST /solr/gettingstarted/update HTTP/1.1
User-Agent: Solr[org.apache.solr.client.solrj.impl.HttpSolrClient] 1.0
Content-Length: 70
Content-Type: application/javabin
Host: 192.168.50.128:8983
Connection: keep-alive
{noformat}
Yours seems to be chunking (multiple entries? try just one) and having
authorization:basic flag (are you doing anything with that?).
Later in the log I see:
{noformat}
DEBUG - 2016-09-12 16:31:37.664; [c:gettingstarted s:shard1 r:core_node1
x:gettingstarted_shard1_replica1]
org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor;
PRE_UPDATE add{,id=da8f101d-b4ac-44c1-932e-1b8c03852c6b}
{update.chain=add-unknown-fields-to-the-schema&df=_text_&wt=javabin&version=2}
{noformat}
Showing that the chain has triggered and the id has been assigned. Are you
seeing anything similar to that?
> uniqueKey generation fails if content POSTed as "application/javabin".
> ----------------------------------------------------------------------
>
> Key: SOLR-9493
> URL: https://issues.apache.org/jira/browse/SOLR-9493
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Yury Kartsev
> Attachments: 200.png, 400.png, Screen Shot 2016-09-11 at 16.29.50 .png
>
>
> I have faced a weird issue when the same application code (using SolrJ) fails
> indexing a document without a unique key (should be auto-generated by SOLR)
> in SolrCloud and succeeds indexing it in standalone SOLR instance (or even in
> cloud mode, but from web interface of one of the replicas). Difference is
> obviously only between clients (CloudSolrClient vs HttpSolrClient) and SOLR
> URLs (Zokeeper hostname+port vs standalone SOLR instance hostname and port).
> Failure is seen as "org.apache.solr.client.solrj.SolrServerException:
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
> Document is missing mandatory uniqueKey field: id".
> I am using SOLR 5.1. In cloud mode I have 1 shard and 3 replicas.
> After lot of debugging and investigation (see below as well as my
> [StackOverflow
> post|http://stackoverflow.com/questions/39401792/uniquekey-generation-does-not-work-in-solrcloud-but-works-if-standalone])
> I came to a conclusion that the difference in failing and succeeding calls
> is simply content type of the POSTing requests. Local proxy clearly shows
> that the request fails if content is sent as "application/javabin" (see
> attached screenshot with sensitive data removed) and succeeds if content sent
> as "application/xml; charset=UTF-8" (see attached screenshot with sensitive
> data removed).
> Would you be able to please assist?
> Thank you very much in advance!
> ------------------------
> Copying whole description and investigation here as well:
> ------------------------
> [Documentation|https://cwiki.apache.org/confluence/display/solr/Other+Schema+Elements]
> states:{quote}Schema defaults and copyFields cannot be used to populate the
> uniqueKey field. You can use UUIDUpdateProcessorFactory to have uniqueKey
> values generated automatically.{quote}
> Therefore I have added my uniqueKey field to the schema:{code}<fieldType
> name="uuid" class="solr.UUIDField" indexed="true" />
> ...
> <field name="id" type="uuid" indexed="true" stored="true" required="true" />
> ...
> <uniqueKey>id</uniqueKey>{code}Then I have added updateRequestProcessorChain
> to my solrconfig:{code}<updateRequestProcessorChain name="uuid">
> <processor class="solr.UUIDUpdateProcessorFactory">
> <str name="fieldName">id</str>
> </processor>
> <processor class="solr.RunUpdateProcessorFactory" />
> </updateRequestProcessorChain>{code}And made it the default for the
> UpdateRequestHandler:{code}<initParams path="/update/**">
> <lst name="defaults">
> <str name="update.chain">uuid</str>
> </lst>
> </initParams>{code}
> Adding new documents with null/absent id works fine as from web-interface of
> one of the replicas, as when using SOLR in standalone mode (non-cloud) from
> my application. Although when only I'm using SolrCloud and add document from
> my application (using CloudSolrClient from SolrJ) it fails with
> "org.apache.solr.client.solrj.SolrServerException:
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
> Document is missing mandatory uniqueKey field: id"
> All other operations like ping or search for documents work fine in either
> mode (standalone or cloud).
> INVESTIGATION (i.e. more details):
> In standalone mode obviously update request is:{code}POST
> standalone_host:port/solr/collection_name/update?wt=json{code}
> In SOLR cloud mode, when adding document from one replica's web interface,
> update request is (found through inspecting the call made by web interface):
> {code}POST
> replica_host:port/solr/collection_name_shard1_replica_1/update?wt=json{code}
> In both these cases payload is something like:{code}{
> "add": {
> "doc": {
> .....
> },
> "boost": 1.0,
> "overwrite": true,
> "commitWithin": 1000
> }
> }{code}
> In case when CloudSolrClient is used, the following happens (found through
> debugging):
> Using ZK and some logic, URL list of replicas is constructed that looks like
> this:{code}[http://replica_1_host:port/solr/collection_name/,
> http://replica_2_host:port/solr/collection_name/,
> http://replica_3_host:port/solr/collection_name/]{code}
> This code is called:{code}LBHttpSolrClient.Req req = new
> LBHttpSolrClient.Req(request, theUrlList);
> LBHttpSolrClient.Rsp rsp = lbClient.request(req);
> return rsp.getResponse();{code}
> Where the second line fails with the exception.
> If to debug the second line further, it ends up calling HttpClient.execute
> (from HttpSolrClient.executeMethod) for:{code}POST
> http://replica_1_host:port/solr/collection_name/update?wt=javabin&version=2
> HTTP/1.1
> POST
> http://replica_2_host:port/solr/collection_name/update?wt=javabin&version=2
> HTTP/1.1
> POST
> http://replica_3_host:port/solr/collection_name/update?wt=javabin&version=2
> HTTP/1.1{code}
> And the very first request returns 400 Bad Request with replica 1 logging
> "Document is missing mandatory uniqueKey field: id" in the logs.
> The funny thing is that when I execute the same request using POSTMAN (but
> with JSON instead of binary payload), it works! Am I doing something wrong
> here? I assume it's definitely something in the way of how the request is
> made...
> UPDATE:
> I have used local proxy in order to see the difference in these 2 requests
> sent by my application in order to understand what is different there. Looks
> like the only difference is content type. In case of cloud mode the payload
> for POSTing document is sent as "application/javabin" while in standalone
> mode it's sent as "application/xml; charset=UTF-8". Everything else is the
> same. First request results in 400 while second is 200.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]