[jira] [Commented] (SOLR-9493) uniqueKey generation fails if content POSTed as "application/javabin".

Alexandre Rafalovitch (JIRA) Mon, 12 Sep 2016 09:48:57 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-9493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15484595#comment-15484595
 ]


Alexandre Rafalovitch commented on SOLR-9493:
---------------------------------------------

I am not able to reproduce this using the following basic code against a single 
node single shard cloud example :
{noformat}
        String zkHostString = "localhost:9983";
        CloudSolrClient solr = new 
CloudSolrClient.Builder().withZkHost(zkHostString).build();
        solr.setDefaultCollection("gettingstarted");

        SolrInputDocument doc = new SolrInputDocument();
        doc.addField("fielda", "valuec");
        doc.addField("fieldb", "valued");

        solr.add(doc);
        solr.commit();
        solr.close();
{noformat}

If I enable full TRACEing (literally setting root to TRACE in the Admin UI 
under Logging/Level, I see my javabin request coming in in the *solr.log* 
(file, not Admin UI which has INFO level limit). 

However, my requests seems to have different headers from yours. I get the 
following:
{noformat}
DEBUG - 2016-09-12 16:31:37.644; [   ] org.eclipse.jetty.server.Server; REQUEST 
on 
HttpChannelOverHttp@43c0621b{r=1,c=false,a=DISPATCHED,uri=//192.168.50.128:8983/solr/gettingstarted/update?wt=javabin&version=2}
POST /solr/gettingstarted/update HTTP/1.1
User-Agent: Solr[org.apache.solr.client.solrj.impl.HttpSolrClient] 1.0
Content-Length: 70
Content-Type: application/javabin
Host: 192.168.50.128:8983
Connection: keep-alive
{noformat}

Yours seems to be chunking (multiple entries? try just one) and having 
authorization:basic flag (are you doing anything with that?).

Later in the log I see:
{noformat}
DEBUG - 2016-09-12 16:31:37.664; [c:gettingstarted s:shard1 r:core_node1 
x:gettingstarted_shard1_replica1] 
org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor; 
PRE_UPDATE add{,id=da8f101d-b4ac-44c1-932e-1b8c03852c6b} 
{update.chain=add-unknown-fields-to-the-schema&df=_text_&wt=javabin&version=2}
{noformat}

Showing that the chain has triggered and the id has been assigned. Are you 
seeing anything similar to that?

> uniqueKey generation fails if content POSTed as "application/javabin".
> ----------------------------------------------------------------------
>
>                 Key: SOLR-9493
>                 URL: https://issues.apache.org/jira/browse/SOLR-9493
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Yury Kartsev
>         Attachments: 200.png, 400.png, Screen Shot 2016-09-11 at 16.29.50 .png
>
>
> I have faced a weird issue when the same application code (using SolrJ) fails 
> indexing a document without a unique key (should be auto-generated by SOLR) 
> in SolrCloud and succeeds indexing it in standalone SOLR instance (or even in 
> cloud mode, but from web interface of one of the replicas). Difference is 
> obviously only between clients (CloudSolrClient vs HttpSolrClient) and SOLR 
> URLs (Zokeeper hostname+port vs standalone SOLR instance hostname and port). 
> Failure is seen as "org.apache.solr.client.solrj.SolrServerException: 
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: 
> Document is missing mandatory uniqueKey field: id".
> I am using SOLR 5.1. In cloud mode I have 1 shard and 3 replicas.
> After lot of debugging and investigation (see below as well as my 
> [StackOverflow 
> post|http://stackoverflow.com/questions/39401792/uniquekey-generation-does-not-work-in-solrcloud-but-works-if-standalone])
>  I came to a conclusion that the difference in failing and succeeding calls 
> is simply content type of the POSTing requests. Local proxy clearly shows 
> that the request fails if content is sent as "application/javabin" (see 
> attached screenshot with sensitive data removed) and succeeds if content sent 
> as "application/xml; charset=UTF-8"  (see attached screenshot with sensitive 
> data removed).
> Would you be able to please assist?
> Thank you very much in advance!
> ------------------------
> Copying whole description and investigation here as well:
> ------------------------
> [Documentation|https://cwiki.apache.org/confluence/display/solr/Other+Schema+Elements]
>  states:{quote}Schema defaults and copyFields cannot be used to populate the 
> uniqueKey field. You can use UUIDUpdateProcessorFactory to have uniqueKey 
> values generated automatically.{quote}
> Therefore I have added my uniqueKey field to the schema:{code}<fieldType 
> name="uuid" class="solr.UUIDField" indexed="true" />
> ...
> <field name="id" type="uuid" indexed="true" stored="true" required="true" />
> ...
> <uniqueKey>id</uniqueKey>{code}Then I have added updateRequestProcessorChain 
> to my solrconfig:{code}<updateRequestProcessorChain name="uuid">
>     <processor class="solr.UUIDUpdateProcessorFactory">
>         <str name="fieldName">id</str>
>     </processor>
>     <processor class="solr.RunUpdateProcessorFactory" />
> </updateRequestProcessorChain>{code}And made it the default for the 
> UpdateRequestHandler:{code}<initParams path="/update/**">
>  <lst name="defaults">
>   <str name="update.chain">uuid</str>
>  </lst>
> </initParams>{code}
> Adding new documents with null/absent id works fine as from web-interface of 
> one of the replicas, as when using SOLR in standalone mode (non-cloud) from 
> my application. Although when only I'm using SolrCloud and add document from 
> my application (using CloudSolrClient from SolrJ) it fails with 
> "org.apache.solr.client.solrj.SolrServerException: 
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: 
> Document is missing mandatory uniqueKey field: id"
> All other operations like ping or search for documents work fine in either 
> mode (standalone or cloud).
> INVESTIGATION (i.e. more details):
> In standalone mode obviously update request is:{code}POST 
> standalone_host:port/solr/collection_name/update?wt=json{code}
> In SOLR cloud mode, when adding document from one replica's web interface, 
> update request is (found through inspecting the call made by web interface): 
> {code}POST 
> replica_host:port/solr/collection_name_shard1_replica_1/update?wt=json{code}
> In both these cases payload is something like:{code}{
>     "add": {
>         "doc": {
>                  .....
>         },
>         "boost": 1.0,
>         "overwrite": true,
>         "commitWithin": 1000
>     }
> }{code}
> In case when CloudSolrClient is used, the following happens (found through 
> debugging):
> Using ZK and some logic, URL list of replicas is constructed that looks like 
> this:{code}[http://replica_1_host:port/solr/collection_name/,
>  http://replica_2_host:port/solr/collection_name/,
>  http://replica_3_host:port/solr/collection_name/]{code}
> This code is called:{code}LBHttpSolrClient.Req req = new 
> LBHttpSolrClient.Req(request, theUrlList);
> LBHttpSolrClient.Rsp rsp = lbClient.request(req);
> return rsp.getResponse();{code}
> Where the second line fails with the exception.
> If to debug the second line further, it ends up calling HttpClient.execute 
> (from HttpSolrClient.executeMethod) for:{code}POST 
> http://replica_1_host:port/solr/collection_name/update?wt=javabin&version=2 
> HTTP/1.1
> POST 
> http://replica_2_host:port/solr/collection_name/update?wt=javabin&version=2 
> HTTP/1.1
> POST 
> http://replica_3_host:port/solr/collection_name/update?wt=javabin&version=2 
> HTTP/1.1{code}
> And the very first request returns 400 Bad Request with replica 1 logging 
> "Document is missing mandatory uniqueKey field: id" in the logs.
> The funny thing is that when I execute the same request using POSTMAN (but 
> with JSON instead of binary payload), it works! Am I doing something wrong 
> here? I assume it's definitely something in the way of how the request is 
> made...
> UPDATE:
> I have used local proxy in order to see the difference in these 2 requests 
> sent by my application in order to understand what is different there. Looks 
> like the only difference is content type. In case of cloud mode the payload 
> for POSTing document is sent as "application/javabin" while in standalone 
> mode it's sent as "application/xml; charset=UTF-8". Everything else is the 
> same. First request results in 400 while second is 200.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-9493) uniqueKey generation fails if content POSTed as "application/javabin".

Reply via email to