Is the useMultiPartPost=false in ManifoldCF, or in SolrJ?
On Mon, Dec 16, 2013 at 1:18 PM, Alessandro Benedetti < [email protected]> wrote: > I have more details now, after a deep debugging : > > The CloudSolrServer triggers the LBHttpSolrServer > lbServer.request(lbRequest).getResponse(). > > The LBHttpSolrServer triggers the HttpSolrServer request(request). > > It's here that we build the httpPOST in this way : > > boolean isMultipart = (this.useMultiPartPost || ( streams != null && > streams.size() > 1 )) && !hasNullStreamName; > > LinkedList<NameValuePair> postParams = new > LinkedList<NameValuePair>(); > ... > List<FormBodyPart> parts = new LinkedList<FormBodyPart>(); > Iterator<String> iter = params.getParameterNamesIterator(); > while (iter.hasNext()) { > String p = iter.next(); > String[] vals = params.getParams(p); > if (vals != null) { > for (String v : vals) { > if (isMultipart) { *// IMPORTANT* > parts.add(new FormBodyPart(p, new StringBody(v, > Charset.forName("UTF-8")))); > } else { > postParams.add(new BasicNameValuePair(p, v)); > } > } > } > } > ... > } > * // It is has one stream, it is the post body, put the params > in the URL* > * else { // we finish in this case* > String pstr = ClientUtils.toQueryString(params, false); > HttpPost post = new HttpPost(url + pstr); > > I checked that debugging Manifold the CloudSolrServer calls a > LBHttpSolrServer that calls a HttpSolrServer with useMultiPartPost=false . > Here we are with the problem. > So at the moment we have evidence that the metadata field values are placed > in the http header. > > Now, what's behind that ? A bug ? A decision to not use multiPartPost ? > Any advice ? > > > > 2013/12/16 Raymond Wiker <[email protected]> > > > That looks distinctly odd: you have an HTTP POST request, but the > > parameters are attached to the url, GET-style. It really makes no sense > to > > add parameters to the url when you have to use POST to carry the file > > content --- but in the "simple post tool", that is exactly what they do. > My > > best guess is that they do it this way to avoid having to deal with the > > complexities of multipart/form-data, and this might be acceptable in a > > scenario where the number of parameters is so small that you run no risk > of > > overrunning the header size limit. > > > > It's possible that the SolrJ developers make the assumption that this is > > safe; alternatively (and hopefully) there is a way of instructing SolrJ > to > > place all the parameters in the request body. If the first is the case, > > you'll have to find a workaround (for example, increasing the maximum > > header size in Jetty); In the second case, I guess that ManifoldCF needs > to > > setup SolrJ appropriately. > > > > > > > > On Mon, Dec 16, 2013 at 11:53 AM, Alessandro Benedetti < > > [email protected]> wrote: > > > > > There was an error in the previous mail, and some of the content is > > quoted > > > and maybe not clear at a first glance, I report the most important part > > of > > > the mail here : > > > > > > You can see that all the params are appended to the URL,so they will go > > in > > > the Headers of the Http POST request, here you are : > > > > > > POST /solr/collection1/update/extract?literal.id > > > =C+Movies%3A1025&literal.field2=value2&....&literal.fieldN=valueN& > > > resource.name=Tom+Cruise&wt=javabin&version=2 > > > > > > User-Agent Solr[org.apache.solr.client.solrj.impl.HttpSolrServer] 1.0 > > > Transfer-Encoding chunked > > > Content-Type text/plain > > > Host 10.0.1.16:8983 > > > Request Header Size : 5.99 KB (6133 bytes) > > > > > > Remember that is not my code, but Manifold 1.4.1 out of the box : > > > > > > org.apache.manifoldcf.agents.output.solr.HttpPoster > > > > > > writeField(out,LITERAL+newFieldName,values); > > > // Write the commitWithin parameter > > > if (commitWithin != null) > > > writeField(out,COMMITWITHIN_METADATA,commitWithin); > > > contentStreamUpdateRequest.setParams(out); > > > contentStreamUpdateRequest.addContentStream(new > > > RepositoryDocumentStream(is,length,contentType,contentName)); > > > contentStreamUpdateRequest.process(solrServer) > > > > > > Cheers > > > > > > > > > 2013/12/16 Alessandro Benedetti <[email protected]> > > > > > > > 2013/12/16 Raymond Wiker <[email protected]> > > > > > > > >> On Mon, Dec 16, 2013 at 9:42 AM, Alessandro Benedetti < > > > >> [email protected]> wrote: > > > >> > > > > > > > >> > Do you have any means of capturing the entire http (POST) request? > > It > > > >> > could > > > >> > > be that SolrJ is adding things to the header. > > > >> > > > > >> > I used Fiddler and Charles ( 2 softwares for monitoring http > > > requests). > > > >> All > > > >> > the params added to the ContentStreamUpdateRequest appear to be in > > the > > > >> > header. > > > >> > Nothing else added by SolrJ. > > > >> > > > > >> > > > >> Ok. Would it be possible for you to generate a set of captures that > > > could > > > >> be shared? I'd be happy to take a look. > > > >> > > > > > > > > Absolutely yes,you can see that all the params are appended to the > > URL,so > > > > they will go in the Headers of the Http POST request, here you are : > > > > > > > > POST /solr/collection1/update/extract?literal.id > > > > =C+Movies%3A1025&literal.field2=value2&....&literal.fieldN=valueN& > > > > resource.name=Tom+Cruise&wt=javabin&version=2 > > > > > > > > User-Agent Solr[org.apache.solr.client.solrj.impl.HttpSolrServer] 1.0 > > > > Transfer-Encoding chunked > > > > Content-Type text/plain > > > > Host 10.0.1.16:8983 > > > > Request Header Size : 5.99 KB (6133 bytes) > > > > > > > > Remember that is not my code, but Manifold 1.4.1 out of the box : > > > > > > > > org.apache.manifoldcf.agents.output.solr.HttpPoster > > > > > > > > writeField(out,LITERAL+newFieldName,values); > > > > // Write the commitWithin parameter > > > > if (commitWithin != null) > > > > writeField(out,COMMITWITHIN_METADATA,commitWithin); > > > > contentStreamUpdateRequest.setParams(out); > > > > contentStreamUpdateRequest.addContentStream(new > > > > RepositoryDocumentStream(is,length,contentType,contentName)); > > > > contentStreamUpdateRequest.process(solrServer) > > > > > > > > > > > > > > > >> > > > >> > > > > > >> > > What container are you running Solr under? Are you accessing > Solr > > > >> > directly, > > > >> > > or via a proxy? > > > >> > > > > >> > Direct access through a SolrCloudServer configured on a zookeper > > > >> ensemble > > > >> > of 3 zk. > > > >> > Solr are running on Jetty. > > > >> > > > > >> > > > > > > > > > > > > > > > > -- > > > > -------------------------- > > > > > > > > Benedetti Alessandro > > > > Visiting card : http://about.me/alessandro_benedetti > > > > > > > > "Tyger, tyger burning bright > > > > In the forests of the night, > > > > What immortal hand or eye > > > > Could frame thy fearful symmetry?" > > > > > > > > William Blake - Songs of Experience -1794 England > > > > > > > > > > > > > > > > -- > > > > -------------------------- > > > > > > > > Benedetti Alessandro > > > > Visiting card : http://about.me/alessandro_benedetti > > > > > > > > "Tyger, tyger burning bright > > > > In the forests of the night, > > > > What immortal hand or eye > > > > Could frame thy fearful symmetry?" > > > > > > > > William Blake - Songs of Experience -1794 England > > > > > > > > > > > > > > > > -- > > > -------------------------- > > > > > > Benedetti Alessandro > > > Visiting card : http://about.me/alessandro_benedetti > > > > > > "Tyger, tyger burning bright > > > In the forests of the night, > > > What immortal hand or eye > > > Could frame thy fearful symmetry?" > > > > > > William Blake - Songs of Experience -1794 England > > > > > > > > > -- > -------------------------- > > Benedetti Alessandro > Visiting card : http://about.me/alessandro_benedetti > > "Tyger, tyger burning bright > In the forests of the night, > What immortal hand or eye > Could frame thy fearful symmetry?" > > William Blake - Songs of Experience -1794 England >
