Ok, let's continue there.

Cheers


2013/12/16 Karl Wright <[email protected]>

> Raymond: Right, it looks like SolrJ 4.4 and after includes the fix
> necessary for us to ditch my local hack.  But we still need a hacked
> version of CloudSolrServer and a new SOLR ticket to fix the SolrCloud
> version of this problem.
>
> I suggest we continue the discussion thread on the CONNECTORS-839 ticket.
>
> Karl
>
>
>
> On Mon, Dec 16, 2013 at 7:36 AM, Raymond Wiker <[email protected]> wrote:
>
> > See also:
> >
> > https://issues.apache.org/jira/browse/SOLR-4358
> >
> > https://issues.apache.org/jira/browse/CONNECTORS-674
> >
> >
> > On Mon, Dec 16, 2013 at 1:34 PM, Karl Wright <[email protected]> wrote:
> >
> > > Hi Alessandro,
> > >
> > > ManifoldCF wound up including a hacked version of HttpSolrServer
> because
> > > the Solr version's support for multipart post was broken.  I did send a
> > > patch to Solr/Lucene but I lost track of whether that got committed or
> > not,
> > > and whether it has been released yet.  But that is immaterial; it
> appears
> > > that the SolrCloud implementation turns off multipart too - and that
> > could
> > > well be because of the breakage I was describing earlier.
> > >
> > > ManifoldCF needs to use multipart post for more reasons than just that:
> > > solr actually treats multipart post fields differently in some respects
> > > than url fields.  So we need to find a solution to this problem.
> > >
> > > I've created a ticket: CONNECTORS-839.
> > >
> > > Karl
> > >
> > >
> > >
> > > On Mon, Dec 16, 2013 at 7:18 AM, Alessandro Benedetti <
> > > [email protected]> wrote:
> > >
> > > > I have more details now, after a deep debugging :
> > > >
> > > > The CloudSolrServer  triggers the LBHttpSolrServer
> > > > lbServer.request(lbRequest).getResponse().
> > > >
> > > > The LBHttpSolrServer triggers the HttpSolrServer request(request).
> > > >
> > > > It's here that we build the httpPOST in this way :
> > > >
> > > > boolean isMultipart = (this.useMultiPartPost || ( streams != null &&
> > > > streams.size() > 1 )) && !hasNullStreamName;
> > > >
> > > >             LinkedList<NameValuePair> postParams = new
> > > > LinkedList<NameValuePair>();
> > > >        ...
> > > >               List<FormBodyPart> parts = new
> > LinkedList<FormBodyPart>();
> > > >               Iterator<String> iter =
> > params.getParameterNamesIterator();
> > > >               while (iter.hasNext()) {
> > > >                 String p = iter.next();
> > > >                 String[] vals = params.getParams(p);
> > > >                 if (vals != null) {
> > > >                   for (String v : vals) {
> > > >                     if (isMultipart) { *// IMPORTANT*
> > > >                       parts.add(new FormBodyPart(p, new StringBody(v,
> > > > Charset.forName("UTF-8"))));
> > > >                     } else {
> > > >                       postParams.add(new BasicNameValuePair(p, v));
> > > >                     }
> > > >                   }
> > > >                 }
> > > >               }
> > > >             ...
> > > >             }
> > > >            * // It is has one stream, it is the post body, put the
> > params
> > > > in the URL*
> > > > *            else { // we finish in this case*
> > > >               String pstr = ClientUtils.toQueryString(params, false);
> > > >               HttpPost post = new HttpPost(url + pstr);
> > > >
> > > > I checked that debugging Manifold the CloudSolrServer calls a
> > > > LBHttpSolrServer that calls a HttpSolrServer with
> > useMultiPartPost=false
> > > .
> > > > Here we are with the problem.
> > > > So at the moment we have evidence that the metadata field values are
> > > placed
> > > > in the http header.
> > > >
> > > > Now, what's behind that ? A bug ? A decision to not use
> multiPartPost ?
> > > > Any advice ?
> > > >
> > > >
> > > >
> > > > 2013/12/16 Raymond Wiker <[email protected]>
> > > >
> > > > > That looks distinctly odd: you have an HTTP POST request, but the
> > > > > parameters are attached to the url, GET-style. It really makes no
> > sense
> > > > to
> > > > > add parameters to the url when you have to use POST to carry the
> file
> > > > > content --- but in the "simple post tool", that is exactly what
> they
> > > do.
> > > > My
> > > > > best guess is that they do it this way to avoid having to deal with
> > the
> > > > > complexities of multipart/form-data, and this might be acceptable
> in
> > a
> > > > > scenario where the number of parameters is so small that you run no
> > > risk
> > > > of
> > > > > overrunning the header size limit.
> > > > >
> > > > > It's possible that the SolrJ developers make the assumption that
> this
> > > is
> > > > > safe; alternatively (and hopefully) there is a way of instructing
> > SolrJ
> > > > to
> > > > > place all the parameters in the request body. If the first is the
> > case,
> > > > > you'll have to find a workaround (for example, increasing the
> maximum
> > > > > header size in Jetty); In the second case, I guess that ManifoldCF
> > > needs
> > > > to
> > > > > setup SolrJ appropriately.
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Dec 16, 2013 at 11:53 AM, Alessandro Benedetti <
> > > > > [email protected]> wrote:
> > > > >
> > > > > > There was an error in the previous mail, and some of the content
> is
> > > > > quoted
> > > > > > and maybe not clear at a first glance, I report the most
> important
> > > part
> > > > > of
> > > > > > the mail here :
> > > > > >
> > > > > > You can see that all the params are appended to the URL,so they
> > will
> > > go
> > > > > in
> > > > > > the Headers of the Http POST request, here you are  :
> > > > > >
> > > > > > POST /solr/collection1/update/extract?literal.id
> > > > > >
> =C+Movies%3A1025&literal.field2=value2&....&literal.fieldN=valueN&
> > > > > > resource.name=Tom+Cruise&wt=javabin&version=2
> > > > > >
> > > > > > User-Agent Solr[org.apache.solr.client.solrj.impl.HttpSolrServer]
> > 1.0
> > > > > > Transfer-Encoding chunked
> > > > > > Content-Type text/plain
> > > > > > Host 10.0.1.16:8983
> > > > > > Request Header Size : 5.99 KB (6133 bytes)
> > > > > >
> > > > > > Remember that is not my code, but Manifold 1.4.1 out of the box :
> > > > > >
> > > > > > org.apache.manifoldcf.agents.output.solr.HttpPoster
> > > > > >
> > > > > >  writeField(out,LITERAL+newFieldName,values);
> > > > > > // Write the commitWithin parameter
> > > > > >  if (commitWithin != null)
> > > > > >      writeField(out,COMMITWITHIN_METADATA,commitWithin);
> > > > > >      contentStreamUpdateRequest.setParams(out);
> > > > > >      contentStreamUpdateRequest.addContentStream(new
> > > > > >  RepositoryDocumentStream(is,length,contentType,contentName));
> > > > > >      contentStreamUpdateRequest.process(solrServer)
> > > > > >
> > > > > > Cheers
> > > > > >
> > > > > >
> > > > > > 2013/12/16 Alessandro Benedetti <[email protected]>
> > > > > >
> > > > > > > 2013/12/16 Raymond Wiker <[email protected]>
> > > > > > >
> > > > > > >> On Mon, Dec 16, 2013 at 9:42 AM, Alessandro Benedetti <
> > > > > > >> [email protected]> wrote:
> > > > > > >>
> > > > > > >
> > > > > > >> > Do you have any means of capturing the entire http (POST)
> > > request?
> > > > > It
> > > > > > >> > could
> > > > > > >> > > be that SolrJ is adding things to the header.
> > > > > > >> >
> > > > > > >> > I used Fiddler and Charles ( 2 softwares for monitoring http
> > > > > > requests).
> > > > > > >> All
> > > > > > >> > the params added to the ContentStreamUpdateRequest appear to
> > be
> > > in
> > > > > the
> > > > > > >> > header.
> > > > > > >> > Nothing else added by SolrJ.
> > > > > > >> >
> > > > > > >>
> > > > > > >> Ok. Would it be possible for you to generate a set of captures
> > > that
> > > > > > could
> > > > > > >> be shared? I'd be happy to take a look.
> > > > > > >>
> > > > > > >
> > > > > > > Absolutely yes,you can see that all the params are appended to
> > the
> > > > > URL,so
> > > > > > > they will go in the Headers of the Http POST request, here you
> > are
> > >  :
> > > > > > >
> > > > > > > POST /solr/collection1/update/extract?literal.id
> > > > > > >
> > =C+Movies%3A1025&literal.field2=value2&....&literal.fieldN=valueN&
> > > > > > > resource.name=Tom+Cruise&wt=javabin&version=2
> > > > > > >
> > > > > > > User-Agent
> Solr[org.apache.solr.client.solrj.impl.HttpSolrServer]
> > > 1.0
> > > > > > > Transfer-Encoding chunked
> > > > > > > Content-Type text/plain
> > > > > > > Host 10.0.1.16:8983
> > > > > > > Request Header Size : 5.99 KB (6133 bytes)
> > > > > > >
> > > > > > > Remember that is not my code, but Manifold 1.4.1 out of the
> box :
> > > > > > >
> > > > > > > org.apache.manifoldcf.agents.output.solr.HttpPoster
> > > > > > >
> > > > > > >  writeField(out,LITERAL+newFieldName,values);
> > > > > > > // Write the commitWithin parameter
> > > > > > >  if (commitWithin != null)
> > > > > > >      writeField(out,COMMITWITHIN_METADATA,commitWithin);
> > > > > > >      contentStreamUpdateRequest.setParams(out);
> > > > > > >      contentStreamUpdateRequest.addContentStream(new
> > > > > > >  RepositoryDocumentStream(is,length,contentType,contentName));
> > > > > > >      contentStreamUpdateRequest.process(solrServer)
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >>
> > > > > > >> > >
> > > > > > >> > > What container are you running Solr under? Are you
> accessing
> > > > Solr
> > > > > > >> > directly,
> > > > > > >> > > or via a proxy?
> > > > > > >> >
> > > > > > >> > Direct access through a SolrCloudServer configured on a
> > zookeper
> > > > > > >> ensemble
> > > > > > >> > of 3 zk.
> > > > > > >> > Solr are running on Jetty.
> > > > > > >> >
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > --------------------------
> > > > > > >
> > > > > > > Benedetti Alessandro
> > > > > > > Visiting card : http://about.me/alessandro_benedetti
> > > > > > >
> > > > > > > "Tyger, tyger burning bright
> > > > > > > In the forests of the night,
> > > > > > > What immortal hand or eye
> > > > > > > Could frame thy fearful symmetry?"
> > > > > > >
> > > > > > > William Blake - Songs of Experience -1794 England
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > --------------------------
> > > > > > >
> > > > > > > Benedetti Alessandro
> > > > > > > Visiting card : http://about.me/alessandro_benedetti
> > > > > > >
> > > > > > > "Tyger, tyger burning bright
> > > > > > > In the forests of the night,
> > > > > > > What immortal hand or eye
> > > > > > > Could frame thy fearful symmetry?"
> > > > > > >
> > > > > > > William Blake - Songs of Experience -1794 England
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > --------------------------
> > > > > >
> > > > > > Benedetti Alessandro
> > > > > > Visiting card : http://about.me/alessandro_benedetti
> > > > > >
> > > > > > "Tyger, tyger burning bright
> > > > > > In the forests of the night,
> > > > > > What immortal hand or eye
> > > > > > Could frame thy fearful symmetry?"
> > > > > >
> > > > > > William Blake - Songs of Experience -1794 England
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > --------------------------
> > > >
> > > > Benedetti Alessandro
> > > > Visiting card : http://about.me/alessandro_benedetti
> > > >
> > > > "Tyger, tyger burning bright
> > > > In the forests of the night,
> > > > What immortal hand or eye
> > > > Could frame thy fearful symmetry?"
> > > >
> > > > William Blake - Songs of Experience -1794 England
> > > >
> > >
> >
>



-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Reply via email to