[
https://issues.apache.org/jira/browse/CONNECTORS-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968678#comment-15968678
]
Karl Wright edited comment on CONNECTORS-1408 at 4/14/17 6:55 AM:
------------------------------------------------------------------
Looking carefully at the code, here are some more thoughts:
- If you are using a debugger, you need to be careful not to just back up and
attempt to run the post again, without waiting until the next document. This
won't work because there are streams involved that will get closed after the
first run-through. I suspect this is the reason why you saw 'missing content
stream'.
- It sounds to me like the POST is in fact happening just fine, but Solr is
kicking it out. That wouldn't happen if the POST was malformed or had too long
a URI, because HttpClient wouldn't allow it.
- I can see no reason why isMultipart would not be 'true' under most
situations; it seems to be gated on the variable useMultiPartPost. The reason
that ModifiedHttpSolrClient even exists is so that we can set useMultiPartPost
to "true". I am sure it stays "true" too; I declared it "final" here and it
compiles.
- Since we've been using multi-part all along, I have to conclude that the
reason we're getting the URI error is simply because the URI is too big even
when the POST is multipart.
- We can easily try to include the request metadata in the multipart fields, as
long as I am sure where they are coming in. request.getParams()? or
request.getQueryParams()? If the metadata is found in request.getParams(),
then we should already be sending parameters in the multipart data, so the
problem would have to be on the Solr side.
- If I change this, though, it's still possible that Solr won't be happy with
it. We'll have to try it and see.
It's also possible to see exactly what is going on by enabling http wire
debugging in httpclient. Then we can see the data being sent, and the URI too.
In logging.ini, you simply set a couple of lines to make this happen. Have a
look at:
https://hc.apache.org/httpcomponents-client-ga/logging.html
You will want to add:
{code}
log4j.logger.org.apache.http=DEBUG
log4j.logger.org.apache.http.wire=DEBUG
{code}
Please let me know if you can confirm my understanding, and determine for sure
whether the problem is on the Solr side or the ManifoldCF side. If on the Solr
side, you'll want to create a Solr ticket. Thanks!!
was (Author: [email protected]):
Looking carefully at the code, here are some more thoughts:
- If you are using a debugger, you need to be careful not to just back up and
attempt to run the post again, without waiting until the next document. This
won't work because there are streams involved that will get closed after the
first run-through. I suspect this is the reason why you saw 'missing content
stream'.
- It sounds to me like the POST is in fact happening just fine, but Solr is
kicking it out. That wouldn't happen if the POST was malformed or had too long
a URI, because HttpClient wouldn't allow it.
- I can see no reason why isMultipart would not be 'true' under most
situations; it seems to be gated on the variable useMultiPartPost. The reason
that ModifiedHttpSolrClient even exists is so that we can set useMultiPartPost
to "true". I am sure it stays "true" too; I declared it "final" here and it
compiles.
- Since we've been using multi-part all along, I have to conclude that the
reason we're getting the URI error is simply because the URI is too big even
when the POST is multipart.
- We can easily try to include the request metadata in the multipart fields, as
long as I am sure where they are coming in. request.getParams()? or
request.getQueryParams()? If the metadata is found in request.getParams(),
then we should already be sending parameters in the multipart data, so the
problem would have to be on the Solr side.
- If I change this, though, it's still possible that Solr won't be happy with
it. We'll have to try it and see.
It's also possible to see exactly what is going on by enabling http wire
debugging in httpclient. Then we can see the data being sent, and the URI too.
In logging.ini, you simply set a couple of lines to make this happen. Have a
look at:
https://hc.apache.org/httpcomponents-client-ga/logging.html
Please let me know if you can confirm my understanding, and determine for sure
whether the problem is on the Solr side or the ManifoldCF side. If on the Solr
side, you'll want to create a Solr ticket. Thanks!!
> Request-URI Too Long
> --------------------
>
> Key: CONNECTORS-1408
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1408
> Project: ManifoldCF
> Issue Type: Bug
> Components: Email connector, Solr 6.x component
> Affects Versions: ManifoldCF 2.6
> Reporter: Cihad Guzel
> Assignee: Karl Wright
> Fix For: ManifoldCF 2.7
>
>
> I run email connector job and follow "Simple History" from UI. I see an error
> as follow:
> {code}
> Error from server at http://localhost:8983/solr/mycore: non ok status: 414,
> message:Request-URI Too Long
> {code}
> It is sent by Solr.
> Solr logs say:
> {code}
> HttpParser - URI is too large >8192
> {code}
> and
> {code}
> HttpParser - bad HTTP parsed: 414 for
> HttpChannelOverHttp@2b6931dd{r=0,​c=false,​a=IDLE,​uri=null}
>
> {code}
> ManifoldCF ModifiedHttpSolrClient.java has following code:
> {code}
> // It is has one stream, it is the post body, put the params in the URL
> else {
> String pstr = toQueryString(wparams, false);
> HttpEntityEnclosingRequestBase postOrPut = SolrRequest.METHOD.POST ==
> request.getMethod() ?
> new HttpPost(url + pstr) : new HttpPut(url + pstr);
> {code}
> There is "pstr" field appended to the URL. "pstr" field have all Solr params.
> It contains email content. We have "URI is too large" error when email has
> large content.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)