[jira] [Commented] (CONNECTORS-254) Bad request when posting 0 byte file to Solr
[ https://issues.apache.org/jira/browse/CONNECTORS-254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103678#comment-13103678 ] Karl Wright commented on CONNECTORS-254: The --data-binary switch to curl causes it to use POST. Here's a bit of the manpage: --data-binary (HTTP) This posts data exactly as specified with no extra proâ cessing whatsoever. So, what happens when you use curl to post your zero-length file? Do you get back a 400 response? Try using the CURL -vvv switch to see what it is doing. If it comes back with a 200 OK response and not a 400, then either you or I should try to do the same thing while Wireshark is capturing packets. If you are using a Linux system, you could instead want to use tcpdump to do the capture, and then examine the capture with Wireshark (on Windows). If this is too confusing, now that I have a test case I can try to do this later today. > Bad request when posting 0 byte file to Solr > > > Key: CONNECTORS-254 > URL: https://issues.apache.org/jira/browse/CONNECTORS-254 > Project: ManifoldCF > Issue Type: Bug > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 >Reporter: Shinichiro Abe >Priority: Minor > Fix For: ManifoldCF 0.3 > > Attachments: CONNECTORS-254-1.patch, sample0byte.zip > > > It seems that httpposter brings about bad request when posting 0 byte file. > Solr log say the below. "missing content stream". Status code is 400. > On the other hand when using Solr request handler without MCF, this exception > is not thrown and the posting 0 byte files is indexed normally. > > 2011/09/13 12:30:40 org.apache.solr.core.SolrCore execute > ???: [] webapp=/solr path=/update/extract > params={literal.id=file:/Users/abe/Desktop/1/no-content/no-content.txt&literal.uri=/Users/abe/Desktop/1/no-content/no-content.txt} > status=400 QTime=367 > 2011/09/13 12:30:40 org.apache.solr.common.SolrException log > ?v???I: org.apache.solr.common.SolrException: missing content stream > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:62) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-254) Bad request when posting 0 byte file to Solr
[ https://issues.apache.org/jira/browse/CONNECTORS-254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103670#comment-13103670 ] Shinichiro Abe commented on CONNECTORS-254: --- I'm sorry. I modified post.sh and posted 0 byte file. But the exception isn't thrown in this case. Is this a GET? --- FILES=$* URL=http://localhost:8983/solr/update/extract?literal.id=1 for f in $FILES; do curl $URL --data-binary @$f -H 'Content-type:application/octet-stream' done --- > Bad request when posting 0 byte file to Solr > > > Key: CONNECTORS-254 > URL: https://issues.apache.org/jira/browse/CONNECTORS-254 > Project: ManifoldCF > Issue Type: Bug > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 >Reporter: Shinichiro Abe >Priority: Minor > Fix For: ManifoldCF 0.3 > > Attachments: CONNECTORS-254-1.patch, sample0byte.zip > > > It seems that httpposter brings about bad request when posting 0 byte file. > Solr log say the below. "missing content stream". Status code is 400. > On the other hand when using Solr request handler without MCF, this exception > is not thrown and the posting 0 byte files is indexed normally. > > 2011/09/13 12:30:40 org.apache.solr.core.SolrCore execute > ???: [] webapp=/solr path=/update/extract > params={literal.id=file:/Users/abe/Desktop/1/no-content/no-content.txt&literal.uri=/Users/abe/Desktop/1/no-content/no-content.txt} > status=400 QTime=367 > 2011/09/13 12:30:40 org.apache.solr.common.SolrException log > ?v???I: org.apache.solr.common.SolrException: missing content stream > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:62) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-254) Bad request when posting 0 byte file to Solr
[ https://issues.apache.org/jira/browse/CONNECTORS-254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103639#comment-13103639 ] Karl Wright commented on CONNECTORS-254: The curl command you have described is not even a POST, it's a GET. The document itself is not being sent, just the file name. The stream.file argument may well be used by Solr on the update handler side to open the file directly. That's not going to work with ManifoldCF though because there's no guarantee of a shared file system between ManifoldCF and the Solr instance. > Bad request when posting 0 byte file to Solr > > > Key: CONNECTORS-254 > URL: https://issues.apache.org/jira/browse/CONNECTORS-254 > Project: ManifoldCF > Issue Type: Bug > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 >Reporter: Shinichiro Abe >Priority: Minor > Fix For: ManifoldCF 0.3 > > Attachments: CONNECTORS-254-1.patch, sample0byte.zip > > > It seems that httpposter brings about bad request when posting 0 byte file. > Solr log say the below. "missing content stream". Status code is 400. > On the other hand when using Solr request handler without MCF, this exception > is not thrown and the posting 0 byte files is indexed normally. > > 2011/09/13 12:30:40 org.apache.solr.core.SolrCore execute > ???: [] webapp=/solr path=/update/extract > params={literal.id=file:/Users/abe/Desktop/1/no-content/no-content.txt&literal.uri=/Users/abe/Desktop/1/no-content/no-content.txt} > status=400 QTime=367 > 2011/09/13 12:30:40 org.apache.solr.common.SolrException log > ?v???I: org.apache.solr.common.SolrException: missing content stream > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:62) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-254) Bad request when posting 0 byte file to Solr
[ https://issues.apache.org/jira/browse/CONNECTORS-254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103632#comment-13103632 ] Shinichiro Abe commented on CONNECTORS-254: --- I agree. My approach was hasty. I'll examine again.(especially adding a space) At least the following results in posting normally on Solr. the exception isn't thrown. curl "http://localhost:8983/solr/update/extract?literal.id=1&stream.file=/path/to/0bytefile&commit=true"; I want to make the same behavior for httpposter. What do you think about this? > Bad request when posting 0 byte file to Solr > > > Key: CONNECTORS-254 > URL: https://issues.apache.org/jira/browse/CONNECTORS-254 > Project: ManifoldCF > Issue Type: Bug > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 >Reporter: Shinichiro Abe >Priority: Minor > Fix For: ManifoldCF 0.3 > > Attachments: CONNECTORS-254-1.patch, sample0byte.zip > > > It seems that httpposter brings about bad request when posting 0 byte file. > Solr log say the below. "missing content stream". Status code is 400. > On the other hand when using Solr request handler without MCF, this exception > is not thrown and the posting 0 byte files is indexed normally. > > 2011/09/13 12:30:40 org.apache.solr.core.SolrCore execute > ???: [] webapp=/solr path=/update/extract > params={literal.id=file:/Users/abe/Desktop/1/no-content/no-content.txt&literal.uri=/Users/abe/Desktop/1/no-content/no-content.txt} > status=400 QTime=367 > 2011/09/13 12:30:40 org.apache.solr.common.SolrException log > ?v???I: org.apache.solr.common.SolrException: missing content stream > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:62) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-202) SOLR connector suport for commitWithin
[ https://issues.apache.org/jira/browse/CONNECTORS-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103622#comment-13103622 ] Karl Wright commented on CONNECTORS-202: r1170174 for the documentation update. > SOLR connector suport for commitWithin > -- > > Key: CONNECTORS-202 > URL: https://issues.apache.org/jira/browse/CONNECTORS-202 > Project: ManifoldCF > Issue Type: Improvement > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 0.2, ManifoldCF 0.3 >Reporter: Jan Høydahl >Assignee: Karl Wright > Labels: commit > Fix For: ManifoldCF 0.4 > > Attachments: CONNECTORS-202.patch > > > The output connection must support commitWithin > (http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22add.22) > in addition to sending a commit() at the end of a job. > This allows for efficient handling of commits on the Solr side. > The parameter should ideally be configurable per job. In that way you could > say that for "Important job" commitWithin=10s while for "Big crawl job", > commitWithin=600s. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-202) SOLR connector suport for commitWithin
[ https://issues.apache.org/jira/browse/CONNECTORS-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103589#comment-13103589 ] Karl Wright commented on CONNECTORS-202: Looks fine. I'll commit it and update the site this evening. > SOLR connector suport for commitWithin > -- > > Key: CONNECTORS-202 > URL: https://issues.apache.org/jira/browse/CONNECTORS-202 > Project: ManifoldCF > Issue Type: Improvement > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 0.2, ManifoldCF 0.3 >Reporter: Jan Høydahl >Assignee: Karl Wright > Labels: commit > Fix For: ManifoldCF 0.4 > > Attachments: CONNECTORS-202.patch > > > The output connection must support commitWithin > (http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22add.22) > in addition to sending a commit() at the end of a job. > This allows for efficient handling of commits on the Solr side. > The parameter should ideally be configurable per job. In that way you could > say that for "Important job" commitWithin=10s while for "Big crawl job", > commitWithin=600s. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-202) SOLR connector suport for commitWithin
[ https://issues.apache.org/jira/browse/CONNECTORS-202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl updated CONNECTORS-202: --- Attachment: CONNECTORS-202.patch Proposed end user documentation update for the update parameters tab, including examples for commitWithin and update.chain > SOLR connector suport for commitWithin > -- > > Key: CONNECTORS-202 > URL: https://issues.apache.org/jira/browse/CONNECTORS-202 > Project: ManifoldCF > Issue Type: Improvement > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 0.2, ManifoldCF 0.3 >Reporter: Jan Høydahl >Assignee: Karl Wright > Labels: commit > Fix For: ManifoldCF 0.4 > > Attachments: CONNECTORS-202.patch > > > The output connection must support commitWithin > (http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22add.22) > in addition to sending a commit() at the end of a job. > This allows for efficient handling of commits on the Solr side. > The parameter should ideally be configurable per job. In that way you could > say that for "Important job" commitWithin=10s while for "Big crawl job", > commitWithin=600s. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[RESULT][VOTE] Release ManifoldCF 0.3-incubating, RC1
Three +1's. >72 hours. Vote passes! Karl On Mon, Sep 12, 2011 at 9:52 PM, Shinichiro Abe wrote: > +1 > > The JCIFS Connector and "ant test" work fine! > > Shinichiro Abe > > > On 2011/09/13, at 10:05, Karl Wright wrote: > >> Thanks! >> >> We need one more binding +1. Shinichiro? Simon? Erlend? Tommaso? >> >> Karl >> >> On Mon, Sep 12, 2011 at 12:06 PM, Piergiorgio Lucidi >> wrote: >>> +1 >>> >>> The CMIS Connector works fine! >>> >>> Piergiorgio >>> >>> 2011/9/9 Karl Wright >>> You can download the release candidate from http://people.apache.org/~kwright, and there is also a tag in svn under https://svn.apache.org/repos/asf/incubator/lcf/tags. +1 to release this RC. -1 to not release it. After a successful release vote, please be aware that I will need to present the release candidate to the incubator for their vote as well, before the release is actually made. Karl >>> >>> >>> >>> -- >>> Piergiorgio Lucidi >>> http://about.me/piergiorgiolucidi >>> > >
[jira] [Commented] (CONNECTORS-254) Bad request when posting 0 byte file to Solr
[ https://issues.apache.org/jira/browse/CONNECTORS-254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103465#comment-13103465 ] Karl Wright commented on CONNECTORS-254: I have some concerns about this approach. First, I'm not sure it is the right thing to do. It would be good to know more about the case you are comparing against when you say "On the other hand when using Solr request handler without MCF, this exception is not thrown and the posting 0 byte files is indexed normally." How are you posting the content normally? When you post the content in that way, what does the entire http request look like? You can use Wireshark to capture it, or I can, so we can see exactly what happens in that case. Second, if it turns out that adding a space is the correct thing to do, I'm concerned because this change is not being reflected in the content-length header. If the header reports a different length than what is posted, the posted data will be truncated. This may be the whole goal, though, in which case we should add a comment to the code noting that the content MUST be the last field posted for this reason. > Bad request when posting 0 byte file to Solr > > > Key: CONNECTORS-254 > URL: https://issues.apache.org/jira/browse/CONNECTORS-254 > Project: ManifoldCF > Issue Type: Bug > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 >Reporter: Shinichiro Abe >Priority: Minor > Fix For: ManifoldCF 0.3 > > Attachments: CONNECTORS-254-1.patch, sample0byte.zip > > > It seems that httpposter brings about bad request when posting 0 byte file. > Solr log say the below. "missing content stream". Status code is 400. > On the other hand when using Solr request handler without MCF, this exception > is not thrown and the posting 0 byte files is indexed normally. > > 2011/09/13 12:30:40 org.apache.solr.core.SolrCore execute > ???: [] webapp=/solr path=/update/extract > params={literal.id=file:/Users/abe/Desktop/1/no-content/no-content.txt&literal.uri=/Users/abe/Desktop/1/no-content/no-content.txt} > status=400 QTime=367 > 2011/09/13 12:30:40 org.apache.solr.common.SolrException log > ?v???I: org.apache.solr.common.SolrException: missing content stream > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:62) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-254) Bad request when posting 0 byte file to Solr
[ https://issues.apache.org/jira/browse/CONNECTORS-254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-254: -- Attachment: CONNECTORS-254-1.patch When the stream length is zero, it replaces space. If this procedure is okay, I'll modify the indent around that. > Bad request when posting 0 byte file to Solr > > > Key: CONNECTORS-254 > URL: https://issues.apache.org/jira/browse/CONNECTORS-254 > Project: ManifoldCF > Issue Type: Bug > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 >Reporter: Shinichiro Abe >Priority: Minor > Fix For: ManifoldCF 0.3 > > Attachments: CONNECTORS-254-1.patch, sample0byte.zip > > > It seems that httpposter brings about bad request when posting 0 byte file. > Solr log say the below. "missing content stream". Status code is 400. > On the other hand when using Solr request handler without MCF, this exception > is not thrown and the posting 0 byte files is indexed normally. > > 2011/09/13 12:30:40 org.apache.solr.core.SolrCore execute > ???: [] webapp=/solr path=/update/extract > params={literal.id=file:/Users/abe/Desktop/1/no-content/no-content.txt&literal.uri=/Users/abe/Desktop/1/no-content/no-content.txt} > status=400 QTime=367 > 2011/09/13 12:30:40 org.apache.solr.common.SolrException log > ?v???I: org.apache.solr.common.SolrException: missing content stream > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:62) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira