[jira] [Updated] (CONNECTORS-254) Bad request when posting 0 byte file to Solr
[ https://issues.apache.org/jira/browse/CONNECTORS-254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-254: -- Attachment: CONNECTORS-254-1.patch When the stream length is zero, it replaces space. If this procedure is okay, I'll modify the indent around that. Bad request when posting 0 byte file to Solr Key: CONNECTORS-254 URL: https://issues.apache.org/jira/browse/CONNECTORS-254 Project: ManifoldCF Issue Type: Bug Components: Lucene/SOLR connector Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 Reporter: Shinichiro Abe Priority: Minor Fix For: ManifoldCF 0.3 Attachments: CONNECTORS-254-1.patch, sample0byte.zip It seems that httpposter brings about bad request when posting 0 byte file. Solr log say the below. missing content stream. Status code is 400. On the other hand when using Solr request handler without MCF, this exception is not thrown and the posting 0 byte files is indexed normally. 2011/09/13 12:30:40 org.apache.solr.core.SolrCore execute ???: [] webapp=/solr path=/update/extract params={literal.id=file:/Users/abe/Desktop/1/no-content/no-content.txtliteral.uri=/Users/abe/Desktop/1/no-content/no-content.txt} status=400 QTime=367 2011/09/13 12:30:40 org.apache.solr.common.SolrException log ?v???I: org.apache.solr.common.SolrException: missing content stream at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:62) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-254) Bad request when posting 0 byte file to Solr
[ https://issues.apache.org/jira/browse/CONNECTORS-254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13103465#comment-13103465 ] Karl Wright commented on CONNECTORS-254: I have some concerns about this approach. First, I'm not sure it is the right thing to do. It would be good to know more about the case you are comparing against when you say On the other hand when using Solr request handler without MCF, this exception is not thrown and the posting 0 byte files is indexed normally. How are you posting the content normally? When you post the content in that way, what does the entire http request look like? You can use Wireshark to capture it, or I can, so we can see exactly what happens in that case. Second, if it turns out that adding a space is the correct thing to do, I'm concerned because this change is not being reflected in the content-length header. If the header reports a different length than what is posted, the posted data will be truncated. This may be the whole goal, though, in which case we should add a comment to the code noting that the content MUST be the last field posted for this reason. Bad request when posting 0 byte file to Solr Key: CONNECTORS-254 URL: https://issues.apache.org/jira/browse/CONNECTORS-254 Project: ManifoldCF Issue Type: Bug Components: Lucene/SOLR connector Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 Reporter: Shinichiro Abe Priority: Minor Fix For: ManifoldCF 0.3 Attachments: CONNECTORS-254-1.patch, sample0byte.zip It seems that httpposter brings about bad request when posting 0 byte file. Solr log say the below. missing content stream. Status code is 400. On the other hand when using Solr request handler without MCF, this exception is not thrown and the posting 0 byte files is indexed normally. 2011/09/13 12:30:40 org.apache.solr.core.SolrCore execute ???: [] webapp=/solr path=/update/extract params={literal.id=file:/Users/abe/Desktop/1/no-content/no-content.txtliteral.uri=/Users/abe/Desktop/1/no-content/no-content.txt} status=400 QTime=367 2011/09/13 12:30:40 org.apache.solr.common.SolrException log ?v???I: org.apache.solr.common.SolrException: missing content stream at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:62) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[RESULT][VOTE] Release ManifoldCF 0.3-incubating, RC1
Three +1's. 72 hours. Vote passes! Karl On Mon, Sep 12, 2011 at 9:52 PM, Shinichiro Abe shinichiro.ab...@gmail.com wrote: +1 The JCIFS Connector and ant test work fine! Shinichiro Abe On 2011/09/13, at 10:05, Karl Wright wrote: Thanks! We need one more binding +1. Shinichiro? Simon? Erlend? Tommaso? Karl On Mon, Sep 12, 2011 at 12:06 PM, Piergiorgio Lucidi piergiorgioluc...@gmail.com wrote: +1 The CMIS Connector works fine! Piergiorgio 2011/9/9 Karl Wright daddy...@gmail.com You can download the release candidate from http://people.apache.org/~kwright, and there is also a tag in svn under https://svn.apache.org/repos/asf/incubator/lcf/tags. +1 to release this RC. -1 to not release it. After a successful release vote, please be aware that I will need to present the release candidate to the incubator for their vote as well, before the release is actually made. Karl -- Piergiorgio Lucidi http://about.me/piergiorgiolucidi
[jira] [Commented] (CONNECTORS-202) SOLR connector suport for commitWithin
[ https://issues.apache.org/jira/browse/CONNECTORS-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13103589#comment-13103589 ] Karl Wright commented on CONNECTORS-202: Looks fine. I'll commit it and update the site this evening. SOLR connector suport for commitWithin -- Key: CONNECTORS-202 URL: https://issues.apache.org/jira/browse/CONNECTORS-202 Project: ManifoldCF Issue Type: Improvement Components: Lucene/SOLR connector Affects Versions: ManifoldCF 0.2, ManifoldCF 0.3 Reporter: Jan Høydahl Assignee: Karl Wright Labels: commit Fix For: ManifoldCF 0.4 Attachments: CONNECTORS-202.patch The output connection must support commitWithin (http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22add.22) in addition to sending a commit() at the end of a job. This allows for efficient handling of commits on the Solr side. The parameter should ideally be configurable per job. In that way you could say that for Important job commitWithin=10s while for Big crawl job, commitWithin=600s. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-202) SOLR connector suport for commitWithin
[ https://issues.apache.org/jira/browse/CONNECTORS-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13103622#comment-13103622 ] Karl Wright commented on CONNECTORS-202: r1170174 for the documentation update. SOLR connector suport for commitWithin -- Key: CONNECTORS-202 URL: https://issues.apache.org/jira/browse/CONNECTORS-202 Project: ManifoldCF Issue Type: Improvement Components: Lucene/SOLR connector Affects Versions: ManifoldCF 0.2, ManifoldCF 0.3 Reporter: Jan Høydahl Assignee: Karl Wright Labels: commit Fix For: ManifoldCF 0.4 Attachments: CONNECTORS-202.patch The output connection must support commitWithin (http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22add.22) in addition to sending a commit() at the end of a job. This allows for efficient handling of commits on the Solr side. The parameter should ideally be configurable per job. In that way you could say that for Important job commitWithin=10s while for Big crawl job, commitWithin=600s. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-254) Bad request when posting 0 byte file to Solr
[ https://issues.apache.org/jira/browse/CONNECTORS-254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13103632#comment-13103632 ] Shinichiro Abe commented on CONNECTORS-254: --- I agree. My approach was hasty. I'll examine again.(especially adding a space) At least the following results in posting normally on Solr. the exception isn't thrown. curl http://localhost:8983/solr/update/extract?literal.id=1stream.file=/path/to/0bytefilecommit=true; I want to make the same behavior for httpposter. What do you think about this? Bad request when posting 0 byte file to Solr Key: CONNECTORS-254 URL: https://issues.apache.org/jira/browse/CONNECTORS-254 Project: ManifoldCF Issue Type: Bug Components: Lucene/SOLR connector Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 Reporter: Shinichiro Abe Priority: Minor Fix For: ManifoldCF 0.3 Attachments: CONNECTORS-254-1.patch, sample0byte.zip It seems that httpposter brings about bad request when posting 0 byte file. Solr log say the below. missing content stream. Status code is 400. On the other hand when using Solr request handler without MCF, this exception is not thrown and the posting 0 byte files is indexed normally. 2011/09/13 12:30:40 org.apache.solr.core.SolrCore execute ???: [] webapp=/solr path=/update/extract params={literal.id=file:/Users/abe/Desktop/1/no-content/no-content.txtliteral.uri=/Users/abe/Desktop/1/no-content/no-content.txt} status=400 QTime=367 2011/09/13 12:30:40 org.apache.solr.common.SolrException log ?v???I: org.apache.solr.common.SolrException: missing content stream at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:62) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-254) Bad request when posting 0 byte file to Solr
[ https://issues.apache.org/jira/browse/CONNECTORS-254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13103639#comment-13103639 ] Karl Wright commented on CONNECTORS-254: The curl command you have described is not even a POST, it's a GET. The document itself is not being sent, just the file name. The stream.file argument may well be used by Solr on the update handler side to open the file directly. That's not going to work with ManifoldCF though because there's no guarantee of a shared file system between ManifoldCF and the Solr instance. Bad request when posting 0 byte file to Solr Key: CONNECTORS-254 URL: https://issues.apache.org/jira/browse/CONNECTORS-254 Project: ManifoldCF Issue Type: Bug Components: Lucene/SOLR connector Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 Reporter: Shinichiro Abe Priority: Minor Fix For: ManifoldCF 0.3 Attachments: CONNECTORS-254-1.patch, sample0byte.zip It seems that httpposter brings about bad request when posting 0 byte file. Solr log say the below. missing content stream. Status code is 400. On the other hand when using Solr request handler without MCF, this exception is not thrown and the posting 0 byte files is indexed normally. 2011/09/13 12:30:40 org.apache.solr.core.SolrCore execute ???: [] webapp=/solr path=/update/extract params={literal.id=file:/Users/abe/Desktop/1/no-content/no-content.txtliteral.uri=/Users/abe/Desktop/1/no-content/no-content.txt} status=400 QTime=367 2011/09/13 12:30:40 org.apache.solr.common.SolrException log ?v???I: org.apache.solr.common.SolrException: missing content stream at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:62) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-254) Bad request when posting 0 byte file to Solr
[ https://issues.apache.org/jira/browse/CONNECTORS-254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13103670#comment-13103670 ] Shinichiro Abe commented on CONNECTORS-254: --- I'm sorry. I modified post.sh and posted 0 byte file. But the exception isn't thrown in this case. Is this a GET? --- FILES=$* URL=http://localhost:8983/solr/update/extract?literal.id=1 for f in $FILES; do curl $URL --data-binary @$f -H 'Content-type:application/octet-stream' done --- Bad request when posting 0 byte file to Solr Key: CONNECTORS-254 URL: https://issues.apache.org/jira/browse/CONNECTORS-254 Project: ManifoldCF Issue Type: Bug Components: Lucene/SOLR connector Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 Reporter: Shinichiro Abe Priority: Minor Fix For: ManifoldCF 0.3 Attachments: CONNECTORS-254-1.patch, sample0byte.zip It seems that httpposter brings about bad request when posting 0 byte file. Solr log say the below. missing content stream. Status code is 400. On the other hand when using Solr request handler without MCF, this exception is not thrown and the posting 0 byte files is indexed normally. 2011/09/13 12:30:40 org.apache.solr.core.SolrCore execute ???: [] webapp=/solr path=/update/extract params={literal.id=file:/Users/abe/Desktop/1/no-content/no-content.txtliteral.uri=/Users/abe/Desktop/1/no-content/no-content.txt} status=400 QTime=367 2011/09/13 12:30:40 org.apache.solr.common.SolrException log ?v???I: org.apache.solr.common.SolrException: missing content stream at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:62) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-254) Bad request when posting 0 byte file to Solr
[ https://issues.apache.org/jira/browse/CONNECTORS-254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13103678#comment-13103678 ] Karl Wright commented on CONNECTORS-254: The --data-binary switch to curl causes it to use POST. Here's a bit of the manpage: --data-binary data (HTTP) This posts data exactly as specified with no extra proâ cessing whatsoever. So, what happens when you use curl to post your zero-length file? Do you get back a 400 response? Try using the CURL -vvv switch to see what it is doing. If it comes back with a 200 OK response and not a 400, then either you or I should try to do the same thing while Wireshark is capturing packets. If you are using a Linux system, you could instead want to use tcpdump to do the capture, and then examine the capture with Wireshark (on Windows). If this is too confusing, now that I have a test case I can try to do this later today. Bad request when posting 0 byte file to Solr Key: CONNECTORS-254 URL: https://issues.apache.org/jira/browse/CONNECTORS-254 Project: ManifoldCF Issue Type: Bug Components: Lucene/SOLR connector Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 Reporter: Shinichiro Abe Priority: Minor Fix For: ManifoldCF 0.3 Attachments: CONNECTORS-254-1.patch, sample0byte.zip It seems that httpposter brings about bad request when posting 0 byte file. Solr log say the below. missing content stream. Status code is 400. On the other hand when using Solr request handler without MCF, this exception is not thrown and the posting 0 byte files is indexed normally. 2011/09/13 12:30:40 org.apache.solr.core.SolrCore execute ???: [] webapp=/solr path=/update/extract params={literal.id=file:/Users/abe/Desktop/1/no-content/no-content.txtliteral.uri=/Users/abe/Desktop/1/no-content/no-content.txt} status=400 QTime=367 2011/09/13 12:30:40 org.apache.solr.common.SolrException log ?v???I: org.apache.solr.common.SolrException: missing content stream at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:62) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira