[jira] [Updated] (CONNECTORS-254) Bad request when posting 0 byte file to Solr

2011-09-13 Thread Shinichiro Abe (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shinichiro Abe updated CONNECTORS-254:
--

Attachment: CONNECTORS-254-1.patch

When the stream length is zero, it replaces space. If this procedure is okay, 
I'll modify the indent around that.

 Bad request when posting 0 byte file to Solr
 

 Key: CONNECTORS-254
 URL: https://issues.apache.org/jira/browse/CONNECTORS-254
 Project: ManifoldCF
  Issue Type: Bug
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3
Reporter: Shinichiro Abe
Priority: Minor
 Fix For: ManifoldCF 0.3

 Attachments: CONNECTORS-254-1.patch, sample0byte.zip


 It seems that httpposter brings about bad request when posting 0 byte file.
 Solr log say the below. missing content stream. Status code is 400. 
 On the other hand when using Solr request handler without MCF, this exception 
 is not thrown and the posting 0 byte files is indexed normally.
  
 2011/09/13 12:30:40 org.apache.solr.core.SolrCore execute
 ???: [] webapp=/solr path=/update/extract 
 params={literal.id=file:/Users/abe/Desktop/1/no-content/no-content.txtliteral.uri=/Users/abe/Desktop/1/no-content/no-content.txt}
  status=400 QTime=367 
 2011/09/13 12:30:40 org.apache.solr.common.SolrException log
 ?v???I: org.apache.solr.common.SolrException: missing content stream
   at 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:62)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-254) Bad request when posting 0 byte file to Solr

2011-09-13 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13103465#comment-13103465
 ] 

Karl Wright commented on CONNECTORS-254:


I have some concerns about this approach.

First, I'm not sure it is the right thing to do.  It would be good to know more 
about the case you are comparing against when you say On the other hand when 
using Solr request handler without MCF, this exception is not thrown and the 
posting 0 byte files is indexed normally.  How are you posting the content 
normally?  When you post the content in that way, what does the entire http 
request look like?  You can use Wireshark to capture it, or I can, so we can 
see exactly what happens in that case.

Second, if it turns out that adding a space is the correct thing to do, I'm 
concerned because this change is not being reflected in the content-length 
header.  If the header reports a different length than what is posted, the 
posted data will be truncated.  This may be the whole goal, though, in which 
case we should add a comment to the code noting that the content MUST be the 
last field posted for this reason.


 Bad request when posting 0 byte file to Solr
 

 Key: CONNECTORS-254
 URL: https://issues.apache.org/jira/browse/CONNECTORS-254
 Project: ManifoldCF
  Issue Type: Bug
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3
Reporter: Shinichiro Abe
Priority: Minor
 Fix For: ManifoldCF 0.3

 Attachments: CONNECTORS-254-1.patch, sample0byte.zip


 It seems that httpposter brings about bad request when posting 0 byte file.
 Solr log say the below. missing content stream. Status code is 400. 
 On the other hand when using Solr request handler without MCF, this exception 
 is not thrown and the posting 0 byte files is indexed normally.
  
 2011/09/13 12:30:40 org.apache.solr.core.SolrCore execute
 ???: [] webapp=/solr path=/update/extract 
 params={literal.id=file:/Users/abe/Desktop/1/no-content/no-content.txtliteral.uri=/Users/abe/Desktop/1/no-content/no-content.txt}
  status=400 QTime=367 
 2011/09/13 12:30:40 org.apache.solr.common.SolrException log
 ?v???I: org.apache.solr.common.SolrException: missing content stream
   at 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:62)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[RESULT][VOTE] Release ManifoldCF 0.3-incubating, RC1

2011-09-13 Thread Karl Wright
Three +1's.  72 hours.  Vote passes!
Karl

On Mon, Sep 12, 2011 at 9:52 PM, Shinichiro Abe
shinichiro.ab...@gmail.com wrote:
  +1

  The JCIFS Connector and ant test work fine!

  Shinichiro Abe


 On 2011/09/13, at 10:05, Karl Wright wrote:

 Thanks!

 We need one more binding +1.  Shinichiro?  Simon?  Erlend?  Tommaso?

 Karl

 On Mon, Sep 12, 2011 at 12:06 PM, Piergiorgio Lucidi
 piergiorgioluc...@gmail.com wrote:
 +1

 The CMIS Connector works fine!

 Piergiorgio

 2011/9/9 Karl Wright daddy...@gmail.com

 You can download the release candidate from
 http://people.apache.org/~kwright, and there is also a tag in svn
 under https://svn.apache.org/repos/asf/incubator/lcf/tags.

 +1 to release this RC.
 -1 to not release it.

 After a successful release vote, please be aware that I will need to
 present the release candidate to the incubator for their vote as well,
 before the release is actually made.

 Karl




 --
 Piergiorgio Lucidi
 http://about.me/piergiorgiolucidi





[jira] [Commented] (CONNECTORS-202) SOLR connector suport for commitWithin

2011-09-13 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13103589#comment-13103589
 ] 

Karl Wright commented on CONNECTORS-202:


Looks fine.  I'll commit it and update the site this evening.


 SOLR connector suport for commitWithin
 --

 Key: CONNECTORS-202
 URL: https://issues.apache.org/jira/browse/CONNECTORS-202
 Project: ManifoldCF
  Issue Type: Improvement
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 0.2, ManifoldCF 0.3
Reporter: Jan Høydahl
Assignee: Karl Wright
  Labels: commit
 Fix For: ManifoldCF 0.4

 Attachments: CONNECTORS-202.patch


 The output connection must support commitWithin 
 (http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22add.22)
  in addition to sending a commit() at the end of a job.
 This allows for efficient handling of commits on the Solr side.
 The parameter should ideally be configurable per job. In that way you could 
 say that for Important job commitWithin=10s while for Big crawl job, 
 commitWithin=600s.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-202) SOLR connector suport for commitWithin

2011-09-13 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13103622#comment-13103622
 ] 

Karl Wright commented on CONNECTORS-202:


r1170174 for the documentation update.


 SOLR connector suport for commitWithin
 --

 Key: CONNECTORS-202
 URL: https://issues.apache.org/jira/browse/CONNECTORS-202
 Project: ManifoldCF
  Issue Type: Improvement
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 0.2, ManifoldCF 0.3
Reporter: Jan Høydahl
Assignee: Karl Wright
  Labels: commit
 Fix For: ManifoldCF 0.4

 Attachments: CONNECTORS-202.patch


 The output connection must support commitWithin 
 (http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22add.22)
  in addition to sending a commit() at the end of a job.
 This allows for efficient handling of commits on the Solr side.
 The parameter should ideally be configurable per job. In that way you could 
 say that for Important job commitWithin=10s while for Big crawl job, 
 commitWithin=600s.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-254) Bad request when posting 0 byte file to Solr

2011-09-13 Thread Shinichiro Abe (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13103632#comment-13103632
 ] 

Shinichiro Abe commented on CONNECTORS-254:
---

I agree. My approach was hasty. I'll examine again.(especially adding a space)

At least the following results in posting normally on Solr. the exception isn't 
thrown. 

curl 
http://localhost:8983/solr/update/extract?literal.id=1stream.file=/path/to/0bytefilecommit=true;
 

I want to make the same behavior for httpposter. What do you think about this?

 Bad request when posting 0 byte file to Solr
 

 Key: CONNECTORS-254
 URL: https://issues.apache.org/jira/browse/CONNECTORS-254
 Project: ManifoldCF
  Issue Type: Bug
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3
Reporter: Shinichiro Abe
Priority: Minor
 Fix For: ManifoldCF 0.3

 Attachments: CONNECTORS-254-1.patch, sample0byte.zip


 It seems that httpposter brings about bad request when posting 0 byte file.
 Solr log say the below. missing content stream. Status code is 400. 
 On the other hand when using Solr request handler without MCF, this exception 
 is not thrown and the posting 0 byte files is indexed normally.
  
 2011/09/13 12:30:40 org.apache.solr.core.SolrCore execute
 ???: [] webapp=/solr path=/update/extract 
 params={literal.id=file:/Users/abe/Desktop/1/no-content/no-content.txtliteral.uri=/Users/abe/Desktop/1/no-content/no-content.txt}
  status=400 QTime=367 
 2011/09/13 12:30:40 org.apache.solr.common.SolrException log
 ?v???I: org.apache.solr.common.SolrException: missing content stream
   at 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:62)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-254) Bad request when posting 0 byte file to Solr

2011-09-13 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13103639#comment-13103639
 ] 

Karl Wright commented on CONNECTORS-254:


The curl command you have described is not even a POST, it's a GET.  The 
document itself is not being sent, just the file name.  The stream.file 
argument may well be used by Solr on the update handler side to open the file 
directly.  That's not going to work with ManifoldCF though because there's no 
guarantee of a shared file system between ManifoldCF and the Solr instance.



 Bad request when posting 0 byte file to Solr
 

 Key: CONNECTORS-254
 URL: https://issues.apache.org/jira/browse/CONNECTORS-254
 Project: ManifoldCF
  Issue Type: Bug
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3
Reporter: Shinichiro Abe
Priority: Minor
 Fix For: ManifoldCF 0.3

 Attachments: CONNECTORS-254-1.patch, sample0byte.zip


 It seems that httpposter brings about bad request when posting 0 byte file.
 Solr log say the below. missing content stream. Status code is 400. 
 On the other hand when using Solr request handler without MCF, this exception 
 is not thrown and the posting 0 byte files is indexed normally.
  
 2011/09/13 12:30:40 org.apache.solr.core.SolrCore execute
 ???: [] webapp=/solr path=/update/extract 
 params={literal.id=file:/Users/abe/Desktop/1/no-content/no-content.txtliteral.uri=/Users/abe/Desktop/1/no-content/no-content.txt}
  status=400 QTime=367 
 2011/09/13 12:30:40 org.apache.solr.common.SolrException log
 ?v???I: org.apache.solr.common.SolrException: missing content stream
   at 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:62)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-254) Bad request when posting 0 byte file to Solr

2011-09-13 Thread Shinichiro Abe (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13103670#comment-13103670
 ] 

Shinichiro Abe commented on CONNECTORS-254:
---

I'm sorry. 
I modified post.sh and posted 0 byte file. But the exception isn't thrown in 
this case. Is this a GET?
 
---
FILES=$*
URL=http://localhost:8983/solr/update/extract?literal.id=1

for f in $FILES; do
  curl $URL --data-binary @$f -H 'Content-type:application/octet-stream' 
done
---

 Bad request when posting 0 byte file to Solr
 

 Key: CONNECTORS-254
 URL: https://issues.apache.org/jira/browse/CONNECTORS-254
 Project: ManifoldCF
  Issue Type: Bug
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3
Reporter: Shinichiro Abe
Priority: Minor
 Fix For: ManifoldCF 0.3

 Attachments: CONNECTORS-254-1.patch, sample0byte.zip


 It seems that httpposter brings about bad request when posting 0 byte file.
 Solr log say the below. missing content stream. Status code is 400. 
 On the other hand when using Solr request handler without MCF, this exception 
 is not thrown and the posting 0 byte files is indexed normally.
  
 2011/09/13 12:30:40 org.apache.solr.core.SolrCore execute
 ???: [] webapp=/solr path=/update/extract 
 params={literal.id=file:/Users/abe/Desktop/1/no-content/no-content.txtliteral.uri=/Users/abe/Desktop/1/no-content/no-content.txt}
  status=400 QTime=367 
 2011/09/13 12:30:40 org.apache.solr.common.SolrException log
 ?v???I: org.apache.solr.common.SolrException: missing content stream
   at 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:62)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-254) Bad request when posting 0 byte file to Solr

2011-09-13 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13103678#comment-13103678
 ] 

Karl Wright commented on CONNECTORS-254:


The --data-binary switch to curl causes it to use POST.  Here's a bit of the 
manpage:

 --data-binary data
  (HTTP) This posts data exactly as specified with no  extra  proâ
  cessing whatsoever.

So, what happens when you use curl to post your zero-length file?  Do you get 
back a 400 response?  Try using the CURL -vvv switch to see what it is doing.

If it comes back with a 200 OK response and not a 400, then either you or I 
should try to do the same thing while Wireshark is capturing packets.  If you 
are using a Linux system, you could instead want to use tcpdump to do the 
capture, and then examine the capture with Wireshark (on Windows).  If this is 
too confusing, now that I have a test case I can try to do this later today.


 Bad request when posting 0 byte file to Solr
 

 Key: CONNECTORS-254
 URL: https://issues.apache.org/jira/browse/CONNECTORS-254
 Project: ManifoldCF
  Issue Type: Bug
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3
Reporter: Shinichiro Abe
Priority: Minor
 Fix For: ManifoldCF 0.3

 Attachments: CONNECTORS-254-1.patch, sample0byte.zip


 It seems that httpposter brings about bad request when posting 0 byte file.
 Solr log say the below. missing content stream. Status code is 400. 
 On the other hand when using Solr request handler without MCF, this exception 
 is not thrown and the posting 0 byte files is indexed normally.
  
 2011/09/13 12:30:40 org.apache.solr.core.SolrCore execute
 ???: [] webapp=/solr path=/update/extract 
 params={literal.id=file:/Users/abe/Desktop/1/no-content/no-content.txtliteral.uri=/Users/abe/Desktop/1/no-content/no-content.txt}
  status=400 QTime=367 
 2011/09/13 12:30:40 org.apache.solr.common.SolrException log
 ?v???I: org.apache.solr.common.SolrException: missing content stream
   at 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:62)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira