[ https://issues.apache.org/jira/browse/CONNECTORS-936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13998684#comment-13998684 ]
Karl Wright commented on CONNECTORS-936: ---------------------------------------- Hi Cetra, I think your patch is actually the correct one. The current contract for RepositoryDocument requires that a non-null binary input stream be set. We may relax that in the future, but for now I will commit your fix. Thanks! > RepositoryDocuments with binaryFieldData = null causes issues with solr > ----------------------------------------------------------------------- > > Key: CONNECTORS-936 > URL: https://issues.apache.org/jira/browse/CONNECTORS-936 > Project: ManifoldCF > Issue Type: Bug > Components: CMIS connector > Affects Versions: ManifoldCF 1.6 > Reporter: Cetra Free > Priority: Minor > Fix For: ManifoldCF 1.7 > > Attachments: CmisRepositoryConnector.patch > > > If a RepositoryDocument is ingested into an activity without an InputStream > set using the setBinary method, it causes errors with the solr output > connector: > {code} > java.lang.IllegalArgumentException: Input stream may not be null > at org.apache.http.util.Args.notNull(Args.java:48) > at > org.apache.http.entity.mime.content.InputStreamBody.<init>(InputStreamBody.java:70) > at > org.apache.http.entity.mime.content.InputStreamBody.<init>(InputStreamBody.java:58) > at > org.apache.manifoldcf.agents.output.solr.ModifiedHttpSolrServer.request(ModifiedHttpSolrServer.java:201) > at > org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:199) > at > org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117) > at > org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:951) > {code} > This can be replicated by trying to ingest documents from a CMIS repository > which contain no content. > The dirty workaround I've come up with is just to provide a Null Input Stream > In *CmisRepositoryConnector.java*: > Import NullInputStream from commons: > {code} > import org.apache.commons.io.input.NullInputStream; > {code} > And Change: > {code} > if(fileLength>0 && document.getContentStream()!=null){ > is = document.getContentStream().getStream(); > rd.setBinary(is, fileLength); > } > {code} > To: > {code} > if(fileLength>0 && document.getContentStream()!=null){ > is = document.getContentStream().getStream(); > rd.setBinary(is, fileLength); > } else { > rd.setBinary(new NullInputStream(0),0); > } > {code} > I'm not sure what the correct fix would be. Possibly change the > *RepositoryDocument* class or handle the situation correctly in the Solr > connector. > It doesn't seem to be an issue with other repository connectors, such as > FileConnector, as they always provide an InputStream. -- This message was sent by Atlassian JIRA (v6.2#6252)