[ 
https://issues.apache.org/jira/browse/CONNECTORS-688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13655656#comment-13655656
 ] 

Karl Wright commented on CONNECTORS-688:
----------------------------------------

I've created a branch at 
https://svn.apache.org/repos/asf/manifoldcf/branches/CONNECTORS-688.  The 
livelink connector has code in it that I *think* will make it avoid documents 
in the RecycleBin.  Please give it a try and let me know.

                
> Can we exclude the Recycle Bin from being crawled in the Livelink Connector?
> ----------------------------------------------------------------------------
>
>                 Key: CONNECTORS-688
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-688
>             Project: ManifoldCF
>          Issue Type: Improvement
>    Affects Versions: ManifoldCF 1.1.1, ManifoldCF 1.2, ManifoldCF 1.3
>         Environment: running solr 4.0 final with manifoldcf v1.1.1 on RHEL 6 
> 64 bit on tomcat v7.0.34
>            Reporter: David Morana
>            Assignee: Karl Wright
>            Priority: Minor
>             Fix For: ManifoldCF 1.3
>
>
> When a file in Livelink (Content Server 10 update 6) gets moved to the 
> Recycle Bin (RC v10.0.0; this module is NOT a part of the basic content 
> server install) the file is still crawled, indexed and it appears in search 
> results (although the link will be inaccessible to users)
> the recycle bin is a special folder on the content server; it holds documents 
> to be purged at a later date. LAPI still shows that they are not deleted. 
> Can we add a filter to the UI and Livelink connector to exclude certain 
> ownerID's (i.e. the ID of the recycle bin) from the crawl?
> In LivelinkConnectors.java you check to see if the version has been deleted 
> and an additional check would need to be added to see if it was sent to the 
> recycle bin (for example, the recycle bin's object id is 426023)
> Here's an example:
> after this call
> {code}
> int status = LLDocs.GetVersionInfo(vol,id,revNumber,versioninfo);
> {code}
> Just check the OWNER in the versioninfo object
> like so:
> {code}
> int ownerID = versioninfo.toInteger("OWNER");
> {code}
> If owner is the NEGATIVE value of the recycle bin ID (i.e -426023) then it's 
> marked for deletion and should be excluded from the index.
> I think this would be a great feature because you could just make it a 
> generic way to exclude project workspaces or special folders from being 
> crawled by supplying an object ID and comparing it to the owner ID of the 
> file. 
> Thanks,

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to