[ 
https://issues.apache.org/jira/browse/CONNECTORS-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16569724#comment-16569724
 ] 

Karl Wright commented on CONNECTORS-1517:
-----------------------------------------

That's unfortunate, because I don't know DQL either.  And I especially don't 
know the DQL for finding documents that are either missing their mime type 
field altogether or have one that's not listed.

I can certainly say this though: the way the form posting works for this 
particular selection is a bit wonky, but seem to guarantee that it works 
consistently PROVIDED the job is edited in the UI at least once.  If nothing 
has been selected at all, then the form is displayed with all boxes checked.  
The very first time the form is reposted or saved, all the checked boxes are 
gathered and become part of the mime spec.  So you'd think that if you wanted 
to achieve the original behavior, you just uncheck everything -- but that 
doesn't work because that's explicitly blocked and won't clear out the old 
stuff.

I think the first order of business is making the form work properly.  Then we 
can look at making changes.


> Documentum Connector uses different "unconstrained" a_content_type filters 
> depending on whether the Content Types tab has been edited
> -------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CONNECTORS-1517
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1517
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Documentum connector
>    Affects Versions: ManifoldCF 2.10
>            Reporter: James Thomas
>            Assignee: Karl Wright
>            Priority: Major
>             Fix For: ManifoldCF 2.11
>
>
> I am using Manifold 2.10 patched for issue 
> https://issues.apache.org/jira/browse/CONNECTORS-1512
> I find that the "unconstrained" query submitted to Documentum differs 
> depending on whether the Content Types in the job have been edited or not. 
> This can dramatically affect which files are fetched. After editing, there 
> are likely to be fewer.
> For example, having simply created a job connecting to DM and setting only 
> the Paths value to Administrator/james the following request is generated. 
> (Taken from manifoldcf.log).
> Note that there are no a_content_type constraints (and my line break for 
> readibility):
> {code:java}
> DEBUG 2018-07-26T05:52:56,422 (Startup thread) - DCTM: About to execute 
> query= (select for READ distinct i_chronicle_id from dm_document where 
> r_modify_date >= date('01/01/1970 01:00:00','mm/dd/yyyy hh:mi:ss') and 
> r_modify_date<=date('07/26/2018 05:52:56','mm/dd/yyyy hh:mi:ss') AND 
> (i_is_deleted=TRUE Or (i_is_deleted=FALSE AND a_full_text=TRUE AND 
> r_content_size>0))
> AND ( Folder('/Administrator/james', DESCEND) ))
> {code}
> Once the Content Types tab has been edited (e.g. to remove the 123w type) it 
> looks like this, i.e. the search constrains to only the selected types (my 
> ellipsis for readibility):
> {code:java}
> DEBUG 2018-07-26T05:58:36,755 (Startup thread) - DCTM: About to execute 
> query= (select for READ distinct i_chronicle_id from dm_document where 
> r_modify_date >= date('01/01/1970 01:00:00','mm/dd/yyyy hh:mi:ss') and 
> r_modify_date<=date('07/26/2018 05:58:36','mm/dd/yyyy hh:mi:ss') AND 
> (i_is_deleted=TRUE Or (i_is_deleted=FALSE AND a_full_text=TRUE AND 
> r_content_size>0 
> AND a_content_type IN ('acad', ... 'zip_pub_html'))) 
> AND ( Folder('/Administrator/james', DESCEND) ))
> {code}
> If the 123w type is now reselected in the Content Types tab, the search adds 
> it to the list of a_content_type entries, but doesn't return to the 
> unconstrained initial search:
> {code:java}
> DEBUG 2018-07-26T05:59:16,863 (Startup thread) - DCTM: About to execute 
> query= (select for READ distinct i_chronicle_id from dm_document where 
> r_modify_date >= date('01/01/1970 01:00:00','mm/dd/yyyy hh:mi:ss') and 
> r_modify_date<=date('07/26/2018 05:59:16','mm/dd/yyyy hh:mi:ss') AND 
> (i_is_deleted=TRUE Or (i_is_deleted=FALSE AND a_full_text=TRUE AND 
> r_content_size>0 
> AND a_content_type IN ('123w', ... 'zip_pub_html'))) 
> AND ( Folder('/Administrator/james', DESCEND) ))
> {code}
> This means that running what appears to be an equivalent job several times 
> may not fetch the same set of documents from Documentum.
> I expect that the same configuration in the UI produces the same search to 
> Documentum, regardless of how the configuration was arrived at.
> If the selected items in the Content Types list is treated as the only set of 
> files to fetch (i,.e. the initial unconstrained search is considered 
> incorrect here) then I guess I might also like to have flexibility to fetch 
> file types not on the checklist in the Content Types tab.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to