Build failed in Jenkins: ManifoldCF-ant #669

2018-10-17 Thread Apache Jenkins Server
See 

--
[...truncated 13.46 KB...]
at 
org.tmatesoft.sqljet.core.table.SqlJetDb.runTransaction(SqlJetDb.java:238)
at 
org.tmatesoft.sqljet.core.table.SqlJetDb.runWriteTransaction(SqlJetDb.java:211)
at 
org.tmatesoft.svn.core.internal.wc17.db.statement.SVNWCDbCreateSchema.exec(SVNWCDbCreateSchema.java:225)
... 28 more
org.tmatesoft.sqljet.core.SqlJetException: BUSY: error code is BUSY
at 
org.tmatesoft.sqljet.core.internal.pager.SqlJetPager.begin(SqlJetPager.java:2778)
at 
org.tmatesoft.sqljet.core.internal.btree.SqlJetBtree.beginTrans(SqlJetBtree.java:931)
at 
org.tmatesoft.sqljet.core.table.engine.SqlJetEngine.doBeginTransaction(SqlJetEngine.java:561)
at 
org.tmatesoft.sqljet.core.table.engine.SqlJetEngine.access$100(SqlJetEngine.java:55)
at 
org.tmatesoft.sqljet.core.table.engine.SqlJetEngine$12.runSynchronized(SqlJetEngine.java:535)
at 
org.tmatesoft.sqljet.core.table.engine.SqlJetEngine.runSynchronized(SqlJetEngine.java:217)
at 
org.tmatesoft.sqljet.core.table.engine.SqlJetEngine.runEngineTransaction(SqlJetEngine.java:529)
at 
org.tmatesoft.sqljet.core.table.SqlJetDb.runTransaction(SqlJetDb.java:238)
at 
org.tmatesoft.sqljet.core.table.SqlJetDb.runWriteTransaction(SqlJetDb.java:211)
at 
org.tmatesoft.svn.core.internal.wc17.db.statement.SVNWCDbCreateSchema.exec(SVNWCDbCreateSchema.java:225)
Caused: org.tmatesoft.svn.core.SVNException: svn: E200030: BUSY
at 
org.tmatesoft.svn.core.internal.wc.SVNErrorManager.error(SVNErrorManager.java:70)
at 
org.tmatesoft.svn.core.internal.wc.SVNErrorManager.error(SVNErrorManager.java:57)
at 
org.tmatesoft.svn.core.internal.db.SVNSqlJetDb.createSqlJetError(SVNSqlJetDb.java:195)
at 
org.tmatesoft.svn.core.internal.wc17.db.statement.SVNWCDbCreateSchema.exec(SVNWCDbCreateSchema.java:303)
at 
org.tmatesoft.svn.core.internal.wc17.db.SVNWCDb.createDb(SVNWCDb.java:292)
at 
org.tmatesoft.svn.core.internal.wc17.db.SVNWCDb.init(SVNWCDb.java:239)
at 
org.tmatesoft.svn.core.internal.wc17.SVNWCContext.initWC(SVNWCContext.java:5017)
at 
org.tmatesoft.svn.core.internal.wc17.SVNWCContext.initializeWC(SVNWCContext.java:4966)
at 
org.tmatesoft.svn.core.internal.wc2.ng.SvnNgAbstractUpdate.checkout(SvnNgAbstractUpdate.java:871)
at 
org.tmatesoft.svn.core.internal.wc2.ng.SvnNgCheckout.run(SvnNgCheckout.java:26)
at 
org.tmatesoft.svn.core.internal.wc2.ng.SvnNgCheckout.run(SvnNgCheckout.java:11)
at 
org.tmatesoft.svn.core.internal.wc2.ng.SvnNgOperationRunner.run(SvnNgOperationRunner.java:20)
at 
org.tmatesoft.svn.core.internal.wc2.SvnOperationRunner.run(SvnOperationRunner.java:21)
at 
org.tmatesoft.svn.core.wc2.SvnOperationFactory.run(SvnOperationFactory.java:1239)
at org.tmatesoft.svn.core.wc2.SvnOperation.run(SvnOperation.java:294)
at 
hudson.scm.subversion.CheckoutUpdater$1.perform(CheckoutUpdater.java:121)
Also:   hudson.remoting.Channel$CallSiteStackTrace: Remote call to H20
at 
hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1741)
at 
hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:357)
at hudson.remoting.Channel.call(Channel.java:955)
at hudson.FilePath.act(FilePath.java:1036)
at hudson.FilePath.act(FilePath.java:1025)
at hudson.scm.SubversionSCM.checkout(SubversionSCM.java:937)
at hudson.scm.SubversionSCM.checkout(SubversionSCM.java:864)
at hudson.scm.SCM.checkout(SCM.java:504)
at 
hudson.model.AbstractProject.checkout(AbstractProject.java:1208)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:574)
at 
jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:499)
at hudson.model.Run.execute(Run.java:1794)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at 
hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:429)
Caused: java.io.IOException: Failed to check out 
https://svn.apache.org/repos/asf/manifoldcf/trunk
at 
hudson.scm.subversion.CheckoutUpdater$1.perform(CheckoutUpdater.java:132)
at 
hudson.scm.subversion.WorkspaceUpdater$UpdateTask.delegateTo(WorkspaceUpdater.java:168)
at 
hudson.scm.subversion.WorkspaceUpdater$UpdateTask.delegateTo(WorkspaceUpdater.java:176)
at 
hudson.scm.subversion.UpdateUpdater$TaskImpl.perform(UpdateUpdater.java:187)
at 

[jira] [Updated] (CONNECTORS-1548) CMIS output connector test fails with versioning state error

2018-10-17 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-1548:

Description: 
While working on the upgrade to Tika 1.19.1, I ran into CMIS output connector 
test failures.  Specifically, here's the trace:

{code}
[junit] org.apache.manifoldcf.core.interfaces.ManifoldCFException: The 
versioning state flag is imcompatible to the type definition.
[junit] at 
org.apache.manifoldcf.agents.output.cmisoutput.CmisOutputConnector.addOrReplaceDocumentWithException(CmisOutputConnector.java:994)
{code}

Nested exception is:

{code}
[junit] Caused by: 
org.apache.chemistry.opencmis.commons.exceptions.CmisConstraintException: The 
versioning state flag is imcompatible to the type definition.
[junit] at 
org.apache.chemistry.opencmis.client.bindings.spi.atompub.AbstractAtomPubService.convertStatusCode(AbstractAtomPubService.java:514)
[junit] at 
org.apache.chemistry.opencmis.client.bindings.spi.atompub.AbstractAtomPubService.post(AbstractAtomPubService.java:717)
[junit] at 
org.apache.chemistry.opencmis.client.bindings.spi.atompub.ObjectServiceImpl.createDocument(ObjectServiceImpl.java:122)
[junit] at 
org.apache.chemistry.opencmis.client.runtime.SessionImpl.createDocument(SessionImpl.java:1158)
{code}

This may (or may not) be related to the Tika code now using a different 
implementation of jaxb.  I've moved all of jaxb and its dependent classes into 
connector-common-lib accordingly, and have no specific inclusions of jaxb in 
any connector class that would need it to be in connector-lib.

It has been committed to trunk; r1844137.  Please verify (or disprove) that the 
problem is the new jaxb implementation.  If it is we'll need to figure out why 
CMIS cares which implementation is used.


  was:
While working on the upgrade to Tika 1.19.1, I ran into CMIS output connector 
test failures.  Specifically, here's the trace:

{code}
[junit] org.apache.manifoldcf.core.interfaces.ManifoldCFException: The 
versioning state flag is imcompatible to the type definition.
[junit] at 
org.apache.manifoldcf.agents.output.cmisoutput.CmisOutputConnector.addOrReplaceDocumentWithException(CmisOutputConnector.java:994)
{code}

This may (or may not) be related to the Tika code now using a different 
implementation of jaxb.  I've moved all of jaxb and its dependent classes into 
connector-common-lib accordingly, and have no specific inclusions of jaxb in 
any connector class that would need it to be in connector-lib.

It has been committed to trunk; r1844137.  Please verify (or disprove) that the 
problem is the new jaxb implementation.  If it is we'll need to figure out why 
CMIS cares which implementation is used.



> CMIS output connector test fails with versioning state error
> 
>
> Key: CONNECTORS-1548
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1548
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: CMIS Output Connector
>Reporter: Karl Wright
>Assignee: Piergiorgio Lucidi
>Priority: Major
> Fix For: ManifoldCF 2.12
>
>
> While working on the upgrade to Tika 1.19.1, I ran into CMIS output connector 
> test failures.  Specifically, here's the trace:
> {code}
> [junit] org.apache.manifoldcf.core.interfaces.ManifoldCFException: The 
> versioning state flag is imcompatible to the type definition.
> [junit] at 
> org.apache.manifoldcf.agents.output.cmisoutput.CmisOutputConnector.addOrReplaceDocumentWithException(CmisOutputConnector.java:994)
> {code}
> Nested exception is:
> {code}
> [junit] Caused by: 
> org.apache.chemistry.opencmis.commons.exceptions.CmisConstraintException: The 
> versioning state flag is imcompatible to the type definition.
> [junit] at 
> org.apache.chemistry.opencmis.client.bindings.spi.atompub.AbstractAtomPubService.convertStatusCode(AbstractAtomPubService.java:514)
> [junit] at 
> org.apache.chemistry.opencmis.client.bindings.spi.atompub.AbstractAtomPubService.post(AbstractAtomPubService.java:717)
> [junit] at 
> org.apache.chemistry.opencmis.client.bindings.spi.atompub.ObjectServiceImpl.createDocument(ObjectServiceImpl.java:122)
> [junit] at 
> org.apache.chemistry.opencmis.client.runtime.SessionImpl.createDocument(SessionImpl.java:1158)
> {code}
> This may (or may not) be related to the Tika code now using a different 
> implementation of jaxb.  I've moved all of jaxb and its dependent classes 
> into connector-common-lib accordingly, and have no specific inclusions of 
> jaxb in any connector class that would need it to be in connector-lib.
> It has been committed to trunk; r1844137.  Please verify (or disprove) that 
> the problem is the new jaxb implementation.  If it is 

[jira] [Created] (CONNECTORS-1548) CMIS output connector test fails with versioning state error

2018-10-17 Thread Karl Wright (JIRA)
Karl Wright created CONNECTORS-1548:
---

 Summary: CMIS output connector test fails with versioning state 
error
 Key: CONNECTORS-1548
 URL: https://issues.apache.org/jira/browse/CONNECTORS-1548
 Project: ManifoldCF
  Issue Type: Bug
  Components: CMIS Output Connector
Reporter: Karl Wright
Assignee: Piergiorgio Lucidi
 Fix For: ManifoldCF 2.12


While working on the upgrade to Tika 1.19.1, I ran into CMIS output connector 
test failures.  Specifically, here's the trace:

{code}
[junit] org.apache.manifoldcf.core.interfaces.ManifoldCFException: The 
versioning state flag is imcompatible to the type definition.
[junit] at 
org.apache.manifoldcf.agents.output.cmisoutput.CmisOutputConnector.addOrReplaceDocumentWithException(CmisOutputConnector.java:994)
{code}

This may (or may not) be related to the Tika code now using a different 
implementation of jaxb.  I've moved all of jaxb and its dependent classes into 
connector-common-lib accordingly, and have no specific inclusions of jaxb in 
any connector class that would need it to be in connector-lib.

It has been committed to trunk; r1844137.  Please verify (or disprove) that the 
problem is the new jaxb implementation.  If it is we'll need to figure out why 
CMIS cares which implementation is used.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CONNECTORS-1547) No activity record for for excluded documents in WebCrawlerConnector

2018-10-17 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1547.
-
Resolution: Fixed

r1844120


> No activity record for for excluded documents in WebCrawlerConnector
> 
>
> Key: CONNECTORS-1547
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1547
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Web connector
>Reporter: Olivier Tavard
>Assignee: Karl Wright
>Priority: Minor
> Fix For: ManifoldCF 2.12
>
> Attachments: manifoldcf_local_files.log, manifoldcf_web.log, 
> simple_history_files.jpg, simple_history_web.jpg
>
>
> Hi,
> I noticed that there is no activity record logged for documents excluded by 
> the Document Filter transformation connector  in the WebCrawler connector.
> To reproduce the issue on MCF out of the box :
> Null output connector 
> Web repository connector 
> Job :
> - DocumentFilter added which only accepts application/msword (doc/docx) 
> documents
> The simple history does not mention the documents excluded (excepted for html 
> documents). They have fetch activity and that's all (see 
> simple_history_web.jpeg).
> We can only see the documents excluded by the MCF log (with DEBUG verbosity 
> activity on connectors) :
> {code:java}
> Removing url 
> 'https://www.datafari.com/assets/img/Logo_Datafari_4_Condensed_No_D_20180606_30x30.png'
>  because it had the wrong content type ('image/png'){code}
> (see manifoldcf_local_files.log)
> The related code is in WebcrawlerConnector.java l.904 :
> {code:java}
> fetchStatus.contextMessage = "it had the wrong content type 
> ('"+contentType+"')";
>  fetchStatus.resultSignal = RESULT_NO_DOCUMENT;
>  activityResultCode = null;{code}
> The activityResultCode is null.
>  
>  
> If we configure the same job but for a Local File system connector with the 
> same Document Filter transformation connector, the simple history mentions 
> all the documents excluded in the simple history (see 
> simple_history_files.jpeg)  and the code mentions a specific error code with 
> an activity record logged (class FileConnector l. 415) : 
> {code:java}
> if (!activities.checkMimeTypeIndexable(mimeType))
>  {
>  errorCode = activities.EXCLUDED_MIMETYPE;
>  errorDesc = "Excluded because mime type ('"+mimeType+"')";
>  Logging.connectors.debug("Skipping file '"+documentIdentifier+"' because 
> mime type ('"+mimeType+"') was excluded by output connector.");
>  activities.noDocument(documentIdentifier,versionString);
>  continue;
>  }{code}
>  
> So the Web Crawler connector should have the same behaviour than for 
> FileConnector and explicitly mention all the documents excluded by the user I 
> think.
>  
> Best regards,
> Olivier



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CONNECTORS-1547) No activity record for for excluded documents in WebCrawlerConnector

2018-10-17 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-1547:

Fix Version/s: ManifoldCF 2.12

> No activity record for for excluded documents in WebCrawlerConnector
> 
>
> Key: CONNECTORS-1547
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1547
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Web connector
>Reporter: Olivier Tavard
>Assignee: Karl Wright
>Priority: Minor
> Fix For: ManifoldCF 2.12
>
> Attachments: manifoldcf_local_files.log, manifoldcf_web.log, 
> simple_history_files.jpg, simple_history_web.jpg
>
>
> Hi,
> I noticed that there is no activity record logged for documents excluded by 
> the Document Filter transformation connector  in the WebCrawler connector.
> To reproduce the issue on MCF out of the box :
> Null output connector 
> Web repository connector 
> Job :
> - DocumentFilter added which only accepts application/msword (doc/docx) 
> documents
> The simple history does not mention the documents excluded (excepted for html 
> documents). They have fetch activity and that's all (see 
> simple_history_web.jpeg).
> We can only see the documents excluded by the MCF log (with DEBUG verbosity 
> activity on connectors) :
> {code:java}
> Removing url 
> 'https://www.datafari.com/assets/img/Logo_Datafari_4_Condensed_No_D_20180606_30x30.png'
>  because it had the wrong content type ('image/png'){code}
> (see manifoldcf_local_files.log)
> The related code is in WebcrawlerConnector.java l.904 :
> {code:java}
> fetchStatus.contextMessage = "it had the wrong content type 
> ('"+contentType+"')";
>  fetchStatus.resultSignal = RESULT_NO_DOCUMENT;
>  activityResultCode = null;{code}
> The activityResultCode is null.
>  
>  
> If we configure the same job but for a Local File system connector with the 
> same Document Filter transformation connector, the simple history mentions 
> all the documents excluded in the simple history (see 
> simple_history_files.jpeg)  and the code mentions a specific error code with 
> an activity record logged (class FileConnector l. 415) : 
> {code:java}
> if (!activities.checkMimeTypeIndexable(mimeType))
>  {
>  errorCode = activities.EXCLUDED_MIMETYPE;
>  errorDesc = "Excluded because mime type ('"+mimeType+"')";
>  Logging.connectors.debug("Skipping file '"+documentIdentifier+"' because 
> mime type ('"+mimeType+"') was excluded by output connector.");
>  activities.noDocument(documentIdentifier,versionString);
>  continue;
>  }{code}
>  
> So the Web Crawler connector should have the same behaviour than for 
> FileConnector and explicitly mention all the documents excluded by the user I 
> think.
>  
> Best regards,
> Olivier



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (CONNECTORS-1547) No activity record for for excluded documents in WebCrawlerConnector

2018-10-17 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-1547:
---

Assignee: Karl Wright

> No activity record for for excluded documents in WebCrawlerConnector
> 
>
> Key: CONNECTORS-1547
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1547
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Web connector
>Reporter: Olivier Tavard
>Assignee: Karl Wright
>Priority: Minor
> Attachments: manifoldcf_local_files.log, manifoldcf_web.log, 
> simple_history_files.jpg, simple_history_web.jpg
>
>
> Hi,
> I noticed that there is no activity record logged for documents excluded by 
> the Document Filter transformation connector  in the WebCrawler connector.
> To reproduce the issue on MCF out of the box :
> Null output connector 
> Web repository connector 
> Job :
> - DocumentFilter added which only accepts application/msword (doc/docx) 
> documents
> The simple history does not mention the documents excluded (excepted for html 
> documents). They have fetch activity and that's all (see 
> simple_history_web.jpeg).
> We can only see the documents excluded by the MCF log (with DEBUG verbosity 
> activity on connectors) :
> {code:java}
> Removing url 
> 'https://www.datafari.com/assets/img/Logo_Datafari_4_Condensed_No_D_20180606_30x30.png'
>  because it had the wrong content type ('image/png'){code}
> (see manifoldcf_local_files.log)
> The related code is in WebcrawlerConnector.java l.904 :
> {code:java}
> fetchStatus.contextMessage = "it had the wrong content type 
> ('"+contentType+"')";
>  fetchStatus.resultSignal = RESULT_NO_DOCUMENT;
>  activityResultCode = null;{code}
> The activityResultCode is null.
>  
>  
> If we configure the same job but for a Local File system connector with the 
> same Document Filter transformation connector, the simple history mentions 
> all the documents excluded in the simple history (see 
> simple_history_files.jpeg)  and the code mentions a specific error code with 
> an activity record logged (class FileConnector l. 415) : 
> {code:java}
> if (!activities.checkMimeTypeIndexable(mimeType))
>  {
>  errorCode = activities.EXCLUDED_MIMETYPE;
>  errorDesc = "Excluded because mime type ('"+mimeType+"')";
>  Logging.connectors.debug("Skipping file '"+documentIdentifier+"' because 
> mime type ('"+mimeType+"') was excluded by output connector.");
>  activities.noDocument(documentIdentifier,versionString);
>  continue;
>  }{code}
>  
> So the Web Crawler connector should have the same behaviour than for 
> FileConnector and explicitly mention all the documents excluded by the user I 
> think.
>  
> Best regards,
> Olivier



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CONNECTORS-1547) No activity record for for excluded documents in WebCrawlerConnector

2018-10-17 Thread Olivier Tavard (JIRA)
Olivier Tavard created CONNECTORS-1547:
--

 Summary: No activity record for for excluded documents in 
WebCrawlerConnector
 Key: CONNECTORS-1547
 URL: https://issues.apache.org/jira/browse/CONNECTORS-1547
 Project: ManifoldCF
  Issue Type: Bug
  Components: Web connector
Reporter: Olivier Tavard
 Attachments: manifoldcf_local_files.log, manifoldcf_web.log, 
simple_history_files.jpg, simple_history_web.jpg

Hi,

I noticed that there is no activity record logged for documents excluded by the 
Document Filter transformation connector  in the WebCrawler connector.

To reproduce the issue on MCF out of the box :

Null output connector 

Web repository connector 

Job :

- DocumentFilter added which only accepts application/msword (doc/docx) 
documents

The simple history does not mention the documents excluded (excepted for html 
documents). They have fetch activity and that's all (see 
simple_history_web.jpeg).
We can only see the documents excluded by the MCF log (with DEBUG verbosity 
activity on connectors) :
{code:java}
Removing url 
'https://www.datafari.com/assets/img/Logo_Datafari_4_Condensed_No_D_20180606_30x30.png'
 because it had the wrong content type ('image/png'){code}
(see manifoldcf_local_files.log)

The related code is in WebcrawlerConnector.java l.904 :
{code:java}
fetchStatus.contextMessage = "it had the wrong content type 
('"+contentType+"')";
 fetchStatus.resultSignal = RESULT_NO_DOCUMENT;
 activityResultCode = null;{code}
The activityResultCode is null.

 

 

If we configure the same job but for a Local File system connector with the 
same Document Filter transformation connector, the simple history mentions all 
the documents excluded in the simple history (see simple_history_files.jpeg)  
and the code mentions a specific error code with an activity record logged 
(class FileConnector l. 415) : 
{code:java}
if (!activities.checkMimeTypeIndexable(mimeType))
 {
 errorCode = activities.EXCLUDED_MIMETYPE;
 errorDesc = "Excluded because mime type ('"+mimeType+"')";
 Logging.connectors.debug("Skipping file '"+documentIdentifier+"' because mime 
type ('"+mimeType+"') was excluded by output connector.");
 activities.noDocument(documentIdentifier,versionString);
 continue;
 }{code}
 

So the Web Crawler connector should have the same behaviour than for 
FileConnector and explicitly mention all the documents excluded by the user I 
think.

 

Best regards,

Olivier



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: ManifoldCF database model

2018-10-17 Thread Gustavo Beneitez
Ok thanks!

El mié., 17 oct. 2018 a las 14:27, Karl Wright ()
escribió:

> Ok, the schema is described in ManifoldCF In Action.
>
> https://github.com/DaddyWri/manifoldcfinaction/tree/master/pdfs
>
> Karl
>
>
> On Wed, Oct 17, 2018 at 7:41 AM Gustavo Beneitez <
> gustavo.benei...@gmail.com>
> wrote:
>
> > Hi Karl,
> >
> > as far as I was able to gather information from history records, I could
> > see MCF is behaving as expected. The "problem" shows when ElasticSearch
> is
> > down or performing bad, MCF says it was requested to be deleted, but
> while
> > it has been erased from database, it is alive on ElasticSearch side, so I
> > need to find whether or not there are those kind of inconsistencies
> exist.
> >
> > Please allow us to check those documents and make new tests in order to
> see
> > what really happens,we don't modify any database record by hand.
> >
> > Thanks!
> >
> >
> >
> >
> >
> >
> >
> > El mar., 16 oct. 2018 a las 19:27, Karl Wright ()
> > escribió:
> >
> > > Hi, you can look at ManifoldCF In Action.  There's a link to it on the
> > > manifoldcf page.
> > >
> > > However, you should be aware that we consider it a severe bug if
> > ManifoldCF
> > > doesn't clean up after itself.  The only time that is not expected is
> > when
> > > people write buggy connectors or mess with database tables
> themselves.  I
> > > would urge you to examine the Simple History report and try to come up
> > with
> > > a reproducible test case rather than trying to reverse engineer MCF.
> > > Should you go directly to the database, we will be unable to give you
> any
> > > support.
> > >
> > > Thanks,
> > > Karl
> > >
> > >
> > > On Tue, Oct 16, 2018 at 11:51 AM Gustavo Beneitez <
> > > gustavo.benei...@gmail.com> wrote:
> > >
> > > > Hi all,
> > > >
> > > > how do you do? I was wandering if there is any technical document
> about
> > > > what is the meaning of each table in database, the relationship
> between
> > > > documents, repositories, jobs and any other output connector (some
> kind
> > > of
> > > > a database model).
> > > >
> > > > We are facing some "garbage issues", jobs are created, duplicated,
> > > related
> > > > to transformations, linked to outputs (Elastic Search), played and
> > > finally
> > > > deleted, but in the end documents that should be also deleted against
> > the
> > > > output connector,  sometimes they still are there, don't know if they
> > are
> > > > visible because they point to an existing job, an unexpected job end
> or
> > > any
> > > > other failure.
> > > >
> > > > We need to understand the database model in order to check when
> > documents
> > > > stored in Elastic can be safely removed since they no longer are
> > referred
> > > > by any process. A process that should be executed periodically every
> > > week,
> > > > for example.
> > > >
> > > > Thanks in advance!
> > > >
> > >
> >
>


Re: ManifoldCF database model

2018-10-17 Thread Karl Wright
Ok, the schema is described in ManifoldCF In Action.

https://github.com/DaddyWri/manifoldcfinaction/tree/master/pdfs

Karl


On Wed, Oct 17, 2018 at 7:41 AM Gustavo Beneitez 
wrote:

> Hi Karl,
>
> as far as I was able to gather information from history records, I could
> see MCF is behaving as expected. The "problem" shows when ElasticSearch is
> down or performing bad, MCF says it was requested to be deleted, but while
> it has been erased from database, it is alive on ElasticSearch side, so I
> need to find whether or not there are those kind of inconsistencies exist.
>
> Please allow us to check those documents and make new tests in order to see
> what really happens,we don't modify any database record by hand.
>
> Thanks!
>
>
>
>
>
>
>
> El mar., 16 oct. 2018 a las 19:27, Karl Wright ()
> escribió:
>
> > Hi, you can look at ManifoldCF In Action.  There's a link to it on the
> > manifoldcf page.
> >
> > However, you should be aware that we consider it a severe bug if
> ManifoldCF
> > doesn't clean up after itself.  The only time that is not expected is
> when
> > people write buggy connectors or mess with database tables themselves.  I
> > would urge you to examine the Simple History report and try to come up
> with
> > a reproducible test case rather than trying to reverse engineer MCF.
> > Should you go directly to the database, we will be unable to give you any
> > support.
> >
> > Thanks,
> > Karl
> >
> >
> > On Tue, Oct 16, 2018 at 11:51 AM Gustavo Beneitez <
> > gustavo.benei...@gmail.com> wrote:
> >
> > > Hi all,
> > >
> > > how do you do? I was wandering if there is any technical document about
> > > what is the meaning of each table in database, the relationship between
> > > documents, repositories, jobs and any other output connector (some kind
> > of
> > > a database model).
> > >
> > > We are facing some "garbage issues", jobs are created, duplicated,
> > related
> > > to transformations, linked to outputs (Elastic Search), played and
> > finally
> > > deleted, but in the end documents that should be also deleted against
> the
> > > output connector,  sometimes they still are there, don't know if they
> are
> > > visible because they point to an existing job, an unexpected job end or
> > any
> > > other failure.
> > >
> > > We need to understand the database model in order to check when
> documents
> > > stored in Elastic can be safely removed since they no longer are
> referred
> > > by any process. A process that should be executed periodically every
> > week,
> > > for example.
> > >
> > > Thanks in advance!
> > >
> >
>


Re: ManifoldCF database model

2018-10-17 Thread Gustavo Beneitez
Hi Karl,

as far as I was able to gather information from history records, I could
see MCF is behaving as expected. The "problem" shows when ElasticSearch is
down or performing bad, MCF says it was requested to be deleted, but while
it has been erased from database, it is alive on ElasticSearch side, so I
need to find whether or not there are those kind of inconsistencies exist.

Please allow us to check those documents and make new tests in order to see
what really happens,we don't modify any database record by hand.

Thanks!







El mar., 16 oct. 2018 a las 19:27, Karl Wright ()
escribió:

> Hi, you can look at ManifoldCF In Action.  There's a link to it on the
> manifoldcf page.
>
> However, you should be aware that we consider it a severe bug if ManifoldCF
> doesn't clean up after itself.  The only time that is not expected is when
> people write buggy connectors or mess with database tables themselves.  I
> would urge you to examine the Simple History report and try to come up with
> a reproducible test case rather than trying to reverse engineer MCF.
> Should you go directly to the database, we will be unable to give you any
> support.
>
> Thanks,
> Karl
>
>
> On Tue, Oct 16, 2018 at 11:51 AM Gustavo Beneitez <
> gustavo.benei...@gmail.com> wrote:
>
> > Hi all,
> >
> > how do you do? I was wandering if there is any technical document about
> > what is the meaning of each table in database, the relationship between
> > documents, repositories, jobs and any other output connector (some kind
> of
> > a database model).
> >
> > We are facing some "garbage issues", jobs are created, duplicated,
> related
> > to transformations, linked to outputs (Elastic Search), played and
> finally
> > deleted, but in the end documents that should be also deleted against the
> > output connector,  sometimes they still are there, don't know if they are
> > visible because they point to an existing job, an unexpected job end or
> any
> > other failure.
> >
> > We need to understand the database model in order to check when documents
> > stored in Elastic can be safely removed since they no longer are referred
> > by any process. A process that should be executed periodically every
> week,
> > for example.
> >
> > Thanks in advance!
> >
>