RE: Add logs to repository connectors

2022-01-26 Thread Julien Massiera
Hi Karl,

So can I add a log4j2 logger in each repository connector and log events like I 
described in the previous mail ? Using the debug level

Regards,
Julien

-Message d'origine-
De : Julien Massiera  
Envoyé : mercredi 19 janvier 2022 12:53
À : dev@manifoldcf.apache.org
Objet : RE: Add logs to repository connectors

Hi Karl,

The goal of having a separate class to produce these logs is to either keeping 
them in the manifoldcf.log (by setting the logs in DEBUG), or writing them in a 
separate file by targeting the specific class in the logging.xml with a 
specific appender and a specific logger.
By default we would be in the first case, having thoses logs in the 
manifoldcf.log in DEBUG mode, and for those who want them in a separate file, 
they'll have to modify the logging.xml conf on their own.

With the two choices you propose, this cannot be done because there is no way 
to specifically target these logs in the logging.xml conf. If we use the 
existing logging class (and modify it to add the connector class), then by 
targeting this class in the logging.xml file we will target any logs generated 
with this class, and in the other solution that consists of having one native 
log4j logger in each connector, we would have to target every logger. 

If I had to choose it would be to have a native log4j logger in every connector 
so we can at least specifically target the logs even if it requires more conf.

Also the log format would be like this: 

[DOC_PROCESS_START|DOC_PROCESS_END]|CONNECTOR_NAME|DOCUMENT_IDENTIFIER

Regards,
Julien

-Message d'origine-
De : Karl Wright 
Envoyé : mardi 18 janvier 2022 17:44
À : dev 
Objet : Re: Add logs to repository connectors

Having a second log is pretty non-standard, and obscures the ordering of events 
that may be pertinent.  So I am not thrilled with that idea.
However, the current logging system does not make it easy to determine where a 
log message is coming from, nor can you filter log messages with grep easily.

It would be better to include enough information in the ONE manifoldcf.log file 
so that you know what connector the log message is coming from.

We already have individual loggers set up that are "catch all" buckets for 
events of various kinds, but the logger itself might want to have more 
information, e.g. the class that did the logging, as an optional argument.
Or - we could use log4j's native capabilities.  All that is needed to do that 
is to simply create a static logger in every connector class from which you 
wish to log, and write directly to that one.

Karl


On Tue, Jan 18, 2022 at 10:29 AM Julien Massiera < 
julien.massi...@francelabs.com> wrote:

> Hi all,
>
>
>
> In order to improve the tracking of documents processing status, and 
> in particular when something goes wrong with MCF (like hanging 
> processes without obvious causes), I would like to propose to add 
> specific logs into each repository connector. One log at the beginning 
> of each documentIdentifier processing in the processDocuments method, 
> and one at the end, so that, at any time, we can easily tell which 
> documents are being processed.
>
> To implement that, I was thinking of a common class that would write 
> the logs, with a custom log level (between INFO and DEBUG), so that 
> the logs could be easily isolated into a specific file through log4j 
> conf. The class would be stored in the 
> org.apache.manifoldcf.crawler.system package like the Logging class.
>
>
>
> What do you think about that ?
>
>
>
> Regards,
>
> Julien
>
>




RE: Add logs to repository connectors

2022-01-19 Thread Julien Massiera
Hi Karl,

The goal of having a separate class to produce these logs is to either keeping 
them in the manifoldcf.log (by setting the logs in DEBUG), or writing them in a 
separate file by targeting the specific class in the logging.xml with a 
specific appender and a specific logger.
By default we would be in the first case, having thoses logs in the 
manifoldcf.log in DEBUG mode, and for those who want them in a separate file, 
they'll have to modify the logging.xml conf on their own.

With the two choices you propose, this cannot be done because there is no way 
to specifically target these logs in the logging.xml conf. If we use the 
existing logging class (and modify it to add the connector class), then by 
targeting this class in the logging.xml file we will target any logs generated 
with this class, and in the other solution that consists of having one native 
log4j logger in each connector, we would have to target every logger. 

If I had to choose it would be to have a native log4j logger in every connector 
so we can at least specifically target the logs even if it requires more conf.

Also the log format would be like this: 

[DOC_PROCESS_START|DOC_PROCESS_END]|CONNECTOR_NAME|DOCUMENT_IDENTIFIER

Regards,
Julien

-Message d'origine-
De : Karl Wright  
Envoyé : mardi 18 janvier 2022 17:44
À : dev 
Objet : Re: Add logs to repository connectors

Having a second log is pretty non-standard, and obscures the ordering of events 
that may be pertinent.  So I am not thrilled with that idea.
However, the current logging system does not make it easy to determine where a 
log message is coming from, nor can you filter log messages with grep easily.

It would be better to include enough information in the ONE manifoldcf.log file 
so that you know what connector the log message is coming from.

We already have individual loggers set up that are "catch all" buckets for 
events of various kinds, but the logger itself might want to have more 
information, e.g. the class that did the logging, as an optional argument.
Or - we could use log4j's native capabilities.  All that is needed to do that 
is to simply create a static logger in every connector class from which you 
wish to log, and write directly to that one.

Karl


On Tue, Jan 18, 2022 at 10:29 AM Julien Massiera < 
julien.massi...@francelabs.com> wrote:

> Hi all,
>
>
>
> In order to improve the tracking of documents processing status, and 
> in particular when something goes wrong with MCF (like hanging 
> processes without obvious causes), I would like to propose to add 
> specific logs into each repository connector. One log at the beginning 
> of each documentIdentifier processing in the processDocuments method, 
> and one at the end, so that, at any time, we can easily tell which 
> documents are being processed.
>
> To implement that, I was thinking of a common class that would write 
> the logs, with a custom log level (between INFO and DEBUG), so that 
> the logs could be easily isolated into a specific file through log4j 
> conf. The class would be stored in the 
> org.apache.manifoldcf.crawler.system package like the Logging class.
>
>
>
> What do you think about that ?
>
>
>
> Regards,
>
> Julien
>
>



Re: Add logs to repository connectors

2022-01-18 Thread Karl Wright
Having a second log is pretty non-standard, and obscures the ordering of
events that may be pertinent.  So I am not thrilled with that idea.
However, the current logging system does not make it easy to determine
where a log message is coming from, nor can you filter log messages with
grep easily.

It would be better to include enough information in the ONE manifoldcf.log
file so that you know what connector the log message is coming from.

We already have individual loggers set up that are "catch all" buckets for
events of various kinds, but the logger itself might want to have more
information, e.g. the class that did the logging, as an optional argument.
Or - we could use log4j's native capabilities.  All that is needed to do
that is to simply create a static logger in every connector class from
which you wish to log, and write directly to that one.

Karl


On Tue, Jan 18, 2022 at 10:29 AM Julien Massiera <
julien.massi...@francelabs.com> wrote:

> Hi all,
>
>
>
> In order to improve the tracking of documents processing status, and in
> particular when something goes wrong with MCF (like hanging processes
> without obvious causes), I would like to propose to add specific logs into
> each repository connector. One log at the beginning of each
> documentIdentifier processing in the processDocuments method, and one at
> the
> end, so that, at any time, we can easily tell which documents are being
> processed.
>
> To implement that, I was thinking of a common class that would write the
> logs, with a custom log level (between INFO and DEBUG), so that the logs
> could be easily isolated into a specific file through log4j conf. The class
> would be stored in the org.apache.manifoldcf.crawler.system package like
> the
> Logging class.
>
>
>
> What do you think about that ?
>
>
>
> Regards,
>
> Julien
>
>