Axel Howind created PDFBOX-5695:
-----------------------------------

             Summary: Improve PDFBox Logging
                 Key: PDFBOX-5695
                 URL: https://issues.apache.org/jira/browse/PDFBOX-5695
             Project: PDFBox
          Issue Type: Improvement
    Affects Versions: 4.0.0
            Reporter: Axel Howind


I know this has been an issue before 
(https://issues.apache.org/jira/browse/PDFBOX-693), but since then a lot of 
time has passed. I propose to switch Logging to either SLF4J (my preference) or 
Log4J.

Why? Currently, PDFBox uses commons logging. That library seems not to be 
actively maintained anymore, the last release was in 2014 with the comment „the 
main purpose of the 1.2 release is to drop support for Java 1.1“. PDFBox 4 
requires at least Java 11. The Java language has evolved a lot over the last 
few years, and so have to logging frameworks.

*Maintainabilty and Performance*

The main point that speaks against commons logging IMHO is that you have to 
make compromises between readability and performance. Let’s look at an example 
(taken from COSParser.java):

 
{code:java}
if (offsetOrObjstmObNr != null)
{ 
    LOG.debug("Set missing offset " + offsetOrObjstmObNr + " for object " + 
objKey);
    document.getXrefTable().put(objKey, offsetOrObjstmObNr);
} 
{code}
Each time the if-statement's body is entered, the logging message is 
constructed. This involves:
 - converting offsetOrObjstmObNr to a String
 - converting objKey to a String which in turn does two more conversions of 
integer and long values to a String, creates a new String by joining results of 
former conversions
 - creating a new String consisting of the results of the both operations above 
and some static text

A lot of temporary objects are created and all of this happens regardless of 
whether logging is necessary or not.

Both SLF4J and Log4J (and even JUL logging) support a message format syntax 
where arguments to logging calls are only evaluated when needed:

 
{code:java}
if (offsetOrObjstmObNr != null)
{
    LOG.debug("Set missing offset {} for object {}", offsetOrObjstmObNr, 
objKey);
    document.getXrefTable().put(objKey, offsetOrObjstmObNr);
}
{code}
Using common s logging, you can guard the logging statement:

 
{code:java}
if (offsetOrObjstmObNr != null)
{
    if (LOG.isDebugEnabled())
    {
        LOG.debug("Set missing offset " + offsetOrObjstmObNr + " for object " + 
objKey);
    }
    document.getXrefTable().put(objKey, offsetOrObjstmObNr);
}
{code}
This works, but makes the code less readable and harder to maintain because the 
level has to be changed in two places instead of one.

*Flexibility*

Another thing is that both SLF4J and Log4J offer you a facade that you can 
bridge to nearly every logging implementation in widespread use nowadays 
whereas commons logging is a full logging implementation that most of us do not 
need (there are ways to make commons logging work with other backends, but they 
are more of a workaround than a clean solution).

*JPMS support*

And of course, both SLF4J and Log4J offer first class support for the module 
system.

So what are the PDFBox maintainer's thoughts on this? I think PDFBox 4 offers 
an opportunity to switch to a new Logging facade/framework, and if there's a 
concensus, I volunteer to contribute a patch to make the transition.

*Support Lambda Logging*

In some cases, even using a message format ist not enough, for example there 
are instances in the code where a `byte[]` value is logged as a String. Since 
the String conversion has to be done explicitly, it would be done before the 
actual logging method is entered. This can be solved using lambdas by passing 
`() -> new String(value)` instead (syntax may vary).

*Alternatives*

The alternatives (sorted according to my own preference) are:
 - use SLF4J as logging facade
 - use Log4J2 as logging facade
 - use JUL logging (comes with a cost when used with other logging backends)
 - stay with commons logging and add guards in the code where they are not yet 
present
 - stay with commons logging and try to convince the maintainers (if there 
still are any) to add a message format support that is comparable to what the 
other three and then switch to the new version of commons logging once the 
functionality is there (and yes, if there's a chance this gets approved by the 
maintainers I'd see if I can contribute the necessary changes to commons 
logging provided the idea gets support over there)

Cheers,
Axel



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to