Dear Andy and All,

Thank you very much for the information that you have provided to me. You 
really helped me a lot.
Below I list some of the most frequent issue types that I have found while I 
was examining the evolution of Jena.

Issue Name      Issues over time        Fixed   Currently Open
The members of an interface declaration or class should appear in a pre-defined 
order   52875   47530   5345
Sections of code should not be "commented out"  35465   31905   3560
Method names should comply with a naming convention     32809   29448   3361
Constant names should comply with a naming convention   18397   16572   1825
String literals should not be 
duplicated<https://sbforge.org/sonar/rules/show/squid:S1192?layout=false> 18106 
  16390   1716
Standard outputs should not be used directly to log 
anything<https://sbforge.org/sonar/rules/show/squid:S106?layout=false>      
16174   14506   1668
Exception handlers should preserve the original 
exceptions<https://sbforge.org/sonar/rules/show/squid:S1166?layout=false>       
10763   9747    1016
Source files should not have any duplicated blocks      9141    8078    1063
<https://sbforge.org/sonar/rules/show/squid:S1186?layout=false>Methods should 
not be empty<https://sbforge.org/sonar/rules/show/squid:S1186?layout=false>     
  8653    7796    857
switch case clauses should not have too many lines of 
code<https://sbforge.org/sonar/rules/show/squid:S1151?layout=false>       8280  
  7491    789
throws declarations should not be 
superfluous<https://sbforge.org/sonar/rules/show/squid:RedundantThrowsDeclarationCheck?layout=false>
  7012    6138    874
Class variable fields should not have public 
accessibility<https://sbforge.org/sonar/rules/show/squid:ClassVariableVisibilityCheck?layout=false>
        6593    5954    639


  1.  The issue that appears the most frequently is: The members of an 
interface declaration or class should appear in a pre-defined order. "According 
to the Java Code Conventions as defined by Oracle, the members of a class or 
interface declaration should appear in the following order in the source files: 
1. Class and instance variables, 2.Constructors, and 3. Methods."
  2.  The second most frequent is: Sections of code should not be "commented 
out". "Programmers should not comment out code as it bloats programs and 
reduces readability.  Unused code should be deleted and can be retrieved from 
source control history if required."
  3.  Method names should comply with a naming convention. "Shared naming 
conventions allow teams to collaborate efficiently. This rule checks that all 
method names match the default provided regular expression ^[a-z][a-zA-Z0-9]*$"
  4.  Constant names should comply with a naming convention. "Shared coding 
conventions allow teams to collaborate efficiently. This rule checks that all 
constant names match the default regular expression 
^[A-Z][A-Z0-9]*(_[A-Z0-9]+)*$"
  5.  String literals should not be duplicated. "Duplicated string literals 
make the process of refactoring error-prone, since you must be sure to update 
all occurrences.  On the other hand, constants can be referenced from many 
places, but only need to be updated in a single place."
  6.  Standard outputs should not be used directly to log anything. "When 
logging a message there are several important requirements which must be 
fulfilled:
* The user must be able to easily retrieve the logs
* The format of all logged message must be uniform to allow the user to easily 
read the log
* Logged data must actually be recorded
* Sensitive data must only be logged securely
If a program directly writes to the standard outputs, there is absolutely no 
way to comply with those requirements. That's why defining and using a 
dedicated logger is highly recommended."
  7.  Exception handlers should preserve the original exceptions. "When 
handling a caught exception, the original exception's message and stack trace 
should be logged or passed forward."
  8.  Source files should not have any duplicated blocks. "An issue is created 
on a file as soon as there is at least one block of duplicated code on this 
file"
  9.  Methods should not be empty. "There are several reasons for a method not 
to have a method body:
* It is an unintentional omission, and should be fixed to prevent an unexpected 
behavior in production.
* It is not yet, or never will be, supported. In this case an 
UnsupportedOperationException should be thrown.
* The method is an intentionally-blank override. In this case a nested comment 
should explain the reason for the blank override."
  10. switch case clauses should not have too many lines of code. "The switch 
statement should be used only to clearly define some new branches in the 
control flow. As soon as a case clause contains too many statements this highly 
decreases the readability of the overall control flow statement. In such case, 
the content of the case clause should be extracted into a dedicated method."
  11.
"throws" declarations should not be superfluous. "An exception in a throws 
declaration in Java is superfluous if it is:
* listed multiple times
* a subclass of another listed exception
* a RuntimeException, or one of its descendants
* completely unnecessary because the declared exception type cannot actually be 
thrown"
  12. Class variable fields should not have public accessibility. "Public class 
variable fields do not respect the encapsulation principle and has three main 
disadvantages:
* Additional behavior such as validation cannot be added.
* The internal representation is exposed, and cannot be changed afterwards.
* Member values are subject to change from anywhere in the code and may not 
meet the programmer's assumptions.
By using private attributes and accessor methods (set and get), unauthorized 
modifications are prevented."

I would like to ask you:

  *   If you agree/disagree (and why)
  *   Are you aware of those?
  *   Why do you think that they introduced?
  *   Do you think that all those introduced because Jena does not enforce a 
"squash-before-merging" policy?
  *   Are you planing to fix some issues of the above categories?
  *   All the currently open issues are listed on the sheet: Jena: Open Issues 
- October 7, 2017 of 
this<https://docs.google.com/spreadsheets/d/1DloQ_GS9l2KS6ldgdHOQkjsCB1J_rrMyUauHC_Ymgfk/edit?usp=sharing>
 spreadsheet

Thank you in advance!


With kind regards,


George Digkas

________________________________
From: Andy Seaborne <a...@apache.org>
Sent: Saturday, November 18, 2017 6:43 PM
To: Γεώργιος Δίγκας; dev@jena.apache.org
Subject: Re: Issues fixed in Apache Jena

Jena was registered at SourceForge 2001-11-20

I found this in ASF SVN:

[[
Added Mon Nov 26 17:41:44 2001 UTC (15 years, 11 months ago) by der
]]
on copyright.txt.

so it looks like we have full history somewhere in the Apache
infrastructure.

CVS:SF->SVN:SF->SVN:ASF

ASF git does not include the pre-git history.

     Andy

On 16/11/17 11:55, Andy Seaborne wrote:
> Do not take git as complete!
>
> Jena started in 2000.
> https://lists.w3.org/Archives/Public/www-rdf-interest/2000Aug/0128.html
>
> Jena 2.0 was released 2003-08-28.
> A whole 40M including dependencies! A 14.7M zip file!
> https://sourceforge.net/projects/jena/files/
>
> The whole of SF SVN history was imported by the Apache infrastructure
> team (a herculean effort) into Apache SVN. I don't know how to get to it
> from git, it may not be there and only in SVN.
>
> The earliest git root commit is for the move to Apache from SF
> [4298106f1e], 6 years ago. (There are 4 root commits due to merges)




>
> ---
>
> It's an interesting start and to make the analysis usefully inform the
> reader as to the state of the project I suggest treating different kinds
> of issues different, not uniformly important.
>
> There are many (, many) minor things and they outweigh the major
> problems. Calling them all "issues" gives them equal weight. Some are
> about canonicalization of the code.
>
> Yet reformatting the whole code base (if practical, which it arguable)
> then greatly decreases the usefulness of git history. That would be a
> huge loss.
>
> (NB the "issue" word has a specific meaning for JIRA which a lot of
> Apache projects use. Jena's current total, now, is 1424.)
>
>      Andy
>
>>
>> Thank you in advance!
>>
>>
>> With kind regards,
>>
>> George Digkas

Reply via email to