Re: [PR] RAT-320: Add -o/--output option to CLI [creadur-rat]

2023-11-01 Thread via GitHub


Claudenw commented on code in PR #161:
URL: https://github.com/apache/creadur-rat/pull/161#discussion_r1378464274


##
apache-rat-core/src/main/java/org/apache/rat/Report.java:
##
@@ -349,29 +357,20 @@ private Report() {
  * @return the IReportale instance containing the files.
  */
 private static IReportable getDirectory(String baseDirectory, 
ReportConfiguration config) {
-try (PrintStream out = new PrintStream(config.getOutput().get())) {

Review Comment:
   This line should close the output.



##
apache-rat-core/src/main/java/org/apache/rat/Report.java:
##
@@ -349,29 +357,20 @@ private Report() {
  * @return the IReportale instance containing the files.
  */
 private static IReportable getDirectory(String baseDirectory, 
ReportConfiguration config) {
-try (PrintStream out = new PrintStream(config.getOutput().get())) {
-File base = new File(baseDirectory);
-if (!base.exists()) {
-out.print("ERROR: ");
-out.print(baseDirectory);
-out.print(" does not exist.\n");
-return null;
-}
+File base = new File(baseDirectory);

Review Comment:
   I see that what you have done is remove the output from the method and let 
the exceptions provide the info.  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@creadur.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (RAT-325) Performance degradation compared to 0.15

2023-11-01 Thread Claude Warren (Jira)


[ 
https://issues.apache.org/jira/browse/RAT-325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17781683#comment-17781683
 ] 

Claude Warren commented on RAT-325:
---

That is what I would expect.  If you edit the configuration file 
/org/apache/rat/default.xml and remove the "not" clause on lines 22-24 I expect 
you will see a speed up.

This is the issue with "not" requiring that the block of input be processed 
until the end before being able to determine that the enclosed option is false. 
 In this case that the copyright does not exist. 

Since it has to process to the end and since the Copyright is a regex on every 
line it is expensive.

This is where I think the idea of specifying if a process is line or block 
oriented may make sense.  Though I think that all of the line oriented checks 
work on the block as well.

The code in the o.a.r.analysis.HeaderCheckWorker.readLine() reads each line and 
calls the matcher to see if it matches.  I think it would be much faster to 
modify HeaderCheckWorker.read() so that it reads the entire header block into a 
buffer first and then pass that buffer to the Matcher to see if it matches. 

This will ammortise the "Not", "regex" and "long text" costs.

Also, we should be able to simplify the "long text" checks as now each instance 
won't have to build the buffer itself and in the future we can probably convert 
it to a regex provided we do some work when we build the buffer to extract only 
comment code from the source files.  This would be a further optimization.

> Performance degradation compared to 0.15
> 
>
> Key: RAT-325
> URL: https://issues.apache.org/jira/browse/RAT-325
> Project: Apache Rat
>  Issue Type: Bug
>  Components: cli
>Affects Versions: 0.16
>Reporter: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 0.16
>
>
> While testing 0.16-SNAPSHOT, I identified rat is much longer to execute than 
> with 0.15.
> I'm investigating why.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (RAT-325) Performance degradation compared to 0.15

2023-11-01 Thread Claude Warren (Jira)


[ 
https://issues.apache.org/jira/browse/RAT-325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17781683#comment-17781683
 ] 

Claude Warren edited comment on RAT-325 at 11/1/23 9:52 AM:


That is what I would expect.  If you edit the configuration file 
/org/apache/rat/default.xml and remove the "not" clause on lines 22-24 I expect 
you will see a speed up.

This is the issue with "not" requiring that the block of input be processed 
until the end before being able to determine that the enclosed option is false. 
 In this case that the copyright does not exist. 

Since it has to process to the end and since the Copyright is a regex on every 
line it is expensive.

This is where I think the idea of specifying if a process is line or block 
oriented may make sense.  Though I think that all of the line oriented checks 
work on the block as well.

The code in the o.a.r.analysis.HeaderCheckWorker.readLine() reads each line and 
calls the matcher to see if it matches.  I think it would be much faster to 
modify HeaderCheckWorker.read() so that it reads the entire header block into a 
buffer first and then pass that buffer to the Matcher to see if it matches. 

This will ammortise the "Not", "regex" and "long text" costs.

Also, we should be able to simplify the "long text" checks as now each instance 
won't have to build the buffer itself and in the future we can probably convert 
it to a regex provided we do some work when we build the buffer to extract only 
comment code from the source files.  This would be a further optimization.

I will try to code up an example to see if we get better performance.


was (Author: claudenw):
That is what I would expect.  If you edit the configuration file 
/org/apache/rat/default.xml and remove the "not" clause on lines 22-24 I expect 
you will see a speed up.

This is the issue with "not" requiring that the block of input be processed 
until the end before being able to determine that the enclosed option is false. 
 In this case that the copyright does not exist. 

Since it has to process to the end and since the Copyright is a regex on every 
line it is expensive.

This is where I think the idea of specifying if a process is line or block 
oriented may make sense.  Though I think that all of the line oriented checks 
work on the block as well.

The code in the o.a.r.analysis.HeaderCheckWorker.readLine() reads each line and 
calls the matcher to see if it matches.  I think it would be much faster to 
modify HeaderCheckWorker.read() so that it reads the entire header block into a 
buffer first and then pass that buffer to the Matcher to see if it matches. 

This will ammortise the "Not", "regex" and "long text" costs.

Also, we should be able to simplify the "long text" checks as now each instance 
won't have to build the buffer itself and in the future we can probably convert 
it to a regex provided we do some work when we build the buffer to extract only 
comment code from the source files.  This would be a further optimization.

> Performance degradation compared to 0.15
> 
>
> Key: RAT-325
> URL: https://issues.apache.org/jira/browse/RAT-325
> Project: Apache Rat
>  Issue Type: Bug
>  Components: cli
>Affects Versions: 0.16
>Reporter: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 0.16
>
>
> While testing 0.16-SNAPSHOT, I identified rat is much longer to execute than 
> with 0.15.
> I'm investigating why.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (RAT-325) Performance degradation compared to 0.15

2023-11-01 Thread Claude Warren (Jira)


[ 
https://issues.apache.org/jira/browse/RAT-325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17781890#comment-17781890
 ] 

Claude Warren commented on RAT-325:
---

I have created a branch on my repository that changes to block processing and 
defines an IHeaders interface that has the raw header as well as a pruned one.  
So the creation of the header block and the pruning of it is only done once.  
This should improve performance and reduce memory footprint.  We can address if 
we want to change the prune function later (it is as originally developed prior 
to v0.16).

The code is on: 
https://github.com/Claudenw/creadur-rat/tree/performance_improvement

> Performance degradation compared to 0.15
> 
>
> Key: RAT-325
> URL: https://issues.apache.org/jira/browse/RAT-325
> Project: Apache Rat
>  Issue Type: Bug
>  Components: cli
>Affects Versions: 0.16
>Reporter: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 0.16
>
>
> While testing 0.16-SNAPSHOT, I identified rat is much longer to execute than 
> with 0.15.
> I'm investigating why.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] RAT-321 cleanup of testing issues [creadur-rat]

2023-11-01 Thread via GitHub


ottlinger commented on PR #159:
URL: https://github.com/apache/creadur-rat/pull/159#issuecomment-1789705372

   > absolutely. if you have a standard setup I would gladly follow it. Perhaps 
it would make sense to add a checkstyle process to align the code base and keep 
it so.
   
   I proposed some automated way of formatting such as 
https://github.com/google/google-java-format but some people opposed this 
automatism. Maybe checkstyle integration is a step in that direction . 
thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@creadur.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] RAT-321 cleanup of testing issues [creadur-rat]

2023-11-01 Thread via GitHub


ottlinger commented on PR #159:
URL: https://github.com/apache/creadur-rat/pull/159#issuecomment-1789714822

   @Claudenw Thanks for the many fixes - let's see how things evolve in other 
scenarios and additional PRs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@creadur.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] RAT-321 cleanup of testing issues [creadur-rat]

2023-11-01 Thread via GitHub


ottlinger commented on code in PR #159:
URL: https://github.com/apache/creadur-rat/pull/159#discussion_r1379350975


##
apache-rat-core/src/main/java/org/apache/rat/Report.java:
##
@@ -147,27 +146,106 @@ public static final void main(String[] args) throws 
Exception {
 if (args == null || args.length != 1) {
 printUsage(opts);
 } else {
+

Review Comment:
   I'll merge the changes, feel free to remove the commentedOut stuff in a 
different PR - thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@creadur.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] RAT-321 cleanup of testing issues [creadur-rat]

2023-11-01 Thread via GitHub


ottlinger merged PR #159:
URL: https://github.com/apache/creadur-rat/pull/159


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@creadur.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (RAT-321) Allow configuration to define SimplePatternBasedLicense instances

2023-11-01 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/RAT-321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17781899#comment-17781899
 ] 

ASF subversion and git services commented on RAT-321:
-

Commit 3b06360b768f11d06803cf6e26196da8e0145927 in creadur-rat's branch 
refs/heads/master from P. Ottlinger
[ https://gitbox.apache.org/repos/asf?p=creadur-rat.git;h=3b06360b ]

RAT-321: Merge pull request #159 from Claudenw/RAT-321-fix

RAT-321 cleanup of testing issues

> Allow configuration to define SimplePatternBasedLicense instances
> -
>
> Key: RAT-321
> URL: https://issues.apache.org/jira/browse/RAT-321
> Project: Apache Rat
>  Issue Type: Improvement
>  Components: engine
>Affects Versions: 0.15
>Reporter: Claude Warren
>Priority: Minor
> Fix For: 0.16
>
>
> The concept here is to rework the license matching definitions so that the 
> SimplePatternBasedLicense can be instantiated from a configuration file.
> The reasoning is that there are a number of licences that are of particular 
> interest in projects both inside and outside of the ASF that are not 
> currently included in the code.  This change will allow users to define the 
> string matching for any license that can be matched by simple string matching.
> The proposal is to move all the static definitions into a default file and 
> allow the default to be ignored/overridden.
> *Command Line Flags*
> Add a flag --no-default-file to skip reading the default definitions from 
> within the package.
> Add a flag --definition-file that accepts a file name argument, reads it and 
> adds the definitions to the list of definitions that are checked.  Ensure 
> that multiple --definition-file arguments may provided and that they are 
> added in the order in the --definition-file.
> *Static definition changes*
>  * Modify Meta.Data so that the license family category values are not 
> statically defined.
>  * Modify Meta.Data so that the license family names are not statically 
> defined.
> see https://github.com/apache/creadur-rat/pull/157 for details
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (RAT-321) Allow configuration to define SimplePatternBasedLicense instances

2023-11-01 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/RAT-321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17781901#comment-17781901
 ] 

ASF subversion and git services commented on RAT-321:
-

Commit d320f157b7ebd8aef0af828ec6b9b000b04c1c83 in creadur-rat's branch 
refs/heads/master from Philipp Ottlinger
[ https://gitbox.apache.org/repos/asf?p=creadur-rat.git;h=d320f157 ]

RAT-321: Remove old manual exclusion as we are past RAT 0.14


> Allow configuration to define SimplePatternBasedLicense instances
> -
>
> Key: RAT-321
> URL: https://issues.apache.org/jira/browse/RAT-321
> Project: Apache Rat
>  Issue Type: Improvement
>  Components: engine
>Affects Versions: 0.15
>Reporter: Claude Warren
>Priority: Minor
> Fix For: 0.16
>
>
> The concept here is to rework the license matching definitions so that the 
> SimplePatternBasedLicense can be instantiated from a configuration file.
> The reasoning is that there are a number of licences that are of particular 
> interest in projects both inside and outside of the ASF that are not 
> currently included in the code.  This change will allow users to define the 
> string matching for any license that can be matched by simple string matching.
> The proposal is to move all the static definitions into a default file and 
> allow the default to be ignored/overridden.
> *Command Line Flags*
> Add a flag --no-default-file to skip reading the default definitions from 
> within the package.
> Add a flag --definition-file that accepts a file name argument, reads it and 
> adds the definitions to the list of definitions that are checked.  Ensure 
> that multiple --definition-file arguments may provided and that they are 
> added in the order in the --definition-file.
> *Static definition changes*
>  * Modify Meta.Data so that the license family category values are not 
> statically defined.
>  * Modify Meta.Data so that the license family names are not statically 
> defined.
> see https://github.com/apache/creadur-rat/pull/157 for details
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (RAT-321) Allow configuration to define SimplePatternBasedLicense instances

2023-11-01 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/RAT-321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17781900#comment-17781900
 ] 

ASF subversion and git services commented on RAT-321:
-

Commit 8c4b23d9b185f4d4d50b8dbf66b54ff6d1fa47a5 in creadur-rat's branch 
refs/heads/master from Philipp Ottlinger
[ https://gitbox.apache.org/repos/asf?p=creadur-rat.git;h=8c4b23d9 ]

LHF: RAT-321: fix typo


> Allow configuration to define SimplePatternBasedLicense instances
> -
>
> Key: RAT-321
> URL: https://issues.apache.org/jira/browse/RAT-321
> Project: Apache Rat
>  Issue Type: Improvement
>  Components: engine
>Affects Versions: 0.15
>Reporter: Claude Warren
>Priority: Minor
> Fix For: 0.16
>
>
> The concept here is to rework the license matching definitions so that the 
> SimplePatternBasedLicense can be instantiated from a configuration file.
> The reasoning is that there are a number of licences that are of particular 
> interest in projects both inside and outside of the ASF that are not 
> currently included in the code.  This change will allow users to define the 
> string matching for any license that can be matched by simple string matching.
> The proposal is to move all the static definitions into a default file and 
> allow the default to be ignored/overridden.
> *Command Line Flags*
> Add a flag --no-default-file to skip reading the default definitions from 
> within the package.
> Add a flag --definition-file that accepts a file name argument, reads it and 
> adds the definitions to the list of definitions that are checked.  Ensure 
> that multiple --definition-file arguments may provided and that they are 
> added in the order in the --definition-file.
> *Static definition changes*
>  * Modify Meta.Data so that the license family category values are not 
> statically defined.
>  * Modify Meta.Data so that the license family names are not statically 
> defined.
> see https://github.com/apache/creadur-rat/pull/157 for details
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (RAT-321) Allow configuration to define SimplePatternBasedLicense instances

2023-11-01 Thread Philipp Ottlinger (Jira)


 [ 
https://issues.apache.org/jira/browse/RAT-321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philipp Ottlinger reassigned RAT-321:
-

Assignee: Claude Warren

> Allow configuration to define SimplePatternBasedLicense instances
> -
>
> Key: RAT-321
> URL: https://issues.apache.org/jira/browse/RAT-321
> Project: Apache Rat
>  Issue Type: Improvement
>  Components: engine
>Affects Versions: 0.15
>Reporter: Claude Warren
>Assignee: Claude Warren
>Priority: Minor
> Fix For: 0.16
>
>
> The concept here is to rework the license matching definitions so that the 
> SimplePatternBasedLicense can be instantiated from a configuration file.
> The reasoning is that there are a number of licences that are of particular 
> interest in projects both inside and outside of the ASF that are not 
> currently included in the code.  This change will allow users to define the 
> string matching for any license that can be matched by simple string matching.
> The proposal is to move all the static definitions into a default file and 
> allow the default to be ignored/overridden.
> *Command Line Flags*
> Add a flag --no-default-file to skip reading the default definitions from 
> within the package.
> Add a flag --definition-file that accepts a file name argument, reads it and 
> adds the definitions to the list of definitions that are checked.  Ensure 
> that multiple --definition-file arguments may provided and that they are 
> added in the order in the --definition-file.
> *Static definition changes*
>  * Modify Meta.Data so that the license family category values are not 
> statically defined.
>  * Modify Meta.Data so that the license family names are not statically 
> defined.
> see https://github.com/apache/creadur-rat/pull/157 for details
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: fyi: INFRA-25131 - Trying to change Creadur's Jira configuration

2023-11-01 Thread P. Ottlinger

Hi,

Am 30.10.23 um 23:06 schrieb P. Ottlinger:
I've filed https://issues.apache.org/jira/browse/INFRA-25131 in order to 
ask how to allow non-pmc members to be issue assignees.


after some discussion I've added
* Claude
* Jean-Baptiste
as contributors in Jira, thus they are able to work on tickets.

Thanks for your help and the newly created MRs :)

Cheers,
Phil


OpenPGP_signature.asc
Description: OpenPGP digital signature


[jira] [Assigned] (RAT-320) Add a command line option to output to a file.

2023-11-01 Thread Philipp Ottlinger (Jira)


 [ 
https://issues.apache.org/jira/browse/RAT-320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philipp Ottlinger reassigned RAT-320:
-

Assignee: Jean-Baptiste Onofré

> Add a command line option to output to a file.
> --
>
> Key: RAT-320
> URL: https://issues.apache.org/jira/browse/RAT-320
> Project: Apache Rat
>  Issue Type: Improvement
>  Components: cli
>Affects Versions: 0.15
>Reporter: Claude Warren
>Assignee: Jean-Baptiste Onofré
>Priority: Minor
>
> Currently the only way to capture output is to pipe it to a file.  When 
> working in the codebase it is often necessary to review the output of a run, 
> adding the ability to pipe the output of the command line to a file is simple 
> and assistive.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] RAT-320: Add -o/--output option to CLI [creadur-rat]

2023-11-01 Thread via GitHub


ottlinger commented on code in PR #161:
URL: https://github.com/apache/creadur-rat/pull/161#discussion_r1379361410


##
apache-rat-core/src/main/java/org/apache/rat/Report.java:
##
@@ -349,29 +357,20 @@ private Report() {
  * @return the IReportale instance containing the files.
  */
 private static IReportable getDirectory(String baseDirectory, 
ReportConfiguration config) {
-try (PrintStream out = new PrintStream(config.getOutput().get())) {
-File base = new File(baseDirectory);
-if (!base.exists()) {
-out.print("ERROR: ");
-out.print(baseDirectory);
-out.print(" does not exist.\n");
-return null;
-}
+File base = new File(baseDirectory);

Review Comment:
   @jbonofre I merged other changes concerning RAT-321. Pls retry.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@creadur.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (RAT-314) .mvn folder default exclude is not respected recursively

2023-11-01 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/RAT-314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17781919#comment-17781919
 ] 

ASF subversion and git services commented on RAT-314:
-

Commit 5a858f6e5ed84ff65c21e63de1f671f9c0f06b48 in creadur-rat's branch 
refs/heads/master from Philipp Ottlinger
[ https://gitbox.apache.org/repos/asf?p=creadur-rat.git;h=5a858f6e ]

RAT-314: Add note about redundant exclusion once 0.16 is used in RAT


> .mvn folder default exclude is not respected recursively
> 
>
> Key: RAT-314
> URL: https://issues.apache.org/jira/browse/RAT-314
> Project: Apache Rat
>  Issue Type: Bug
>  Components: maven
>Affects Versions: 0.15
>Reporter: François Guillot
>Assignee: Philipp Ottlinger
>Priority: Minor
> Fix For: 0.16
>
>
> The RAT plugin defines default excludes in `ExclusionHelper.addMavenDefaults` 
> for Maven project. One of them is ".mvn"
>  
> Yet, if you declare an extension in ".mvn/extensions.xml", the file is not 
> excluded.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)