[ 
https://issues.apache.org/jira/browse/RAT-265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847052#comment-17847052
 ] 

Claude Warren commented on RAT-265:
-----------------------------------

[~raphinesse] 

I believe that [pull 252|https://github.com/apache/creadur-rat/pull/252] solves 
this problem. As of this moment that pull has not been merged but you should be 
able to compile and test it if you would like.

The issue was, as you rightly pointed out, was that the filter builder would 
fail on the regular expression compile and ignore the other potential filters.  
This has been fixed.


> CLI: Certain wildcard file filters do not work anymore
> ------------------------------------------------------
>
>                 Key: RAT-265
>                 URL: https://issues.apache.org/jira/browse/RAT-265
>             Project: Apache Rat
>          Issue Type: Sub-task
>          Components: Client - cli
>    Affects Versions: 0.13, 0.14
>            Reporter: Raphael von der Grün
>            Assignee: Claude Warren
>            Priority: Major
>             Fix For: 0.17
>
>
> Run the following command in the root of the `rat` repo:
> {noformat}
> java -jar apache-rat-0.14-20191120.132901-66.jar -e "*.txt" -d 
> apache-rat-core/src/test/resources/violations{noformat}
> This will give the following output on `stderr`: 
> {noformat}
> Will skip given exclusion '*.txt' due to 
> java.util.regex.PatternSyntaxException: Dangling meta character '*' near 
> index 0
> *.txt
> ^
> {noformat}
> Furthermore, `bad.txt` will NOT be excluded from the license check.
> The error that causes this is thrown in [line 132 of 
> `org.apache.rat.Report.java`|#L132]]. The reason is simple: any glob pattern 
> that starts with `*` or `?` is not a valid regex. When Line 132 throws, the 
> next two lines will also be skipped, so the pattern will not be added at all.
> Unfortunately, a solution to this problem is not so simple. In `v0.12` the 
> `-e` option always added wildcard filters while `-E` always added regex 
> filters. The documentation still states the same in the latest `v0.14` 
> snapshot. Beginning with `v0.13` the code tries to add any exclude rule as 
> three different filters. I believe this approach is inherently flawed.
> Firstly, the `new NameFileFilter(exclusion)` is redundant if we also add `new 
> WildcardFileFilter(exclusion)`. The files matched by the `NameFileFilter` are 
> a subset of those matched by the `WildcardFileFilter` since any magic 
> character (i.e. `?` or `*`) in `exclusion` also matches itself when used in a 
> `WildcardFileFilter`.
> So let's assume we only register the `WildcardFileFilter` and the 
> `RegexFileFilter`. Even if we properly add patterns as wildcard filters that 
> are not a valid RegEx, there are still patterns where we cannot decide what 
> the user's intention was. Consider the pattern `bi.ini`. Should it be 
> interpreted as a wildcard pattern and match only itself or should it be 
> interpreted as a regex and also match `bikini` for example?
> My recommendation for a quick patch solution would be to go back to the 
> exclusion behavior of `v0.12`.
> Beyond that, the nicest solution IMHO would be support for ignore files with 
> the same semantics as `.gitignore` (via `-E`) and support for giving extended 
> shell globs via `-e`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to