[ 
https://issues.apache.org/jira/browse/HADOOP-12301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kengo Seki updated HADOOP-12301:
--------------------------------
    Description: 
1. rubocop.sh counts only lines which have at least five fields separated by a 
colon:

{code}
  calcdiffs "${PATCH_DIR}/branch-rubocop-result.txt" 
"${PATCH_DIR}/patch-rubocop-result.txt" > "${PATCH_DIR}/diff-patch-rubocop.txt"
  diffPostpatch=$(${AWK} -F: 'BEGIN {sum=0} 4<NF {sum+=1} END {print sum}' 
"${PATCH_DIR}/diff-patch-rubocop.txt")

  if [[ ${diffPostpatch} -gt 0 ]] ; then
    # shellcheck disable=SC2016
    numPrepatch=$(${AWK} -F: 'BEGIN {sum=0} 4<NF {sum+=1} END {print sum}' 
"${PATCH_DIR}/branch-rubocop-result.txt")

    # shellcheck disable=SC2016
    numPostpatch=$(${AWK} -F: 'BEGIN {sum=0} 4<NF {sum+=1} END {print sum}' 
"${PATCH_DIR}/patch-rubocop-result.txt")
{code}

This is because the diff result can contain multiple lines for one issue. For 
example:

{code}
[sekikn@mobile hadoop]$ cat 
/private/tmp/test-patch-hbase/25821/diff-patch-rubocop.txt
bin/draining_servers.rb:165:1: C: Do not introduce global variables.
$foo
^^^^
{code}

Checking the number of fields is intended to skip the second and third lines in 
the above diff file. But four or more colons can appear in the source code 
itself, for example:

{code}
| Vote |       Subsystem |  Runtime   | Comment
============================================================================
|  -1  |        rubocop  |  0m 02s    | The applied patch generated 4 new 
|      |                 |            | rubocop issues (total was 77, now 81).


|| Subsystem || Report/Notes ||
============================================================================
| rubocop | v0.32.1 |
| rubocop | /private/tmp/test-patch-hbase/5632/diff-patch-rubocop.txt |


[sekikn@mobile hadoop]$ cat 
/private/tmp/test-patch-hbase/5632/diff-patch-rubocop.txt
bin/draining_servers.rb:165:4: C: Do not use :: for method calls.
foo::bar::baz
   ^^
bin/draining_servers.rb:165:9: C: Do not use :: for method calls.
foo::bar::baz
        ^^
[sekikn@mobile hadoop]$ 
{code}

In this case, new rubocop issues should be 2, but counted as 4 incorrectly. 
More reliable way to count is needed.

2. pylint.sh has the same problem. In addition, I removed awk's '-F:' option by 
mistake on HADOOP-12286. It can report a wrong number for now.


  was:
Some test-patch plugins have problems about counting the diff lines.

1. ruby-lint.sh counts only lines which have five or more fields:

{code}
  diffPostpatch=$(${AWK} -F: 'BEGIN {sum=0} 4<NF {sum+=1} END {print sum}' 
"${PATCH_DIR}/diff-patch-ruby-lint.txt")

  if [[ ${diffPostpatch} -gt 0 ]] ; then
    # shellcheck disable=SC2016
    numPrepatch=$(${AWK} -F: 'BEGIN {sum=0} 4<NF {sum+=1} END {print sum}' 
"${PATCH_DIR}/branch-ruby-lint-result.txt")

    # shellcheck disable=SC2016
    numPostpatch=$(${AWK} -F: 'BEGIN {sum=0} 4<NF {sum+=1} END {print sum}' 
"${PATCH_DIR}/patch-ruby-lint-result.txt")
{code}

This is because the diff result can contain multiple lines for one issue. For 
example:

{code}
[sekikn@mobile hadoop]$ cat /tmp/test-patch-hbase/1149/diff-patch-rubocop.txt 
bin/draining_servers.rb:165:1: C: Do not introduce global variables.
$FOO
^^^^
bin/draining_servers.rb:166:1: C: Do not introduce global variables.
$BAR
^^^^
{code}

Checking the number of fields is intended to skip the second, third, fifth and 
sixth lines above. But four or more colons can appear in the source code 
itself, so more reliable way is needed.

2. pylint.sh has the same problem. In addition, I removed awk's '-F:' option by 
mistake on HADOOP-12286. It must report a wrong number for now.

3. rubocop.sh should also follow ruby-lint.sh. Or, change its formatter to 
"emacs" or "simple" style (currently "clang"). They are parser-friendly because 
they outputs one line by one issue, so we can count issues by 'wc -l' like 
perlcritic.sh. https://github.com/bbatsov/rubocop#emacs-style-formatter


> Fix some test-patch plugins to count the diff lines correctly
> -------------------------------------------------------------
>
>                 Key: HADOOP-12301
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12301
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: yetus
>    Affects Versions: HADOOP-12111
>            Reporter: Kengo Seki
>
> 1. rubocop.sh counts only lines which have at least five fields separated by 
> a colon:
> {code}
>   calcdiffs "${PATCH_DIR}/branch-rubocop-result.txt" 
> "${PATCH_DIR}/patch-rubocop-result.txt" > 
> "${PATCH_DIR}/diff-patch-rubocop.txt"
>   diffPostpatch=$(${AWK} -F: 'BEGIN {sum=0} 4<NF {sum+=1} END {print sum}' 
> "${PATCH_DIR}/diff-patch-rubocop.txt")
>   if [[ ${diffPostpatch} -gt 0 ]] ; then
>     # shellcheck disable=SC2016
>     numPrepatch=$(${AWK} -F: 'BEGIN {sum=0} 4<NF {sum+=1} END {print sum}' 
> "${PATCH_DIR}/branch-rubocop-result.txt")
>     # shellcheck disable=SC2016
>     numPostpatch=$(${AWK} -F: 'BEGIN {sum=0} 4<NF {sum+=1} END {print sum}' 
> "${PATCH_DIR}/patch-rubocop-result.txt")
> {code}
> This is because the diff result can contain multiple lines for one issue. For 
> example:
> {code}
> [sekikn@mobile hadoop]$ cat 
> /private/tmp/test-patch-hbase/25821/diff-patch-rubocop.txt
> bin/draining_servers.rb:165:1: C: Do not introduce global variables.
> $foo
> ^^^^
> {code}
> Checking the number of fields is intended to skip the second and third lines 
> in the above diff file. But four or more colons can appear in the source code 
> itself, for example:
> {code}
> | Vote |       Subsystem |  Runtime   | Comment
> ============================================================================
> |  -1  |        rubocop  |  0m 02s    | The applied patch generated 4 new 
> |      |                 |            | rubocop issues (total was 77, now 81).
> || Subsystem || Report/Notes ||
> ============================================================================
> | rubocop | v0.32.1 |
> | rubocop | /private/tmp/test-patch-hbase/5632/diff-patch-rubocop.txt |
> [sekikn@mobile hadoop]$ cat 
> /private/tmp/test-patch-hbase/5632/diff-patch-rubocop.txt
> bin/draining_servers.rb:165:4: C: Do not use :: for method calls.
> foo::bar::baz
>    ^^
> bin/draining_servers.rb:165:9: C: Do not use :: for method calls.
> foo::bar::baz
>         ^^
> [sekikn@mobile hadoop]$ 
> {code}
> In this case, new rubocop issues should be 2, but counted as 4 incorrectly. 
> More reliable way to count is needed.
> 2. pylint.sh has the same problem. In addition, I removed awk's '-F:' option 
> by mistake on HADOOP-12286. It can report a wrong number for now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to