nyoungstudios commented on PR #35564:
URL: https://github.com/apache/beam/pull/35564#issuecomment-3082185850

   > You've impacted 1,775 files here, I'm assuming by applying the whitespace 
linting to the entire code base (although I have no clue how you managed to 
also impact things like SVGs for the website.) I would suggest narrowing the 
scope _significantly_, as A) I'd prefer to not impact `git blame` on all of 
these files and B) actually reviewing this PR is a nightmare to ensure that all 
of the changes are legitimately scoped to pure whitespace changes.
   
   Thanks Jack for your feedback and perspective here (also enjoyed your 
session when you spoke at Beam Summit too). 
   
   For SVGs, they are ultimately text files too, not binary files. So, applying 
the whitespace rules to them seems fair in my opinion. Fwiw, there were many 
SVGs that did not require any changes which means that they already follow 
these new linting rules. So, in my opinion, enforcing consistency here is a 
good thing.
   
   If you prefer my changes to not effect the existing git blame, I could 
create a separate pr after this one to add the git hash for this merged change 
to an ignore file. You can see an example from another Apache project, Apache 
Airflow, 
[here](https://github.com/apache/airflow/blob/main/.git-blame-ignore-revs).
   
   Finally, I do not think any human should try to manually review the 
whitespace changes in this pr. To make it easier to, I have pushed 4 separate 
commits that should allow you to review the non-whitespace changes separately. 
Let's go through each of them.
   
   
https://github.com/apache/beam/pull/35564/commits/2f6d47d48ab64b3a43e170d32dc3953000223614
 (`adds editorconfig and pre-commit config`)
   - First, this commit adds the `.editorconfig` file which makes it easier for 
your editor to pick up the correct indentation for the file extension. Mainly, 
most editors and projects use 4 spaces for Python indents where this project 
uses 2. Let me know if I missed any extensions though. So, this allows the new 
contributor to start contributing with less hassle. This `.editorconfig` also 
removes trailing whitespaces at the end of life and makes sure you have a new 
line at the end of the file when you save it.
   - Secondly, this commit also adds the changes in the 
`.pre-commit-config.yaml` file to enforce the whitespace linting.
   
   
https://github.com/apache/beam/pull/35564/commits/be0f63539bdb4d6c9e76e624e25314dba160a5a6
 (`comments out these pre-commit hooks temporarily`)
   - This commit comments out the yapf and pylint pre-commit hooks in the 
`.pre-commit-config.yaml` temporarily. No need to run these pre-commit hooks if 
we didn't modify any Python code. And mainly to save some runtime when running 
the next step.
   
   
https://github.com/apache/beam/pull/35564/commits/0029f5f18b933959ca3223cfb8641f2a7430b322
 (`applies pre-commit changes`)
   - This commit is the result of running `pre-commit run --all-files` on top 
of the previous commit.
   - To verify this is all I did, you can checkout my branch to the previous 
commit 
(https://github.com/apache/beam/pull/35564/commits/be0f63539bdb4d6c9e76e624e25314dba160a5a6)
 and run `pre-commit run --all-files` after installing it if you don't already 
have it setup locally. Then, you can do a git diff to see that the result is 
the same.
   
   
https://github.com/apache/beam/pull/35564/commits/b38afdc3f619df3031b3635f6c80968d13aadf95
 (`Revert "comments out these pre-commit hooks temporarily"`)
   - This commit reverts our commit from earlier when we commented out the yapf 
and pylint pre-commit hooks.
   
   ------------
   
   Okay, after walking through each commit and running the pre-commit command 
locally yourself, I think you can have more confidence in my change. But how do 
we know that the pre-commit hook I am adding does what it says. 
   - Well, you can write a few different files locally and run `pre-commit run 
--files /path/to/test/file` to see how it changed. 
   - You can check this pre-commit hook's popularity on GitHub with this 
search: 
https://github.com/search?q=path%3A%2F%5E%5C.pre-commit-config%5C.yaml%24%2F+%2Fhttps%3A%5C%2F%5C%2Fgithub%5C.com%5C%2Fpre-commit%5C%2Fpre-commit-hooks%2F&type=code.
 It is used in 96.3k other places. I think that is pretty trustworthy.
   - Add it is even used my our Apache Airflow friends here: 
https://github.com/apache/airflow/blob/d8e6f4bb7560aa01443d338aa77317bf27ac90bd/.pre-commit-config.yaml#L254-L304
   
   ------------
   
   Finally, in order for this to be beneficial, I think we need we need to be 
running the `.pre-commit-config.yaml` in the GitHub Actions. I am not sure 
where this is happening, but since I am new to this repo, I may just be missing 
where it is being called currently. But if not, I'll be happy to add a GitHub 
action that runs the pre-commit config (I've done this before, 
[here](https://github.com/nyoungstudios/alfa/blob/49f50f6776ce203603f0209756aabffb680e06f0/.github/workflows/code_style.yml)
 is an example in another public repo that I made a GitHub action workflow to 
run the pre-commit config). 
   
   Separately, I did this script `sdks/python/scripts/run_whitespacelint.sh` 
which runs https://github.com/bendikro/whitespacelint Python library to check 
whitespaces if I am understanding how the CI process works. I do think this 
pre-commit hook I am adding is more robust and more popular than the existing 
tool. But any guidance on the existing process would be appreciated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to