nyoungstudios commented on PR #35564: URL: https://github.com/apache/beam/pull/35564#issuecomment-3082185850
> You've impacted 1,775 files here, I'm assuming by applying the whitespace linting to the entire code base (although I have no clue how you managed to also impact things like SVGs for the website.) I would suggest narrowing the scope _significantly_, as A) I'd prefer to not impact `git blame` on all of these files and B) actually reviewing this PR is a nightmare to ensure that all of the changes are legitimately scoped to pure whitespace changes. Thanks Jack for your feedback and perspective here (also enjoyed your session when you spoke at Beam Summit too). For SVGs, they are ultimately text files too, not binary files. So, applying the whitespace rules to them seems fair in my opinion. Fwiw, there were many SVGs that did not require any changes which means that they already follow these new linting rules. So, in my opinion, enforcing consistency here is a good thing. If you prefer my changes to not effect the existing git blame, I could create a separate pr after this one to add the git hash for this merged change to an ignore file. You can see an example from another Apache project, Apache Airflow, [here](https://github.com/apache/airflow/blob/main/.git-blame-ignore-revs). Finally, I do not think any human should try to manually review the whitespace changes in this pr. To make it easier to, I have pushed 4 separate commits that should allow you to review the non-whitespace changes separately. Let's go through each of them. https://github.com/apache/beam/pull/35564/commits/2f6d47d48ab64b3a43e170d32dc3953000223614 (`adds editorconfig and pre-commit config`) - First, this commit adds the `.editorconfig` file which makes it easier for your editor to pick up the correct indentation for the file extension. Mainly, most editors and projects use 4 spaces for Python indents where this project uses 2. Let me know if I missed any extensions though. So, this allows the new contributor to start contributing with less hassle. This `.editorconfig` also removes trailing whitespaces at the end of life and makes sure you have a new line at the end of the file when you save it. - Secondly, this commit also adds the changes in the `.pre-commit-config.yaml` file to enforce the whitespace linting. https://github.com/apache/beam/pull/35564/commits/be0f63539bdb4d6c9e76e624e25314dba160a5a6 (`comments out these pre-commit hooks temporarily`) - This commit comments out the yapf and pylint pre-commit hooks in the `.pre-commit-config.yaml` temporarily. No need to run these pre-commit hooks if we didn't modify any Python code. And mainly to save some runtime when running the next step. https://github.com/apache/beam/pull/35564/commits/0029f5f18b933959ca3223cfb8641f2a7430b322 (`applies pre-commit changes`) - This commit is the result of running `pre-commit run --all-files` on top of the previous commit. - To verify this is all I did, you can checkout my branch to the previous commit (https://github.com/apache/beam/pull/35564/commits/be0f63539bdb4d6c9e76e624e25314dba160a5a6) and run `pre-commit run --all-files` after installing it if you don't already have it setup locally. Then, you can do a git diff to see that the result is the same. https://github.com/apache/beam/pull/35564/commits/b38afdc3f619df3031b3635f6c80968d13aadf95 (`Revert "comments out these pre-commit hooks temporarily"`) - This commit reverts our commit from earlier when we commented out the yapf and pylint pre-commit hooks. ------------ Okay, after walking through each commit and running the pre-commit command locally yourself, I think you can have more confidence in my change. But how do we know that the pre-commit hook I am adding does what it says. - Well, you can write a few different files locally and run `pre-commit run --files /path/to/test/file` to see how it changed. - You can check this pre-commit hook's popularity on GitHub with this search: https://github.com/search?q=path%3A%2F%5E%5C.pre-commit-config%5C.yaml%24%2F+%2Fhttps%3A%5C%2F%5C%2Fgithub%5C.com%5C%2Fpre-commit%5C%2Fpre-commit-hooks%2F&type=code. It is used in 96.3k other places. I think that is pretty trustworthy. - Add it is even used my our Apache Airflow friends here: https://github.com/apache/airflow/blob/d8e6f4bb7560aa01443d338aa77317bf27ac90bd/.pre-commit-config.yaml#L254-L304 ------------ Finally, in order for this to be beneficial, I think we need we need to be running the `.pre-commit-config.yaml` in the GitHub Actions. I am not sure where this is happening, but since I am new to this repo, I may just be missing where it is being called currently. But if not, I'll be happy to add a GitHub action that runs the pre-commit config (I've done this before, [here](https://github.com/nyoungstudios/alfa/blob/49f50f6776ce203603f0209756aabffb680e06f0/.github/workflows/code_style.yml) is an example in another public repo that I made a GitHub action workflow to run the pre-commit config). Separately, I did this script `sdks/python/scripts/run_whitespacelint.sh` which runs https://github.com/bendikro/whitespacelint Python library to check whitespaces if I am understanding how the CI process works. I do think this pre-commit hook I am adding is more robust and more popular than the existing tool. But any guidance on the existing process would be appreciated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org