Thanks. I commented on the PR. I think if we're going this route we should add a pre-commit, plus instructions on how to run the tool (similar to spotless).
On Mon, Jan 27, 2020 at 10:00 AM Udi Meiri <eh...@google.com> wrote: > > I've done a pass on the PR on code I'm familiar with. > Please make a pass and add your suggestions on the PR. > > On Fri, Jan 24, 2020 at 7:15 AM Ismaël Mejía <ieme...@gmail.com> wrote: >> >> Java build fails on any unformatted code so python probably should be like >> that. >> We have to ensure however that it fails early on that. >> As Robert said time to debate the knobs :) >> >> On Fri, Jan 24, 2020 at 3:19 PM Kamil Wasilewski >> <kamil.wasilew...@polidea.com> wrote: >>> >>> PR is ready: https://github.com/apache/beam/pull/10684. Please share your >>> comments ;-) I've managed to reduce the impact a bit: >>> 501 files changed, 18245 insertions(+), 19495 deletions(-) >>> >>> We still need to consider how to enforce the usage of autoformatter. >>> Pre-commit sounds like a nice addition, but it still needs to be installed >>> manually by a developer. On the other hand, Jenkins precommit job that >>> fails if any unformatted code is detected looks like too strict. What do >>> you think? >>> >>> On Thu, Jan 23, 2020 at 8:37 PM Robert Bradshaw <rober...@google.com> wrote: >>>> >>>> Thanks! Now we get to debate what knobs to twiddle :-P >>>> >>>> FYI, I did a simple run (just pushed to >>>> https://github.com/apache/beam/compare/master...robertwb:yapf) to see >>>> the impact. The diff is >>>> >>>> $ git diff --stat master >>>> ... >>>> 547 files changed, 22118 insertions(+), 21129 deletions(-) >>>> >>>> For reference >>>> >>>> $ find sdks/python/apache_beam -name '*.py' | xargs wc >>>> ... >>>> 200424 612002 7431637 total >>>> >>>> which means a little over 10% of lines get touched. I think there are >>>> some options, such as SPLIT_ALL_TOP_LEVEL_COMMA_SEPARATED_VALUES and >>>> COALESCE_BRACKETS, that will conform more to the style we are already >>>> (mostly) following. >>>> >>>> >>>> On Thu, Jan 23, 2020 at 1:59 AM Kamil Wasilewski >>>> <kamil.wasilew...@polidea.com> wrote: >>>> > >>>> > Thank you Michał for creating the ticket. I have some free time and I'd >>>> > like to volunteer myself for this task. >>>> > Indeed, it looks like there's consensus for `yapf`, so I'll try `yapf` >>>> > first. >>>> > >>>> > Best, >>>> > Kamil >>>> > >>>> > >>>> > On Thu, Jan 23, 2020 at 10:37 AM Michał Walenia >>>> > <michal.wale...@polidea.com> wrote: >>>> >> >>>> >> Hi all, >>>> >> I created a JIRA issue for this and summarized the available tools >>>> >> >>>> >> https://issues.apache.org/jira/browse/BEAM-9175 >>>> >> >>>> >> Cheers, >>>> >> Michal >>>> >> >>>> >> On Thu, Jan 23, 2020 at 1:49 AM Udi Meiri <eh...@google.com> wrote: >>>> >>> >>>> >>> Sorry, backing off on this due to time constraints. >>>> >>> >>>> >>> On Wed, Jan 22, 2020 at 3:39 PM Udi Meiri <eh...@google.com> wrote: >>>> >>>> >>>> >>>> It sounds like there's a consensus for yapf. I volunteer to take this >>>> >>>> on >>>> >>>> >>>> >>>> On Wed, Jan 22, 2020, 10:31 Udi Meiri <eh...@google.com> wrote: >>>> >>>>> >>>> >>>>> +1 to autoformatting >>>> >>>>> >>>> >>>>> On Wed, Jan 22, 2020 at 9:57 AM Luke Cwik <lc...@google.com> wrote: >>>> >>>>>> >>>> >>>>>> +1 to autoformatters. Also the Beam Java SDK went through a one >>>> >>>>>> time pass to apply the spotless formatting. >>>> >>>>>> >>>> >>>>>> On Tue, Jan 21, 2020 at 9:52 PM Ahmet Altay <al...@google.com> >>>> >>>>>> wrote: >>>> >>>>>>> >>>> >>>>>>> +1 to autoformatters and yapf. It appears to be a well maintained >>>> >>>>>>> project. I do support making a one time pass to apply formatting >>>> >>>>>>> the whole code base. >>>> >>>>>>> >>>> >>>>>>> On Tue, Jan 21, 2020 at 5:38 PM Chad Dombrova <chad...@gmail.com> >>>> >>>>>>> wrote: >>>> >>>>>>>>> >>>> >>>>>>>>> >>>> >>>>>>>>> It'd be good if there was a way to only apply to violating (or at >>>> >>>>>>>>> least changed) lines. >>>> >>>>>>>> >>>> >>>>>>>> >>>> >>>>>>>> I assumed the first thing we’d do is convert all of the code in >>>> >>>>>>>> one go, since it’s a very safe operation. Did you have something >>>> >>>>>>>> else in mind? >>>> >>>>>>>> >>>> >>>>>>>> -chad >>>> >>>>>>>> >>>> >>>>>>>> >>>> >>>>>>>> >>>> >>>>>>>> >>>> >>>>>>>>> >>>> >>>>>>>>> >>>> >>>>>>>>> On Tue, Jan 21, 2020 at 1:56 PM Chad Dombrova >>>> >>>>>>>>> <chad...@gmail.com> wrote: >>>> >>>>>>>>> > >>>> >>>>>>>>> > +1 to autoformatting >>>> >>>>>>>>> > >>>> >>>>>>>>> > Let me add some nuance to that. >>>> >>>>>>>>> > >>>> >>>>>>>>> > The way I see it there are 2 varieties of formatters: those >>>> >>>>>>>>> > which take the original formatting into consideration >>>> >>>>>>>>> > (autopep8) and those which disregard it (yapf, black). >>>> >>>>>>>>> > >>>> >>>>>>>>> > I much prefer yapf to black, because you have plenty of >>>> >>>>>>>>> > options to tweak with yapf (enough to make the output a pretty >>>> >>>>>>>>> > close match to the current Beam style), and you can mark areas >>>> >>>>>>>>> > to preserve the original formatting, which could be very >>>> >>>>>>>>> > useful with Pipeline building with pipe operators. Please >>>> >>>>>>>>> > don't pick black. >>>> >>>>>>>>> > >>>> >>>>>>>>> > autopep8 is more along the lines of spotless in Java -- it >>>> >>>>>>>>> > only corrects code that breaks the project's style rules. The >>>> >>>>>>>>> > big problem with Beam's current style is that it is so >>>> >>>>>>>>> > esoteric that autopep8 can't enforce it -- and I'm not just >>>> >>>>>>>>> > talking about 2-spaces, which I don't really have a problem >>>> >>>>>>>>> > with -- the problem is the use of either 2 or 4 spaces >>>> >>>>>>>>> > depending on context (expression start vs hanging indent, >>>> >>>>>>>>> > etc). This is my *biggest* gripe about the current style. >>>> >>>>>>>>> > PyCharm doesn't have enough control either. So, if we can >>>> >>>>>>>>> > choose a style that can be expressed by flake8 or pycodestyle >>>> >>>>>>>>> > then we can use autopep8 to enforce it. >>>> >>>>>>>>> > >>>> >>>>>>>>> > I'd prefer autopep8 to yapf because I like having a little >>>> >>>>>>>>> > wiggle room to influence the style, but on a big project like >>>> >>>>>>>>> > Beam all that wiggle room ends up to minor but noticeable >>>> >>>>>>>>> > inconsistencies in style throughout the project. yapf ensures >>>> >>>>>>>>> > completely consistent style, but the tradeoff is that it's >>>> >>>>>>>>> > sometimes ugly, especially in scenarios with similar repeated >>>> >>>>>>>>> > entries like argparse, where yapf might insert line breaks in >>>> >>>>>>>>> > visually inconsistent and unappealing ways depending on the >>>> >>>>>>>>> > lengths of the keywords and expressions involved. >>>> >>>>>>>>> > >>>> >>>>>>>>> > Either way (but especially if we choose yapf) I think it'd be >>>> >>>>>>>>> > a nice addition to setup a pre-commit [1] config so that >>>> >>>>>>>>> > people can opt in to running *lightweight* autofixers prior to >>>> >>>>>>>>> > commit. This will not only reduce dev frustration but will >>>> >>>>>>>>> > also reduce the amount of cpu cycles that Jenkins spends >>>> >>>>>>>>> > pointing out lint errors. >>>> >>>>>>>>> > >>>> >>>>>>>>> > [1] https://pre-commit.com/ >>>> >>>>>>>>> > >>>> >>>>>>>>> > -chad >>>> >>>>>>>>> > >>>> >>>>>>>>> > >>>> >>>>>>>>> > >>>> >>>>>>>>> > >>>> >>>>>>>>> > On Tue, Jan 21, 2020 at 12:52 PM Ismaël Mejía >>>> >>>>>>>>> > <ieme...@gmail.com> wrote: >>>> >>>>>>>>> >> >>>> >>>>>>>>> >> Last time we discussed this there seems not to be much >>>> >>>>>>>>> >> progress into autoformatting. >>>> >>>>>>>>> >> This tool looks more tweakable, so maybe it could be more >>>> >>>>>>>>> >> appropriate for Beam's use case. >>>> >>>>>>>>> >> https://github.com/google/yapf/ >>>> >>>>>>>>> >> WDYT? >>>> >>>>>>>>> >> >>>> >>>>>>>>> >> >>>> >>>>>>>>> >> On Thu, May 30, 2019 at 10:50 AM Łukasz Gajowy >>>> >>>>>>>>> >> <lgaj...@apache.org> wrote: >>>> >>>>>>>>> >>> >>>> >>>>>>>>> >>> +1 for any autoformatter for Python SDK that does the job. >>>> >>>>>>>>> >>> My experience is that since spotless in Java SDK I would >>>> >>>>>>>>> >>> never start a new Java project without it. So many great >>>> >>>>>>>>> >>> benefits not only for one person coding but for all >>>> >>>>>>>>> >>> community. >>>> >>>>>>>>> >>> >>>> >>>>>>>>> >>> It is a GitHub UI issue that you cannot easily browse past >>>> >>>>>>>>> >>> the reformat. It is not actually that hard, but does take a >>>> >>>>>>>>> >>> couple extra clicks to get GitHub to display blame before a >>>> >>>>>>>>> >>> reformat. It is easier with the command line. I do a lot of >>>> >>>>>>>>> >>> code history digging and the global Java reformat is not >>>> >>>>>>>>> >>> really a problem. >>>> >>>>>>>>> >>> >>>> >>>>>>>>> >>> It's actually one more click on Github but I agree it's not >>>> >>>>>>>>> >>> the best way to search the history. The most convenient and >>>> >>>>>>>>> >>> clear one I've found so far is in Jetbrains IDEs (Intelij) >>>> >>>>>>>>> >>> where you can: >>>> >>>>>>>>> >>> >>>> >>>>>>>>> >>> right click on line number -> "annotate" -> click again -> >>>> >>>>>>>>> >>> "annotate previous revision" -> ... >>>> >>>>>>>>> >>> >>>> >>>>>>>>> >>> You can also use "compare with" to see the diff between two >>>> >>>>>>>>> >>> revisions. >>>> >>>>>>>>> >>> >>>> >>>>>>>>> >>> Łukasz >>>> >>>>>>>>> >>> >>>> >>>>>>>>> >>> >>>> >>>>>>>>> >>> >>>> >>>>>>>>> >>> >>>> >>>>>>>>> >>> >>>> >>>>>>>>> >>> czw., 30 maj 2019 o 06:15 Kenneth Knowles <k...@apache.org> >>>> >>>>>>>>> >>> napisał(a): >>>> >>>>>>>>> >>>> >>>> >>>>>>>>> >>>> +1 pending good enough tooling (I can't quite tell - seems >>>> >>>>>>>>> >>>> there are some issues?) >>>> >>>>>>>>> >>>> >>>> >>>>>>>>> >>>> On Wed, May 29, 2019 at 2:40 PM Katarzyna Kucharczyk >>>> >>>>>>>>> >>>> <ka.kucharc...@gmail.com> wrote: >>>> >>>>>>>>> >>>>> >>>> >>>>>>>>> >>>>> What else actually we gain? My guess is faster PR review >>>> >>>>>>>>> >>>>> iteration. We will skip some of conversations about code >>>> >>>>>>>>> >>>>> style. >>>> >>>>>>>>> >>>> >>>> >>>>>>>>> >>>> ... >>>> >>>>>>>>> >>>>> >>>> >>>>>>>>> >>>>> Last but not least, new contributor may be less >>>> >>>>>>>>> >>>>> discouraged. When I started contribute I didn’t know how >>>> >>>>>>>>> >>>>> to format my code and I lost a lot of time to add pylint >>>> >>>>>>>>> >>>>> and adjust IntelliJ. I eventually failed. Currently I >>>> >>>>>>>>> >>>>> write code intuitively and when I don’t forget I rerun tox. >>>> >>>>>>>>> >>>> >>>> >>>>>>>>> >>>> >>>> >>>>>>>>> >>>> This is a huge benefit. This is why I supported it so much >>>> >>>>>>>>> >>>> for Java. It is a community benefit. You do not have to be >>>> >>>>>>>>> >>>> a contributor to the Python SDK to support this. That is >>>> >>>>>>>>> >>>> why I am writing here. Just eliminate all discussion of >>>> >>>>>>>>> >>>> formatting. It doesn't really matter what the resulting >>>> >>>>>>>>> >>>> format is, if it is not crazy to read. I strongly oppose >>>> >>>>>>>>> >>>> maintaining a non-default format. >>>> >>>>>>>>> >>>> >>>> >>>>>>>>> >>>> Reformating 20k lines or 200k is not hard. The Java global >>>> >>>>>>>>> >>>> reformat touched 50k lines. It does not really matter how >>>> >>>>>>>>> >>>> big it is. Definitely do it all at once if you think the >>>> >>>>>>>>> >>>> tool is good enough. And you should pin a version, so churn >>>> >>>>>>>>> >>>> is not a problem. You can upgrade the version and reformat >>>> >>>>>>>>> >>>> in a PR later and that is also easy. >>>> >>>>>>>>> >>>> >>>> >>>>>>>>> >>>> It is a GitHub UI issue that you cannot easily browse past >>>> >>>>>>>>> >>>> the reformat. It is not actually that hard, but does take a >>>> >>>>>>>>> >>>> couple extra clicks to get GitHub to display blame before a >>>> >>>>>>>>> >>>> reformat. It is easier with the command line. I do a lot of >>>> >>>>>>>>> >>>> code history digging and the global Java reformat is not >>>> >>>>>>>>> >>>> really a problem. >>>> >>>>>>>>> >>>> >>>> >>>>>>>>> >>>> Kenn >>>> >>>>>>>>> >>>> >>>> >>>>>>>>> >>>> >>>> >>>>>>>>> >>>>> >>>> >>>>>>>>> >>>>> Also everything will be formatted in a same way, so >>>> >>>>>>>>> >>>>> eventually it would be easier to read. >>>> >>>>>>>>> >>>>> >>>> >>>>>>>>> >>>>> Moreover, as it was mentioned in previous emails - a lot >>>> >>>>>>>>> >>>>> of Jenkins failures won’t take place, so we save time and >>>> >>>>>>>>> >>>>> resources. >>>> >>>>>>>>> >>>>> >>>> >>>>>>>>> >>>>> >>>> >>>>>>>>> >>>>> One of disadvantages is that our pipelines has custom >>>> >>>>>>>>> >>>>> syntax and after formatting they looks a little bit weird, >>>> >>>>>>>>> >>>>> but maybe extending the only configurable option in Black >>>> >>>>>>>>> >>>>> - lines, from 88 to 110 would be solution. >>>> >>>>>>>>> >>>>> >>>> >>>>>>>>> >>>>> Second one is that Black requires Python 3 to be run. I >>>> >>>>>>>>> >>>>> don’t know how big obstacle it would be. >>>> >>>>>>>>> >>>>> >>>> >>>>>>>>> >>>>> >>>> >>>>>>>>> >>>>> I believe there are two options how it would be possible >>>> >>>>>>>>> >>>>> to introduce Black. First: just do it, it will hurt but >>>> >>>>>>>>> >>>>> then it would be ok (same as a dentist appointment). Of >>>> >>>>>>>>> >>>>> course it may require some work to adjust linters. On the >>>> >>>>>>>>> >>>>> other hand we can do it gradually and start including sdk >>>> >>>>>>>>> >>>>> parts one by one - maybe it will be less painful? >>>> >>>>>>>>> >>>>> >>>> >>>>>>>>> >>>>> >>>> >>>>>>>>> >>>>> As an example I can share one of projects [2] I know that >>>> >>>>>>>>> >>>>> uses Black (they use also other cool checkers and >>>> >>>>>>>>> >>>>> pre-commit [3]). This is how looks their build with all >>>> >>>>>>>>> >>>>> checks [4]. >>>> >>>>>>>>> >>>>> >>>> >>>>>>>>> >>>>> >>>> >>>>>>>>> >>>>> To sum up I believe that if we want improve our coding >>>> >>>>>>>>> >>>>> experience, we should improve our toolset. Black seems be >>>> >>>>>>>>> >>>>> recent and quite popular tool what makes think they won’t >>>> >>>>>>>>> >>>>> stop developing it. >>>> >>>>>>>>> >>>>> >>>> >>>>>>>>> >>>>> >>>> >>>>>>>>> >>>>> [1] >>>> >>>>>>>>> >>>>> https://stackoverflow.com/questions/4112410/git-change-styling-whitespace-without-changing-ownership-blame >>>> >>>>>>>>> >>>>> >>>> >>>>>>>>> >>>>> >>>> >>>>>>>>> >>>>> [2] >>>> >>>>>>>>> >>>>> https://github.com/GoogleCloudPlatform/oozie-to-airflow >>>> >>>>>>>>> >>>>> >>>> >>>>>>>>> >>>>> >>>> >>>>>>>>> >>>>> [3] https://pre-commit.com >>>> >>>>>>>>> >>>>> >>>> >>>>>>>>> >>>>> >>>> >>>>>>>>> >>>>> [4] >>>> >>>>>>>>> >>>>> https://travis-ci.org/GoogleCloudPlatform/oozie-to-airflow/builds/538725689 >>>> >>>>>>>>> >>>>> >>>> >>>>>>>>> >>>>> >>>> >>>>>>>>> >>>>> >>>> >>>>>>>>> >>>>> On Wed, May 29, 2019 at 2:01 PM Robert Bradshaw >>>> >>>>>>>>> >>>>> <rober...@google.com> wrote: >>>> >>>>>>>>> >>>>>> >>>> >>>>>>>>> >>>>>> Reformatting to 4 spaces seems a non-starter to me, as it >>>> >>>>>>>>> >>>>>> would change nearly every single line in the codebase >>>> >>>>>>>>> >>>>>> (and the loss of all context as well as that particular >>>> >>>>>>>>> >>>>>> line). >>>> >>>>>>>>> >>>>>> >>>> >>>>>>>>> >>>>>> This is probably why the 2-space fork exists. However, we >>>> >>>>>>>>> >>>>>> don't conform to that either--we use 2 spaces for >>>> >>>>>>>>> >>>>>> indentation, but 4 for continuation indentation. (As for >>>> >>>>>>>>> >>>>>> the history of this, this goes back to Google's internal >>>> >>>>>>>>> >>>>>> style guide, probably motivated by consistency with C++, >>>> >>>>>>>>> >>>>>> Java, ... and the fact that with an indent level of 4 one >>>> >>>>>>>>> >>>>>> ends up wrapping lines quite frequently (it's telling >>>> >>>>>>>>> >>>>>> that black's default line length is 88)). This turns out >>>> >>>>>>>>> >>>>>> to be an easy change to the codebase. >>>> >>>>>>>>> >>>>>> >>>> >>>>>>>>> >>>>>> Once we move beyond the 2 vs. 4 whitespace thing, I found >>>> >>>>>>>>> >>>>>> that this tool introduces a huge amount of vertical >>>> >>>>>>>>> >>>>>> whitespace (e.g. closing parentheses on their own line), >>>> >>>>>>>>> >>>>>> e.g. >>>> >>>>>>>>> >>>>>> >>>> >>>>>>>>> >>>>>> def foo( >>>> >>>>>>>>> >>>>>> args >>>> >>>>>>>>> >>>>>> ): >>>> >>>>>>>>> >>>>>> if ( >>>> >>>>>>>>> >>>>>> long expression) >>>> >>>>>>>>> >>>>>> ): >>>> >>>>>>>>> >>>>>> func( >>>> >>>>>>>>> >>>>>> args >>>> >>>>>>>>> >>>>>> ) >>>> >>>>>>>>> >>>>>> >>>> >>>>>>>>> >>>>>> I wrote a simple post-processor to put closing >>>> >>>>>>>>> >>>>>> parentheses on the same lines, as well as omit the >>>> >>>>>>>>> >>>>>> newline after "if (", and disabling formatting of >>>> >>>>>>>>> >>>>>> strings, which reduce the churn in our codebase to 15k >>>> >>>>>>>>> >>>>>> lines (adding about 4k) out of 200k total. >>>> >>>>>>>>> >>>>>> >>>> >>>>>>>>> >>>>>> https://github.com/apache/beam/pull/8712/files >>>> >>>>>>>>> >>>>>> >>>> >>>>>>>>> >>>>>> It's still very opinionated, often in different ways then >>>> >>>>>>>>> >>>>>> me, and doesn't understand the semantics of the code, but >>>> >>>>>>>>> >>>>>> possibly something we could live with given the huge >>>> >>>>>>>>> >>>>>> advantages of an autoformatter. >>>> >>>>>>>>> >>>>>> >>>> >>>>>>>>> >>>>>> An intermediate point would be to allow, but not require, >>>> >>>>>>>>> >>>>>> autoformatting of changed lines. >>>> >>>>>>>>> >>>>>> >>>> >>>>>>>>> >>>>>> As for being beta quality, it looks like it's got a >>>> >>>>>>>>> >>>>>> decent number of contributors and in my book being in the >>>> >>>>>>>>> >>>>>> python github project is a strong positive signal. But, >>>> >>>>>>>>> >>>>>> due to the above issues, I think we'd have to maintain a >>>> >>>>>>>>> >>>>>> fork. (The code is pretty lightweight, the 2 vs. 4 space >>>> >>>>>>>>> >>>>>> issue is a 2-line change, and the rest implemented as a >>>> >>>>>>>>> >>>>>> post-processing step (for now, incomplete), so it'd be >>>> >>>>>>>>> >>>>>> easy to stay in sync with upstream.) >>>> >>>>>>>>> >>>>>> >>>> >>>>>>>>> >>>>>> On Wed, May 29, 2019 at 11:03 AM Ismaël Mejía >>>> >>>>>>>>> >>>>>> <ieme...@gmail.com> wrote: >>>> >>>>>>>>> >>>>>> > >>>> >>>>>>>>> >>>>>> > > I think the question is if it can be configured in a >>>> >>>>>>>>> >>>>>> > > way to fit our >>>> >>>>>>>>> >>>>>> > > current linter's style. I don't think it is feasible >>>> >>>>>>>>> >>>>>> > > to reformat the >>>> >>>>>>>>> >>>>>> > > entire Python SDK. >>>> >>>>>>>>> >>>>>> > >>>> >>>>>>>>> >>>>>> > It cannot be configured to do what we actually do >>>> >>>>>>>>> >>>>>> > because Black is >>>> >>>>>>>>> >>>>>> > configurable only to support the standard python >>>> >>>>>>>>> >>>>>> > codestyle guidelines >>>> >>>>>>>>> >>>>>> > (PEP-8) which recommends 4 spaces and is what most >>>> >>>>>>>>> >>>>>> > projects in the >>>> >>>>>>>>> >>>>>> > python world use. >>>> >>>>>>>>> >>>>>> > >>>> >>>>>>>>> >>>>>> > > Reformatted lines don't allow quick access to the Git >>>> >>>>>>>>> >>>>>> > > history. This >>>> >>>>>>>>> >>>>>> > > effect is still visible in the Java SDK. However, I >>>> >>>>>>>>> >>>>>> > > have the feeling >>>> >>>>>>>>> >>>>>> > > that this might be less of a problem with Python >>>> >>>>>>>>> >>>>>> > > because the linter has >>>> >>>>>>>>> >>>>>> > > more rules than Checkstyle had. >>>> >>>>>>>>> >>>>>> > >>>> >>>>>>>>> >>>>>> > Yes that’s the bad side effect but there are always >>>> >>>>>>>>> >>>>>> > tradeoffs we have >>>> >>>>>>>>> >>>>>> > to deal with. >>>> >>>>>>>>> >>>>>> > >>>> >>>>>>>>> >>>>>> > >>>> >>>>>>>>> >>>>>> > >>>> >>>>>>>>> >>>>>> > >>>> >>>>>>>>> >>>>>> > On Wed, May 29, 2019 at 10:52 AM Maximilian Michels >>>> >>>>>>>>> >>>>>> > <m...@apache.org> wrote: >>>> >>>>>>>>> >>>>>> > > >>>> >>>>>>>>> >>>>>> > > I think the question is if it can be configured in a >>>> >>>>>>>>> >>>>>> > > way to fit our >>>> >>>>>>>>> >>>>>> > > current linter's style. I don't think it is feasible >>>> >>>>>>>>> >>>>>> > > to reformat the >>>> >>>>>>>>> >>>>>> > > entire Python SDK. >>>> >>>>>>>>> >>>>>> > > >>>> >>>>>>>>> >>>>>> > > Reformatted lines don't allow quick access to the Git >>>> >>>>>>>>> >>>>>> > > history. This >>>> >>>>>>>>> >>>>>> > > effect is still visible in the Java SDK. However, I >>>> >>>>>>>>> >>>>>> > > have the feeling >>>> >>>>>>>>> >>>>>> > > that this might be less of a problem with Python >>>> >>>>>>>>> >>>>>> > > because the linter has >>>> >>>>>>>>> >>>>>> > > more rules than Checkstyle had. >>>> >>>>>>>>> >>>>>> > > >>>> >>>>>>>>> >>>>>> > > -Max >>>> >>>>>>>>> >>>>>> > > >>>> >>>>>>>>> >>>>>> > > On 29.05.19 10:16, Ismaël Mejía wrote: >>>> >>>>>>>>> >>>>>> > > >> My concerns are: >>>> >>>>>>>>> >>>>>> > > >> - The product is clearly marked as beta with a big >>>> >>>>>>>>> >>>>>> > > >> warning. >>>> >>>>>>>>> >>>>>> > > >> - It looks like mostly a single person project. >>>> >>>>>>>>> >>>>>> > > >> For the same reason I also strongly prefer not >>>> >>>>>>>>> >>>>>> > > >> using a fork for a specific setting. Fork will >>>> >>>>>>>>> >>>>>> > > >> only have less people looking at it. >>>> >>>>>>>>> >>>>>> > > > >>>> >>>>>>>>> >>>>>> > > > I suppose the project is marked as beta because it >>>> >>>>>>>>> >>>>>> > > > is recent, it was >>>> >>>>>>>>> >>>>>> > > > presented in 2018’s pycon, and because some things >>>> >>>>>>>>> >>>>>> > > > can change since >>>> >>>>>>>>> >>>>>> > > > auto-formatters are pretty tricky beasts, I think >>>> >>>>>>>>> >>>>>> > > > beta in that case is >>>> >>>>>>>>> >>>>>> > > > like our own ‘@Experimental’. If you look at the >>>> >>>>>>>>> >>>>>> > > > contribution page [1] >>>> >>>>>>>>> >>>>>> > > > you can notice that it is less and less a single >>>> >>>>>>>>> >>>>>> > > > person project, there >>>> >>>>>>>>> >>>>>> > > > have been 93 independent contributions since the >>>> >>>>>>>>> >>>>>> > > > project became >>>> >>>>>>>>> >>>>>> > > > public, and the fact that it is hosted in the >>>> >>>>>>>>> >>>>>> > > > python organization >>>> >>>>>>>>> >>>>>> > > > github [2] gives some confidence on the project >>>> >>>>>>>>> >>>>>> > > > continuity. >>>> >>>>>>>>> >>>>>> > > > >>>> >>>>>>>>> >>>>>> > > > You are right however about the fact that the main >>>> >>>>>>>>> >>>>>> > > > author seems to be >>>> >>>>>>>>> >>>>>> > > > the ‘benevolent’ dictator, and in the 2-spaces >>>> >>>>>>>>> >>>>>> > > > issue he can seem >>>> >>>>>>>>> >>>>>> > > > arbitrary, but he is just following pep8 style >>>> >>>>>>>>> >>>>>> > > > guide recommendations >>>> >>>>>>>>> >>>>>> > > > [3]. I am curious of why we (Beam) do not follow >>>> >>>>>>>>> >>>>>> > > > the 4 spaces >>>> >>>>>>>>> >>>>>> > > > recommendation of PEP-8 or even Google's own Python >>>> >>>>>>>>> >>>>>> > > > style guide [4], >>>> >>>>>>>>> >>>>>> > > > So, probably it should be to us to reconsider the >>>> >>>>>>>>> >>>>>> > > > current policy to >>>> >>>>>>>>> >>>>>> > > > adapt to the standards (and the tool). >>>> >>>>>>>>> >>>>>> > > > >>>> >>>>>>>>> >>>>>> > > > I did a quick run of black with python 2.7 >>>> >>>>>>>>> >>>>>> > > > compatibility on >>>> >>>>>>>>> >>>>>> > > > sdks/python and got only 4 parsing errors which is >>>> >>>>>>>>> >>>>>> > > > positive given the >>>> >>>>>>>>> >>>>>> > > > size of our code base. >>>> >>>>>>>>> >>>>>> > > > >>>> >>>>>>>>> >>>>>> > > > 415 files reformatted, 45 files left unchanged, 4 >>>> >>>>>>>>> >>>>>> > > > files failed to reformat. >>>> >>>>>>>>> >>>>>> > > > >>>> >>>>>>>>> >>>>>> > > > error: cannot format >>>> >>>>>>>>> >>>>>> > > > /home/ismael/upstream/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py: >>>> >>>>>>>>> >>>>>> > > > Cannot parse: 47:22: _display_progress = print >>>> >>>>>>>>> >>>>>> > > > error: cannot format >>>> >>>>>>>>> >>>>>> > > > /home/ismael/upstream/beam/sdks/python/apache_beam/runners/worker/log_handler.py: >>>> >>>>>>>>> >>>>>> > > > Cannot parse: 151:18: file=sys.stderr) >>>> >>>>>>>>> >>>>>> > > > error: cannot format >>>> >>>>>>>>> >>>>>> > > > /home/ismael/upstream/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py: >>>> >>>>>>>>> >>>>>> > > > Cannot parse: 160:34: print(traceback_string, >>>> >>>>>>>>> >>>>>> > > > file=sys.stderr) >>>> >>>>>>>>> >>>>>> > > > error: cannot format >>>> >>>>>>>>> >>>>>> > > > /home/ismael/upstream/beam/sdks/python/apache_beam/typehints/trivial_inference.py: >>>> >>>>>>>>> >>>>>> > > > Cannot parse: 335:51: print('-->' if pc == >>>> >>>>>>>>> >>>>>> > > > last_pc else ' ', >>>> >>>>>>>>> >>>>>> > > > end=' ') >>>> >>>>>>>>> >>>>>> > > > >>>> >>>>>>>>> >>>>>> > > > I still think this can be positive for the project >>>> >>>>>>>>> >>>>>> > > > but well I am >>>> >>>>>>>>> >>>>>> > > > barely a contributor to the python code base so I >>>> >>>>>>>>> >>>>>> > > > let you the python >>>> >>>>>>>>> >>>>>> > > > maintainers to reconsider this, in any case it >>>> >>>>>>>>> >>>>>> > > > seems like a good >>>> >>>>>>>>> >>>>>> > > > improvement for the project. >>>> >>>>>>>>> >>>>>> > > > >>>> >>>>>>>>> >>>>>> > > > [1] >>>> >>>>>>>>> >>>>>> > > > https://github.com/python/black/graphs/contributors >>>> >>>>>>>>> >>>>>> > > > [2] https://github.com/python >>>> >>>>>>>>> >>>>>> > > > [3] >>>> >>>>>>>>> >>>>>> > > > https://www.python.org/dev/peps/pep-0008/#indentation >>>> >>>>>>>>> >>>>>> > > > [4] >>>> >>>>>>>>> >>>>>> > > > https://github.com/google/styleguide/blob/gh-pages/pyguide.md#34-indentation >>>> >>>>>>>>> >>>>>> > > > >>>> >>>>>>>>> >>>>>> > > > On Tue, May 28, 2019 at 11:15 PM Ahmet Altay >>>> >>>>>>>>> >>>>>> > > > <al...@google.com> wrote: >>>> >>>>>>>>> >>>>>> > > >> >>>> >>>>>>>>> >>>>>> > > >> I am in the same boat with Robert, I am in favor >>>> >>>>>>>>> >>>>>> > > >> of autoformatters but I am not familiar with this >>>> >>>>>>>>> >>>>>> > > >> one. My concerns are: >>>> >>>>>>>>> >>>>>> > > >> - The product is clearly marked as beta with a big >>>> >>>>>>>>> >>>>>> > > >> warning. >>>> >>>>>>>>> >>>>>> > > >> - It looks like mostly a single person project. >>>> >>>>>>>>> >>>>>> > > >> For the same reason I also strongly prefer not >>>> >>>>>>>>> >>>>>> > > >> using a fork for a specific setting. Fork will >>>> >>>>>>>>> >>>>>> > > >> only have less people looking at it. >>>> >>>>>>>>> >>>>>> > > >> >>>> >>>>>>>>> >>>>>> > > >> IMO, this is in an early stage for us. That said >>>> >>>>>>>>> >>>>>> > > >> lint issues are real as pointed in the thread. If >>>> >>>>>>>>> >>>>>> > > >> someone would like to give it a try and see how it >>>> >>>>>>>>> >>>>>> > > >> would look like for us that would be interesting. >>>> >>>>>>>>> >>>>>> > > >> >>>> >>>>>>>>> >>>>>> > > >> On Tue, May 28, 2019 at 4:44 AM Katarzyna >>>> >>>>>>>>> >>>>>> > > >> Kucharczyk <ka.kucharc...@gmail.com> wrote: >>>> >>>>>>>>> >>>>>> > > >>> >>>> >>>>>>>>> >>>>>> > > >>> This sounds really good. A lot of Jenkins jobs >>>> >>>>>>>>> >>>>>> > > >>> failures are caused by lint problems. >>>> >>>>>>>>> >>>>>> > > >>> I think it would be great to have something >>>> >>>>>>>>> >>>>>> > > >>> similar to Spotless in Java SDK (I heard there is >>>> >>>>>>>>> >>>>>> > > >>> problem with configuring Black with IntelliJ). >>>> >>>>>>>>> >>>>>> > > >>> >>>> >>>>>>>>> >>>>>> > > >>> On Mon, May 27, 2019 at 10:52 PM Robert Bradshaw >>>> >>>>>>>>> >>>>>> > > >>> <rober...@google.com> wrote: >>>> >>>>>>>>> >>>>>> > > >>>> >>>> >>>>>>>>> >>>>>> > > >>>> I'm generally in favor of autoformatters, though >>>> >>>>>>>>> >>>>>> > > >>>> I haven't looked at >>>> >>>>>>>>> >>>>>> > > >>>> how well this particular one works. We might >>>> >>>>>>>>> >>>>>> > > >>>> have to go with >>>> >>>>>>>>> >>>>>> > > >>>> https://github.com/desbma/black-2spaces given >>>> >>>>>>>>> >>>>>> > > >>>> https://github.com/python/black/issues/378 . >>>> >>>>>>>>> >>>>>> > > >>>> >>>> >>>>>>>>> >>>>>> > > >>>> On Mon, May 27, 2019 at 10:43 PM Pablo Estrada >>>> >>>>>>>>> >>>>>> > > >>>> <pabl...@google.com> wrote: >>>> >>>>>>>>> >>>>>> > > >>>>> >>>> >>>>>>>>> >>>>>> > > >>>>> This looks pretty good:) I know at least a >>>> >>>>>>>>> >>>>>> > > >>>>> couple people (myself included) who've been >>>> >>>>>>>>> >>>>>> > > >>>>> annoyed by having to take care of lint issues >>>> >>>>>>>>> >>>>>> > > >>>>> that maybe a code formatter could save us. >>>> >>>>>>>>> >>>>>> > > >>>>> Thanks for sharing Ismael. >>>> >>>>>>>>> >>>>>> > > >>>>> -P. >>>> >>>>>>>>> >>>>>> > > >>>>> >>>> >>>>>>>>> >>>>>> > > >>>>> >>>> >>>>>>>>> >>>>>> > > >>>>> On Mon, May 27, 2019, 12:24 PM Ismaël Mejía >>>> >>>>>>>>> >>>>>> > > >>>>> <ieme...@gmail.com> wrote: >>>> >>>>>>>>> >>>>>> > > >>>>>> >>>> >>>>>>>>> >>>>>> > > >>>>>> I stumbled by chance into Black [1] a python >>>> >>>>>>>>> >>>>>> > > >>>>>> code auto formatter that >>>> >>>>>>>>> >>>>>> > > >>>>>> is becoming the 'de-facto' auto-formatter for >>>> >>>>>>>>> >>>>>> > > >>>>>> python, and wanted to >>>> >>>>>>>>> >>>>>> > > >>>>>> bring to the ML Is there interest from the >>>> >>>>>>>>> >>>>>> > > >>>>>> python people to get this >>>> >>>>>>>>> >>>>>> > > >>>>>> into the build? >>>> >>>>>>>>> >>>>>> > > >>>>>> >>>> >>>>>>>>> >>>>>> > > >>>>>> The introduction of spotless for Java has been >>>> >>>>>>>>> >>>>>> > > >>>>>> a good improvement and >>>> >>>>>>>>> >>>>>> > > >>>>>> maybe the python code base may benefit of this >>>> >>>>>>>>> >>>>>> > > >>>>>> too. >>>> >>>>>>>>> >>>>>> > > >>>>>> >>>> >>>>>>>>> >>>>>> > > >>>>>> WDYT? >>>> >>>>>>>>> >>>>>> > > >>>>>> >>>> >>>>>>>>> >>>>>> > > >>>>>> [1] https://github.com/python/black >>>> >> >>>> >> >>>> >> >>>> >> -- >>>> >> >>>> >> Michał Walenia >>>> >> Polidea | Software Engineer >>>> >> >>>> >> M: +48 791 432 002 >>>> >> E: michal.wale...@polidea.com >>>> >> >>>> >> Unique Tech >>>> >> Check out our projects!