Re: [DISCUSS] Autoformat python code with Black

Robert Bradshaw Mon, 27 Jan 2020 11:04:25 -0800

Thanks. I commented on the PR. I think if we're going this route we
should add a pre-commit, plus instructions on how to run the tool
(similar to spotless).


On Mon, Jan 27, 2020 at 10:00 AM Udi Meiri <eh...@google.com> wrote:
>
> I've done a pass on the PR on code I'm familiar with.
> Please make a pass and add your suggestions on the PR.
>
> On Fri, Jan 24, 2020 at 7:15 AM Ismaël Mejía <ieme...@gmail.com> wrote:
>>
>> Java build fails on any unformatted code so python probably should be like 
>> that.
>> We have to ensure however that it fails early on that.
>> As Robert said time to debate the knobs :)
>>
>> On Fri, Jan 24, 2020 at 3:19 PM Kamil Wasilewski 
>> <kamil.wasilew...@polidea.com> wrote:
>>>
>>> PR is ready: https://github.com/apache/beam/pull/10684. Please share your 
>>> comments ;-) I've managed to reduce the impact a bit:
>>> 501 files changed, 18245 insertions(+), 19495 deletions(-)
>>>
>>> We still need to consider how to enforce the usage of autoformatter. 
>>> Pre-commit sounds like a nice addition, but it still needs to be installed 
>>> manually by a developer. On the other hand, Jenkins precommit job that 
>>> fails if any unformatted code is detected looks like too strict. What do 
>>> you think?
>>>
>>> On Thu, Jan 23, 2020 at 8:37 PM Robert Bradshaw <rober...@google.com> wrote:
>>>>
>>>> Thanks! Now we get to debate what knobs to twiddle :-P
>>>>
>>>> FYI, I did a simple run (just pushed to
>>>> https://github.com/apache/beam/compare/master...robertwb:yapf) to see
>>>> the impact. The diff is
>>>>
>>>>     $ git diff --stat master
>>>>     ...
>>>>      547 files changed, 22118 insertions(+), 21129 deletions(-)
>>>>
>>>> For reference
>>>>
>>>>     $ find sdks/python/apache_beam -name '*.py' | xargs wc
>>>>     ...
>>>>     200424  612002 7431637 total
>>>>
>>>> which means a little over 10% of lines get touched. I think there are
>>>> some options, such as SPLIT_ALL_TOP_LEVEL_COMMA_SEPARATED_VALUES and
>>>> COALESCE_BRACKETS, that will conform more to the style we are already
>>>> (mostly) following.
>>>>
>>>>
>>>> On Thu, Jan 23, 2020 at 1:59 AM Kamil Wasilewski
>>>> <kamil.wasilew...@polidea.com> wrote:
>>>> >
>>>> > Thank you Michał for creating the ticket. I have some free time and I'd 
>>>> > like to volunteer myself for this task.
>>>> > Indeed, it looks like there's consensus for `yapf`, so I'll try `yapf` 
>>>> > first.
>>>> >
>>>> > Best,
>>>> > Kamil
>>>> >
>>>> >
>>>> > On Thu, Jan 23, 2020 at 10:37 AM Michał Walenia 
>>>> > <michal.wale...@polidea.com> wrote:
>>>> >>
>>>> >> Hi all,
>>>> >> I created a JIRA issue for this and summarized the available tools
>>>> >>
>>>> >> https://issues.apache.org/jira/browse/BEAM-9175
>>>> >>
>>>> >> Cheers,
>>>> >> Michal
>>>> >>
>>>> >> On Thu, Jan 23, 2020 at 1:49 AM Udi Meiri <eh...@google.com> wrote:
>>>> >>>
>>>> >>> Sorry, backing off on this due to time constraints.
>>>> >>>
>>>> >>> On Wed, Jan 22, 2020 at 3:39 PM Udi Meiri <eh...@google.com> wrote:
>>>> >>>>
>>>> >>>> It sounds like there's a consensus for yapf. I volunteer to take this 
>>>> >>>> on
>>>> >>>>
>>>> >>>> On Wed, Jan 22, 2020, 10:31 Udi Meiri <eh...@google.com> wrote:
>>>> >>>>>
>>>> >>>>> +1 to autoformatting
>>>> >>>>>
>>>> >>>>> On Wed, Jan 22, 2020 at 9:57 AM Luke Cwik <lc...@google.com> wrote:
>>>> >>>>>>
>>>> >>>>>> +1 to autoformatters. Also the Beam Java SDK went through a one 
>>>> >>>>>> time pass to apply the spotless formatting.
>>>> >>>>>>
>>>> >>>>>> On Tue, Jan 21, 2020 at 9:52 PM Ahmet Altay <al...@google.com> 
>>>> >>>>>> wrote:
>>>> >>>>>>>
>>>> >>>>>>> +1 to autoformatters and yapf. It appears to be a well maintained 
>>>> >>>>>>> project. I do support making a one time pass to apply formatting 
>>>> >>>>>>> the whole code base.
>>>> >>>>>>>
>>>> >>>>>>> On Tue, Jan 21, 2020 at 5:38 PM Chad Dombrova <chad...@gmail.com> 
>>>> >>>>>>> wrote:
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>> It'd be good if there was a way to only apply to violating (or at
>>>> >>>>>>>>> least changed) lines.
>>>> >>>>>>>>
>>>> >>>>>>>>
>>>> >>>>>>>> I assumed the first thing we’d do is convert all of the code in 
>>>> >>>>>>>> one go, since it’s a very safe operation. Did you have something 
>>>> >>>>>>>> else in mind?
>>>> >>>>>>>>
>>>> >>>>>>>> -chad
>>>> >>>>>>>>
>>>> >>>>>>>>
>>>> >>>>>>>>
>>>> >>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>> On Tue, Jan 21, 2020 at 1:56 PM Chad Dombrova 
>>>> >>>>>>>>> <chad...@gmail.com> wrote:
>>>> >>>>>>>>> >
>>>> >>>>>>>>> > +1 to autoformatting
>>>> >>>>>>>>> >
>>>> >>>>>>>>> > Let me add some nuance to that.
>>>> >>>>>>>>> >
>>>> >>>>>>>>> > The way I see it there are 2 varieties of formatters:  those 
>>>> >>>>>>>>> > which take the original formatting into consideration 
>>>> >>>>>>>>> > (autopep8) and those which disregard it (yapf, black).
>>>> >>>>>>>>> >
>>>> >>>>>>>>> > I much prefer yapf to black, because you have plenty of 
>>>> >>>>>>>>> > options to tweak with yapf (enough to make the output a pretty 
>>>> >>>>>>>>> > close match to the current Beam style), and you can mark areas 
>>>> >>>>>>>>> > to preserve the original formatting, which could be very 
>>>> >>>>>>>>> > useful with Pipeline building with pipe operators.  Please 
>>>> >>>>>>>>> > don't pick black.
>>>> >>>>>>>>> >
>>>> >>>>>>>>> > autopep8 is more along the lines of spotless in Java -- it 
>>>> >>>>>>>>> > only corrects code that breaks the project's style rules.  The 
>>>> >>>>>>>>> > big problem with Beam's current style is that it is so 
>>>> >>>>>>>>> > esoteric that autopep8 can't enforce it -- and I'm not just 
>>>> >>>>>>>>> > talking about 2-spaces, which I don't really have a problem 
>>>> >>>>>>>>> > with -- the problem is the use of either 2 or 4 spaces 
>>>> >>>>>>>>> > depending on context (expression start vs hanging indent, 
>>>> >>>>>>>>> > etc).  This is my *biggest* gripe about the current style.  
>>>> >>>>>>>>> > PyCharm doesn't have enough control either.  So, if we can 
>>>> >>>>>>>>> > choose a style that can be expressed by flake8 or pycodestyle 
>>>> >>>>>>>>> > then we can use autopep8 to enforce it.
>>>> >>>>>>>>> >
>>>> >>>>>>>>> > I'd prefer autopep8 to yapf because I like having a little 
>>>> >>>>>>>>> > wiggle room to influence the style, but on a big project like 
>>>> >>>>>>>>> > Beam all that wiggle room ends up to minor but noticeable 
>>>> >>>>>>>>> > inconsistencies in style throughout the project.  yapf ensures 
>>>> >>>>>>>>> > completely consistent style, but the tradeoff is that it's 
>>>> >>>>>>>>> > sometimes ugly, especially in scenarios with similar repeated 
>>>> >>>>>>>>> > entries like argparse, where yapf might insert line breaks in 
>>>> >>>>>>>>> > visually inconsistent and unappealing ways depending on the 
>>>> >>>>>>>>> > lengths of the keywords and expressions involved.
>>>> >>>>>>>>> >
>>>> >>>>>>>>> > Either way (but especially if we choose yapf) I think it'd be 
>>>> >>>>>>>>> > a nice addition to setup a pre-commit [1] config so that 
>>>> >>>>>>>>> > people can opt in to running *lightweight* autofixers prior to 
>>>> >>>>>>>>> > commit.  This will not only reduce dev frustration but will 
>>>> >>>>>>>>> > also reduce the amount of cpu cycles that Jenkins spends 
>>>> >>>>>>>>> > pointing out lint errors.
>>>> >>>>>>>>> >
>>>> >>>>>>>>> > [1] https://pre-commit.com/
>>>> >>>>>>>>> >
>>>> >>>>>>>>> > -chad
>>>> >>>>>>>>> >
>>>> >>>>>>>>> >
>>>> >>>>>>>>> >
>>>> >>>>>>>>> >
>>>> >>>>>>>>> > On Tue, Jan 21, 2020 at 12:52 PM Ismaël Mejía 
>>>> >>>>>>>>> > <ieme...@gmail.com> wrote:
>>>> >>>>>>>>> >>
>>>> >>>>>>>>> >> Last time we discussed this there seems not to be much 
>>>> >>>>>>>>> >> progress into autoformatting.
>>>> >>>>>>>>> >> This tool looks more tweakable, so maybe it could be more 
>>>> >>>>>>>>> >> appropriate for Beam's use case.
>>>> >>>>>>>>> >> https://github.com/google/yapf/
>>>> >>>>>>>>> >> WDYT?
>>>> >>>>>>>>> >>
>>>> >>>>>>>>> >>
>>>> >>>>>>>>> >> On Thu, May 30, 2019 at 10:50 AM Łukasz Gajowy 
>>>> >>>>>>>>> >> <lgaj...@apache.org> wrote:
>>>> >>>>>>>>> >>>
>>>> >>>>>>>>> >>> +1 for any autoformatter for Python SDK that does the job. 
>>>> >>>>>>>>> >>> My experience is that since spotless in Java SDK I would 
>>>> >>>>>>>>> >>> never start a new Java project without it. So many great 
>>>> >>>>>>>>> >>> benefits not only for one person coding but for all 
>>>> >>>>>>>>> >>> community.
>>>> >>>>>>>>> >>>
>>>> >>>>>>>>> >>> It is a GitHub UI issue that you cannot easily browse past 
>>>> >>>>>>>>> >>> the reformat. It is not actually that hard, but does take a 
>>>> >>>>>>>>> >>> couple extra clicks to get GitHub to display blame before a 
>>>> >>>>>>>>> >>> reformat. It is easier with the command line. I do a lot of 
>>>> >>>>>>>>> >>> code history digging and the global Java reformat is not 
>>>> >>>>>>>>> >>> really a problem.
>>>> >>>>>>>>> >>>
>>>> >>>>>>>>> >>> It's actually one more click on Github but I agree it's not 
>>>> >>>>>>>>> >>> the best way to search the history. The most convenient and 
>>>> >>>>>>>>> >>> clear one I've found so far is in Jetbrains IDEs (Intelij) 
>>>> >>>>>>>>> >>> where you can:
>>>> >>>>>>>>> >>>
>>>> >>>>>>>>> >>> right click on line number -> "annotate" -> click again -> 
>>>> >>>>>>>>> >>> "annotate previous revision" -> ...
>>>> >>>>>>>>> >>>
>>>> >>>>>>>>> >>> You can also use "compare with" to see the diff between two 
>>>> >>>>>>>>> >>> revisions.
>>>> >>>>>>>>> >>>
>>>> >>>>>>>>> >>> Łukasz
>>>> >>>>>>>>> >>>
>>>> >>>>>>>>> >>>
>>>> >>>>>>>>> >>>
>>>> >>>>>>>>> >>>
>>>> >>>>>>>>> >>>
>>>> >>>>>>>>> >>> czw., 30 maj 2019 o 06:15 Kenneth Knowles <k...@apache.org> 
>>>> >>>>>>>>> >>> napisał(a):
>>>> >>>>>>>>> >>>>
>>>> >>>>>>>>> >>>> +1 pending good enough tooling (I can't quite tell - seems 
>>>> >>>>>>>>> >>>> there are some issues?)
>>>> >>>>>>>>> >>>>
>>>> >>>>>>>>> >>>> On Wed, May 29, 2019 at 2:40 PM Katarzyna Kucharczyk 
>>>> >>>>>>>>> >>>> <ka.kucharc...@gmail.com> wrote:
>>>> >>>>>>>>> >>>>>
>>>> >>>>>>>>> >>>>> What else actually we gain? My guess is faster PR review 
>>>> >>>>>>>>> >>>>> iteration. We will skip some of conversations about code 
>>>> >>>>>>>>> >>>>> style.
>>>> >>>>>>>>> >>>>
>>>> >>>>>>>>> >>>> ...
>>>> >>>>>>>>> >>>>>
>>>> >>>>>>>>> >>>>> Last but not least, new contributor may be less 
>>>> >>>>>>>>> >>>>> discouraged. When I started contribute I didn’t know how 
>>>> >>>>>>>>> >>>>> to format my code and I lost a lot of time to add pylint 
>>>> >>>>>>>>> >>>>> and adjust IntelliJ. I eventually failed. Currently I 
>>>> >>>>>>>>> >>>>> write code intuitively and when I don’t forget I rerun tox.
>>>> >>>>>>>>> >>>>
>>>> >>>>>>>>> >>>>
>>>> >>>>>>>>> >>>> This is a huge benefit. This is why I supported it so much 
>>>> >>>>>>>>> >>>> for Java. It is a community benefit. You do not have to be 
>>>> >>>>>>>>> >>>> a contributor to the Python SDK to support this. That is 
>>>> >>>>>>>>> >>>> why I am writing here. Just eliminate all discussion of 
>>>> >>>>>>>>> >>>> formatting. It doesn't really matter what the resulting 
>>>> >>>>>>>>> >>>> format is, if it is not crazy to read. I strongly oppose 
>>>> >>>>>>>>> >>>> maintaining a non-default format.
>>>> >>>>>>>>> >>>>
>>>> >>>>>>>>> >>>> Reformating 20k lines or 200k is not hard. The Java global 
>>>> >>>>>>>>> >>>> reformat touched 50k lines. It does not really matter how 
>>>> >>>>>>>>> >>>> big it is. Definitely do it all at once if you think the 
>>>> >>>>>>>>> >>>> tool is good enough. And you should pin a version, so churn 
>>>> >>>>>>>>> >>>> is not a problem. You can upgrade the version and reformat 
>>>> >>>>>>>>> >>>> in a PR later and that is also easy.
>>>> >>>>>>>>> >>>>
>>>> >>>>>>>>> >>>> It is a GitHub UI issue that you cannot easily browse past 
>>>> >>>>>>>>> >>>> the reformat. It is not actually that hard, but does take a 
>>>> >>>>>>>>> >>>> couple extra clicks to get GitHub to display blame before a 
>>>> >>>>>>>>> >>>> reformat. It is easier with the command line. I do a lot of 
>>>> >>>>>>>>> >>>> code history digging and the global Java reformat is not 
>>>> >>>>>>>>> >>>> really a problem.
>>>> >>>>>>>>> >>>>
>>>> >>>>>>>>> >>>> Kenn
>>>> >>>>>>>>> >>>>
>>>> >>>>>>>>> >>>>
>>>> >>>>>>>>> >>>>>
>>>> >>>>>>>>> >>>>> Also everything will be formatted in a same way, so 
>>>> >>>>>>>>> >>>>> eventually it would be easier to read.
>>>> >>>>>>>>> >>>>>
>>>> >>>>>>>>> >>>>> Moreover, as it was mentioned in previous emails - a lot 
>>>> >>>>>>>>> >>>>> of Jenkins failures won’t take place, so we save time and 
>>>> >>>>>>>>> >>>>> resources.
>>>> >>>>>>>>> >>>>>
>>>> >>>>>>>>> >>>>>
>>>> >>>>>>>>> >>>>> One of disadvantages is that our pipelines has custom 
>>>> >>>>>>>>> >>>>> syntax and after formatting they looks a little bit weird, 
>>>> >>>>>>>>> >>>>> but maybe extending the only configurable option in Black 
>>>> >>>>>>>>> >>>>> - lines, from 88 to 110 would be solution.
>>>> >>>>>>>>> >>>>>
>>>> >>>>>>>>> >>>>> Second one is that Black requires Python 3 to be run. I 
>>>> >>>>>>>>> >>>>> don’t know how big obstacle it would be.
>>>> >>>>>>>>> >>>>>
>>>> >>>>>>>>> >>>>>
>>>> >>>>>>>>> >>>>> I believe there are two options how it would be possible 
>>>> >>>>>>>>> >>>>> to introduce Black. First: just do it, it will hurt but 
>>>> >>>>>>>>> >>>>> then it would be ok (same as a dentist appointment). Of 
>>>> >>>>>>>>> >>>>> course it may require some work to adjust linters. On the 
>>>> >>>>>>>>> >>>>> other hand we can do it gradually and start including sdk 
>>>> >>>>>>>>> >>>>> parts one by one - maybe it will be less painful?
>>>> >>>>>>>>> >>>>>
>>>> >>>>>>>>> >>>>>
>>>> >>>>>>>>> >>>>> As an example I can share one of projects [2] I know that 
>>>> >>>>>>>>> >>>>> uses Black (they use also other cool checkers and 
>>>> >>>>>>>>> >>>>> pre-commit [3]). This is how looks their build with all 
>>>> >>>>>>>>> >>>>> checks [4].
>>>> >>>>>>>>> >>>>>
>>>> >>>>>>>>> >>>>>
>>>> >>>>>>>>> >>>>> To sum up I believe that if we want improve our coding 
>>>> >>>>>>>>> >>>>> experience, we should improve our toolset. Black seems be 
>>>> >>>>>>>>> >>>>> recent and quite popular tool what makes think they won’t 
>>>> >>>>>>>>> >>>>> stop developing it.
>>>> >>>>>>>>> >>>>>
>>>> >>>>>>>>> >>>>>
>>>> >>>>>>>>> >>>>> [1] 
>>>> >>>>>>>>> >>>>> https://stackoverflow.com/questions/4112410/git-change-styling-whitespace-without-changing-ownership-blame
>>>> >>>>>>>>> >>>>>
>>>> >>>>>>>>> >>>>>
>>>> >>>>>>>>> >>>>> [2]  
>>>> >>>>>>>>> >>>>> https://github.com/GoogleCloudPlatform/oozie-to-airflow
>>>> >>>>>>>>> >>>>>
>>>> >>>>>>>>> >>>>>
>>>> >>>>>>>>> >>>>> [3] https://pre-commit.com
>>>> >>>>>>>>> >>>>>
>>>> >>>>>>>>> >>>>>
>>>> >>>>>>>>> >>>>> [4] 
>>>> >>>>>>>>> >>>>> https://travis-ci.org/GoogleCloudPlatform/oozie-to-airflow/builds/538725689
>>>> >>>>>>>>> >>>>>
>>>> >>>>>>>>> >>>>>
>>>> >>>>>>>>> >>>>>
>>>> >>>>>>>>> >>>>> On Wed, May 29, 2019 at 2:01 PM Robert Bradshaw 
>>>> >>>>>>>>> >>>>> <rober...@google.com> wrote:
>>>> >>>>>>>>> >>>>>>
>>>> >>>>>>>>> >>>>>> Reformatting to 4 spaces seems a non-starter to me, as it 
>>>> >>>>>>>>> >>>>>> would change nearly every single line in the codebase 
>>>> >>>>>>>>> >>>>>> (and the loss of all context as well as that particular 
>>>> >>>>>>>>> >>>>>> line).
>>>> >>>>>>>>> >>>>>>
>>>> >>>>>>>>> >>>>>> This is probably why the 2-space fork exists. However, we 
>>>> >>>>>>>>> >>>>>> don't conform to that either--we use 2 spaces for 
>>>> >>>>>>>>> >>>>>> indentation, but 4 for continuation indentation. (As for 
>>>> >>>>>>>>> >>>>>> the history of this, this goes back to Google's internal 
>>>> >>>>>>>>> >>>>>> style guide, probably motivated by consistency with C++, 
>>>> >>>>>>>>> >>>>>> Java, ... and the fact that with an indent level of 4 one 
>>>> >>>>>>>>> >>>>>> ends up wrapping lines quite frequently (it's telling 
>>>> >>>>>>>>> >>>>>> that black's default line length is 88)). This turns out 
>>>> >>>>>>>>> >>>>>> to be an easy change to the codebase.
>>>> >>>>>>>>> >>>>>>
>>>> >>>>>>>>> >>>>>> Once we move beyond the 2 vs. 4 whitespace thing, I found 
>>>> >>>>>>>>> >>>>>> that this tool introduces a huge amount of vertical 
>>>> >>>>>>>>> >>>>>> whitespace (e.g. closing parentheses on their own line), 
>>>> >>>>>>>>> >>>>>> e.g.
>>>> >>>>>>>>> >>>>>>
>>>> >>>>>>>>> >>>>>> def foo(
>>>> >>>>>>>>> >>>>>>     args
>>>> >>>>>>>>> >>>>>> ):
>>>> >>>>>>>>> >>>>>>   if (
>>>> >>>>>>>>> >>>>>>       long expression)
>>>> >>>>>>>>> >>>>>>   ):
>>>> >>>>>>>>> >>>>>>     func(
>>>> >>>>>>>>> >>>>>>         args
>>>> >>>>>>>>> >>>>>>     )
>>>> >>>>>>>>> >>>>>>
>>>> >>>>>>>>> >>>>>> I wrote a simple post-processor to put closing 
>>>> >>>>>>>>> >>>>>> parentheses on the same lines, as well as omit the 
>>>> >>>>>>>>> >>>>>> newline after "if (", and disabling formatting of 
>>>> >>>>>>>>> >>>>>> strings, which reduce the churn in our codebase to 15k 
>>>> >>>>>>>>> >>>>>> lines (adding about 4k) out of 200k total.
>>>> >>>>>>>>> >>>>>>
>>>> >>>>>>>>> >>>>>> https://github.com/apache/beam/pull/8712/files
>>>> >>>>>>>>> >>>>>>
>>>> >>>>>>>>> >>>>>> It's still very opinionated, often in different ways then 
>>>> >>>>>>>>> >>>>>> me, and doesn't understand the semantics of the code, but 
>>>> >>>>>>>>> >>>>>> possibly something we could live with given the huge 
>>>> >>>>>>>>> >>>>>> advantages of an autoformatter.
>>>> >>>>>>>>> >>>>>>
>>>> >>>>>>>>> >>>>>> An intermediate point would be to allow, but not require, 
>>>> >>>>>>>>> >>>>>> autoformatting of changed lines.
>>>> >>>>>>>>> >>>>>>
>>>> >>>>>>>>> >>>>>> As for being beta quality, it looks like it's got a 
>>>> >>>>>>>>> >>>>>> decent number of contributors and in my book being in the 
>>>> >>>>>>>>> >>>>>> python github project is a strong positive signal. But, 
>>>> >>>>>>>>> >>>>>> due to the above issues, I think we'd have to maintain a 
>>>> >>>>>>>>> >>>>>> fork. (The code is pretty lightweight, the 2 vs. 4 space 
>>>> >>>>>>>>> >>>>>> issue is a 2-line change, and the rest implemented as a 
>>>> >>>>>>>>> >>>>>> post-processing step (for now, incomplete), so it'd be 
>>>> >>>>>>>>> >>>>>> easy to stay in sync with upstream.)
>>>> >>>>>>>>> >>>>>>
>>>> >>>>>>>>> >>>>>> On Wed, May 29, 2019 at 11:03 AM Ismaël Mejía 
>>>> >>>>>>>>> >>>>>> <ieme...@gmail.com> wrote:
>>>> >>>>>>>>> >>>>>> >
>>>> >>>>>>>>> >>>>>> > > I think the question is if it can be configured in a 
>>>> >>>>>>>>> >>>>>> > > way to fit our
>>>> >>>>>>>>> >>>>>> > > current linter's style. I don't think it is feasible 
>>>> >>>>>>>>> >>>>>> > > to reformat the
>>>> >>>>>>>>> >>>>>> > > entire Python SDK.
>>>> >>>>>>>>> >>>>>> >
>>>> >>>>>>>>> >>>>>> > It cannot be configured to do what we actually do 
>>>> >>>>>>>>> >>>>>> > because Black is
>>>> >>>>>>>>> >>>>>> > configurable only to support the standard python 
>>>> >>>>>>>>> >>>>>> > codestyle guidelines
>>>> >>>>>>>>> >>>>>> > (PEP-8) which recommends 4 spaces and is what most 
>>>> >>>>>>>>> >>>>>> > projects in the
>>>> >>>>>>>>> >>>>>> > python world use.
>>>> >>>>>>>>> >>>>>> >
>>>> >>>>>>>>> >>>>>> > > Reformatted lines don't allow quick access to the Git 
>>>> >>>>>>>>> >>>>>> > > history. This
>>>> >>>>>>>>> >>>>>> > > effect is still visible in the Java SDK. However, I 
>>>> >>>>>>>>> >>>>>> > > have the feeling
>>>> >>>>>>>>> >>>>>> > > that this might be less of a problem with Python 
>>>> >>>>>>>>> >>>>>> > > because the linter has
>>>> >>>>>>>>> >>>>>> > > more rules than Checkstyle had.
>>>> >>>>>>>>> >>>>>> >
>>>> >>>>>>>>> >>>>>> > Yes that’s the bad side effect but there are always 
>>>> >>>>>>>>> >>>>>> > tradeoffs we have
>>>> >>>>>>>>> >>>>>> > to deal with.
>>>> >>>>>>>>> >>>>>> >
>>>> >>>>>>>>> >>>>>> >
>>>> >>>>>>>>> >>>>>> >
>>>> >>>>>>>>> >>>>>> >
>>>> >>>>>>>>> >>>>>> > On Wed, May 29, 2019 at 10:52 AM Maximilian Michels 
>>>> >>>>>>>>> >>>>>> > <m...@apache.org> wrote:
>>>> >>>>>>>>> >>>>>> > >
>>>> >>>>>>>>> >>>>>> > > I think the question is if it can be configured in a 
>>>> >>>>>>>>> >>>>>> > > way to fit our
>>>> >>>>>>>>> >>>>>> > > current linter's style. I don't think it is feasible 
>>>> >>>>>>>>> >>>>>> > > to reformat the
>>>> >>>>>>>>> >>>>>> > > entire Python SDK.
>>>> >>>>>>>>> >>>>>> > >
>>>> >>>>>>>>> >>>>>> > > Reformatted lines don't allow quick access to the Git 
>>>> >>>>>>>>> >>>>>> > > history. This
>>>> >>>>>>>>> >>>>>> > > effect is still visible in the Java SDK. However, I 
>>>> >>>>>>>>> >>>>>> > > have the feeling
>>>> >>>>>>>>> >>>>>> > > that this might be less of a problem with Python 
>>>> >>>>>>>>> >>>>>> > > because the linter has
>>>> >>>>>>>>> >>>>>> > > more rules than Checkstyle had.
>>>> >>>>>>>>> >>>>>> > >
>>>> >>>>>>>>> >>>>>> > > -Max
>>>> >>>>>>>>> >>>>>> > >
>>>> >>>>>>>>> >>>>>> > > On 29.05.19 10:16, Ismaël Mejía wrote:
>>>> >>>>>>>>> >>>>>> > > >> My concerns are:
>>>> >>>>>>>>> >>>>>> > > >> - The product is clearly marked as beta with a big 
>>>> >>>>>>>>> >>>>>> > > >> warning.
>>>> >>>>>>>>> >>>>>> > > >> - It looks like mostly a single person project. 
>>>> >>>>>>>>> >>>>>> > > >> For the same reason I also strongly prefer not 
>>>> >>>>>>>>> >>>>>> > > >> using a fork for a specific setting. Fork will 
>>>> >>>>>>>>> >>>>>> > > >> only have less people looking at it.
>>>> >>>>>>>>> >>>>>> > > >
>>>> >>>>>>>>> >>>>>> > > > I suppose the project is marked as beta because it 
>>>> >>>>>>>>> >>>>>> > > > is recent, it was
>>>> >>>>>>>>> >>>>>> > > > presented in 2018’s pycon, and because some things 
>>>> >>>>>>>>> >>>>>> > > > can change since
>>>> >>>>>>>>> >>>>>> > > > auto-formatters are pretty tricky beasts, I think 
>>>> >>>>>>>>> >>>>>> > > > beta in that case is
>>>> >>>>>>>>> >>>>>> > > > like our own ‘@Experimental’. If you look at the 
>>>> >>>>>>>>> >>>>>> > > > contribution page [1]
>>>> >>>>>>>>> >>>>>> > > > you can notice that it is less and less a single 
>>>> >>>>>>>>> >>>>>> > > > person project, there
>>>> >>>>>>>>> >>>>>> > > > have been 93 independent contributions since the 
>>>> >>>>>>>>> >>>>>> > > > project became
>>>> >>>>>>>>> >>>>>> > > > public, and the fact that it is hosted in the 
>>>> >>>>>>>>> >>>>>> > > > python organization
>>>> >>>>>>>>> >>>>>> > > > github [2] gives some confidence on the project 
>>>> >>>>>>>>> >>>>>> > > > continuity.
>>>> >>>>>>>>> >>>>>> > > >
>>>> >>>>>>>>> >>>>>> > > > You are right however about the fact that the main 
>>>> >>>>>>>>> >>>>>> > > > author seems to be
>>>> >>>>>>>>> >>>>>> > > > the ‘benevolent’ dictator, and in the 2-spaces 
>>>> >>>>>>>>> >>>>>> > > > issue he can seem
>>>> >>>>>>>>> >>>>>> > > > arbitrary, but he is just following pep8 style 
>>>> >>>>>>>>> >>>>>> > > > guide recommendations
>>>> >>>>>>>>> >>>>>> > > > [3]. I am curious of why we (Beam) do not follow 
>>>> >>>>>>>>> >>>>>> > > > the 4 spaces
>>>> >>>>>>>>> >>>>>> > > > recommendation of PEP-8 or even Google's own Python 
>>>> >>>>>>>>> >>>>>> > > > style guide [4],
>>>> >>>>>>>>> >>>>>> > > > So, probably it should be to us to reconsider the 
>>>> >>>>>>>>> >>>>>> > > > current policy to
>>>> >>>>>>>>> >>>>>> > > > adapt to the standards (and the tool).
>>>> >>>>>>>>> >>>>>> > > >
>>>> >>>>>>>>> >>>>>> > > > I did a quick run of black with python 2.7 
>>>> >>>>>>>>> >>>>>> > > > compatibility on
>>>> >>>>>>>>> >>>>>> > > > sdks/python and got only 4 parsing errors which is 
>>>> >>>>>>>>> >>>>>> > > > positive given the
>>>> >>>>>>>>> >>>>>> > > > size of our code base.
>>>> >>>>>>>>> >>>>>> > > >
>>>> >>>>>>>>> >>>>>> > > > 415 files reformatted, 45 files left unchanged, 4 
>>>> >>>>>>>>> >>>>>> > > > files failed to reformat.
>>>> >>>>>>>>> >>>>>> > > >
>>>> >>>>>>>>> >>>>>> > > > error: cannot format
>>>> >>>>>>>>> >>>>>> > > > /home/ismael/upstream/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py:
>>>> >>>>>>>>> >>>>>> > > > Cannot parse: 47:22:   _display_progress = print
>>>> >>>>>>>>> >>>>>> > > > error: cannot format
>>>> >>>>>>>>> >>>>>> > > > /home/ismael/upstream/beam/sdks/python/apache_beam/runners/worker/log_handler.py:
>>>> >>>>>>>>> >>>>>> > > > Cannot parse: 151:18:               file=sys.stderr)
>>>> >>>>>>>>> >>>>>> > > > error: cannot format
>>>> >>>>>>>>> >>>>>> > > > /home/ismael/upstream/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py:
>>>> >>>>>>>>> >>>>>> > > > Cannot parse: 160:34:       print(traceback_string, 
>>>> >>>>>>>>> >>>>>> > > > file=sys.stderr)
>>>> >>>>>>>>> >>>>>> > > > error: cannot format
>>>> >>>>>>>>> >>>>>> > > > /home/ismael/upstream/beam/sdks/python/apache_beam/typehints/trivial_inference.py:
>>>> >>>>>>>>> >>>>>> > > > Cannot parse: 335:51:       print('-->' if pc == 
>>>> >>>>>>>>> >>>>>> > > > last_pc else '    ',
>>>> >>>>>>>>> >>>>>> > > > end=' ')
>>>> >>>>>>>>> >>>>>> > > >
>>>> >>>>>>>>> >>>>>> > > > I still think this can be positive for the project 
>>>> >>>>>>>>> >>>>>> > > > but well I am
>>>> >>>>>>>>> >>>>>> > > > barely a contributor to the python code base so I 
>>>> >>>>>>>>> >>>>>> > > > let you the python
>>>> >>>>>>>>> >>>>>> > > > maintainers to reconsider this, in any case it 
>>>> >>>>>>>>> >>>>>> > > > seems like a good
>>>> >>>>>>>>> >>>>>> > > > improvement for the project.
>>>> >>>>>>>>> >>>>>> > > >
>>>> >>>>>>>>> >>>>>> > > > [1] 
>>>> >>>>>>>>> >>>>>> > > > https://github.com/python/black/graphs/contributors
>>>> >>>>>>>>> >>>>>> > > > [2] https://github.com/python
>>>> >>>>>>>>> >>>>>> > > > [3] 
>>>> >>>>>>>>> >>>>>> > > > https://www.python.org/dev/peps/pep-0008/#indentation
>>>> >>>>>>>>> >>>>>> > > > [4] 
>>>> >>>>>>>>> >>>>>> > > > https://github.com/google/styleguide/blob/gh-pages/pyguide.md#34-indentation
>>>> >>>>>>>>> >>>>>> > > >
>>>> >>>>>>>>> >>>>>> > > > On Tue, May 28, 2019 at 11:15 PM Ahmet Altay 
>>>> >>>>>>>>> >>>>>> > > > <al...@google.com> wrote:
>>>> >>>>>>>>> >>>>>> > > >>
>>>> >>>>>>>>> >>>>>> > > >> I am in the same boat with Robert, I am in favor 
>>>> >>>>>>>>> >>>>>> > > >> of autoformatters but I am not familiar with this 
>>>> >>>>>>>>> >>>>>> > > >> one. My concerns are:
>>>> >>>>>>>>> >>>>>> > > >> - The product is clearly marked as beta with a big 
>>>> >>>>>>>>> >>>>>> > > >> warning.
>>>> >>>>>>>>> >>>>>> > > >> - It looks like mostly a single person project. 
>>>> >>>>>>>>> >>>>>> > > >> For the same reason I also strongly prefer not 
>>>> >>>>>>>>> >>>>>> > > >> using a fork for a specific setting. Fork will 
>>>> >>>>>>>>> >>>>>> > > >> only have less people looking at it.
>>>> >>>>>>>>> >>>>>> > > >>
>>>> >>>>>>>>> >>>>>> > > >> IMO, this is in an early stage for us. That said 
>>>> >>>>>>>>> >>>>>> > > >> lint issues are real as pointed in the thread. If 
>>>> >>>>>>>>> >>>>>> > > >> someone would like to give it a try and see how it 
>>>> >>>>>>>>> >>>>>> > > >> would look like for us that would be interesting.
>>>> >>>>>>>>> >>>>>> > > >>
>>>> >>>>>>>>> >>>>>> > > >> On Tue, May 28, 2019 at 4:44 AM Katarzyna 
>>>> >>>>>>>>> >>>>>> > > >> Kucharczyk <ka.kucharc...@gmail.com> wrote:
>>>> >>>>>>>>> >>>>>> > > >>>
>>>> >>>>>>>>> >>>>>> > > >>> This sounds really good. A lot of Jenkins jobs 
>>>> >>>>>>>>> >>>>>> > > >>> failures are caused by lint problems.
>>>> >>>>>>>>> >>>>>> > > >>> I think it would be great to have something 
>>>> >>>>>>>>> >>>>>> > > >>> similar to Spotless in Java SDK (I heard there is 
>>>> >>>>>>>>> >>>>>> > > >>> problem with configuring Black with IntelliJ).
>>>> >>>>>>>>> >>>>>> > > >>>
>>>> >>>>>>>>> >>>>>> > > >>> On Mon, May 27, 2019 at 10:52 PM Robert Bradshaw 
>>>> >>>>>>>>> >>>>>> > > >>> <rober...@google.com> wrote:
>>>> >>>>>>>>> >>>>>> > > >>>>
>>>> >>>>>>>>> >>>>>> > > >>>> I'm generally in favor of autoformatters, though 
>>>> >>>>>>>>> >>>>>> > > >>>> I haven't looked at
>>>> >>>>>>>>> >>>>>> > > >>>> how well this particular one works. We might 
>>>> >>>>>>>>> >>>>>> > > >>>> have to go with
>>>> >>>>>>>>> >>>>>> > > >>>> https://github.com/desbma/black-2spaces given
>>>> >>>>>>>>> >>>>>> > > >>>> https://github.com/python/black/issues/378 .
>>>> >>>>>>>>> >>>>>> > > >>>>
>>>> >>>>>>>>> >>>>>> > > >>>> On Mon, May 27, 2019 at 10:43 PM Pablo Estrada 
>>>> >>>>>>>>> >>>>>> > > >>>> <pabl...@google.com> wrote:
>>>> >>>>>>>>> >>>>>> > > >>>>>
>>>> >>>>>>>>> >>>>>> > > >>>>> This looks pretty good:) I know at least a 
>>>> >>>>>>>>> >>>>>> > > >>>>> couple people (myself included) who've been 
>>>> >>>>>>>>> >>>>>> > > >>>>> annoyed by having to take care of lint issues 
>>>> >>>>>>>>> >>>>>> > > >>>>> that maybe a code formatter could save us.
>>>> >>>>>>>>> >>>>>> > > >>>>> Thanks for sharing Ismael.
>>>> >>>>>>>>> >>>>>> > > >>>>> -P.
>>>> >>>>>>>>> >>>>>> > > >>>>>
>>>> >>>>>>>>> >>>>>> > > >>>>>
>>>> >>>>>>>>> >>>>>> > > >>>>> On Mon, May 27, 2019, 12:24 PM Ismaël Mejía 
>>>> >>>>>>>>> >>>>>> > > >>>>> <ieme...@gmail.com> wrote:
>>>> >>>>>>>>> >>>>>> > > >>>>>>
>>>> >>>>>>>>> >>>>>> > > >>>>>> I stumbled by chance into Black [1] a python 
>>>> >>>>>>>>> >>>>>> > > >>>>>> code auto formatter that
>>>> >>>>>>>>> >>>>>> > > >>>>>> is becoming the 'de-facto' auto-formatter for 
>>>> >>>>>>>>> >>>>>> > > >>>>>> python, and wanted to
>>>> >>>>>>>>> >>>>>> > > >>>>>> bring to the ML Is there interest from the 
>>>> >>>>>>>>> >>>>>> > > >>>>>> python people to get this
>>>> >>>>>>>>> >>>>>> > > >>>>>> into the build?
>>>> >>>>>>>>> >>>>>> > > >>>>>>
>>>> >>>>>>>>> >>>>>> > > >>>>>> The introduction of spotless for Java has been 
>>>> >>>>>>>>> >>>>>> > > >>>>>> a good improvement and
>>>> >>>>>>>>> >>>>>> > > >>>>>> maybe the python code base may benefit of this 
>>>> >>>>>>>>> >>>>>> > > >>>>>> too.
>>>> >>>>>>>>> >>>>>> > > >>>>>>
>>>> >>>>>>>>> >>>>>> > > >>>>>> WDYT?
>>>> >>>>>>>>> >>>>>> > > >>>>>>
>>>> >>>>>>>>> >>>>>> > > >>>>>> [1] https://github.com/python/black
>>>> >>
>>>> >>
>>>> >>
>>>> >> --
>>>> >>
>>>> >> Michał Walenia
>>>> >> Polidea | Software Engineer
>>>> >>
>>>> >> M: +48 791 432 002
>>>> >> E: michal.wale...@polidea.com
>>>> >>
>>>> >> Unique Tech
>>>> >> Check out our projects!

Re: [DISCUSS] Autoformat python code with Black

Reply via email to