Thank you all for doing the work. The demo PR looks decent. I still have 3
major concerns. (1) I prefer not to maintain a fork with our own
customization for a tool. It will add up to our maintenance costs. (2) Even
the small change touches about 20k lines of code. (3) On beta note, the
specific note is "until the formatter becomes stable, you should expect
some formatting to change in the future." That signals like we might see
formatting churn.

And as a minor point I do not understand why some reformatting happened.
For example, was the following change due to pep-8 compliance or the tool
being opinionated?

if (not hasattr(result, 'has_job')        # direct runner
    or result.has_job):               # not just a template creation

-->

if (not hasattr(result, 'has_job') or result.has_job  # direct runner
 ):  # not just a template creation

So far all of us agree on benefits of an autoformatter. I do not wish to
block this. Consider me -0 as long as someone is willing to maintain a fork
that works for our purposes.

Ahmet

On Wed, May 29, 2019 at 5:01 AM Robert Bradshaw <rober...@google.com> wrote:

> Reformatting to 4 spaces seems a non-starter to me, as it would change
> nearly every single line in the codebase (and the loss of all context as
> well as that particular line).
>
> This is probably why the 2-space fork exists. However, we don't conform to
> that either--we use 2 spaces for indentation, but 4 for continuation
> indentation. (As for the history of this, this goes back to Google's
> internal style guide, probably motivated by consistency with C++, Java, ...
> and the fact that with an indent level of 4 one ends up wrapping lines
> quite frequently (it's telling that black's default line length is 88)).
> This turns out to be an easy change to the codebase.
>
> Once we move beyond the 2 vs. 4 whitespace thing, I found that this tool
> introduces a huge amount of vertical whitespace (e.g. closing parentheses
> on their own line), e.g.
>
> def foo(
>     args
> ):
>   if (
>       long expression)
>   ):
>     func(
>         args
>     )
>
> I wrote a simple post-processor to put closing parentheses on the same
> lines, as well as omit the newline after "if (", and disabling formatting
> of strings, which reduce the churn in our codebase to 15k lines (adding
> about 4k) out of 200k total.
>
> https://github.com/apache/beam/pull/8712/files
>
> It's still very opinionated, often in different ways then me, and doesn't
> understand the semantics of the code, but possibly something we could live
> with given the huge advantages of an autoformatter.
>
> An intermediate point would be to allow, but not require, autoformatting
> of changed lines.
>
> As for being beta quality, it looks like it's got a decent number of
> contributors and in my book being in the python github project is a strong
> positive signal. But, due to the above issues, I think we'd have to
> maintain a fork. (The code is pretty lightweight, the 2 vs. 4 space issue
> is a 2-line change, and the rest implemented as a post-processing step (for
> now, incomplete), so it'd be easy to stay in sync with upstream.)
>
> On Wed, May 29, 2019 at 11:03 AM Ismaël Mejía <ieme...@gmail.com> wrote:
> >
> > > I think the question is if it can be configured in a way to fit our
> > > current linter's style. I don't think it is feasible to reformat the
> > > entire Python SDK.
> >
> > It cannot be configured to do what we actually do because Black is
> > configurable only to support the standard python codestyle guidelines
> > (PEP-8) which recommends 4 spaces and is what most projects in the
> > python world use.
> >
> > > Reformatted lines don't allow quick access to the Git history. This
> > > effect is still visible in the Java SDK. However, I have the feeling
> > > that this might be less of a problem with Python because the linter has
> > > more rules than Checkstyle had.
> >
> > Yes that’s the bad side effect but there are always tradeoffs we have
> > to deal with.
> >
> >
> >
> >
> > On Wed, May 29, 2019 at 10:52 AM Maximilian Michels <m...@apache.org>
> wrote:
> > >
> > > I think the question is if it can be configured in a way to fit our
> > > current linter's style. I don't think it is feasible to reformat the
> > > entire Python SDK.
> > >
> > > Reformatted lines don't allow quick access to the Git history. This
> > > effect is still visible in the Java SDK. However, I have the feeling
> > > that this might be less of a problem with Python because the linter has
> > > more rules than Checkstyle had.
> > >
> > > -Max
> > >
> > > On 29.05.19 10:16, Ismaël Mejía wrote:
> > > >> My concerns are:
> > > >> - The product is clearly marked as beta with a big warning.
> > > >> - It looks like mostly a single person project. For the same reason
> I also strongly prefer not using a fork for a specific setting. Fork will
> only have less people looking at it.
> > > >
> > > > I suppose the project is marked as beta because it is recent, it was
> > > > presented in 2018’s pycon, and because some things can change since
> > > > auto-formatters are pretty tricky beasts, I think beta in that case
> is
> > > > like our own ‘@Experimental’. If you look at the contribution page
> [1]
> > > > you can notice that it is less and less a single person project,
> there
> > > > have been 93 independent contributions since the project became
> > > > public, and the fact that it is hosted in the python organization
> > > > github [2] gives some confidence on the project continuity.
> > > >
> > > > You are right however about the fact that the main author seems to be
> > > > the ‘benevolent’ dictator, and in the 2-spaces issue he can seem
> > > > arbitrary, but he is just following pep8 style guide recommendations
> > > > [3]. I am curious of why we (Beam) do not follow the 4 spaces
> > > > recommendation of PEP-8 or even Google's own Python style guide [4],
> > > > So, probably it should be to us to reconsider the current policy to
> > > > adapt to the standards (and the tool).
> > > >
> > > > I did a quick run of black with python 2.7 compatibility on
> > > > sdks/python and got only 4 parsing errors which is positive given the
> > > > size of our code base.
> > > >
> > > > 415 files reformatted, 45 files left unchanged, 4 files failed to
> reformat.
> > > >
> > > > error: cannot format
> > > >
> /home/ismael/upstream/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py:
> > > > Cannot parse: 47:22:   _display_progress = print
> > > > error: cannot format
> > > >
> /home/ismael/upstream/beam/sdks/python/apache_beam/runners/worker/log_handler.py:
> > > > Cannot parse: 151:18:               file=sys.stderr)
> > > > error: cannot format
> > > >
> /home/ismael/upstream/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py:
> > > > Cannot parse: 160:34:       print(traceback_string, file=sys.stderr)
> > > > error: cannot format
> > > >
> /home/ismael/upstream/beam/sdks/python/apache_beam/typehints/trivial_inference.py:
> > > > Cannot parse: 335:51:       print('-->' if pc == last_pc else '    ',
> > > > end=' ')
> > > >
> > > > I still think this can be positive for the project but well I am
> > > > barely a contributor to the python code base so I let you the python
> > > > maintainers to reconsider this, in any case it seems like a good
> > > > improvement for the project.
> > > >
> > > > [1] https://github.com/python/black/graphs/contributors
> > > > [2] https://github.com/python
> > > > [3] https://www.python.org/dev/peps/pep-0008/#indentation
> > > > [4]
> https://github.com/google/styleguide/blob/gh-pages/pyguide.md#34-indentation
> > > >
> > > > On Tue, May 28, 2019 at 11:15 PM Ahmet Altay <al...@google.com>
> wrote:
> > > >>
> > > >> I am in the same boat with Robert, I am in favor of autoformatters
> but I am not familiar with this one. My concerns are:
> > > >> - The product is clearly marked as beta with a big warning.
> > > >> - It looks like mostly a single person project. For the same reason
> I also strongly prefer not using a fork for a specific setting. Fork will
> only have less people looking at it.
> > > >>
> > > >> IMO, this is in an early stage for us. That said lint issues are
> real as pointed in the thread. If someone would like to give it a try and
> see how it would look like for us that would be interesting.
> > > >>
> > > >> On Tue, May 28, 2019 at 4:44 AM Katarzyna Kucharczyk <
> ka.kucharc...@gmail.com> wrote:
> > > >>>
> > > >>> This sounds really good. A lot of Jenkins jobs failures are caused
> by lint problems.
> > > >>> I think it would be great to have something similar to Spotless in
> Java SDK (I heard there is problem with configuring Black with IntelliJ).
> > > >>>
> > > >>> On Mon, May 27, 2019 at 10:52 PM Robert Bradshaw <
> rober...@google.com> wrote:
> > > >>>>
> > > >>>> I'm generally in favor of autoformatters, though I haven't looked
> at
> > > >>>> how well this particular one works. We might have to go with
> > > >>>> https://github.com/desbma/black-2spaces given
> > > >>>> https://github.com/python/black/issues/378 .
> > > >>>>
> > > >>>> On Mon, May 27, 2019 at 10:43 PM Pablo Estrada <
> pabl...@google.com> wrote:
> > > >>>>>
> > > >>>>> This looks pretty good:) I know at least a couple people (myself
> included) who've been annoyed by having to take care of lint issues that
> maybe a code formatter could save us.
> > > >>>>> Thanks for sharing Ismael.
> > > >>>>> -P.
> > > >>>>>
> > > >>>>>
> > > >>>>> On Mon, May 27, 2019, 12:24 PM Ismaël Mejía <ieme...@gmail.com>
> wrote:
> > > >>>>>>
> > > >>>>>> I stumbled by chance into Black [1] a python code auto
> formatter that
> > > >>>>>> is becoming the 'de-facto' auto-formatter for python, and
> wanted to
> > > >>>>>> bring to the ML Is there interest from the python people to get
> this
> > > >>>>>> into the build?
> > > >>>>>>
> > > >>>>>> The introduction of spotless for Java has been a good
> improvement and
> > > >>>>>> maybe the python code base may benefit of this too.
> > > >>>>>>
> > > >>>>>> WDYT?
> > > >>>>>>
> > > >>>>>> [1] https://github.com/python/black
>

Reply via email to