Hi,

Thanks everyone for sharing your views!

Some of this conversation is starting to feel like boiling the ocean. I
believe in refactoring with purpose and discussing class-by-class or
module-by-module does not make sense to me. Can we first list down what we
want to achieve? So far, I have only heard fixing IDE/IntelliJ warnings.
Also instead of focussing on new work, how about looking at the pending
JIRAs under "Testing" "Code Cleanup" components first and see if those are
worth tackling.

We went down this path for code formatting and today we still have
inconsistencies. Looking back, I feel we should have clearly defined end
goals for the cleanups and we can then rank them based on ROI.

Thanks
Vinoth

On Wed, Jan 22, 2020 at 7:05 PM vino yang <[email protected]> wrote:

> Hi Shiyan and Bhavani:
>
> Thanks for sharing your thoughts.
>
> As I originally stated. The advantage of using modules as a unit to split
> work is that the decomposition is clear, but the disadvantage is that the
> volume of changes may be huge, which brings huge risks (considering that
> Hudi's test coverage is still not very high) and the workload of review.
> The advantage of splitting by class is that the volume of changes is small
> and the review is more convenient, but the disadvantages are too many tasks
> and high maintenance costs.
>
>
> *In addition, we need to define the boundaries of the "code cleanup" I
> expressed in this topic: it is limited to the smart tips shown by Intellij
> IDEA. If the boundaries are too wide, then this discussion will lose
> control.*
> I agree with Bhavani that we don't take it as the actual goal. But we are
> not opposed to the community to help improve the quality of the code
> (basically, these tips given by the IDE are more reasonable).
>
>
> So, I still give my thoughts: We manage this work with Jira. Before we
> start working, we need to find a committer as a mentor. The mentor must
> decide whether the scale of the subtasks is reasonable and whether
> additional unit tests need to be added to verify the changes. And the
> mentor should be responsible for merged changes.
>
> What do you think?
>
> Best,
> Vino
>
> Bhavani Sudha <[email protected]> 于2020年1月22日周三 下午2:22写道:
>
> > Hi @vinoyang thanks for bringing this to discussion. I feel it would be
> > less disruptive to clean up code as part of individual classes being
> > touched for a specific goal rather than code cleanup being the actual
> goal.
> > This would narrow the touch point and ensure test coverage (both unit and
> > integration tests)  catches any accidental/unintentional changes. Also it
> > would give chance to change any documentation quoting/referencing that
> > code. Wanted to share my personal opinion.
> >
> > Thanks,
> > Sudha
> >
> >
> >
> > On Tue, Jan 21, 2020 at 11:36 AM Shiyan Xu <[email protected]>
> > wrote:
> >
> > > The clean-up work can actually be split by modules.
> > >
> > > Though it is generally a good practice to follow, my concern is the
> > > clean-up is likely to cause conflicts with some on-going changes. If I
> > may
> > > suggest, the dedicated clean-up tasks should avoid
> > > - modules that are undergoing multiple feature changes/PRs
> > > - modules that are planned to have major refactoring due to design
> > changes
> > > (since clean-up can be done altogether during refactoring)
> > >
> > > On Tue, Jan 21, 2020 at 4:17 AM Vinoth Chandar <[email protected]>
> > wrote:
> > >
> > > > Not sure if I fully agree with sweeping statements being made. But,
> +1
> > > for
> > > > structuring this work via Jiras and having some committer “accept”
> the
> > > > issue first.  Some of these tend to be subjective and we do need to
> > make
> > > > different tradeoffs.
> > > >
> > > > On Tue, Jan 21, 2020 at 1:28 AM vino yang <[email protected]>
> > wrote:
> > > >
> > > > > Hi Pratyaksh,
> > > > >
> > > > > Thanks for your thought.
> > > > >
> > > > > Let's listen to others' comments. If there is no objection, we will
> > > > follow
> > > > > this way.
> > > > >
> > > > > Best,
> > > > > Vino
> > > > >
> > > > >
> > > > > Pratyaksh Sharma <[email protected]> 于2020年1月21日周二 下午4:56写道:
> > > > >
> > > > > > Hi Vino,
> > > > > >
> > > > > > Big +1 for this initiative. I have done this code cleanup for
> test
> > > > > classes
> > > > > > in the past and strongly feel there is a need to do the same at
> > other
> > > > > > places as well. I would definitely like to volunteer for this.
> > > > > >
> > > > > > On Tue, Jan 21, 2020 at 1:52 PM vino yang <[email protected]
> >
> > > > wrote:
> > > > > >
> > > > > > > Hi folks,
> > > > > > >
> > > > > > > Currently, the code quality of some Hudi module is not very
> well.
> > > As
> > > > > many
> > > > > > > developers have seen, the Intellij IDEA has shown many
> > intellisense
> > > > > about
> > > > > > > cleanup and improvement. The community does not object to doing
> > the
> > > > > > cleanup
> > > > > > > and improvement work and the work has been started via some
> > direct
> > > > > > "minor"
> > > > > > > PRs by some volunteers. The current way is unorganized and hard
> > to
> > > > > > manage.
> > > > > > > For tracking this work, I prefer to manage this work with the
> > Jira
> > > > > issue.
> > > > > > > We can create an umbrella issue. Then, split the work into
> > several
> > > > > > > subtasks.
> > > > > > >
> > > > > > > Since those "bad smell" lays anywhere in the whole project.
> It's
> > > > > > difficult
> > > > > > > to give a standard to split the subtasks. For example, some
> files
> > > > have
> > > > > a
> > > > > > > lot while some modules have few. So I suggest the standard
> would
> > > > depend
> > > > > > on
> > > > > > > the volume of the changes. Before working, any subtask should
> > find
> > > a
> > > > > > > committer as a mentor who would judge and approve the scope is
> > > > > suitable.
> > > > > > >
> > > > > > > What do you think?
> > > > > > >
> > > > > > > Any comments and suggestions would be appreciated.
> > > > > > >
> > > > > > > Best,
> > > > > > > Vino
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to