Re: [Pdl-devel] Faster PDL Development Cycle---But How?

Chris Marshall Thu, 03 Sep 2015 05:39:17 -0700

Good points.  I definitely think having a warning in each of the PDLA
duplicate modules is going to be important to keep things straight for
users and developers.


I think that staged migration of features from PDLA -> PDL is the way to
go.  Once something is really stable in PDLA, we can propagate into PDL so
that users needing a stable, reliable, robust PDL will have one.  On the
other hand, users needing or desiring the latest features could use the
PDLA based modules.

NOTE:  We should probably make sure that it is possible to install both PDL
and PDLA on the same system without conflict to make this simple and safe.

--Chris


On Thu, Sep 3, 2015 at 7:53 AM, kmx <[email protected]> wrote:

> My view:
>
> I agree with PDLA and I support basically all goals that where mentioned.
> On the other hand I see also couple of possible gotchas.
>
> First is (potential) users confusion. With ongoing work on breaking PDL(A)
> into smaller pieces there will be sooner or later more PDLA::* modules than
> PDL::* on CPAN. Maybe all PDLA modules should have something like this:
>
> ...
> =head1 NAME
>
> (experimental PDL fork)  PDL interface to something cool.
>
> =head1 SYNOPSIS
> ...
>
> Then it will be clear from search result page on metacpan.org or
> search.cpan.org that the module is what it is.
>
> Next: although PDLA is intended to be a place for agile development I
> think it still should have set some milestones (like: a / unbundling
> jumbo-PDL, b/ reshaping makefiles, c/ rewriting pdl-pp parser, d/ new core,
> e/ new object model, ...). Considering how many developers will be actively
> contributing the milestones should be perhaps sorted and work on
> sequentially. These milestones, once finished, might be a good opportunity
> for thorough peer review from those PDL devs/users that will not be
> actively participating in PDLA.
>
> And the last: considering the high demand for stability (see other posts
> in this thread) I am not quite sure that the idea of "in the end PDLA
> repository will replace PDL " will work. Maybe the
> finished/polished/discussed/agreed/reviewed/tested changes and ideas from
> PDLA should be step-by-step (milestone-by-milestone) brought to mainstream
> PDL so that in the end PDL == PDLA (at some point on this way the major
> version will bump from 2 to 3).
>
> --
> kmx
>
>
>
> On 25.8.2015 19:42, Zakariyya Mughal wrote:
>
> On 2015-08-24 at 23:48:51 +0000, Chris Marshall wrote:
>
> PDL Developers-
>
>      With the addition of two active and highly motivated PDL developers
> (Zakariyya Mughal and Guggle "Ed" Worth) we've made significant progress
> in cleaning up the PDL distribution itself and the development process
> itself.  PDL is now run through test builds automatically on git commit
> via the Travis-CI framework of github.  Many perl platforms and PDL
> configuration options are exercised.  PDL-2.013 was the best tested
> pre-release release ever.
>
>      The current process we've been working toward is to make
> PDL development faster and more responsive by breaking up the current
> monolithic PDL distribution into a lean core (roughly the current
> PDL::Core, PDL:PP, and PDL::Slices) and spinning off the other modules
> for IO, Graphics, and Library interfaces as their own CPAN releases.
> This would enable the separate module/distributions to have a faster
> development-test-relese cycle since that process would not be held up by
> the testing of the full PDL distribution with all its subcomponents,
> even if they are completely independent/unrelated to the separate module
> changes being made.
>
>      We're ready to make the split, but there is a catch...  How can we
> have the rapid agile development needed to bring the next generation
> PDL3 possible _without_ losing the "PDL just works" that has been one of
> the primary focus of PDL-2.x development since I volunteered as release
> manager circa PDL-2.4.3 [sic]?
>
>      There has been some discussion, largely on #pdl, about how to best
> proceed.  One idea is to move to a constant release mode which could be
> expedited by adding co-maints to PDL.  I've not acted on that largely
> because I feel that PDL just working, easy to get and start to use, is
> essential to survive as a minority numeric computation engine (compared
> with R, NumPy, Octave/MATLAB).  How can we grow market share if it takes
> a perl expert to start using PDL?
>
>      That said, I think the "big split" is the best way forward for PDL
> to grow and thrive.  The ideas for the PDL3 core engine show great
> promise for the kind of dynamic development as occurred when Karl first
> conceived and implemented the idea that would become PDL.
> Unfortunately, my experience with rapid sequential releases is a sort of
> "churn" where it is difficult to know if you'll be able to get a working
> module at any given release.  So what to do...
>
>      One idea I had is change the stable PDL release distribution into a
> PDL bundle.  That would be the "stable PDL" that would be easy to get
> and install.  The sub-modules would then be able to have independent
> development forming the "experimental PDL" track.  Another way, a bit
> more crude, would be to make a fixed "stable PDL" release that would be
> the one to install.  Maybe we could use specific version information to
> work with cpan, cpanm,...
>
>      Here's where we need your input for discussion and consensus.
> Please feel free to comment on any of the above, or to offer your own
> thoughts.  The goal is to select the preferred approach for modern PDL
> development and move out on it.  I would like to complete this discuss
> process within the next two weeks.  At that point we should be able to
> make a specific plan for any final comments with the agile development
> to begin shortly after.
>
> Let the discussions begin!
>
> Hello Chris,
>
> First off, thank you for starting this conversation.
>
> Ed and I have been working on and off as time permits on preparing for
> the split. The work we've been doing hasn't really generated much
> traffic on the pdl-devel mailing list, but the #pdl and PDLPorters
> GitHub organisation shows a very different story. There is a lot going
> on there every few days. The discussion on those two mediums is a little
> more agile than the mailing list or SourceForge and helps with formulating
>
> I highly recommend joining both by watching the repositories in
> PDLPorters and following the IRC by either joining in a client or
> tracking the backlog with <http://irclog.perlgeek.de/pdl/> 
> <http://irclog.perlgeek.de/pdl/>.
>
> I'd like to summarise some of what we came up with on GitHub/IRC:
>
>  1. A split is necessary to not only make releases easier, but also
>     development. We have worked on reducing the time required to build
>     PDL across multiple environments down to a little over 1 hour.
>
>     This is still too long when you have perhaps 1.5 hours of tuits that
>     day. So the work inevitably gets spread out over weeks.
>
>     A split would help decrease this friction.
>
>  2. Making `cpanm PDL` always work has always part of the plan.
>     Improving the PDL devops has helped with that. The plan is to
>     continue doing that.
>
>     But large refactors such as this split can be quite daunting. We
>     can't be sure we will stick the landing right the first time. But
>     the job needs to move forward or it will fail via analysis paralysis
>     even before it has begun.
>
>  3. Ed and I have been thinking about releasing a more agile, friendly
>     fork of PDL under the PDLA namespace (for PDL Agile). The
>     repositories will continue to live under the PDLPorters GitHub
>     organisation.
>
>     We will start by applying the split. This will be followed by
>     improving code coverage, fixes to the 64-bit indexing, formalising
>     the badvalue semantics for more functions, and bug-fixes.
>
>     We plan on making sure that libraries such as PDL-Stats, PDL-IO-CSV,
>     etc. remain compatible with this library. I believe there is a way
>     to do this without making changes to the original code (via a subref
>     in @INC).
>
>  4. The modules that come from the split will each be improved so that
>     they are easy to install on their own. We already have plans to
>     write Alien::Base modules for all of them.
>
>  5. In parallel with this, we will begin reaching out to distribution
>     packagers. PDL has not been updated on many of them (some of which
>     are on 2.4.x). This is already on the wishlist at 
> <https://github.com/PDLPorters/pdl/issues/139> 
> <https://github.com/PDLPorters/pdl/issues/139>.
>
>  6. The current PDL distribution will remain as it is. Bugfixes will
>     continue on PDL and they will be backported from PDLA. This approach
>     has worked well for IPython/Jupyter (which underwent a split earlier
>     this summer)[^jupyter-split]. Back porting fixes was a large part
>     of what they had to go through.
>
>  7. Eventually, after we are sure that PDLA has maintained
>     compatibility with PDL, the changes of PDLA will replace the
>     current PDL repository.
>
> Finally, I also have some ideas for PDL3 that I will post in about a
> month's time. One of the top priorities on the feature list of PDL3's C
> API needs to be the ability to do optmisations such as loop fusion. I
> need to ponder on how to combine this with the Moo-like metaprogramming
> that we envision. The Julia developers seem to be working on this, but
> there are still big unresolved questions on the issue tracker.
>
> By the way, I think it might be better to avoid putting a number in the
> name of this next major version of PDL. It's a personal opinion that
> stems from marketing issues that are similar to what happened with
> Osborne 1 <https://en.wikipedia.org/wiki/Osborne_effect> 
> <https://en.wikipedia.org/wiki/Osborne_effect> and somewhat
> with Perl 6. This isn't a strongly held opinion, but I feel that it is
> worth bringing up.
>
> [^jupyter-split]: http://blog.jupyter.org/2015/04/15/the-big-split/
>
> Cheers,
> - Zaki Mughal
>
>
> --Chris
>
>
>
>
> ------------------------------------------------------------------------------
> Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
> Get real-time metrics from all of your servers, apps and tools
> in one place.
> SourceForge users - Click here to start your Free Trial of Datadog now!
> http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140
> _______________________________________________
> pdl-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/pdl-devel
>
>

------------------------------------------------------------------------------
Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
Get real-time metrics from all of your servers, apps and tools
in one place.
SourceForge users - Click here to start your Free Trial of Datadog now!
http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140

_______________________________________________
pdl-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pdl-devel

Re: [Pdl-devel] Faster PDL Development Cycle---But How?

Reply via email to