Hii, I have just shared my draft proposal for GSoC. Port Codes from Commons Math. <https://docs.google.com/document/d/1sqSa0hrYc2AD75RZyJRkeqCOBOqTOeMnPaBsE9U5YhU/edit> Devs, would you please review it and I always welcome your precious suggestions to improve it.
Best Regards, Gimhana On 17 March 2018 at 05:06, Gilles <gil...@harfang.homelinux.org> wrote: > Hi. > > On Fri, 16 Mar 2018 23:12:38 +0530, Gimhana Nadeeshan wrote: > >> Hi devs, >> >> Sorry for the delayed reply due to my academics. >> >> >> If you want to start playing with the code, we could just begin >>> by having discussions here (on design) and on JIRA (for processing >>> minor issues) based on the current state of your repository. >>> [What's the link to look it up?] >>> >>> >> Should I create my own repo and start code in there?[Not in the forked >> repo] >> > > What's the difference? IOW, someone else should answer. :-} > > Actually it will be more helpful to me if someone [ @Gilles or @Eric ] can >> guide me more. Like, to give me some minor issues in the current >> implementation to solve or as a new feature implementation and gradually >> we >> can go for deeper >> > > IMO, the top priority would be to release "Commons Numbers": > http://commons.apache.org/proper/commons-numbers/ > > There are some blocking issues on JIRA: > https://issues.apache.org/jira/projects/NUMBERS > > and eventually I can go further my my own way. Then I >> can gradually familiar with the code and I think it is the most efficient >> way to learn the design architecture.[I spent hours to understand the >> current code basis and I felt that was not so efficient as I thought] >> > > Refactoring the package "stat" is not straightforward... > However, to get to that, it would be useful to record your thoughts > as you browse through the code(s): what seems easy to port, what should > be changed/fixed, what you don't understand, and so on. > > >> And if there is a format of Proposal regarding ASF ? >> > > I don't think so. This ML is the forum where project directions > are discussed. > > If not what should I >> mention in the proposal basically? >> > > This can be a work in progress, I think (see above suggestions). > > Best regards, > Gilles > > > >> Best Regards, >> >> >> >> >> On 14 March 2018 at 19:07, Gilles <gil...@harfang.homelinux.org> wrote: >> >> Hi. >>> >>> On Tue, 13 Mar 2018 23:37:24 +0530, Gimhana Nadeeshan wrote: >>> >>> Hello Devs, >>>> >>>> Thanks Gilles and Eric for guidance. >>>> >>>> I have cloned the Commons repos and forked the Common's Stat repo. Is it >>>> possible to make pull requests to that repo to be reviewed? >>>> >>>> >>> That's certainly possible, but I'm afraid that it will become >>> quite unwieldy from my side if I have to delete/create branches >>> for every PR. >>> >>> If you want to start playing with the code, we could just begin >>> by having discussions here (on design) and on JIRA (for processing >>> minor issues) based on the current state of your repository. >>> [What's the link to look it up?] >>> >>> Or should I >>> >>>> follow a specific method? >>>> >>>> >>> I'll inquire about a more efficient method (than the above)... >>> >>> By referring the API docs I got some idea of the separation of modules. >>> >>>> >>>> In the current Commons's stat repo there are some classes under the >>>> package distribution. I think those can be refactored using java 8 in >>>> build statistics functionalities. Please correct me if I wrong. >>>> >>>> >>> An example perhaps? >>> >>> As Eric said separation of function and streaming implementations is good >>> >>>> idea as designing. (In my point of view, it means method overloading -> >>>> Again correct me if I didn't understand your fact correctly) >>>> >>>> >>> ? >>> >>> And I will share my draft proposal here for your review soon. >>> >>>> >>>> >>> OK. >>> >>> Thanks again for your interest, >>> Gilles >>> >>> >>> >>> Best Regards. >>>> >>>> On 13 March 2018 at 20:50, Gilles <gil...@harfang.homelinux.org> wrote: >>>> >>>> Hello. >>>> >>>>> >>>>> On Tue, 13 Mar 2018 09:25:19 +0100, Eric Barnhill wrote: >>>>> >>>>> On Tue, Mar 13, 2018 at 12:47 AM, Gilles <gil...@harfang.homelinux.org >>>>> > >>>>> >>>>>> wrote: >>>>>> >>>>>> >>>>>> >>>>>> Where can we find the old code before port into new Commons >>>>>>> components? >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> The code bases are managed by the "git" software; the whole history >>>>>>>> is >>>>>>>> >>>>>>> available: >>>>>>> https://git1-us-west.apache.org/repos/asf?p=commons-math.git;a=log >>>>>>> >>>>>>> [I'd advise to "clone" the repositories on your local computer, and >>>>>>> use the command line tools.] >>>>>>> >>>>>>> >>>>>>> >>>>>> I believe you will want to clone the commons-math repositories, but >>>>>> then >>>>>> develop your own "fork" of the commons-statistics repository. Gilles >>>>>> can >>>>>> correct me if that is wrong. >>>>>> >>>>>> >>>>>> Actually, I know only my workflow: >>>>> $ git clone ... >>>>> $ git branch ... >>>>> $ git commit ... >>>>> $ git push >>>>> >>>>> :-} >>>>> >>>>> I didn't find it very easy to cooperate with developers who >>>>> fork on GitHub and submit PRs. >>>>> I've now found the "git" command that creates a branch from >>>>> a PR, but it would be so much more comfortable to just switch >>>>> directory and do "git pull". >>>>> >>>>> In the context of GSoC, would it be possible to grant some >>>>> privilege to non-committers so that they can update a selected >>>>> "git" repository? >>>>> If not, what is the next easiest way to share a "common space" >>>>> (aka "sandbox") from which it would be easy to copy reviewed >>>>> bits over to the official source repository? >>>>> >>>>> >>>>> As >>>>> >>>>>> >>>>>>> you mentioned it will be a good approach to redesign process. >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> You don't necessarily need to analyze how the code was before >>>>>>>> >>>>>>> the port/refactoring; looking at how it is now is sufficient, >>>>>>> unless you suspect that something is wrong now and might have >>>>>>> been better before. ;-) >>>>>>> >>>>>>> >>>>>>> In particular, the statistics library was designed before Java 8. >>>>>>> Java >>>>>>> >>>>>> 8 >>>>>> however has provided both efficient programming strategies for these >>>>>> statistical methods (in the form of lambdas and streams) as well as >>>>>> some >>>>>> built-in methods providing summary statistics functions (see >>>>>> discussion >>>>>> at >>>>>> http://markmail.org/message/7t2mjaprsuvb3waj). >>>>>> >>>>>> >>>>>> Very good point, indeed. >>>>> IMO, the new component should be targeted Java 8. >>>>> Even Java 9 (enforcing modularity with JPMS): if by the time we think >>>>> of releasing the code, we still want to avoid "multi-release" JARs it >>>>> will be easy to just remove the "module-info" files (I don't think much >>>>> else Java 9 specific would used by "Commons Statistics"). >>>>> >>>>> In fact, given the very slow pace at which new components are being >>>>> brought to releasable state, I'd like to ask whether it would be OK >>>>> to make "incremental" releases? That would mean: focus on (maven) >>>>> modules that seem close to feature-complete and bug-free, fix the >>>>> remaining issues and perform a release with that module added. >>>>> >>>>> It seems that the expectations were set to high (content-wise given >>>>> the amount of human resources), so that neither CM can be released >>>>> (too many non-fixed issues) nor its "Commons Numbers" spin-off that >>>>> contains many modules, some of which are blocked by lack of consensus >>>>> or dangling discussions. >>>>> >>>>> It probably makes sense, as a design strategy, to separate the function >>>>> >>>>> implementation from the streaming implementation. For example, a 2D >>>>>> integer >>>>>> array will probably require a different streaming implementation than >>>>>> a >>>>>> 1D >>>>>> double array, but they can probably both be passed the same function >>>>>> handle to collect, say, the mean or max value. >>>>>> >>>>>> The role of commons might then be to provide a convenient interface, >>>>>> so >>>>>> that the user can simply call a static method like SummaryStats.mean() >>>>>> and >>>>>> not have to worry about the implementation. >>>>>> >>>>>> The other difficulty I see, is that quantile and median statistics >>>>>> will >>>>>> not >>>>>> be as easy to stream as statistics with a closed-form solution like >>>>>> mean >>>>>> or >>>>>> variance. There may however be great algorithms out there for pulling >>>>>> the >>>>>> median or the 95% quantile out of a stream -- if so they should be >>>>>> used. >>>>>> >>>>>> Eric >>>>>> >>>>>> >>>>>> Eric, >>>>> >>>>> Would you be the official "mentor" for the GSoC participants that >>>>> are interested in helping with the porting of "o.a.c.math4.stat"? >>>>> >>>>> Thank you, >>>>> Gilles >>>>> >>>>> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > >