Hii,

I have just shared my draft proposal for GSoC. Port Codes from Commons Math.
<https://docs.google.com/document/d/1sqSa0hrYc2AD75RZyJRkeqCOBOqTOeMnPaBsE9U5YhU/edit>
Devs, would you please review it and I always welcome your precious
suggestions to improve it.

Best Regards,
Gimhana

On 17 March 2018 at 05:06, Gilles <gil...@harfang.homelinux.org> wrote:

> Hi.
>
> On Fri, 16 Mar 2018 23:12:38 +0530, Gimhana Nadeeshan wrote:
>
>> Hi devs,
>>
>> Sorry for the delayed reply due to my academics.
>>
>>
>> If you want to start playing with the code, we could just begin
>>> by having discussions here (on design) and on JIRA (for processing
>>> minor issues) based on the current state of your repository.
>>> [What's the link to look it up?]
>>>
>>>
>> Should I create my own repo and start code in there?[Not in the forked
>> repo]
>>
>
> What's the difference?  IOW, someone else should answer. :-}
>
> Actually it will be more helpful to me if someone [ @Gilles or @Eric ] can
>> guide me more. Like, to give me some minor issues in the current
>> implementation to solve or as a new feature implementation and gradually
>> we
>> can go for deeper
>>
>
> IMO, the top priority would be to release "Commons Numbers":
>   http://commons.apache.org/proper/commons-numbers/
>
> There are some blocking issues on JIRA:
>   https://issues.apache.org/jira/projects/NUMBERS
>
> and eventually I can go further my my own way.  Then I
>> can gradually familiar with the code and I think it is the most efficient
>> way to learn the design architecture.[I spent hours to understand the
>> current code basis and I felt that was not so efficient as I thought]
>>
>
> Refactoring the package "stat" is not straightforward...
> However, to get to that, it would be useful to record your thoughts
> as you browse through the code(s): what seems easy to port, what should
> be changed/fixed, what you don't understand, and so on.
>
>
>> And if there is a format of Proposal regarding ASF ?
>>
>
> I don't think so.  This ML is the forum where project directions
> are discussed.
>
> If not what should I
>> mention in the proposal basically?
>>
>
> This can be a work in progress, I think (see above suggestions).
>
> Best regards,
> Gilles
>
>
>
>> Best Regards,
>>
>>
>>
>>
>> On 14 March 2018 at 19:07, Gilles <gil...@harfang.homelinux.org> wrote:
>>
>> Hi.
>>>
>>> On Tue, 13 Mar 2018 23:37:24 +0530, Gimhana Nadeeshan wrote:
>>>
>>> Hello Devs,
>>>>
>>>> Thanks Gilles and Eric for guidance.
>>>>
>>>> I have cloned the Commons repos and forked the Common's Stat repo. Is it
>>>> possible to make pull requests to that repo to be reviewed?
>>>>
>>>>
>>> That's certainly possible, but I'm afraid that it will become
>>> quite unwieldy from my side if I have to delete/create branches
>>> for every PR.
>>>
>>> If you want to start playing with the code, we could just begin
>>> by having discussions here (on design) and on JIRA (for processing
>>> minor issues) based on the current state of your repository.
>>> [What's the link to look it up?]
>>>
>>> Or should I
>>>
>>>> follow a specific method?
>>>>
>>>>
>>> I'll inquire about a more efficient method (than the above)...
>>>
>>> By referring the API docs I got some idea of the separation of modules.
>>>
>>>>
>>>> In the current Commons's stat repo there are some classes under the
>>>> package  distribution. I think those can be refactored using java 8 in
>>>> build statistics functionalities. Please correct me if I wrong.
>>>>
>>>>
>>> An example perhaps?
>>>
>>> As Eric said separation of function and streaming implementations is good
>>>
>>>> idea as designing. (In my point of view, it means method overloading ->
>>>> Again correct me if I didn't understand your fact correctly)
>>>>
>>>>
>>> ?
>>>
>>> And I will share my draft proposal here for your review soon.
>>>
>>>>
>>>>
>>> OK.
>>>
>>> Thanks again for your interest,
>>> Gilles
>>>
>>>
>>>
>>> Best Regards.
>>>>
>>>> On 13 March 2018 at 20:50, Gilles <gil...@harfang.homelinux.org> wrote:
>>>>
>>>> Hello.
>>>>
>>>>>
>>>>> On Tue, 13 Mar 2018 09:25:19 +0100, Eric Barnhill wrote:
>>>>>
>>>>> On Tue, Mar 13, 2018 at 12:47 AM, Gilles <gil...@harfang.homelinux.org
>>>>> >
>>>>>
>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Where can we find the old code before port into new Commons
>>>>>>> components?
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> The code bases are managed by the "git" software; the whole history
>>>>>>>> is
>>>>>>>>
>>>>>>> available:
>>>>>>>   https://git1-us-west.apache.org/repos/asf?p=commons-math.git;a=log
>>>>>>>
>>>>>>> [I'd advise to "clone" the repositories on your local computer, and
>>>>>>> use the command line tools.]
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> I believe you will want to clone the commons-math repositories, but
>>>>>> then
>>>>>> develop your own "fork" of the commons-statistics repository. Gilles
>>>>>> can
>>>>>> correct me if that is wrong.
>>>>>>
>>>>>>
>>>>>> Actually, I know only my workflow:
>>>>>  $ git clone ...
>>>>>  $ git branch ...
>>>>>  $ git commit ...
>>>>>  $ git push
>>>>>
>>>>> :-}
>>>>>
>>>>> I didn't find it very easy to cooperate with developers who
>>>>> fork on GitHub and submit PRs.
>>>>> I've now found the "git" command that creates a branch from
>>>>> a PR, but it would be so much more comfortable to just switch
>>>>> directory and do "git pull".
>>>>>
>>>>> In the context of GSoC, would it be possible to grant some
>>>>> privilege to non-committers so that they can update a selected
>>>>> "git" repository?
>>>>> If not, what is the next easiest way to share a "common space"
>>>>> (aka "sandbox") from which it would be easy to copy reviewed
>>>>> bits over to the official source repository?
>>>>>
>>>>>
>>>>> As
>>>>>
>>>>>>
>>>>>>> you mentioned it will be a good approach to redesign process.
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> You don't necessarily need to analyze how the code was before
>>>>>>>>
>>>>>>> the port/refactoring; looking at how it is now is sufficient,
>>>>>>> unless you suspect that something is wrong now and might have
>>>>>>> been better before. ;-)
>>>>>>>
>>>>>>>
>>>>>>> In particular, the statistics library was designed before Java 8.
>>>>>>> Java
>>>>>>>
>>>>>> 8
>>>>>> however has provided both efficient programming strategies for these
>>>>>> statistical methods (in the form of lambdas and streams) as well as
>>>>>> some
>>>>>> built-in methods providing summary statistics functions (see
>>>>>> discussion
>>>>>> at
>>>>>> http://markmail.org/message/7t2mjaprsuvb3waj).
>>>>>>
>>>>>>
>>>>>> Very good point, indeed.
>>>>> IMO, the new component should be targeted Java 8.
>>>>> Even Java 9 (enforcing modularity with JPMS): if by the time we think
>>>>> of releasing the code, we still want to avoid "multi-release" JARs it
>>>>> will be easy to just remove the "module-info" files (I don't think much
>>>>> else Java 9 specific would used by "Commons Statistics").
>>>>>
>>>>> In fact, given the very slow pace at which new components are being
>>>>> brought to releasable state, I'd like to ask whether it would be OK
>>>>> to make "incremental" releases?  That would mean: focus on (maven)
>>>>> modules that seem close to feature-complete and bug-free, fix the
>>>>> remaining issues and perform a release with that module added.
>>>>>
>>>>> It seems that the expectations were set to high (content-wise given
>>>>> the amount of human resources), so that neither CM can be released
>>>>> (too many non-fixed issues) nor its "Commons Numbers" spin-off that
>>>>> contains many modules, some of which are blocked by lack of consensus
>>>>> or dangling discussions.
>>>>>
>>>>> It probably makes sense, as a design strategy, to separate the function
>>>>>
>>>>> implementation from the streaming implementation. For example, a 2D
>>>>>> integer
>>>>>> array will probably require a different streaming implementation than
>>>>>> a
>>>>>> 1D
>>>>>> double array, but they can  probably both be passed the same function
>>>>>> handle to collect, say, the mean or max value.
>>>>>>
>>>>>> The role of commons might then be to provide a convenient interface,
>>>>>> so
>>>>>> that the user can simply call a static method like SummaryStats.mean()
>>>>>> and
>>>>>> not have to worry about the implementation.
>>>>>>
>>>>>> The other difficulty I see, is that quantile and median statistics
>>>>>> will
>>>>>> not
>>>>>> be as easy to stream as statistics with a closed-form solution like
>>>>>> mean
>>>>>> or
>>>>>> variance. There may however be great algorithms out there for pulling
>>>>>> the
>>>>>> median or the 95% quantile out of a stream -- if so they should be
>>>>>> used.
>>>>>>
>>>>>> Eric
>>>>>>
>>>>>>
>>>>>> Eric,
>>>>>
>>>>> Would you be the official "mentor" for the GSoC participants that
>>>>> are interested in helping with the porting of "o.a.c.math4.stat"?
>>>>>
>>>>> Thank you,
>>>>> Gilles
>>>>>
>>>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Reply via email to