Hi, Thanks for your interest in Apache Commons.
The GSoC project for Statistics is part of the ongoing project to refactor the large Commons Math (CM) component into smaller modular components (see [1-5]). I have CC'd the commons developer's list on this e-mail. If you subscribe you will be able to track all the discussion on GSoC by searching the subject for the GSoC tag. The suggested project for Statistics 54 ([6]) is to develop the various univariate statistics in CM for use in Java 8 streams. You can see the statistics in the latest javadoc for CM ([7]); the relevant packages are under 'descriptive'. A start point would be to look at the storeless statistics such as mean, variance, moments, as well as the summary statistics classes which group together more than one statistic. The project would be to develop an API that complements the SummaryStatistics in Java (see [8]) for double, long and int. In general a collector for a stream would have to be able to accept both a single value and be combined with another collector to create an aggregate, e.g: Mean.add(double) Mean.add(Mean) This is to allow parallel stream support. Currently the JDK only offers a summary containing min, max, count, average and sum. To extend this would be development of some aggregator classes for individual statistics and some type of generic aggregator class that can be constructed to summarise statistics of interest, e.g. mean and standard deviation; the statistics could be user-configurable. Please take a look at the current code in CM and then ask any questions, either on the dev mailing list or on the Jira ticket. If you wish to register for a Jira account to allow you to track the GSoC issue then see here [9, 10]. You send your preferred username, alternate username and display name to priv...@commons.apache.org and we shall create an account for you. Regards, Alex [1] https://commons.apache.org/proper/commons-rng/ [2] https://commons.apache.org/proper/commons-geometry/ [3] https://commons.apache.org/proper/commons-statistics/ [4] https://commons.apache.org/proper/commons-numbers/ [5] https://commons.apache.org/proper/commons-math/ [6] https://issues.apache.org/jira/browse/STATISTICS-54 [7] https://commons.apache.org/proper/commons-math/javadocs/api-4.0-beta1/index.html [8] https://docs.oracle.com/javase/8/docs/api/java/util/DoubleSummaryStatistics.html [9] https://infra.apache.org/jira-guidelines.html [10] https://issues.apache.org/jira/secure/Dashboard.jspa On Fri, 24 Feb 2023 at 21:03, Md Tanvir Alfesani <tanviralfesani3...@gmail.com> wrote: > > I hope this email finds you well. My name is Md Tanvir Alfesani and I'm a > student who is interested in contributing to Apache Foundation's project > 'Summary Statistics API for JAVA 8 Streams', for Google Summer of Code 2023. > > As I was going through the project idea, I realized that I need to learn more > about the project. I'm particularly interested in the functionalities of the > Common Statistics Library and how to access them to get a good idea about the > aforesaid project. I would appreciate any advice or resources you could > provide to help me prepare for the project. > > Thank you for taking the time to read my email. I am looking forward to > hearing from you and hopefully working together on the project. > > Best regards, > Md Tanvir Alfesani --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org